Computing marginal posterior densities of genetic parameters of a multiple trait animal model using Laplace approximation or Gibbs sampling

Deux procedures de calcul de la densite marginale a posteriori des heritabilites et des correlations genetiques, a savoir la methode de Laplace pour l'approximation des integrales et l'echantillonnage de Gibbs, sont comparees. Pour cela, nous considerons un modele animal multicaractere avec un effet aleatoire, sans observations manquantes et avec un modele identique pour chaque caractere. L'approximation de Laplace conduit au calcul de la densite marginale a posteriori pour differentes valeurs du parametre qui nous interesse. Cela necessite l'evaluation repetee de traces et de determinants qui sont simples a calculer une fois que les valeurs propres d'une matrice de dimension egale au nombre d'animaux ont ete determinees. Ces valeurs propres peuvent etre calculees de maniere efficace a l'aide de l'algorithme de Lanczos. L'echantillonnage de Gibbs genere des echantillons de la densite conjointe a posteriori. Ces echantillons sont utilises pour estimer la densite marginale a posteriori, qui est exacte a une erreur de Monte Carlo pres. Les deux procedures ont ete appliquees a un fichier de donnees comportant les caracteres de production de semence recueillies sur 1957 taureaux Normands. Les caracteres analyses etaient le volume de l'ejaculat, une note de motilite et la concentration en spermatozoides. L'approximation laplacienne a permis une approximation tres precise de la densite marginale a posteriori de tous les parametres avec un cout de calcul beaucoup plus reduit.


INTRODUCTION
The optimization of breeding programs relies on accurate estimates of genetic parameters, ie, heritabilities and genetic correlations. Genetic evaluations of selection candidates by best linear unbiased prediction (BLUP; Henderson, 1973) assumes known variance components which, in practice, are replaced by estimates. Thus, it seems desirable to have an idea about the accuracy of the estimates used. Restricted maximum likelihood (REML; Patterson and Thompson, 1971) is widely regarded as the method of choice for estimation of variance components. REML estimates have known large sample properties under the concept of repeated sampling of the data analyzed but unknown small sample distributions. In contrast, a Bayesian analysis yields an exact posterior probability density of the parameters of interest for the data set at hand (Gianola and Fernando, 1986). REML estimates correspond to the mode of the posterior distribution marginalized with respect to the location parameters in a Bayesian analysis with flat priors for fixed effects and variance components (Harville, 1977). Gianola and Foulley (1990) used two arguments for suggesting an alternative method for variance component estimation. First, REML estimates are joint modes of all (co)variance components rather than marginal modes. The latter are a better approximation to the posterior mean, which is the optimum Bayes estimator under a quadratic loss function. Second, if interest is only in a subset of the (co)variance components the remaining (co)variance components should be regarded as nuisance parameters and inferences should take into account the error incurred in estimating these nuisance variance components. Gianola and Foulley (1990) suggest basing inferences about the variance components of interest on their marginal posterior density.
Various methods exist for marginalizing the joint posterior density of variance components. Analytical integration is possible only for certain parameters in simple univariate models, eg, for obtaining the marginal posterior density of the variance ratio (or the heritability) in a single trait linear mixed model with one random effect (Gianola et al, 1990b). With more complex models, approximations or Monte-Carlo integration techniques are used to obtain marginal posterior densities. Tierney and Kadane (1986) have suggested computing marginal posterior densities by Laplace's method to approximate integrals. In animal breeding the method was first considered by Cantet et al (1992) for a linear mixed model with maternal effects and has been applied by Tempelman and Gianola (1993) to a Poisson mixed model and recently by Ducrocq and Casella (1996) to a mixed survival model. Laplacian integration involves the repeated computation of second derivatives of the logarithm of the posterior density of all (co)variance components after integration of location parameters. For single trait animal models with one random effect or for multiple trait models where a canonical transformation is possible, the derivatives can be efficiently computed once all eigenvalues of a symmetric matrix of dimension of the number of animals are determined (Robert and Ducrocq, 1996). Robert and Ducrocq (1996) have shown that the Lanczos algorithm is well suited to computing these eigenvalues.
As an alternative, Monte-Carlo integration can be used to obtain the exact marginal posterior distribution up to the Monte-Carlo error. Gibbs sampling (Gelfand and Smith, 1990;Casella and George, 1992), a Markov chain Monte-Carlo procedure, is becoming an increasingly important tool in statistical analysis (Gilks et al, 1996). The Gibbs sampler generates samples from the joint (marginal) posterior distribution. The samples are used to derive the desired summary statistics of the target distribution. Applications in animal breeding have rapidly increased in recent years (eg, Wang al, 1994;Janss et al, 1995;Sorensen et al, 1995;Van Tassell and Van Vleck, 1996). This paper considers the computation of the marginal posterior density of heritabilities or genetic correlations of a simple multiple trait animal model. Heritabilities and genetic correlations are nonlinear functions of (co)variance components. A second order Laplace approximation of marginal densities of nonlinear functions has been derived by Tierney et al (1989,1991) and is presented in detail. The Laplace approximation is compared to Gibbs sampling by applying both procedures to a data set of semen production traits of Normande bulls.

Model
Consider the multiple trait mixed model with only one random effect, equal incidence matrices for all t traits and no missing observations: where y is the vector of observations of order n ! t, It is an identity matrix of order t by t, X and Z are known incidence matrices of order n by p and n by q, respectively, b is the vector of fixed effects of order p . t and a is the vector of random additive genetic effects of order (j f t, and e is the vector of residuals of order nt.
The conditional probability density function of the observations is: with R o the (co)variance matrix of residual effects of order t by t.
The prior distribution for the unknown location parameters are assumed to be: with Go the matrix of additive genetic (co)variances of order t by t and A the numerator relationship matrix of order q by q.
For the (co)variance matrices the prior distribution is assumed to be an inverse Wishart distribution (eg, Gelman et al, 1995): . 1 with the known parameters degree of belief v i (analogous to degrees of freedom) and scale matrix V i of order t by t, where i stands either for G or R. The expected value of the inverse Wishart distribution is (for example for Go) E(G o w G , V G ) _ (v G -t -1)-1 V G , which may be used for choosing V G . The parameters b, a, Go and R o are assumed to be independent a priori. The joint posterior probability density of all parameters is defined as the product of the probability density of the observations and the joint prior density of the parameters and can be written as: For this study the parameters of interest are functions of the (co)variance components in Go and R!, ie, variance ratios and correlations. Therefore, 0 is a nuisance parameter that should be integrated out of the joint posterior density.
To draw inference about functions of a subset of the (co)variance components, a further marginalization of the posterior density is required. Analytical solutions are not available.

Laplace approximation
Approximation of marginal posterior probability densities using Laplace's method has been suggested by Tierney and Kadane (1986). They considered the case where the vector of parameters can be partitioned as 6 = [o'i, ( T2] , where !1 is the parameter of interest and Q2 is the nuisance parameter that is integrated out of the posterior distribution using Laplace's method to approximate integrals. Genetic parameters such as heritabilities or genetic correlations are nonlinear functions of (co)variance components. Inferences about genetic parameters should therefore be based on marginal densities of nonlinear functions of (co)variance components. Tierney et al (1989Tierney et al ( , 1991 extended the work of Tierney and Kadane (1986) for the approximation of marginal posterior distributions of nonlinear functions.
Let T = g(u) be the nonlinear function of interest of the m parameters in 0 &dquo;. We assume here that T is a scalar (eg, heritability or genetic correlation) but this assumption is not required for the following derivation. A new parameterization is defined as <1>( 0 &dquo;) = {<I>l (0&dquo;), <1>2 ( O &dquo;n = {g( 0 &dquo;), !2(!)} partitioned into the nonlinear function of interest and a function 4 ) 2 that ensures that the transformation is one to one, although it might be difficult to specify <1> 2 explicitly. The approximation proposed by Tierney et al (1989,1991) does not depend on an explicit reparameterization. The only requirement is the existence of a one to one transformation in a sufficiently small neighbourhood U of the joint mode 6. In our application 4 ) 2 is easy to specify and the approximation caused by the restriction to the neighbourhood U does not occur.
In order to obtain the marginal distribution of'[ = g( 0&dquo;), the nuisance parameters u 2 = 4Y2 (u) in the new parameterization must be integrated out where J (4) -1 !(r,U2)) is the Jacobian matrix of the transformation <t> -1.
Let 8 r be the argument on the original scale that maximizes log p(6!y) subject to the constraint g(u) = T . In the new parameterization q!(8T) maximizes log p( <t> -1 (g( 0&dquo;), U2 ) Iy) subject to the constraint g( 0&dquo;) = T . At the constrained mode 4 )(&, r ), the gradient of the Lagrangian log p(<t> -l (g( O &dquo;), U 2 )!y) + A(g(u) --r) is zero, where A is a Lagrange multiplier. Now, log p( 4 ) -'(r, u 2 )ly) in [4] for fixed T is approximated using a second order Taylor series expansion around 4) 2 (& ' ): In expression [5] the part that depends on U2 is the kernel of a normal distribution with mean <l> 2 (Ô''1') and variance Qr, 22 . The normalizing constant of this kernel is (2 7r ) -9 1 f2r,221 2, which leads to the following approximation of equation [4]: In expression [6] S2 T , 22 and J(6'-r) depend on an explicit specification of !2(!). Following Tierney et al (1989, 1991, we now derive an alternative expression for [6] that does not require an explicit reparameterization. Let S2T and HT be the matrices of second derivatives of the Lagrangian evaluated at the constrained mode in the new and the original parameterization, respectively, ie with u' = [g( 0&dquo;), u'], and Note that n!,!2 in equation [5] is equal to the diagonal block of f l -1 pertaining to u 2 = !2(!)! Following Meyer and Smith (1996, equation 53) and using the fact that the gradient of log p(!-1(g(a'), u 2 )ly) + A (g (u) -r) is zero at q!(8r) , we have evaluated at u = 8 r . This leads to: Using the formula for the determinant of the partitioned matrix in equation [7] (Searle, 1982) and recalling that n!,!2 is equal to the diagonal block of S2T I pertaining to u 2 = < P2 ( u), the determinant of nor can be obtained as: Substituting In TI = IH TII J 1 2 which follows from equation [7] into equation [8] yields: This equality can be used in equation [6] to obtain the following approximation of the marginal probability density of the nonlinear function T = g( 6 ) (Tierney et al, 1989(Tierney et al, , 1991: Note that in contrast to equation [6] an explicit parameterization <1> 2 is not needed for approximation !9!. To obtain the marginal posterior probability density of the genetic parameter T = g(u), equation [9] is repeatedly used for different values of T .
In equation [9] observed second derivatives of log p(!!y) are required. Model [1] allows a transformation to canonical scale, which facilitates the computation of derivatives. Colleau et al (1989) have given expressions on the canonical scale for the derivatives of the residual likelihood, which is one part of log p( C Tly). Derivatives of log p(!!y) on the canonical scale and the transformation to the original scale are given in the Appendix.

Approximating determinant and traces
Evaluation of the marginal density !3!, which is needed in equation !9), requires the computation of the determinant of the coefficient matrix of the mixed model equations for each canonical trait (Appendi!). For the evaluation of derivatives of log p( C Tly) traces involving the inverse of the coefficient matrix of the mixed model equations on the canonical scale have to be computed (Appendi!). The computation of these quantities becomes trivial once the eigenvalues, -y 2 , of L'Z'MZL have been determined, where L is the Cholesky factor of A, ie, A = LL', and M is the absorption matrix, ie, M = I -X(X'X)-1 X'. For the determinant we have (eg, Gianola et al, 1990a): The traces can be obtained as (eg, Robert and Ducrocq, 1996): The procedure described by Robert and Ducrocq (1996) can be used to compute all eigenvalues of L'Z'MZL. Here we only give a brief outline, for details see Robert and Ducrocq (1996) and the references cited therein. The procedure uses the Lanczos algorithm to transform the original matrix B = L'Z'MZL of size q to a tridiagonal matrix T k of a given size k, which in theory has the same eigenvalues as B. A standard method can then be used to compute the eigenvalues of T k . The advantage of this algorithm is that memory requirements are minimal because matrix B is not altered. The matrix-vector product Bv needs to be computed repeatedly, which can be achieved very efficiently taking advantage of the special structure of B (Robert and Ducrocq, 1996).
Owing to rounding errors in the practical application, the resulting Lanczos matrix T k is inaccurate. When the size k of T k is increased, the eigenvalues of T k provide increasingly accurate estimates of eigenvalues of B. For k > q T k has 'good' eigenvalues which are true approximations of the distinct eigenvalues of B but also extra eigenvalues which are either copies of a good eigenvalue or spurious. A test is used to identify the spurious eigenvalues. A numerically multiple eigenvalue of T k is accepted as an accurate approximation of an eigenvalue of B. For the remaining good eigenvalues an error estimate is computed (Cullum and Willoughby, 1985).
The Lanczos algorithm is unable to determine the multiple eigenvalues and their multiplicities. The approach used by Robert and Ducrocq (1996) for identifying the multiple eigenvalues is based on the fact that if 7 is a multiple eigenvalue of B, then it will also be an eigenvalue of the matrix B obtained from B by adding a symmetric rank one matrix, which is proportional to the product of the starting vector of the Lanczos algorithm and its transpose. Hence the Lanczos algorithm is applied to B and the eigenvalues of the resulting Lanczos matrix T k that are also eigenvalues of T k are identified as multiple eigenvalues of B. Using the fact that the multiplicity m l of the zero eigenvalue is known to be q -(nrank(X)), the multiplicity m i of the ith multiple eigenvalue -y i can be obtained as a solution to the set of the following three equations: subject to the constraint m i > 1, where r m is the number of multiple and r s is the number of single eigenvalues of B. Robert and Ducrocq (1996) successfully applied this method to a model with one fixed effect. Although the multiplicities determined were not consistent for different sizes of the Lanczos matrix, the traces were very well approximated using these multiplicities even for a Lanczos matrix of size k = 2q.

Gibbs sampling
The Gibbs sampler (Gelfand and Smith, 1990;Casella and George, 1992) generates samples from the joint posterior distribution by repeatedly sampling from the fully conditional distributions of the parameters. Fully conditional densities are derived from the joint posterior probability density given in equation [2] by collecting the terms that involve 0. These terms are proportional to the kernel of a multivariate normal density (eg, Gianola et al, 1990a): the solution to the mixed model equations. Therefore, obtaining a new realization of all location parameters jointly would require sampling from: which is impractical for larger problems. A computationally much less demanding alternative is to sample one location parameter at a time. Let e-z be e without its ith component and c' be the ith row of C without its ith element. Then, a new realization of the ith location parameter 9, is obtained by sampling from (eg, Wang et al, 1994): The disadvantage of using equation [14] is that the samples of subsequent cycles show higher correlations than if equation [13] were used. This leads to a slower mixing of the chain and therefore a lower effective number of independent samples (Sorensen et al, 1995), ie, a lower information content of a chain of given length (see below). Recently, Garcfa-Cort6s and Sorensen (1996) have shown a way to sample from equation [13] that circumvents the need for the inverse of the coefficient matrix of the mixed model equations but requires a solution to the mixed model equations.
With their simulated data for a single trait animal model, the effective number of independent samples for the additive genetic variance was only doubled using equation [13] instead of equation (14!. Because computing costs for equation [13] using the strategy proposed by Garcia-Cortes and Sorensen (1996) are increased by a factor much larger than two it is concluded that equation [14] is the preferred way to sample the location parameters in single trait animal models. With the model considered here the canonical transformation can be used to transform the correlated traits to uncorrelated variables. This allows us to sample the location parameters on the canonical scale using equation (14), which is equivalent to jointly sampling the location parameters for all traits of one effect on the original scale. To improve mixing of the chain additive genetic values for sires and their final progeny can be sampled jointly as proposed by Janss et al (1995).
The fully conditional distribution for the genetic (co)variance matrix Go is: using the fact that e/(G010!)e=tr(SaGOl), where Sa={a!A-1aj} for i = 1, ... , t and j = 1, ... , t, and a i is the vector of additive genetic effects for trait i. The conditional distribution is in the form of an inverse Wishart distribution with q + v G degrees of freedom and scale matrix V G + S a .
Similarly, the fully conditional distribution for the residual (co)variance matrix R o is: where e = y -(I d 9W)8, S e = {eie! ! for i = 1, ... , t and j = l, ... , t, and e i is the subvector of e for trait i. This is an inverse Wishart distribution with n+v R degrees of freedom and scale matrix V R + S e . Sorensen (1996) Samples from the marginal posterior density of the nonlinear function g(u) can be easily obtained by applying g( 0') to the samples generated for 6 in each cycle of the Gibbs sampler (eg, Wang et al, 1994).

APPLICATION Data
The methodology was applied to data set DS2 of Ducrocq and Humblot (1995) with semen production traits of 1957 Normande bulls born between 1976 and 1986. The three traits considered were volume of the ejaculate (mL), motility score (on a scale from 0 to 4) and spermatozoa concentration(10 9 /mL). The observations available were means of these traits for 11.4 ! 3.6 sperm collections. There were no missing observations. The means + standard deviations were 3.08 ! 0.88, 2.99 ! 0.63 and 0.98 ! 0.26 for volume, motility and concentration, respectively. All known ancestors without records were added with the exception of ancestors that provided no additional information, ie, ancestors with both parents unknown and only one progeny. The total number of animals considered was 5 566. Further details about this data set are given by Ducrocq and Humblot (1995).

Model
The same model as considered by Ducrocq and Humblot (1995) was used for each of the three traits. The two fixed effects were birth year x birth trimester x station of performance test (65 classes) and the effect of total number of sperm collections (six classes: unknown, < 6, 7-9, 10-12, 13-15, > 15). The additive genetic effect was the only random effect. Because the number of sperm collections that made up a record did not vary much, and for simplicity, a constant residual variance was assumed for each of the three traits.
Proper inverse Wishart distributions were assumed as priors for the matrices of genetic and residual (co)variances. The degree of belief parameters were arbitrarily set to 20 for both prior distributions. The scale matrices were chosen such that the expected values correspond to the phenotypic standard deviations found in the raw data (see above) with heritabilities of 0.5 for all three traits and genetic and residual correlations of -0.2 between volume and motility as well as concentration and of 0.6 between motility and concentration. The expected prior heritability of 0.5 for means of about ten observations correspond to a heritability of around 0.1 I for single observations. The chosen prior distribution reflects our prior belief based on literature values (Ouali, 1984;Taylor et al, 1985;Makulska et al, 1993).

Implementation of Laplace approximation
Eigenvalues of L'Z'MZL were computed using Lanczos matrices of size 2q, 4q, 8q and 12q. Multiple eigenvalues and their multiplicities were determined using the procedure of Robert and Ducrocq (1996) described above. As observed already by Robert and Ducrocq (1996), eigenvalues were difficult to estimate in some intervals.
With increasing size of the Lanczos matrix, new eigenvalues were detected in these intervals and the intervals with eigenvalues of low precision became smaller. These observations led to a new strategy where intervals with eigenvalues with high error estimates had the first eigenvalue with an error estimate > 10-5 on each side of the interval arbitrarily declared as a multiple eigenvalue in order to compensate for the eigenvalues not yet detected in the interval. For these 'multiple' eigenvalues and the multiple eigenvalue of 0.5 found with the procedure of Robert and Ducrocq (1996) multiplicities were estimated with equations !10!, (11! and !12!, using a least squares procedure. Integer programming was also attempted but failed on all occasions. Application of the formula for the Laplace approximation [9] requires maximization of log p( O&dquo;ly) subject to the constraint g( 0&dquo;) -T = 0. The iterative algorithm used to solve this problem follows ideas developed by Bard (1974, ch 6-6). First, log p(!!y) is approximated by a second order Taylor series expansion around the current estimate 8! at round r: where b 6 = u -6' . Similarly, g(o-) -T is approximated by a first order Taylor series expansion: Now, p is maximized subject to the constraint ! = 0, which is equivalent to finding the stationary point of the Lagrangian p + A!. Setting the first derivatives of the Lagrangian with respect to b 6 and A to zero leads to the following system of equations: The derivatives are evaluated at the current value of 8!. The new solution is then found as C T = C T + 58!!!. In our implementation observed second derivatives in equations [17] were replaced by expected second derivatives, which are faster to compute as will be shown below. Expected values for second derivatives on the canonical scale for the residual likelihood, which is one part of log p(uly), were derived by the method of Meyer (1985). First derivatives and expected values of second derivatives of log p(Q!y) on the canonical scale and the transformation to the original scale are given in the Appendix. Once the constraint was met log p(!!y) was computed for the proposed new solution and if necessary, step size was successively halved until the density was found to increase. Iterations were stopped when the maximum absolute change of heritability or genetic correlation estimates was less than 0.001. At convergence, observed second derivatives were computed for use in equation (9!. The computation of derivatives of log p(oyy) involves the solution to the mixed model equations on the canonical scale. Single trait mixed model equations were iteratively solved for each canonical trait by successive overrelaxation using a relaxation factor of 1.8 and solutions obtained in the previous round of the constrained maximization by equation [17] as starting values. The sum of squared differences between solutions in successive rounds of iterations divided by the sum of squared solutions in the current round was used as stopping criterion and was set to 10 -12 .
The computation of observed second derivatives involves the quadratic form aiA-1C!°A-lai, where a i is the solution to the mixed model equations for additive genetic effects for canonical trait i and Cq a is the submatrix of (W'W + a!E)-1 (the inverse of the coefficient matrix of the mixed model equations for canonical trait j) pertaining to additive genetic effects. This quadratic form was evaluated by first computing Û i = A-'iii, followed by iteratively solving the mixed model equations: In addition to the Laplace approximation [9] a computationally less demanding approximation of the marginal posterior distribution was obtained by replacing the matrix of observed second derivatives in equation [9] by the matrix of expected second derivatives. This approximation will be called LaplaceE in the following.
For heritabilities (genetic correlations) both approximations were computed for 35 (55) values equally spaced by 0.02 that covered the expected range of nonnegligible densities. For a more accurate location of the mode an additional three values spaced by 0.005 were evaluated in the intervals on each side of the value with the highest posterior probability density. The normalizing constant, the mean and quantiles of the marginal posterior densities were computed by simple numerical integration using the trapezoidal rule.

Implementation of Gibbs sampling
The Gibbs sampler was run in three parallel chains of 500 000 cycles each. At the start of each chain the (co)variance components were set to a random sample from their a priori distribution and the location parameters to the solution of the mixed model equations given the starting values for the (co)variance components. MZRAN of Marsaglia and Zaman (1994) was used as the basic random number generator, which has a period of the order of 2 94 .
Because the starting point of the Gibbs sampler is chosen arbitrarily samples are not immediately from the desired target distribution. Therefore, the samples from the first cycles, the burn-in period, have to be discarded. In a preliminary analysis six parallel chains with extreme starting values were run for 10 000 cycles. Starting values were: all three heritabilities 0.05 or 0.95; within these two alternatives all three possible combinations of two genetic = residual correlations of +0.9 and one genetic = residual correlation of -0.9; variance of raw observations as phenotypic variance in all six chains. Visual inspection of the plots of samples for genetic parameters versus cycles indicated that after approximately 1000 cycles convergence to the desired marginal posterior distributions was reached for all parameters. Based on these results the first 2 000 cycles of each of the three Gibbs chains were discarded. Samples from subsequent Gibbs cycles are correlated. The higher the lag autocovariances the lower the information content of the samples of a chain of given length. As suggested by Sorensen et al (1995) the 'initial positive sequence estimator' of Geyer (1992) was used to estimate the error variance of the marginal posterior mean and from this the effective chain length (ECL), ie, the number of independent samples that would result in the same error variance.
Marginal densities of heritabilities and genetic correlations were obtained by the normal kernel estimator (Silvermann, 1986, Wang et al, 1994. Let T i = g(a!i)(i = 1, ... s) denote the samples of the parameter of interest T = g(u). Then the marginal posterior density can be estimated as: where h is the window width. For all heritabilities and genetic correlations a window width of 0.01 was used and the posterior density computed for values equally spaced by 0.005 covering the whole range of the parameter space. Summary statistics of the marginal posterior distributions were computed by numerical integration using the estimated densities.

Approximation of determinant and traces
The eigenvalues determined as multiple and their estimated multiplicities are shown in table I. With the procedure of Robert and Ducrocq (1996) only the eigenvalue 0.5 was detected as multiple with all sizes of the Lanczos matrix. All other eigenvalues were determined as multiple with one size of the Lanczos matrix only. These results suggest that 0.5 is the only real multiple eigenvalue and that the others were falsely determined as multiple, probably due to a too loose criterion for comparing eigenvalues obtained from the two Lanczos matrices T and T. No attempt was made to change this criterion because it was also used for other comparisons within the Lanczos subroutine.
In the alternative strategy, only 0.5 was accepted as a real multiple eigenvalue and the first eigenvalue with an error estimate > 10-5 on each side of intervals with high error estimates were arbitrarily determined as 'multiple'. For the Lanczos matrix of size 2q two such intervals were found. For larger Lanczos matrices, only one of these intervals remained and became smaller with increasing size of the Lanczos matrix. Table II shows the relative errors of the determinant and the traces obtained with increasing size of the Lanczos matrix for error to genetic variance ratios a of 99, 4 and 1/3 corresponding to heritabilities of 0.01, 0.20 and 0.75, respectively.
For each strategy the values obtained with the Lanczos matrix of size 12q were regarded as exact. These values were very similar for the two procedures, the largest absolute relative difference being < 2 x 10-4 for the log-determinant and a = 1/3. In general, absolute relative errors tended to increase with decreasing a. The new strategy resulted in more accurate approximations than the procedure of Robert and Ducrocq (1996). A Lanczos matrix of size 8q was required to obtain absolute relative errors < 2 x 10-4 with the procedure of Robert and Ducrocq (1996), whereas such accuracies were already obtained with a size of 2q using the new strategy.
Eigenvalues computed with the new strategy and a Lanczos matrix of size 12q were used in the Laplace approximation.

Gibbs sampling
The lag-20 autocorrelations for samples of heritabilities and genetic correlations for the three Gibbs chains were all very high and ranged from 0.79 to 0.87 (table III).
Lag correlations were highest for the heritability of motility and for the genetic correlation between volume and motility and lowest for the heritability of volume. Lag-1 autocorrelations were between 0.99 and 0.98 and lag-100 autocorrelations between 0.53 and 0.36 over all genetic parameters and all three chains (data not shown). The estimates of the standard deviation of the marginal posterior means (SDM) ranged from 0.001 to 0.003. The effective chain length (ECL) ranged from 1535 to 2648, which is by a factor of 324 to 188 smaller than the actual chain length.
Marginal posterior probability density of genetic parameters from 2q to 12q. The adhoc declaration of two eigenvalues in the intervals of low precision as multiple with a Lanczos matrix of size 2q compensated well for the eigenvalues not yet detected leading to accurate approximations of traces and determinants. In contrast, much larger Lanczos matrices were required to achieve the same accuracy with the procedure used by Robert and Ducrocq (1996). Robert and Ducrocq (1996) successfully applied their strategy to a model with one fixed effect and found accurate approximations of traces already with a Lanczos matrix of size 2q. In their application, Robert and Ducrocq (1996) consistently found three multiple eigenvalues (0.5, 0.6875, 0.75), which corresponded to animals with certain genetic relationships that had records in the same level of the fixed effect. They report that distinct eigenvalues not yet detected with a Lanczos matrix of 4q were in the intervals [0.72, 0.75] and [0.84, 0.87]. Therefore, the estimated multiplicities, especially for the multiple eigenvalue of 0.75, has compensated for the eigenvalues not yet detected with small sizes of the Lanczos matrix, leading to very accurate approximations of traces with a Lanczos matrix of 4q.
Marginal posterior probability density of genetic parameters The Laplace approximation yielded very accurate approximations of the marginal posterior densities. As evident from equations [5] and [6], the approximation assumes normality, which is true only asymptotically. Our results suggest that the 1957 animals with observations were a large enough sample for normality to be a good approximation. The performance of the Laplace approximation with smaller samples is not known and will be the subject of another study. The computing time for the Laplace approximation of the marginal densities of all six genetic parameters was by a factor of 16 smaller than for Gibbs sampling.
Computing time for the Laplace approximation could be reduced by considering more sophisticated numerical integration techniques, eg, iterative Gauss-Hermite quadrature as used by Ducrocq and Casella (1996), which might require evaluation of the density for less values. It should be kept in mind that in this study the three combined Gibbs sampling chains were used to obtain an exact reference with small Monte-Carlo error to investigate the accuracy of the Laplace approximation.
The small variation of all summary statistics among the three chains suggests that a combined ECL > 1 500 is enough for obtaining marginal densities sufficiently accurate for most practical applications. With increasing sample size computing time is expected to increase nearly linearly for both procedures, Gibbs sampling and Laplace approximation, with the exception of the determination of the eigenvalues of the Lanczos matrix. However, increasing the number of traits will lead to a quadratic increase in computing time for the Laplace approximation if marginal densities are desired for all genetic parameters. For Gibbs sampling, the increase in computing time is expected to be nearly linear because canonical transformation can be used.
A simple multiple trait animal model with one random effect, no missing observations and equal incidence matrices was considered in this study. Hence, canonical transformation was applied and traces and determinants for canonical traits were easily computed once the eigenvalues of L'Z'MZL had been determined. Without these computational advantages, the application of the Laplace approximation is feasible only for small problems, since for each value where the marginal density is computed, the traces need to be evaluated repeatedly to locate the constrained mode and the determinant has to be evaluated once. The application of the computational strategies used in this study to multiple trait models with missing observations or models with more than one random effect is not possible. For the Laplace approximation to be applied to such models new efficient computing strategies have to be developed. Recent advances in the application of the canonical transformation to situations with missing observations for the solution of multivariate mixed model equations (eg, Ducrocq and Chapuis, 1997) suggest that further research towards an extension to multiple trait models with one random effect but missing observations seems most promising.