Generalizing the use of the canonical transformation for the solution of multivariate mixed model equations

La transformation canonique remplace t caracteres correles par t caracteres genetiquement et phenotypiquement independants. Son application dans des evaluations genetiques de type BLUP multi-caracteres diminue les besoins informatiques, accroit la vitesse de convergence d'algorithmes de resolution iterative et simplifie la programmation. Cet article presente differentes manieres de conserver au moins partiellement ces caracteristiques favorables dans les situations ou la transformation canonique est theoriquement impossible, c'est-a-dire quand certains caracteres sont manquants pour certains animaux (y compris lorsqu'un modele animal reduit est utilise), quand plus d'un effet aleatoire est inclus dans le modele et quand differents caracteres sont decrits par differents modeles.

evaluations decreases computing requirements, increases the convergence rate of iterative solvers and simplifies programming. This paper presents alternative ways to retain, at least partly, these desirable characteristics in situations where the canonical transformation is theoretically impossible: when some traits are missing in some animals (including when a reduced animal model is used), when more than one random effect is included in the model and when different traits are described by different models. genetic evaluation / mixed model / computing algorithm / multiple trait / animal model Résumé -Généralisation de l'utilisation de la transformation canonique pour la résolution des équations du modèle mixte multicaractère. La transformation canonique remplace t caractères corrélés par t caractères génétiquement et phénotypiquement indépendants. Son application dans des évaluations génétiques de type BL UP multicaractères diminue les besoins informatiques, accroît la vitesse de convergence d'algorithmes de résolution itérative et simplifie la programmation. Cet article présente di f férentes manières de conserver au moins partiellement ces caractéristiques favorables dans les situations où la transformation canonique est théoriqv,ement impossible, c'est-à-dire INTRODUCTION In routine genetic evaluations, theoretical considerations suggest that in situations where records can be described as linear functions of fixed and random effects, best linear unbiased prediction (BLUP) of genetic effects based on a multiple trait animal model should be used (Henderson and Quaas, 1976;Foulley et al, 1982;Quaas, 1984;Schaeffer, 1984). The inclusion of the known relationship between traits in a joint analysis of these traits increases the amount of information available and as a result, improves the accuracy of prediction and corrects potential biases resulting from selection. van der Werf et al (1992) and Ducrocq (1994a) review the benefits to be drawn from a multiple trait BLUP genetic evaluation. Flexible general purpose packages, eg, PEST ; ) exist and are successfully used to solve complex multiple trait evaluations.
However, the simple iterative algorithms commonly implemented in such packages can be extremely slow to converge when traits are missing for some animals or when several random effects or groups of unknown parents are defined in the model (Groeneveld and Kovac, 1992;Ducrocq, 1994a, b). Although generally acceptable for data files of moderate size, slow convergence can become a limiting factor for routine national evaluations.
In the particular case when the same model with only one random (genetic) effect applies to all traits and no records are missing, a canonical transformation of the t records of each animal into uncorrelated records replaces the large system of multiple trait mixed model equations with a set of t simpler univariate systems (Foulley et al, 1982;Quaas, 1984;Arnason, 1986;Thompson and Meyer, 1986;Jensen and Mao, 1988;Ducrocq and Besbes, 1993). The resulting reduction in computing costs is often drastic. However, the restrictions on the model and data structure required for the implementation of the canonical transformation are rarely fulfilled in practice. Other transformations have been proposed when some traits are missing (Pollak and Quaas, 1982;Quaas, 1984) but it was found that a strategy where missing values are iteratively replaced by their expectation and therefore retaining the possibility to implement the canonical transformation is clearly superior (Ducrocq and Besbes, 1993;Ducrocq, 1994a, b). The purpose of this paper is to demonstrate that the basic objective of the canonical transformation, ie, the reduction of a large linear system of equations into sets of smaller, sparser systems can be achieved in even more general situations, eg, with different models for each trait or with more than one random effect other than the residual. For the sake of completeness, the simple canonical transformation is briefly described with and without missing values on some traits. An extension of the above-mentioned strategy for the missing values case to reduced animal models is also presented.

MULTIPLE TRAIT MIXED MODEL EQUATIONS
First, consider the general situation encountered in multiple trait genetic evaluations. For each trait i, i = 1, ..t, assume the linear model: where y i is the vector of records for trait i; b i and a i are vectors of fixed and random effects and X i and Z i are the corresponding incidence matrices. Here, the only assumption is that no more than one random effect other than the residual e i is considered in the model. The variance-covariance structure for the random effects is summarized as follows: Concatenating the random (genetic) effects and the residuals for all traits into vectors a and e, respectively, the G ij and RZ! blocks are grouped into matrices G = Var(a) and R = Var(e). The (i, j) blocks of the inverse matrices G-' and R-1 are denoted G ij and R ij , respectively.
The submatrices G ij and R2! are functions of the pedigree and data structures and of Go and R o , the genetic and residual variance-covariance matrices between traits. The general form of the mixed model equations is: The number of equations and the memory requirements for such systems increase with t and t 2 , respectively. Iterative solvers can be used but they are relatively complex to implement in the general case. More importantly, convergence rate can be extremely slow (Arnason, 1986;Groeneveld and Kovac, 1992;Reents and Swalve, 1991).
In this section, we consider the particular case where there are no missing records, ie, each one of the recorded animals has a record on each of the t traits, and the same model applies to all traits. Let y = ( Y ' Y ' ... y')' be the vector including all records for all traits and b = (b ...... bt)' be the vector of fixed effects. Each vector y i is of size N Define Q to be a matrix such that QG oQ ' = D -1 , where D is a diagonal matrix, and QRoQ' = It. Such a matrix always exists. A way to compute Q can be found, eg, in Quaas (1984) or Ducrocq and Besbes (1993). Quaas (1984) described different ways to simplify the multivariate system [3] transforming its coefficient matrix into a block-diagonal matrix. A first approach consists in applying a linear transformation: to the data vector y and to manipulate the model of analysis accordingly. This leads to the transformed model: where b Q = (Q Q9 I B )b, a Q = (Q Q 9 I, V * )a and e Q = (Q Q9 I N )e. B and N * are the dimensions of b i and a i , respectively. Then: Since D and It are diagonal t x t matrices, the resulting system of mixed model equations is block-diagonal and, therefore, the solutions for the fixed and random effects for each transformed trait i can be obtained solving the univariate system: where d i is the ith diagonal element of D. The solutions on the original scale are obtained by simple back-transformation: For later use, we will now describe another enlightening way of obtaining this result (Quaas, 1985, pers comm), through matrix manipulation of the multivariate mixed model equations corresponding to model !4!: Rewrite system [12] as C u = W'y and define S = ( Q Q9 0 IB Q Q9 0 IN* ) and S * = Q Q 9 IN. Premultiply both sides of the system by S-' = (S-')' and insert I( B+Ar* ) t = S-1 S in the left-hand side and I Nt = S * -'S * in the right-hand side.
This results in: and is equal to: which simplifies again into univariate systems !14!.

Canonical transformation and reduced animal model
The canonical transformation applies without modifications to multivariate reduced animal models (RAM; Quaas and Pollak, 1981). This will be illustrated here in order to introduce notations for later use. Let the indices (p) and (n) refer to parent and non-parent animals (N P + N n = N * ). One can rewrite model [1] as: where K!n! is a matrix relating records of non-parent animals to their parents. A typical row of matrix K(!) has two non-zero elements equal to 0.5 in the columns corresponding to parents. For the t traits, with records ordered within trait: The part of e * corresponding to non-parents includes the residual effect e( n ) as well as the mendelian sampling contribution !!n!. Let Var(e * ) = R * . If e m represents the t elements of the residual vector e * for a particular animal m, we have: When parents are not inbred, 8 m = 0.75 or 8 m = 0.5 depending on whether only one or both parents are known. Let D n be the diagonal matrix of size N n with diagonal element 6 m If App is the relationship matrix between parents, we have: Then, after transformation of the data y !--> yQ (or after matrix manipulations similar to !13!), system [21] can be partitioned into t univariate 'RAM' systems to solve. For the transformed trait i and defining SZ!!! = INn + diD n :

MISSING VALUES
As previously indicated, the transformation [6] or the matrix manipulation [13] require identical incidence matrices X and Z for each trait. Therefore they cannot be directly implemented when some recorded animals have missing values for some traits. A simple strategy to avoid this constraint has been proposed by Ducrocq and Besbes (1993) and Ducrocq (1994a, b) and is briefly reviewed here. The underlying idea is to iteratively replace the missing values by their expectation given our current knowledge of all parameters and to solve the resulting sytem as if they were not missing, ie, applying the canonical transformation. It can be algebraically shown that this technique leads to the same solutions for fixed and random effects as the usual general approach. A formal justification results from the use of the expectation-maximization (EM) algorithm of Dempster et al (1977).
Using subscripts a and { 3 for observed and missing observations, respectively, and assuming that, given a and b, the complete (= augmented) data vector y = (y!, y) ) ' follows a multivariate normal distribution with mean (It (9 X)b+ (It 0 Z)a, the estimation of b and the prediction of a require the knowledge of the vector of sufficient statistics T (y) where: The vector y, 3 being unknown, we replace T (y) at iteration k by its expectation (E step): where for observed records: . k (k) = y a and for animal m with missing records: In the above formula, R om , aa and R Om , ¡3a are obtained from Ro by choosing the rows and columns corresponding to missing and observed traits for animal m. X m ¡3 (respectively, X ma ) are obtained from (It 0 X) by choosing rows corresponding to missing (respectively, observed) traits for animal m. Similarly, a m , and a&dquo; La are the elements of a m corresponding to missing and observed traits.
The M step consists in solving the mixed model equations in order to obtain new values b!!+1! and a!!+1! for b and a. This is much simpler to implement than in the general case because now a canonical transformation is possible: from the records actually observed and the prediction of the missing ones at the current EM iteration, transformed records on the canonical scale can be computed. After solution of the mixed model equations on the canonical scale and backsolution on the original scale, new predictions for the missing values are made and this iterative scheme is repeated until convergence. In practice, it is not necessary to go back to the original scale as updating can be done on the canonical scale. Consider that all traits for animal m have been ordered such that observed traits precede missing ones: C y & d q u o ; ' ' a J . If this is not the case, re-order y m , Q and R. Partition Ym¡3 Q = ((aa Q ¡ 3 ) and Q-1 = C Q a J . Then, the vector of observations for animal m Q,3 on the transformed scale at iteration (k) is: but we have: where b Q .&dquo;,, is, on the transformed scale, the vector of fixed effects pertaining to animal m. Finally: where X m represents the rows of It 0 X pertaining to animal m. The matrices JQ! (of size t x t a ) and J Q2 (of size t x t) depend on the missing pattern only, and they are computed only once for use at each iteration. Furthermore, each EM step can be interlaced with the iterative procedure used to solve the mixed model equations. This results in large savings in computing time.
Application to reduced animal models (following a suggestion from R Thompson) With the previous approach, in RAM situations, it is necessary to predict y; ;,b for all non-parent animals. This requires in [25] the knowledge of a!!>. This is in contradiction to the original purpose of using a reduced animal model, which is to solve a smaller system of mixed model equations with the additive genetic values of the parent animals only. To avoid the computation of non-parent animals' genetic values, one can replace the E step: L These predicted missing values influence the right-hand side of the RAM equations, which after canonical transformation is of the form: After each solution of the reduced system of equations, or after each iteration completed, the missing terms in YQi(P) and YQi(n) of [30] are computed again given the current values of b &dquo; and a (') Q (P)Q*

MORE THAN ONE RANDOM EFFECT
The second necessary condition in order to apply the regular canonical transformation is the existence of only one random effect other than the residual. In this section, this requirement will be relaxed. Consider for example a model with a direct additive genetic effect and a maternal genetic effect. Assume the same model for all traits (no missing values): where m is the vector of maternal effects, M the corresponding incidence matrix and: The corresponding mixed model equations can be written:

Simultaneous diagonalization
A straightforward extension of the canonical transformation was suggested by Lin and Smith (1990) in a particular situation: if G!,&dquo;,, = G&dquo;,, a = 0 and G aa , Gm m and R o are proportional, then it is possible to find a matrix Q such that, after a transformation similar to (6!: The resulting system of mixed model equations is block-diagonal and simplifies to t univariate systems: Again, this result can be obtained via a manipulation of the system of equations as in [13]. The conditions required to diagonalize three (or more) covariance matrices are rather drastic. Misztal et al (1995) clearly showed that accurate results can still be obtained when the true covariance matrices are replaced with simultaneously diagonisable approximations of these matrices. However, this approach is not applicable when the random effects are correlated (G am -I-0).

Block-iterative canonical transformation
A more general strategy consists in solving [34] block-iteratively (Hackbusch, 1994 At each iteration, one can solve the univariate systems:

DIFFERENT MODELS FOR DIFFERENT TRAITS
A third requirement for the applicability of the canonical transformation is the use of the same model for each trait. Again, several strategies exist to at least partly retain the benefits of the transformation for computational simplicity. For simplicity, assume here that in model [1], the incidence matrices Z i = Z are the same for all traits i (no missing values on recorded animals) but that the incidence matrix X i can vary from one trait to another.

Block-iterative approach
In the general expression (3!, one can isolate in the coefficient matrix the diagonal blocks corresponding to the fixed effects on the one hand and to the random effects on the other. The off-diagonal blocks are moved to the right-hand side after multiplication by the current solutions of the corresponding effects. For the fixed effects part, the system remains multivariate: But for the random effects block, it is possible to take advantage of the fact that the incidence matrix is the same for all traits: where y¡ k+1J = y i -Xb¡k+1J. Applying the regular canonical transformation (as in !6!) to (47!, the random effect block becomes a set of t univariate systems: Gengler and Misztal's approach Gengler and Misztal (1996) proposed an approach that makes use of the algorithm described for the missing values case (note: this approach was also found and used by Goddard (1995, pers comm)). Their method is better illustrated using a two-trait example: Here, the vector f 1 (with associated incidence matrix F) refers to fixed effects influencing trait 1 only, h 2 (matrix H) refers to those influencing trait 2 only and b (matrix X) includes effects affecting both. Consider for example an animal m with two recorded traits: Gengler and Misztal (1996)  For the animal m above, this means the use of four records, two of them being considered missing: Here, dum refers to a dummy level defined for missing records only and different from any level corresponding to observed records. After this artificial modification, the same incidence matrices apply to all traits (for the first and second records for each trait in [52], respectively) and a regular canonical transformation can be performed using the approach described in the previous section to include missing records. No additional coding is required as long as the initial code accepts multiple records. All records are analyzed with the proper model.

An approach using constraints on unnecessary effects
We will slightly generalize Gengler and Misztal's example described in [49). Consider two groups of traits: The idea here is to define the complete model [51] for all traits and to solve the resulting mixed model equations under the constraints: The full system of mixed model equations is: Applying once more a block-iterative strategy for the solution of this system, we can first solve: This system has exactly the same form as described in [12] and can be transformed into a set of t smaller systems by canonical transformation. The other two blocks to solve are: The constraints will be applied on these smaller systems. For example, for the f block, we want to solve: Note that the set of constraints can be written as: To is a diagonal t x t matrix with diagonal element equal to one for each effect to be where W o is a t x t matrix computed only once and f oQ are the unconstrained solutions. These unconstrained solutions are obtained solving t simple univariate systems. If more than one effect is included in f, the inverse of a full rank submatrix of F'F is used in [66] or an iterative approach is used to obtain f¿!+1). . A numerical example of the algorithm is given in the Appendix.

DISCUSSION
The systematic use of a canonical transformation to solve multiple trait mixed model equations is desirable for several reasons, all relying on the computational advantages it offers.
The resulting system is much sparser. Increased sparsity has a beneficial effect on the convergence rate of iterative methods. This has been repeatedly found in univariate evaluations (eg, when an equivalent model is used to increase the sparsity of the coefficient matrix when groups of unknown parents are defined (Quaas, 1988)) as well as for multivariate analyses (Pollak and Quaas, 1982;Ducrocq and Besbes, 1993).
As a result of the block-diagonal structure of the coefficient matrix, the system breaks down to several systems of smaller size. Again, this generally leads to faster convergence in univariate (eg, with reduced animal models) as well as in multivariate situations (Arnason, 1986;Ducrocq, 1994a, b).
Programming is easier, since each system is equivalent to a single-trait analysis.
Programming can be made more efficient: iteration on data (Schaeffer and Kennedy) may be replaced by (possibly partial) storage of the coefficient matrices in core. The system being smaller and sparser, a direct solution may become feasible in some cases, possibly on parts of the system only. The search for optimized iterative algorithms (eg, with optimum relaxation factors (Misztal and Gianola, 1987) may become worthwhile.
The estimation of variance components is greatly simplified and less computer intensive (Meyer, 1985;Jensen and Mao, 1988).
In this paper, it was shown that there are ways to retain at least partly these computational advantages when the three main requirements for the regular canonical transformation to be feasible are not fulfilled.
When data are missing, one can replace the missing records at each iteration by their expected values given the current parameters; the extra coding is minimal; the potential drawbacks, such as the need to backsolve for each non-parent solution when a reduced animal model is used (Ducrocq and Besbes, 1993), can be avoided.
When more than one random effect other than the residual is included in the model and a simultaneous diagonalization of all the covariance matrices between traits is not possible, a canonical transformation can still be applied to the diagonal block corresponding to each random effect. Block iteration is a well-known extension of iterative techniques (Hackbusch, 1994). Although the procedure proposed here is relatively simple, it is not guaranteed that such a strategy is always more efficient than when other partitions of the coefficient matrix in [33] are used. Quaas (pers comm) suggests than in some cases, it may be better to sort the equations in [33] by traits within animal and to treat together all the equations referring to the same animal in a block-iterative procedure. Direct and maternal effects of each animal are then computed jointly instead of separately as in [42] and (43!. The behavior of both strategies needs to be compared in real-life situations. This was not considered here.
When the models describing each trait differ, three alternatives are proposed. The first one relies on the same idea developed in the case with more than one random effect: one can isolate in the mixed model equations a block on which the canonical transformation can be applied. Most often, this will represent by far the largest block, corresponding to additive genetic values. However, this implies a specific coding for the solution of the remaining multivariate part [46].
The second approach (Gengler and Misztal, 1996) has the advantage of being applicable without additional coding. Its drawbacks are the large increase in the size of the data file (the initial size is multiplied by as many times as there are distinct models) and the slower convergence resulting from the large number of missing records whose values have to be computed at each iteration. It is not known whether convergence is always guaranteed in practice. The third alternative consists of creating a complete model including all fixed effects of interest and constraining to zero the solutions of the unwanted effects for each trait. Equation [67] illustrates the fact that this constraint is easy to implement. Our limited experience indicates that convergence is reached at .about the same rate as when the complete model is applied without constraints. It should be noted, however, that this approach imposes a block-iterative solution of the fixed effects one block at a time, which sometimes may be much less efficient than a simultaneous solution of all fixed effects, for example using direct methods. Again, it was not our intention to actually compare the practical performances of these approaches because each one may be valuable in a particular context.

CONCLUSION
Multiple trait BLUP equations have been known for a long time (Henderson and Quaas, 1976) but despite the desirable theoretical properties of the resulting predictions and the huge progress in computing technology, their large scale application is far from systematic, mainly because computing time remains a limiting factor. Some of the strategies presented here may lack the flexibility or simplicity required for easy implementation in a general purpose package for the solution of multivariate mixed model equations, but they may lead to invaluable savings in computing time (Ducrocq and Besbes, 1993;Ducrocq, 1994a, b), eg, for routine national genetic evaluations. These savings can be further increased by the use of more efficient single-trait solvers (Misztal and Gianola, 1987;Ducrocq, 1992;Carabano et al, 1992).
Obviously, these different ways to generalize the use of the canonical transformation can be extended to more general situations with simultaneously missing values, different models and more than one random effect. The strategies proposed are complementary and can be combined in a straightforward manner. One exception that was not dealt with here is the case when the number of random effects considered may vary between traits, for example, when only some traits may have repeated observations (implying the prediction of a permanent environment effect) or may be under the influence of maternal effects. However, the basic idea developed can be extended to such a case: in the initial system of mixed model equations, the blocks corresponding to each particular random effect can be isolated and a canonical transformation can be applied to each of these blocks, through the matrix manipulation described in !13!. is measured at 12 (BW12) and 16 (BW16) weeks of age. Leanness is assessed through the measure of backfat thickness using an ultrasonic probe (ultrasonic backfat thickness, UBT). The description model used for body weights includes a contemporary group (hatch) as only fixed effect, while the UBT measure is also affected by the operator effect. Consider the following records for four turkeys A, B, C and D: The incidence matrix F for the operator effect is: 5.8 20 50 The matrix To is 1 Applying a block-iterative strategy, the solutions of the batch and animal effects are obtained through a regular canonical decomposition and the solution of the operator effect f Q on the transformed scale is given by [66]: The solutions on the original scale are obtained by simple back transformation.
In particular f = ( Q -0 I f ) f Q : As expected, the solutions of the operator effect are zero for BW12 and BW16.