An expression of mixed animal model equations to account for different means and variances in the base population

- This paper presents a general expression to predict breeding values using animal models when the base population is selected, i.e. the means and variances of breeding values in the base generation differ among individuals. Rules for forming the mixed model equations are also presented. A numerical example illustrates the procedure. &copy; Inra/Elsevier, Paris mixed model equations / animal model / base population / selection Résumé - Expression générale


INTRODUCTION
The prediction of breeding values involves assumptions on animals with unknown parents, commonly named the base animals. Correct understanding and definition of the base population are critical for animal models because all subsequent breeding values are tied to them. The usual assumption is to consider base animals unselected. However, this condition often does not hold because it is not always possible to trace the complete geneology or to describe the selection process back to the unselected foundation generation. In this case, the distribution of breeding values is altered and, in particular, it is no longer valid to assume that the breeding values of the base animals have the same mean and variance, and that the genetic variance of the base generation is twice that of the Mendelian variance.
In a Gaussian setting, Henderson [6] derived a modification of the mixed model equations (MME) which led to the obtaining of predictors of breeding values that are unbiased even if base animals were selected, provided that the variance components associated with the model were known. In many applications these equations are difficult to set up and various alternatives have been suggested. In sire evaluation, Henderson [5] proposed to assign logically animals to fixed groups according to some existing prior knowledge of breeding values, or instead to treat animals as fixed if selection occurred in an unspecified manner. Quaas and Pollak [12] showed the equivalence between the MME for a sire model with genetic groups and those derived by Henderson [6] under his selection model, provided that the appropriate genetic groups were defined.
The alternative formulation of the MME derived by Quaas and Pollak [12] for a model with genetic groups was exploited in Graser et al. [4] and in Quaas [11]. They gave easy rules to set up the equations corresponding to an animal model with base animals treated as fixed and an additive genetic animal model with groups and relationships, respectively. Cantet and Fernando [1] extended these rules to allow for heterogeneous additive genetic variances and segregation variance between groups. However, these rules assume that each base animal is randomly sampled within the group, and therefore that its variance is the same as before selection took place. Although Henderson [7,8] and van der Werf and Thompson [14] developed MME that account for reduced genetic variance of base animals due to selection, they did not explicitly give a set of rules to set up the associated MME.
The purpose of this paper is to present a general approach to predicting breeding values when genetic means and variances of base animals are not homogeneous. The problem has been dealt with in the literature, and easy rules are available to set up the MME when individuals from the base population have different means [11] or can be easily derived when they have distinct genetic variances [14]. However, both aspects have not been dealt with in one practical approach. This paper brings these two problems together. The generalisation gives a convenient formulation for illustrating the relationships between several methods of dealing with selected base populations. This includes obtaining MME which can be constructed using an extension of the rules given by Quaas [11] to cope with different assumptions concerning the variance of breeding values of base animals. A numerical example is given.

THEORY
The usual animal model expression can be written as: where y is the vector of records; b is the vector of fixed effects; a b is the random vector of breeding values of base animals; a r is the random vector of breeding values of non-base animals; e is the random vector of residuals; and X, Z l and Z 2 are known incidence matrices associated with b, a b and a r , respectively.
The vector of breeding values of non-base animals can be partitioned as: where s * is a linear transformation of the random vector of the Mendelian sampling effects (s) of animals with known parents, such that where P 2 is a matrix relating non-base animals among themselves; and Q is the incidence matrix relating base animals with their descendants, such that where P I is a matrix relating base animals with non-base animals. P I and P 2 are matrices with 0.5 in the parent's columns in each row.
We can then write !4!: where j and k are the parents of i.
Thus, following equation (1), we can write the variance of s * as: We will denote that V(e) = R. To complete the definition of the model, we need only to specify the expectation and dispersion matrices for a b . This will serve to develop different hypotheses about the mean and variance of the base population. In most mixed models either the mean or the variance is assumed to be zero. Hence, to be general, we will consider that E(a b ) = Q b g and V(a!,) = H 6 , where g is the vector of base population means, Q 6 an incidence matrix relating the base animals to their respective groups, and H b is the dispersion matrix of breeding values of base animals.

Expression (3) can be rewritten as:
With this model the vector of breeding values of base animals is: Now, following the modification of Quaas and Pollak [12] in a similar manner to that described by Graser et al. (4!, the associated MME are: Absorption of the equations for the genetic groups (g), and using equations (2) and (4), permit us to rewrite the MME in equation (5) as and a is the vector of breeding values of base (a b ) and non-base animals (a r ). Now, calling and Z = [Z l Z 2] the prediction of breeding values when base animals are selected is then obtained by solving the following MME: The calculation of G * -1 is simplified if all the groups are assumed to have the same additive genetic variance, and base animals are unrelated and noninbred, because in that case G * -1 is the usual inverse of the relationship matrix. Otherwise, the calculation of G * -' requires computing H o introducing the segregation variance between groups and inbreeding, though these effects can be easily accommodated using for example the algorithms given by Cantet et al. [1] and Meuwissen and Luo (10), respectively. The second term of G * -' requires the computation of H However, from inspection of MME in equation (5) it can be seen that, if no inbreeding is assumed and base animals are genetically unrelated, H-' does not need to be calculated because G * -1 can be constructed directly by extending the algorithm of Quaas [11]. In particular, if base animals are sampled at random from some selected populations, and, for simplicity, are assumed to be genetically unrelated, then H b is diagonal with the ith diagonal element defined as 6 i Q a, where 6 i accounts for the reduction in the genetic variance Q a a 2 due to selection. In this case, G * -1 = A * -1 (1/ Q a) and A * -1 can be computed, for m = number of unknown parents of an individual, replacing x(= 4/(k+2)) in the rules of Quaas (11) with: -x=2, ifm=0(k=m); -x = [4/(2 + 8 j )], if m = 1 and the unknown parent is from a population with variance 6 j or a 2(k = 6 j ); -and ! _ (4/(2 + 6 j + 6 k )], if m = 2 and the unknown parents are from populations with variance 6 j Q a and 8 k Q a (k = 6 j + 6!).

NUMERICAL EXAMPLE
Consider the following pedigree: All base sires and dams come from the same population. Dams were taken at random and sires were selected from the offspring of the 1 % of the phenotypically best animals.
Records were made in two time periods as follows: Two different genetic groups can be defined: g, for the selected base males and 92 for the randomly chosen base females, both with different additive genetic means and variances. Assigning a hypothetical base animal to these groups (g, and g 2 ) genetic groups can be treated as fixed effects (6 = oo).
Selection carried out in males is known. Therefore, assuming normality, the proportion of genetic additive variance after selection (b) can be derived from the following expression: 6 = 1-i (i-w) h 2 , with i being the selection intensity value, w the standardised truncation point value and h 2 the heritability value !13!.
Following the proposed rules, we have: The associated MME are: The coefficient matrix has order 15 but rank 14. Imposing the

DISCUSSION
The results presented in this paper permit us to obtain a general expression to predict breeding values using animal models when the means and variances of breeding values in the base generation differ among individuals. This can be accomplished using equation (5) or equation (7) with a proper definition of H in equation (6). In particular, it is through Q b and H b that we account for the distribution of breeding values of base animals, and can illustrate the correspondence among different models for selected base populations. Thus, if Q b = 0 and H 6 = I o, a 2, the expression (7) leads us to the habitual MME under a non-selection model. With H6 1 = 0, which can be obtained by setting Similarly, the MME described in van der Werf and Thompson [14] are the same as in equation (7) with H6 1 = 6 1 (llafl) and Q b = 0. Further, when selection can be described as a linear function of breeding values of base animals (M'a b ), it can be shown that equation (7) is equivalent to equation (3) in Henderson [8] when M = H-1 Q b , and, therefore, Q' H b 1 ab represents the conditional variable upon which selection is assumed to be based. This can be interpreted generally as a weighted grouping, where groups are weighted by the dispersion matrix of breeding values of base animals. Alternatively, the results of Famula [2] serve to show that this is equivalent to a model of restricted selection using Hb 1 (ab as a restriction matrix.
Hence, predictions of a b deviations from their group mean are independent of selection decisions made in the past and, assuming normality, selection can be ignored. Note, however, that this is not true if descendants of base animals are also selected, unless they are selected on linear, translation invariant functions of the observations (6!. The latter condition would not be satisfied when the selection criterion included the group effect or, more generally, when base animals were treated as fixed [14]. Nonetheless, this condition for ignoring selection does not need to be met when likelihood [9] or Bayesian [3] methods of inference are used, and it has not been demonstrated that this property leads to maximising the expected genetic progress, as Fernando and Gianola [3] have shown in a simulated example. Equation (7) can also be useful in the estimation of variance components when, as in the example presented, selection can be simply modelled. In this case, the problem of selected base animals could be reduced to estimating some extra parameters, although the amount and the structure of available data would condition the reliability of estimates (14!.

CONCLUSION
When additive genetic means and variances of base animals are not homogeneous, prediction of breeding values can be obtained by means of animal models if the covariance matrix of additive genetic values is properly defined. MME construction is similar to that with homogeneous mean and variance in the base population. The different methods that have been proposed for prediction of breeding values when base population animals have been selected in some non-random manner can be deduced from a general expression of MME.