Effects of data structure on the estimation of covariance functions to describe genotype by environment interactions in a reaction norm model

Covariance functions have been proposed to predict breeding values and genetic (co)variances as a function of phenotypic within herd-year averages (environmental parameters) to include genotype by environment interaction. The objective of this paper was to investigate the influence of definition of environmental parameters and non-random use of sires on expected breeding values and estimated genetic variances across environments. Breeding values were simulated as a linear function of simulated herd effects. The definition of environmental parameters hardly influenced the results. In situations with random use of sires, estimated genetic correlations between the trait expressed in different environments were 0.93, 0.93 and 0.97 while simulated at 0.89 and estimated genetic variances deviated up to 30% from the simulated values. Non random use of sires, poor genetic connectedness and small herd size had a large impact on the estimated covariance functions, expected breeding values and calculated environmental parameters. Estimated genetic correlations between a trait expressed in different environments were biased upwards and breeding values were more biased when genetic connectedness became poorer and herd composition more diverse. The best possible solution at this stage is to use environmental parameters combining large numbers of animals per herd, while losing some information on genotype by environment interaction in the data.


INTRODUCTION
The application of genetic covariance functions (CF), to model traits in dairy cattle by predicting breeding values as a function of an environmental parameter (EP), has been suggested several times [1,2,8,13]. The change of an animal's expected breeding value (EBV) across environments represents its environmental sensitivity. The CF includes differences in the environmental sensitivity of genotypes for a trait, also known as the genotype by environment interaction (G × E), in the variance components, regardless of whether it originates from scaling effects or re-ranking of animals across environments. This is in contrast to the usually applied methods for breeding value prediction that either (1) ignore environmental sensitivity or (2) ignore re-ranking by correcting only for heterogeneity of variances [9]. In international breeding value estimation, where G × E is included in the model by regarding records of animals in different countries as different traits [11], both scaling and re-ranking are considered. However, this method has several limitations, for example the grouping of animals based on country borders while herd environments in small neighbouring countries may be much more similar than herd environments in different parts of a large country [14]. Also, a large number of countries implies a large number of traits, which increases the chance that the estimated genetic covariance matrix is not positive definite [6], indicating that problems are likely to appear in the estimation of variance components for such multi trait models. Therefore, the application of CF is of interest to take G × E into account for example in international breeding value estimation, or to investigate the importance of G × E.
In applications in dairy cattle, an EP is usually calculated as the mean phenotypic performance of a trait in an environment [1,2,8,13], which implies that both average genetic level within the herd and the animals own true breeding value (TBV) are included in the EP [1,2,8]. Confounding between EP and TBV might affect EBV for example in herds with a non-average genetic composition or relatively small herds, since it might be difficult to disentangle genetic and environmental effects. Kolmodin et al. [8] tried to partly solve this problem by calculating EP from more animals in the herd, rather than only from animals whose sires are being evaluated. Another problem with the application of CF is that low numbers of daughters per sire might lead to problems in predicting breeding values. The number of records from daughters of a sire is the number of data points through which the curve representing the sires' EBV is fitted and extrapolation of curves of sires with a low number of daughters to extreme environments might be required. Another typical animal breeding problem is that herds with better management tend to use different sires than herds with a low level of management. This might lead to poorer genetic connectedness between herd environments but also to a covariance between genotype and herd environment. Hence, it is not known whether CF can handle these typical animal breeding problems, such as limited genetic connectedness between herds or preferential treatment, that exists in both within country and international breeding value estimation.
The objective of this paper was to investigate the influence of definition of EP and levels of preferential sire use in herds on expected breeding values and estimated genetic variance across the range of EP in one population by stochastic simulation. Data structures were varied by changing the number of daughters per sire and average number of animals per herd for traits with low and high heritabilities, applying three levels of the G × E interaction.

Simulation
Data were simulated to compare estimated variance components, calculated EP and expected breeding values from different models. A record was simulated including the animals breeding value, a herd effect and a residual. A breeding value (a) was simulated as the average of the parents breeding values plus a Mendelian sampling term (ms). Each component included an intercept (a 0 and ms 0 ) and a linear regression on the environment (a 1 and ms 1 ): and The Mendelian Sampling term was simulated dependent on the environment, to ensure that it explained half of the total genetic variance in each given environment. The breeding value in a specific environment with a simulated herd effect herd was calculated as: TBV herd = z herd a, where z herd = z 0 herd z 1 herd , and z 0 herd and z 1 herd are respectively the level and slope of the animals breeding value. TBV herd had a normal distribution N(0, z herd Var(a)z herd ) for each value of herd. Application of reaction norm models as a function of herd average of the analysed trait showed that genetic variances increase with increasing herd level of the trait [1,8]. In order to simulate mainly increasing genetic variance across environments, 99% of simulated herd effects (herd) got positive simulated values by sampling from a normal distribution N(1, 1/9). The residual was simulated homogeneously across environments by sampling from a normal distribution N(0, σ 2 e ), where σ 2 e = 1 − σ 2 a 0 . σ 2 a 0 and σ 2 a 1 were set to 0.04 and 0.02 to reflect a low heritability trait (e.g. a fertility trait) and to 0.4 and 0.2 to reflect a high heritability trait (e.g. a milk production trait). The correlation between level and slope (r a 0 ,a 1 ) was set to −0.5, 0 or 0.5. The simulated genetic correlation between the trait expressed in different environments was calculated by dividing the genetic covariance between two environments, with simulated herd effects of herd 1 and herd 2 , by the square root of the product of the genetic variances in both environments: As a result of the chosen variances, both the low and high heritability traits had simulated values for r g (herd=0.5,herd=1.5) of 0.74, 0.89 and 0.96 representing different amounts of re-ranking for r a 0 ,a 1 being respectively −0.5, 0 or 0.5. Simulated heritabilities across environments for both the low and high heritability traits are shown in Figure 1.

Population structure
Different values were considered for the input parameters (Tab. I). All values in bold were used as default in situations where different values were considered for the other parameters. A simulated population contained 50 000 animals, 500 or 2000 sires and 1000 or 5000 herds. The number of daughters per sire was 25 or 100. The average number of animals per herd was 10 or 50. Only one generation of animals was simulated and no selection was considered.
Daughters of sires were either randomly or non-randomly assigned to herds following three different scenarios, based on the differences in the selection of sires and herds and resulting genetic connection between the groups of herds (Tab. II). In the first scenario sires were assigned randomly across herds. In the second scenario (selective use of sires), sires were ranked based on the simulated breeding value of level. Both sires and herds were split in five equally  sized groups; sires based on ranking of their breeding values for level and herds at random. Daughters of sires from the first group were most likely assigned to herds of the first group; daughters of sires from the second group were most likely assigned to herds of the second group, etc. The chances of a sire from group i to have a daughter in group of herds j, are shown in Table III. The third scenario involved non-random grouping of herds based on an increasing simulated herd effect combined with the selective use of sires, to create a positive correlation between the herd effect and sires breeding values for level. This scenario is referred to as the herd dependent use of sires.

Analysis of simulated data
The general model used to analyse the simulated data, with a linear random regression on a calculated EP, was: where: y jk is the performance of cow k; µ is the average for the trait across all animals; hr j is either a fixed effect of herd j or a fixed polynomial regression common to all evaluated animals on phenotypic average within a herd (see below); 1 i=0 α ik p i j is the additive genetic effect of animal k in herd j where α ik is coefficient i of the random regression on a polynomial (pol(x,t) option in ASREML) [4] of environment of animal k and p i j is element i of a polynomial resembling the calculated EP of herd j; and e jk is the residual effect of cow k in herd j.
Polynomials were used to rescale EP in order to facilitate the convergence of the model. The estimated genetic variance matrix S had variances of level and slope on the diagonal and covariances between those on the off-diagonals. The estimated genetic variance in an environment with EP equal to EP1 was calculated as Φ EP1 SΦ EP1 ', where Φ EP1 is a vector with polynomial coefficients of EP1 on each row. The estimated genetic covariance between environments with EP equal to EP1 and EP2, respectively, is calculated as Φ EP1 SΦ EP2 '. To compare the results to simulated values, all estimates of genetic variance components were calculated back from the polynomial scale to the original scale per replicate and then averaged across replicates. ASREML [4] was used for all analyses. For all situations considered, 50 replicates were simulated, which was sufficient to obtain reliable averages in initial test analyses.

Modelling of EP
Three models were considered for an estimated herd effect (hr j ) and calculated EP: Model 1. hr j is a fixed effect of the herd as normally used in breeding value estimation models [5] and EP was calculated as the average phenotypic performance of the trait within a herd. Model 2. hr j is a fifth order fixed polynomial regression common to all evaluated animals [12] on EP, which was calculated as the average phenotypic performance of the trait within a herd. Model 3. hr j was a fixed effect of herd and EP was iteratively estimated with the general model. In the first iteration EP was equal to the average phenotypic performance of the trait in a herd. In all consecutive iterations EP was equal to the value of the fixed herd effect, estimated in the previous iteration. The iteration was stopped if all EP were equal to the values of the corresponding estimated fixed herd effects, i.e. the difference between each newly estimated fixed herd effect (hr j ) and EP from the last iteration was smaller than the convergence criterion (a maximal absolute change of 0.001).
Model 3 was expected to remove possible bias from EP, resulting from a nonrandom use of sires or low numbers of animals per herd. Model 3 resembled the simulation model most, since the calculated EP was equal to the estimated fixed herd effect. In situations where all three models were applied, a single data set was simulated in each replicate and analysed with each of the three described models.

Comparison of different methods to model EP
The effects of description of an EP were investigated by comparing estimated variance components, expected breeding values and calculated EP to simulated values for the different scenarios across all 50 replicates. Estimated variance components were used to calculate estimated genetic correlations of the trait expressed in different environments. Also, the correlations between TBV and EBV of sires were calculated for different values of EP to indicate problems arising from the selective use of sires when applying CF.

Variance components, breeding values and EP
Each replicate gave estimates of the residual variance, variances of level and slope and the covariance between level and slope. Averages and standard deviations of estimated variance components across the 50 replicates are shown in Table IV for the low and the high heritability trait with r a 0 ,a 1 of 0.0 and random use of sires. The trends were generally the same for the low and high heritability trait. Variance components of models 1, 2 and 3 were hardly different. Estimated variances of the slope were underestimated for situations with 10 animals per herd.
Genetic correlations between level and slope for all situations considered in Table IV were estimated on average 0.2 higher than simulated (results not shown). In replicates where the estimated correlation between level and slope became higher than 1, the (co)variance matrix was forced to be positive definite by fixing the correlation at 0.999 [4]. For the low heritability trait, the variance of the slope became very small in a considerable number of replicates leading to fixation of the correlation between level and slope at 0.999 and on average to a high estimate of the correlation between level and slope. For the high heritability trait, the overestimation of the correlation between level and slope mainly resulted from an overestimation of the covariance between level and slope.
In each replicate, values were calculated for EP for all herds and breeding values of level and slope were predicted for all animals. Average correlations between simulated herd effects and calculated EP, and simulated and expected breeding values of level and slope of sires, are given in Table V for the high heritability trait, r a 0 ,a 1 = 0.0 and random use of sires. Different definitions of EP hardly influenced the correlations between simulated herd effects and calculated EP. The EP of models 1 and 2 were both calculated as phenotypic   herd averages and therefore were the same. Generally, the values of EP in model 3 converged after two or three iterations. The number of animals per herd had a larger effect on the correlations between simulated herd effects and calculated EP, than the number of daughters per sire. The number of daughters per sire had a larger effect on the correlations between simulated and expected breeding values of levels and slopes of sires, than the number of animals per herd.
Genetic variances across environments estimated by model 1 are shown in Figure 2 for the high heritability trait with r a 0 ,a 1 equal to 0.0. Regardless of the data structure, the curve of the estimated genetic variance was flatter than the curve of the simulated genetic variance. The number of animals per herd had a strong influence on the estimates of the genetic variance, while the influence of the number of daughters per sire was limited. In the situation with 100 daughters per sire and 10 animals per herd, estimated genetic variance deviated up to 30% from the simulated value. The simulated value of r g (EP=0.5, EP=1.5) was 0.89 (given r a 0 ,a 1 = 0.0), while estimated values were 0.93, 0.93 and 0.97 (results

Selective use of sires
Averages and standard deviations of estimated variance components of model 1 for selective use of sires are shown in Table VI. Residual variance was strongly overestimated and the variances of level and slope were strongly underestimated in all situations. For situations with selective use of sires, model 3 gave results (not shown) that were comparable to model 1, indicating that model 3 was not better in distinguishing between environmental and genetic effects than model 1.
Correlations between simulated herd effects and calculated EP and between simulated and expected breeding values of sires of level and slope are shown in Table VII. Correlations for herd environment and sires breeding values of level were lower than for situations with random use of sires, while correlations of the slopes of sires breeding values were slightly higher. Biased estimates of EP combined with underestimated variances of level and slope resulted in an underestimation of the genetic variance across environments in all situations with selective use of sires (results not shown).    Herd dependent use of sires, the situation with a confounding of sires breeding values of level and simulated herd effect, was only applied to the situation with 100 daughters per sire and 50 animals per herd with a correlation between level and slope of 0.0. The results (not shown) were comparable to the results for the selective use of sires.  Table VIII were independent from EP, due to  the correlation between level and slope of 0.0. For groups of sires 1 and 5, EBV EP=0.57 and EBV EP=1.43 were on average closer to zero than simulated. As the data became more complex, average EBV of groups of sires 1 and 5 were closer to zero. Correlations were calculated between simulated and expected breeding values for each EP level (Tab. VIII). Correlations were slightly higher for EP = 1.00 and EP = 1.43. Correlations were the same for random and selective use of sires. For herd dependent use of sires, correlations tended to be the highest in the group of herds where sires had most daughters.

Modelling of EP
In this study we started with an idealised situation where the simulation model and the model used to analyse the data, were as similar as possible. One of the major differences between the simulation and estimation models was that EP in model 1 and 2 were calculated as phenotypic averages since they are generally modelled in a reaction norm model [1,2,8,13]. The proposed alternative model (model 3) was expected to correct for genetic influences on EP by iteratively estimating the fixed herd effect in the evaluation model and use this as EP in the next iteration. Model 3 was tested because we expected that this model had closer resemblance with the simulated (and probably the true) model. All models used a linear random regression on EP to model genetic effects. Model 1 performed slightly better than model 2, which likely results from the fact that model 1 exactly fitted the simulation model and used more degrees of freedom to estimate herd effects. The results of model 3 were not different from the results of model 1 even for the situation with selective use of sires and model 3 used about twice as much calculation time as model 1.
Failure of the alternative model to perform better than model 1 could mean that either simply using herd means as EP is not the real underlying problem for estimation when using data under the scenario of selective use of sires, or that the proposed alternative model did not properly account for possible genetic bias in EP. More theoretical models, that for instance include simulated environmental effects as EP, could be used to further explore the nature of this problem. However based on the results of this study which was restricted to practical applicable models, there is no reason to use model 3 instead of model 1.
Model convergence was one of the major problems experienced with all three models. Although the random regression model is the most common applied covariance function in reaction norm models, Jaffrezic and Pletcher [7] showed in a few examples that a character process model was more successful in modelling longitudinal data than random regression. The application of a character process model to model reaction norms appears straightforward, and might provide a solution to get better convergence of the model.

Estimation of G × E
For the trait with a low heritability, it was more difficult to estimate the genetic CF than for the trait with a high heritability. The main problem was that for the low heritability trait in almost 40% of the replicates the covariance structure was forced to be positive definite. Also, the number of animals per herd was important to estimate genetic variance and calculate EP correctly, which illustrates that environmental sensitivity is better estimated in a population with larger herds and likely to be underestimated for a population with small herds. However, in a practical situation small herds may either be too large in number to simply disregard or represent certain management styles that are hardly found in larger herds. This problem might be partly solved by calculating EP based on for instance 50 animals that calved consecutively in one herd, rather than based on herd-year. Changing the data structure from the default situation by reducing the number of animals per herd to 10 or by introducing the non-random use of sires, led to correlations between simulated herd effects and calculated EP of 0.69 and 0.74, respectively (Tabs. V and VII). Although these changes in data structure are arbitrary, it indicates that both relatively low numbers of animals per herd and non random use of sires leads to biased EP.
One of the effects observed was that the estimated genetic correlation between the high heritability trait expressed in different environments was biased upwards, i.e. estimates were 0.93, 0.93 and 0.97 in situations with a random use of sires where the simulated value was 0.89. This resulted from the overestimated covariance between level and slope and the underestimation of variance of slope. Underestimation of variance of slope in situations with random use of sires also resulted in deviations of up to 30% of estimated genetic variance from simulated genetic variance. Variances of slope were more underestimated if the population structure was less informative, which indicates that high estimates of the genetic correlation between a trait expressed in different environments calculated with CF might result from the quality of the data rather than from the absence of re-ranking based on TBV. In the extreme situation where the variance of slope is estimated to be zero, the estimated genetic correlation between a trait expressed in different environments will be 1, since it can easily be derived from equation (1).

Prediction of breeding values across environments
One of the objectives was to investigate the influence of sires breeding values on EP. In situations with selective and herd dependent use of sires, the sires were grouped based on their TBV for level, which is equal to TBV herd=0 . Grouping of sires based on TBV for any other simulated herd effect would have caused only small changes in the composition of groups of sires, since the simulated genetic correlation between the trait expressed in different environments was relatively high. The model clearly had more problems in estimating effects correctly in the case of selective use of sires. Selective use of sires not only implies a possible bias in EP but also poorer genetic connections between groups of herds, which can lead to more difficulties for the model to disentangle genetic and environmental effects [3]. From this study, it is not clear whether problems in the estimation of variance components in the case of non random use of sires are due to genetic influence on EP, poorer genetic connections between groups of herds or failure to disentangle genetic and environmental effects. Random herd effects could be applied to avoid herd effects from absorbing part of the genetic levels within herds. Initial analyses with model 3 using random herd effects showed however that variances of level and slope were severely overestimated and herd variances were severely underestimated.
Groups of sires shown in Table VIII were selected based on their TBV of level. This implies that selection was based on data that is not included in the genetic evaluation and therefore EBV are expected to be biased [5] and correlations between TBV and EBV are expected to be different for different groups of sires. However, groups of sires in Table VIII were the same for the different scenarios. Therefore, differences between scenarios are due to differences in genetic compositions of herds and in case of herd dependent use of sires also due to the fact that sires had most of their daughters in a limited range of environments. Correlations between simulated and expected breeding values indicated that breeding values of sires were predicted accurately across environments with the different models. Absolute values of EBV, however, were closer to zero if the data became less informative. This is not a problem if selection is based on a single trait or if scaling effects are not important. If selection is, however, based on an index based on more than one trait with different scaling effects, scaling effects can cause re-ranking across environments based on the composite index [10]. In that case, non-random use of sires could result in misleading indexes, since scaling effects of traits are likely to be underestimated.
The EBV of cows were not compared to their TBV. In the simulated data, cows only had one record and therefore only one point through which their EBV was fitted. Since the EBV of cows are based on far less data than the EBV of sires, the EBV of cows are likely to be more biased than the EBV of sires, especially if breeding values are extrapolated to extreme environments.
Problems in estimating variance components and lower correlations between TBV and EBV under the presence of selective use of sires seem to contradict suggestions [8] that CF could be useful in overcoming problems with genetic connectedness in international breeding value estimation. However, only one population with one generation of sires was simulated and subsets of sires were not equally distributed, while an international situation ideally would be simulated by different related base populations reflecting different countries. Additional genetic relations between animals would improve genetic connectedness across environments and therefore reduce bias in EBV. Since poor genetic connectedness and confounding between herd and genetic effects are features of the data, bias in estimated variance components might be reduced by selecting data containing genetically well-connected herds with a non-extreme genetic composition and different levels of management.

CONCLUSION
Implications of using phenotypic averages as EP in CF were expected to lead to problems of estimation of variance components. Non average genetic composition of herds and poor genetic connectedness had a large impact on estimated variance components in CF and gave poorer correlations between simulated and predicted sire effects and between simulated herd effects and calculated EP. Estimation problems were not overcome by a new model that was aimed at separating environmental and genetic effects in the EP. The effect of estimation problems was that genetic correlations between the trait expressed in different environments were biased upwards and that EBV were biased if genetic connectedness became poorer and herd composition more diverse. The best possible solution at this stage is to use EP combining a large number of animals per herd.