Estimation of genetic variance for macro- and micro-environmental sensitivity using double hierarchical generalized linear models
Genetics Selection Evolution volume 45, Article number: 23 (2013)
Genetic variation for environmental sensitivity indicates that animals are genetically different in their response to environmental factors. Environmental factors are either identifiable (e.g. temperature) and called macro-environmental or unknown and called micro-environmental. The objectives of this study were to develop a statistical method to estimate genetic parameters for macro- and micro-environmental sensitivities simultaneously, to investigate bias and precision of resulting estimates of genetic parameters and to develop and evaluate use of Akaike’s information criterion using h-likelihood to select the best fitting model.
We assumed that genetic variation in macro- and micro-environmental sensitivities is expressed as genetic variance in the slope of a linear reaction norm and environmental variance, respectively. A reaction norm model to estimate genetic variance for macro-environmental sensitivity was combined with a structural model for residual variance to estimate genetic variance for micro-environmental sensitivity using a double hierarchical generalized linear model in ASReml. Akaike’s information criterion was constructed as model selection criterion using approximated h-likelihood. Populations of sires with large half-sib offspring groups were simulated to investigate bias and precision of estimated genetic parameters.
Designs with 100 sires, each with at least 100 offspring, are required to have standard deviations of estimated variances lower than 50% of the true value. When the number of offspring increased, standard deviations of estimates across replicates decreased substantially, especially for genetic variances of macro- and micro-environmental sensitivities. Standard deviations of estimated genetic correlations across replicates were quite large (between 0.1 and 0.4), especially when sires had few offspring. Practically, no bias was observed for estimates of any of the parameters. Using Akaike’s information criterion the true genetic model was selected as the best statistical model in at least 90% of 100 replicates when the number of offspring per sire was 100. Application of the model to lactation milk yield in dairy cattle showed that genetic variance for micro- and macro-environmental sensitivities existed.
The algorithm and model selection criterion presented here can contribute to better understand genetic control of macro- and micro-environmental sensitivities. Designs or datasets should have at least 100 sires each with 100 offspring.
The term “genotype by environment (G × E) interaction” refers to the fact that the best genotype in one environment may not be the best genotype in another environment and that genotypes differ in their response to environmental factors, which means that genetic variance for environmental sensitivity or phenotypic plasticity exists. Some environmental factors (e.g., temperature, soil, diet, etc.) are identifiable and can be categorised (e.g., temperate or tropical climate) or quantified (e.g., temperature) and thus are referred to as macro-environmental factors. Other environmental factors are unknown and referred to as micro-environmental factors. Therefore, genetic variance in macro-environmental sensitivity is the genetic variance due to known environmental factors and can be expressed as the genetic variance in the slope of a reaction norm when environments can be quantified on a continuous scale. If environments are categorized, then phenotypes in different environments are considered as separate traits and the genetic covariances between environments are a measure of genetic variation in macro-environmental sensitivity. Genetic variance in micro-environmental sensitivity is the genetic variance due to unknown environmental factors and can be expressed as differences in environmental variance, sometimes called genetic heterogeneity of environmental variance.
Numerous studies in the last 70 years have studied G × E interactions or genetics of macro-environmental sensitivity in animal and plant breeding as well as in evolutionary biology[4–6]. Different modeling approaches have been used. In recent years, reaction norm models have been applied in animal breeding to better understand the environmental factors that determine G × E interactions[7–9]. In evolutionary genetics, many experiments on Drosophila and other laboratory species have been carried out to understand the genetics of macro-environmental sensitivity or phenotypic plasticity[6, 10, 11]. Other examples of studies on wild life populations include analyses of great tit and butterfly populations that showed the existence of genetic variation in phenotypic plasticity to temperature change.
The genetics of micro-environmental sensitivity or environmental variance have been studied less extensively than the genetics of macro-environmental sensitivity. In evolutionary genetics, several studies have focused on canalization, i.e. selection for reduced variance[14–16]. Recently, there has been renewed interest on this topic due to the development of methods to estimate genetic variance in environmental variance, e.g. Bayesian methods[17, 18] and double hierarchical generalized linear models (DHGLM) in a REML setting. Hill and Mulder reviewed 14 studies on this subject and concluded that there is empirical evidence for the existence of genetic variance in environmental variance.
Although there is substantial evidence for genetic variance in macro- and micro-environmental sensitivities, very few studies have studied them together or studied their genetic relationship. Jinks and Pooni postulated the concepts of macro- and micro-environmental sensitivity and studied both in Nicotiana rustica but did not investigate the relationships between these two types of environmental sensitivity. However, Perkins and Jinks reported that both types of environmental sensitivity were weakly genetically correlated for most traits in Nicotiana rustica. In a study on Populus, Wu showed that micro-environmental sensitivity was less heritable than macro-environmental sensitivity and that they were weakly genetically correlated. In another paper, also on Populus, Wu and O'Malley detected different sets of genes for macro- and micro-environmental sensitivities. In a mapping study of QTL for macro- and micro-environmental sensitivities in barley, Kraakman et al. identified only a few QTL for micro-environmental sensitivity, but did not detect any QTL for macro-environmental sensitivity. Yampolsky and Scheiner observed a weak genetic correlation between developmental noise, similar to micro-environmental sensitivity, and macro-environmental sensitivity in Daphnia Magna. In Drosophila melanogaster, both positive and negative genetic correlations between macro- and micro-environmental sensitivities have been reported[27, 28]. In farm animals, the genetic correlation between macro- and micro-environmental sensitivities has not been investigated. In addition, to date, no suitable method was available to estimate such correlations in outbred populations and the requirements in terms of number of families and family sizes of the designs necessary to estimate genetic parameters of macro- and micro-environmental sensitivities were unknown.
The objectives of this study were to extend the double hierarchical generalized linear model (DHGLM) of Rönnegård et al. with a reaction norm model to estimate genetic variance for macro- and micro-environmental sensitivities and to investigate bias and precision of estimated variance components in designs resembling dairy cattle populations. In addition, we evaluated Akaike’s information criterion (AIC) using approximated h-likelihood to select the best fitting model and studied situations in which true genetic and best fitting statistical models differed. Finally, we applied the model to lactation milk yield in dairy cattle.
In this section, we describe the assumed quantitative genetic model underlying genetic variance for macro- and micro-environmental sensitivities, the statistical model to estimate genetic variance in macro- and micro-environmental sensitivities, and the AIC based on an approximation of the h-likelihood that can be used to select the best fitting model. Finally, we present the simulation used to test the statistical model and describe the evaluated scenarios as well as an application to milk yield in dairy cattle.
Quantitative genetic model
Here, we assume that genetic variance for macro-environmental sensitivity is expressed as the genetic variation in the slope of a linear reaction norm. Genetic variance in micro-environmental sensitivity implies that genotypes respond differently to one or several unknown environmental factors and thus, we assume that it is expressed as genetic variance in environmental variance according to an exponential model. Thus, the quantitative genetic model underlying genetic variance in macro- and micro-environmental sensitivities can be formulated as:
where P is the phenotype, μ is the population mean for the phenotype, x is the environmental parameter (continuous or discrete) that cause genotypes to respond differently, A int , A sl and A v are the additive genetic or breeding values for the intercept, for the slope of the reaction norm (= macro-environmental sensitivity), and for the environmental variance (= micro-environmental sensitivity), respectively, is the environmental variance of the exponential model and ϵ is a scaled environmental deviation with variance one. Note that the average environmental variance is. The additive genetic values A int , A sl and A v have variance
The basis of the statistical model is the DHLGM as presented by Rönnegård et al.. The original DHGLM algorithm in Rönnegård et al. iterates between a linear mixed model for the phenotypic observations: y i = P i = phenotypic observation of animal i; and a Gamma GLM for the residual variance φ i , where is the squared estimated residual of y i and h i is the diagonal element of the hat-matrix of y corresponding to y i . Felleki et al. extended the model of by allowing estimation of the genetic covariance () between the genetic effect in the mean model and the genetic effect in the variance model (Equation (2)) in the absence of a linear reaction norm model and reported on computational details and a theoretical assessment of this algorithm. In brief, equivalent to using a Gamma GLM, linearizing around its current fitted value results in vector ψ and yields the following bivariate linear model:
where X(X v ) is the incidence matrix for fixed effects for y (ψ), b (b v ) is the vector with solutions for fixed effects for y(ψ), Z(Z v ) is the incidence matrix for relating observations of y(ψ) to the additive genetic values a (a v ) for phenotype (environmental variance), where A is the additive genetic or numerator relationship matrix and is the genetic variance of a(a v ). The residuals of y (e) and ψ (e v ) are assumed to be independent normally distributed, and, and are scaling variances, which are expected to be equal to 1, because W and W v already contain the reciprocals of the residual variances per observation. Estimating and allows more flexibility. An iterative estimation procedure is required to obtain estimates for all parameters because of the dependence of the model for y on the results of the model for ψ and vice versa. The vector ψ and the diagonals of W and W v are updated at each iteration until convergence. The model in Equation (2) can be considered as combining the DHGLM method with the iterative method of Mulder et al..
In order to consider macro- and micro-environmental sensitivity simultaneously, the model for phenotype y in Equation (2) was extended with a linear reaction norm model to estimate genetic variance in macro-environmental sensitivity, whereas the residual variance model using ψ estimates the genetic variance in micro-environmental sensitivity or environmental variance. Instead of using an animal model, we used a sire model because implementation of the animal model DHGLM resulted in severely biased estimates of variance components with single observations per animal. As an indication of poor model fit, implementation of the animal model DHGLM gave a lower adjusted profile h-likelihood (see next section “Model selection”) than an animal model without heterogeneity of residual variance. Sire models are commonly used in combination with reaction norm models[7, 8, 33]. In the reaction norm model, we assumed that the environmental parameter x (see Equation (1)) is known without error and does not need to be estimated from the data. The resulting combined macro–micro environmental sensitivity sire model can be formulated as:
where is a vector with linearized values of transformed squared residuals (see calculation in Appendix), Z s and Z sv are respectively the incidence matrices for the sire effects for the intercept of the reaction norm and for the environmental variance, Z x is the matrix with the environmental parameter x as a covariate for the sire effects for the slope of the reaction norm, and s int , s sl and s v are the vectors with the estimated sire effects for intercept, slope, and environmental variance. The sire effects s int , s sl , and s v are assumed trivariate normally distributed, assuming that sire (co)variances are a quarter of the additive genetic variance. The residuals of y (e s ) and ψ s (e sv ) are assumed to be independent normally distributed, because cov(e, e2) = 0, when e is normally distributed (), where and are scaling variances for the residual variances in a sire model. Computations of ψ s and of the diagonals of W s and W sv are different due to the use of a sire model compared to an animal model and are explained in the Appendix. The algorithm was implemented by iterating over several runs of ASReml. In each ASReml run, REML-estimates of the variance components were obtained for the current values of ψ s , W s and W sv . The vector ψ s and the diagonals of W s and W sv were updated after each ASReml run. On simulated data, 10 ASReml runs were sufficient to obtain converged parameter estimates. ASReml was run with an option to check whether the variance-covariance matrices were positive definite and were forced to be positive definite if they were not.
When working with real data, the true genetic model is not known, thus statistical inference can be used to find the statistical model that fits the data best. The h-likelihood concept can be applied when using DHGLM, in which the adjusted profile h-likelihood (APHL) is used to assess the significance of variance components. Because we use an iterative reweighted least square approximation in ASReml, the APHL can be approximated from the log REML likelihood (logL) for the bivariate model in Equation (3) fitted in ASReml, after correcting for the fact that the squared estimated residuals of y are used to compute the response variable ψ s (see Appendix for derivation):
where is the weight for the variance model for observation i, i.e. the ith diagonal of W sv . Note that Equation (4) can also be used for animal models by replacing elements and with the corresponding elements of Equation (2). To compare nested and non-nested models, we propose to use APHL in combination with Akaike’s information criterion (AIC):
where t is the number of variance parameters.
Monte Carlo simulation was used to evaluate the bias and precision of the estimated genetic parameters of the model and proposed algorithm. Populations of paternal half-sib families were simulated, resembling a simplified population structure in an animal breeding context with large half-sib offspring groups per sire like in dairy cattle. Either the number of sires (50, 100 or 200) was varied while keeping the number of offspring per sire at 100, or the number of offspring per sire (20, 50, 100 or 200) was varied while keeping the number of sires at 100. Sires and dams were assumed unrelated and were not selected. The offspring had phenotypes but the dams or sires did not. Phenotypes were generated for offspring according to the quantitative genetic model in Equation (1). Dams were randomly assigned to herds (herd size = 100 cows) and their offspring (daughters in the case of dairy cattle) were in the same herd. For each herd, the environmental covariate x was sampled from N(0,1). No fixed effects other than a general mean were simulated (μ = 0) but offspring were randomly allocated to contemporary groups with size 10 to investigate the effect of fixed effects on the precision of genetic parameters. Additive genetic effects were sampled from N(0, G ⊗ A). Different values for genetic variances and genetic correlations were used for simulation. Table 1 gives an overview of the parameter values used in the simulation, both the default and alternative values. In each scenario, only one parameter was varied at a time, i.e. other parameters were kept at their default values. For each set of parameters, 100 replicates were generated and means and standard deviations of estimates were calculated across replicates.
Scenarios in which true genetic and statistical models differ
First, we tested situations in which the true genetic and statistical models contained both types of environmental sensitivities for different sets of genetic parameters and designs. In addition, we investigated the power of AIC to select the correct model given the true genetic model. Three situations were simulated with default parameters: presence of genetic variance in (i) macro- or (ii) micro-environmental sensitivity or in (iii) both. Four statistical models were applied to these simulated situations: a combined macro–micro environmental sensitivity model “macro–micro”, a macro-environmental sensitivity model “macro”, a micro-environmental sensitivity model “micro” and a simple model “simple” with only one additive genetic effect for the phenotype. The “macro–micro” model and the “micro” model were run for 10 ASReml runs to update the weights and the transformed squared residuals, whereas the “macro” model and the “simple” model required only one ASReml run. AIC (Equation 5) was used to select the best fitting model.
Finally, we tested scenarios in which the statistical model deviated from the true genetic model used for simulation, i.e. when model selection failed to select the right model (Table 2). These scenarios were analyzed to explore whether genetic parameters are biased when the statistical model deviates from the true genetic model and whether macro-environmental sensitivity can be detected with a micro-environmental sensitivity model or vice versa. These scenarios were run with the default set of parameters as listed in Table 1 but with various values of and. Scenarios for which the true genetic and statistical models were the same and included either macro or micro-environmental sensitivity were not analyzed since these have been previously investigated, i.e. see for macro-environmental sensitivity and for micro-environmental sensitivity.
Application to milk yield in dairy cattle
The “macro–micro”, “macro”, “micro” and “simple” models were applied to 305-day lactation milk yield of 142 565 first parity Swedish Holstein cows. Lactation yields were calculated based on test-day data, as described in using the test-day interval method. Seasons were defined as three-month periods (January-March, April-June, July-September, October-December). Herd-year-season (HYS) classes with less than five cows were excluded. Data on a total of 762 sires with on average 187 daughters were available and 213 of these sires had more than 100 daughters. In the model for phenotype (y), HYS was a fixed effect and in the model for residual variance (ψ s ), HYS was a random effect to allow regression to the mean with few observations per HYS and avoid extreme HYS estimates. The model for y included a fixed fourth order polynomial for age at calving and a fixed ninth-order polynomial for lactation length, while the model for ψ s included a third and a sixth order polynomial for age at calving and lactation length, respectively. The orders of polynomials were determined using a Wald test in univariate models for y and ψ s and were kept the same for “macro–micro”, “macro”, “micro” and “simple” models. Herd-year average milk yield standardized to a mean of zero and a variance of one was used as an environmental parameter for macro-environmental sensitivity. The “macro–micro”, “macro”, “micro” and “simple” models were all run for 50 ASReml runs. After 50 runs, the change in variance components was less than 0.2% of the change between consecutive runs and therefore parameters were considered converged.
Variance components and effect of fixed effects
Estimated genetic variances were close to their true values when simulated genetic correlations were zero, although occasionally the average value deviated slightly from the true value (Table 3). The average of the estimated genetic variance in slope was biased upwards in all cases, but the bias was small (between +2 and +13%). Surprisingly, standard deviations of relative to were small (between 21 and 30% of). Note that standard deviations across replicates represent standard errors of estimates from a single replicate, e.g. an analysis of one data set. The average of the estimated genetic variance in environmental variance had on average larger deviations from the true value but no direction in bias could be detected. Standard deviations of relative to were large (between 31 and 74%), indicating that this parameter had a low precision. In the presence of fixed effects (i.e. contemporary group) for the mean and residual variance, standard deviations of estimated parameters increased on average by 9% and the number of replicates for which the genetic variance-covariance structure was forced to be positive definite also increased. Thus, the conclusion is that estimates of genetic variances are practically unbiased but that the precision of is low.
Effect of genetic correlations
In general, estimates of genetic variances remained close to their true values when genetic correlations were varied (<10% difference from the true value) (Table 4). When all genetic correlations were equal to 0.5, the genetic variance of environmental variance was slightly underestimated (−11%). Estimates of genetic correlations were unbiased but had very high standard deviations, especially the genetic correlation between the breeding value for slope and environmental variance (). The number of replicates in which the variance-covariance structure was forced to be positive definite was in general small but slightly higher when all genetic correlations were equal to 0.5. In addition to the results shown, we simulated an extreme negative genetic correlation (), as well as an extreme positive genetic correlation (). In these scenarios, for 40 to 60% of the replicates, the variance-covariance matrix was bended to be positive definite. Due to the large proportion of replicates with bended variance-covariance matrices, the average genetic correlation was biased towards zero. Nevertheless, the general conclusion is that estimates of genetic correlations are largely unbiased but are estimated with low precision.
Effect of different designs
When varying the number of offspring per sire or the number of sire families, means of genetic variances were close to their true simulated values in most cases, except when the number of offspring per sire was 20 for which estimates for and were biased upwards (25% and 20%, respectively) (Table 5). In this case, 51% of the replicates had variance-covariance matrices that were forced to be positive definite, indicating that this is not a suitable design to estimate genetic parameters for macro- and micro-environmental sensitivities. When increasing the number of offspring per sire, the standard deviations of estimates of and decreased substantially, e.g. from 50 to 200 offspring with 54 and 60%, respectively, whereas the standard deviation of estimates of decreased only slightly (−10%). When increasing the number of sire families from 50 to 200, standard deviations of estimates of all genetic parameters decreased substantially (from −52 to −58%). Based on these results, we conclude that designs with at least 100 sires with 100 offspring each are required in order to estimate variance components for macro- and micro-environmental sensitivities with low standard errors.
Using AIC to select the best fitting model, the true genetic model was selected as the best statistical model in at least 90% of the 100 replicates when the number of offspring per sire was 100 (Table 6). In only a few cases did the best statistical model differ from the true genetic model. However, when the number of offspring per sire was 50, the correct model was selected for 63% to 95% of the replicates. When both macro- and micro-environmental sensitivity existed and the number of offspring per sire was 50, the macro model was selected rather than the macro–micro model in 27% of the replicates. When only micro-environmental sensitivity existed and 50 offspring per sire, the power to select the correct model was reduced. In conclusion, the power to select the correct model was high in designs with at least 100 offspring per sire.
Scenarios in which true genetic and statistical models differ
It is of interest to determine whether genetic parameters are biased if the wrong statistical model is applied. When either macro or micro-environmental sensitivity was simulated (Table 7), the variance-covariance structures were forced to be positive definite in many replicates, due to estimates of variances at the boundary. Estimates at the boundary were expected since the estimated genetic variance should be close to zero when the true genetic variance is zero. Forcing the genetic variance-covariance structures to be positive definite can bias estimates of parameters. When true macro-environmental sensitivity existed, some micro-environmental sensitivity was detected () with the “micro” (scenario A) and the “macro–micro” (scenario B) models, indicating that part of the macro-environmental sensitivity was captured as micro-environmental sensitivity, but were small (0.010-0.018). When true micro-environmental sensitivity existed, no macro-environmental sensitivity was detected () with the “macro” (scenario C) and “macro–micro” (scenario D) models, but were biased downwards (between −16% to −28%), possibly because of the high number of genetic variance-covariance structures that were forced to be positive definite. When both macro and micro-environmental sensitivity existed, the “macro” (scenario E) and “micro” (scenario F) models gave unbiased estimates of either macro-environmental sensitivity or micro-environmental sensitivity and the genetic variance-covariance structure was forced to be positive definite for at maximum three replicates. Thus, the general conclusion is that a discrepancy between the true and statistical models does not lead to large biases in estimated genetic parameters and that contamination between estimates of the two types of environmental sensitivity is very limited.
Application to milk yield in dairy cattle
The “macro–micro”, “macro”, “micro” and “simple” models were applied to 305-day first lactation milk yield data of Swedish Holsteins (mean = 8693 kg, standard deviation = 1652 kg, skew = 0.18, kurtosis = 0.28). “Macro”, “micro” and “macro–micro” models fitted significantly better than the “simple” model. The “micro” model was favoured by AIC (Table 8) and had the best fit. The genetic variance for micro-environmental sensitivity was substantial but lower than for most reported traits: one genetic standard deviation changed micro-environmental sensitivity (= environmental variance) by 21%. The difference in AIC between the “macro–micro” and “micro” models was small and therefore it was interesting to examine the genetic parameters of the “macro–micro” model. The estimated genetic variance for macro-environmental sensitivity was small in comparison to the genetic variance in intercept. For instance, the estimate of the genetic correlation between environments that were −2 and 2 standard deviations from the overall mean was 0.92, indicating a small level of reranking of sires across the environmental gradient. Estimates of genetic correlations between intercept and macro- and micro-environmental sensitivities were 0.81 and 0.63, respectively, indicating that selection for higher milk yield increases both types of environmental sensitivity. The estimate of the genetic correlation between macro- and micro-environmental sensitivities was 0.76, indicating that they are genetically similar. Standard errors of parameter estimates were smaller than the standard deviations found for the simulations, due to the larger dataset i.e. more sires and more offspring per sire. Thus, macro- and micro-environmental sensitivities may exist for milk yield in dairy cattle and are positively correlated.
Model and design
In this study, we developed a model to estimate genetic variances for macro- and micro-environmental sensitivities. The model is an extension of the DHGLM as presented by Rönnegård et al.. Here, we combined a linear reaction norm model to estimate genetic variance for macro-environmental sensitivity with the DHGLM to estimate genetic variance for micro-environmental sensitivity. The animal model in Equation (2) was adapted to a sire model because the animal model produced highly biased estimated variance components because of the high dependence of the estimated breeding values and residuals on the variance ratio used in the mixed model equations, which would differ for each animal. Felleki et al. also reported the presence of bias in variance components with few repeated observations per animal but the bias decreased as the number of repeated observations per animal increased. Furthermore, an animal model with heterogeneous residual variance (DHGLM implementation) gave a poorer adjusted profile h-likelihood than an animal model without homogenous residual variance, which indicates that the former did not produce a better fit than the latter in a scenario that included both macro and micro-environmental sensitivities. Therefore, we decided to use a sire model implementation, because it is more robust than the animal model implementation with DHGLM when animals only have a single observation. Furthermore, sire models are commonly used for reaction norm models, because of their substantially lower computational burden compared to animal models and genetic information about environmental sensitivity typically comes from paternal half-sibs that perform in different environments. As far as we know, this is the first time that a model to estimate genetic variance in macro- and micro-environmental sensitivities suitable for outbred animal populations is presented.
Monte Carlo simulation was used to evaluate bias and precision of estimated genetic parameters. Genetic parameters were unbiased in most situations. The precision was not very high, especially of estimates for, as indicated by the high standard deviation of estimates across replicates, particularly in designs with small families. Designs with at least 100 sire families, each with at least 100 offspring, are required to have sufficient precision. Presence of fixed effects, such as contemporary group effects, would increase the required number of sire families and the number of offspring per sire family. These results are in agreement with standard error and power calculations reported by Hill, Mulder and Hill and Mulder with respect to estimation of genetic variance for micro-environmental sensitivity or for environmental variance. For instance, Hill and Mulder derived that the optimal family size to estimate genetic variance for environmental variance with a family design is approximately 2/γ2, where γ2 is the square of the coefficient of variation of the within-family variance. Thus, the optimal family size for half-sibs is 137 when. The only study providing Monte Carlo results for the DHGLM to estimate genetic variance for micro-environmental sensitivity is by Rönnegård et al.. They considered a design with clones, which is more powerful and leads to lower standard deviations of estimates than those in our study.
With respect to macro-environmental sensitivity, the magnitude of the standard deviations of the estimates across replicates was similar to that reported by Calus et al., but larger than those reported by Lillehammer et al., which is explained by the fact that the latter authors simulated more sire families, i.e., 1000 sire families with 100 offspring each. In general, the number of required offspring per family is lower for macro-environmental sensitivity than for micro-environmental sensitivity, as indicated by the lower standard deviations of estimates of genetic variance for the former. This is in agreement with a previous study by Mulder, which showed that the power to detect G x E interactions between two environments is greater than the power to detect genetic variance for environmental variance or micro-environmental sensitivity. In the present study, we assumed that the environmental parameter x used for the reaction norm was known without error. This will be true in some cases, e.g. when using temperature or rainfall or other herd characteristics. In other cases, an estimated herd mean is used as the environmental parameter, which is estimated from the data[7, 8, 33]. Calus et al. showed that genetic variance in macro-environmental sensitivity () was severely underestimated when the environmental parameter was estimated from the data. This may have also led to the genetic variance in macro-environmental sensitivity to be underestimated in our application to milk yield. Su et al. reported that a Bayesian approach that estimates simultaneously the herd mean and the reaction norm parameters leads to unbiased estimates of.
Lillehammer et al. showed that sire models gave upward biased estimates of when heterogeneity of residual variance was ignored, because the unexplained genetic variance in the reaction norm parameters becomes part of the residual variance when using a sire model. In our study, the bias in estimates of was smaller than in Lillehammer et al.. Lillehammer et al. proposed including a dummy animal effect in the model to account for the residual three-quarters of the genetic variance that is not accounted for by the sire effect. This solution was also tested in our model but gave severely biased variance components because of the high dependency of estimated breeding values and residuals on variance ratios that were used in the mixed model equations, which differ by animal when considering heterogeneity of residual variance.
The algorithm developed in this study allowed estimating genetic correlations between the different genetic effects. Standard deviations of estimated genetic correlations were large, especially those of the genetic correlation between macro- and micro-environmental sensitivities (). This large magnitude of the standard deviation of estimates of the genetic correlation was expected considering that the genetic correlation between macro- and micro-environmental sensitivities is mainly based on paternal half-sib information and that both traits have low heritability. Using the equation in Robertson and assuming a heritability of 0.05 for both traits, the standard error is approximately 0.26 when the true genetic correlation between the traits is zero, which is close to the value found here, i.e. 0.25 (Table 4). To increase the precision of estimates of the genetic correlation between macro- and micro-environmental sensitivities, designs with a larger number of families and larger family sizes are required. The application to milk yield data in dairy cattle shows that it should be possible to estimate genetic correlations with standard errors between 0.06 and 0.10, since in most countries datasets with at least 100 bulls each with 100 daughters are easily obtained. Full-sib families or clones would also reduce standard errors of estimates of genetic correlations in comparison to half-sibs.
Here, we showed that the adjusted profile h-likelihood (APHL) can be approximated from REML-output and used in combination with AIC to provide an efficient model selection tool. In addition, we showed that biases in genetic parameters were relatively small when statistical and true models differed. Both results are re-assuring that these models can discriminate between macro- and micro-environmental sensitivities and that the true model of environmental sensitivity can be elucidated using AIC. We also found that the Bayesian information criterion (BIC) was too conservative and favoured the simpler model too often (results not shown). AIC has the advantage that it can be used independent of the order of the fitted models, whereas the likelihood ratio test requires a hierarchical structure such as in a forward selection scheme.
Improving biological understanding of environmental sensitivity
The proposed model can contribute to better understand the genetic architecture of environmental sensitivity, e.g. whether macro- and micro-environmental sensitivities are genetically related. Most studies in plants and laboratory species indicate that macro- and micro-environmental sensitivities are weakly correlated[22–24, 26–28]. This seems to indicate that selection on one type of environmental sensitivity will hardly affect the other. The first application of the model on milk yield data in dairy cattle, revealed a high genetic correlation between macro- and micro-environmental sensitivities (0.76), suggesting that selection on one type of environmental sensitivity will also affect the other in the same direction. Generally, little is known about these relationships in livestock. Knowledge about these genetic correlations could be used to optimize selection strategies for environmental sensitivity.
In this study, we assumed a linear reaction norm model, but reaction norms can also be non-linear. The model presented here can easily be extended to higher-order polynomials. Furthermore, the genetic basis of micro-environmental sensitivity may not be the same along an environmental gradient and the model for residual variance or micro-environmental sensitivity in Equation (3) could be extended to contain a reaction norm with a known environmental gradient. For instance, in stressful environments, there might be more genetic variance for micro-environmental sensitivity than in less stressful environments. Furthermore, G × E interactions often exist between categorical environments and are often analysed with character state or multivariate models. Character state models do not explicitly estimate breeding values for macro-environmental sensitivity, but these breeding values could be back-calculated by using covariance functions when the environmental parameter responsible for G × E interactions is identified because reaction norm models and character state models are interchangeable. In the case of two environments, a reaction norm model with a dummy environmental parameter with values 0 and 1 would yield results that are identical to a bivariate character state model. Multivariate versions of the DHGLM or reaction norm models with dummy environmental variables could be used to simultaneously investigate macro- and micro-environmental sensitivities when environments are discrete.
Application to breeding
Taking macro- and micro-environmental sensitivities into consideration is highly relevant in animal breeding. Due to the high level of globalisation in animal breeding programs, it is necessary to breed animals that can perform well in a wide range of environments. Therefore, it may be important to select animals that have limited environmental sensitivity, especially for environments with a higher risk of environmental disturbances. Reduction in environmental sensitivity increases the predictability of performance and reduces risk for farmers. Furthermore, reduction in micro-environmental sensitivity will increase the uniformity of animal products, which is a general goal. In plant breeding, application of a model for macro- and micro-environmental sensitivities is also highly relevant since G × E interactions are generally very strong and uniformity of crops is very important. Recent papers by Ordas et al., Makumburage and Stapleton and Kliebenstein show that there is an interest for increased uniformity in plants. Economic values could be derived for micro-environmental sensitivity. The economic value of macro-environmental sensitivity can be determined as a function of the importance of environments along the environmental gradient. Based on Mulder et al.[3, 47], progeny testing schemes are more efficient than sib testing schemes to reduce micro-environmental sensitivity since it behaves as a trait with a small heritability. Genomic selection could be an alternative selection strategy with sufficient accuracy and shorter generation intervals.
In this study, a model was developed to estimate genetic parameters of macro- and micro-environmental sensitivities, combining a reaction norm model with a double hierarchical generalized linear model within a REML framework. Simulations showed that the genetic parameters obtained were mostly unbiased, but designs with at least 100 sires, each with 100 half-sib offspring, were required to estimate genetic parameters with sufficient precision. Using AIC, the true genetic model was selected as the best statistical model in at least 90% of replicates when the number of offspring per sire was 100. Application of the model to milk yield data in dairy cattle showed that both types of environmental sensitivity existed. Our model and AIC based on h-likelihood can be used to increase our understanding of the genetic control of environmental sensitivity in livestock populations but more research is needed to test the model in a wider range of situations.
DHGLM algorithm for a sire model
The original DHGLM algorithm of Rönnegård et al. was developed for an animal model. Here we describe the estimation algorithm for the sire model used in the current paper, including a few adjustments of the algorithm in Rönnegård et al. to correct for the fact that the residual variance in a sire model (without permanent environmental effects, e.g. with animals with a single observation) contains three quarters of the additive genetic variance in addition to the environmental variance. The adjustments in the algorithm are as follows: adjustment of the squared residuals (yv) accounting for the fact that the residual variance in a sire model includes three quarters of the additive genetic variance, use of average residual variance to calculate ψ s (instead of predicted individual values), and computations of the diagonal weight matrices W s and W sv . These adjustments resulted in a computationally robust algorithm with small or no bias (as presented in Results).
To compute the linearized response ψ s for the bivariate sire model in Equation (3), first we calculate as:
This is equivalent to the calculations for the response yv in, except for the multiplication by the ratio of the average estimated residual variance in a sire model (, whereand n is the total number of records) and the average residual variance in an animal model, which is calculated as, (). Because we use a log link function, is linearized as:
The diagonals of Ws are the reciprocals of (i.e.), which is the vector of predicted residual variances for each observation based on the previous iteration and is calculated as:
The sire effects for environmental variance only affect the part of the residual variance which is truly environmental variance and therefore three-quarters of the additive genetic variance is subtracted in the multiplicative part of Equation (8). The diagonals of W sv are the reciprocals of the residual variance of ψ s and were calculated as:
since the true residuals are assumed normal and the squared true residuals are therefore Gamma distributed with a variance of.
The algorithm can be summarized as:
Run model on y in Equation (3) with homogeneous residual variance.
Calculate ψ s , W s , W sv , where in iteration 1.
Run bivariate model in Equation (3).
Update ψ s , W s , W sv
Iterate steps 3 till 4 until convergence
Approximation of the adjusted profile h-likelihood
In our paper, the model selection was based on an approximation of the adjusted profile h-likelihood (APHL), which is defined as:
where H is the Hessian of the h-likelihood and τ is the vector of all fixed and random effects both in the mean and variance parts of the model. For the model in Equation (3), minus two times the h-likelihood (− 2h) is:
where l is log-likelihood, (, with being the ith diagonal of W s ) is the residual variance of observation i, is the estimated squared residual of observation i, is the covariance matrix of all random effects (), s is a vector of all random sire effects () and is the weight for the mean model for observation i. The minus two log REML likelihood (logL) from the bivariate model in Equation (3) is:
where the first n residual variances come from the first part of the bivariate model (y) and the next k residual variances come from the second part of the bivariate model (ψ s ) (k = n = number of records), and C is the Hessian of the bivariate model (i.e. left-hand-side of the mixed model equations). Because log (det(C)) is a reasonable approximation of log (det (H)), we can approximate APHL as given in Equation (4) in the main text:
Thus, the REML likelihood for the bivariate model is corrected for the fact that the squared residuals are used as “observations” in the bivariate model. Note that equation (14) can also be used for animal models by replacing the elements and with the corresponding elements of Equation (2) or for models with more or fewer random effects in the mean and variance model.
Falconer DS, Mackay TFC: Introduction to quantitative genetics. 1996, Essex: Pearson Education Limited, 4
De Jong G, Bijma P: Selection and phenotypic plasticity in evolutionary biology and animal breeding. Livest Prod Sci. 2002, 78: 195-214. 10.1016/S0301-6226(02)00096-9.
Mulder HA, Bijma P, Hill WG: Prediction of breeding values and selection responses with genetic heterogeneity of environmental variance. Genetics. 2007, 175: 1895-1910. 10.1534/genetics.106.063743.
Meyer K: Factor-analytic models for genotype x environment type problems and structured covariance matrices. Genet Sel Evol. 2009, 41: 21-10.1186/1297-9686-41-21.
Piepho HP, Mohring J, Melchinger AE, Buchse A: BLUP for phenotypic selection in plant breeding and variety testing. Euphytica. 2008, 161: 209-228. 10.1007/s10681-007-9449-8.
Scheiner SM: Selection experiments and the study of phenotypic plasticity. J Evol Biol. 2002, 15: 889-898. 10.1046/j.1420-9101.2002.00468.x.
Kolmodin R, Strandberg E, Madsen P, Jensen J, Jorjani H: Genotype by environment interaction in Nordic dairy cattle studied using reaction norms. Acta Agric Scand A. 2002, 52: 11-24.
Calus MPL, Veerkamp RF: Estimation of environmental sensitivity of genetic merit for milk production traits using a random regression model. J Dairy Sci. 2003, 86: 3756-3764. 10.3168/jds.S0022-0302(03)73982-4.
Bergsma R, Hermesch S: Exploring breeding opportunities for reduced thermal sensitivity of feed intake in the lactating sow. J Anim Sci. 2012, 90: 85-98. 10.2527/jas.2011-4021.
Bradshaw AD: Evolutionary significance of phenotypic plasticity in plants. Adv Genet. 1965, 13: 115-155.
Scheiner SM: Genetics and evolution of phenotypic plasticity. Annu Rev Ecol Syst. 1993, 24: 35-68. 10.1146/annurev.es.24.110193.000343.
Nussey DH, Postma E, Gienapp P, Visser ME: Selection on heritable phenotypic plasticity in a wild bird population. Science. 2005, 310: 304-306. 10.1126/science.1117004.
Windig JJ: Reaction norms and the genetic basis of phenotypic plasticity in the wing pattern of the butterfly Bicyclus anynana. J Evol Biol. 1994, 7: 665-695. 10.1046/j.1420-9101.1994.7060665.x.
Waddington CH: Canalization of development and the inheritance of acquired characters. Nature. 1942, 150: 563-565. 10.1038/150563a0.
Flatt T: The evolutionary genetics of canalization. Q Rev Biol. 2005, 80: 287-316. 10.1086/432265.
Debat V, David P: Mapping phenotypes: canalization, plasticity and developmental stability. Trends Ecol Evol. 2001, 16: 555-561. 10.1016/S0169-5347(01)02266-2.
Sorensen D, Waagepetersen R: Normal linear models with genetically structured residual variance heterogeneity: a case study. Genet Res. 2003, 82: 207-222. 10.1017/S0016672303006426.
Ibanez-Escriche N, Sorensen D, Waagepetersen R, Blasco A: Selection for environmental variation: a statistical analysis and power calculations to detect reponse. Genetics. 2008, 180: 2209-2226. 10.1534/genetics.108.091678.
Rönnegård L, Felleki M, Fikse F, Mulder HA, Strandberg E: Genetic heterogeneity of residual variance: estimation of variance components using double hierarchical generalized linear models. Genet Sel Evol. 2010, 42: 8-10.1186/1297-9686-42-8.
Hill WG, Mulder HA: Genetic analysis of environmental variation. Genet Res. 2010, 92: 381-395. 10.1017/S0016672310000546.
Jinks JL, Pooni HS: The genetic basis of environmental sensitivity. Proceedings of the Second International Conference on Quantitative Genetics: 31 May - 5 June 1987; Raleigh. Edited by: Weir BS, Eisen EJ, Goodman MM, Namkoong G. 1988, North Carolina State University, 505-522.
Perkins JM, Jinks JL: The assessment and specificity of environmental and genotype-environmental components of variability. Heredity. 1973, 30: 111-126. 10.1038/hdy.1973.16.
Wu RL: Genetic control of macro- and micro-environmental sensitivities in Populus. Theor Appl Genet. 1997, 94: 104-114. 10.1007/s001220050388.
Wu RL, O'Malley DM: Nonlinear genotypic response to macro- and micro-environments. Theor Appl Genet. 1998, 96: 669-675. 10.1007/s001220050787.
Kraakman ATW, Niks RE, Van den Berg PMMM, Stam P, Van Eeuwijk FA: Linkage disequilibrium mapping of yield and yield stability in modern spring barley cultivars. Genetics. 2004, 168: 435-446. 10.1534/genetics.104.026831.
Yampolsky LY, Scheiner SM: Developmental noise, phenotypic plasticity, and allozyme heterozygosity in Daphnia. Evolution. 1994, 48: 1715-1722. 10.2307/2410259.
Weber SL, Scheiner SM: The genetics of phenotypic plasticity. IV. Chromosomal localization. J Evol Biol. 1992, 5: 109-120. 10.1046/j.1420-9101.1992.5010109.x.
Scheiner SM, Caplan RL, Lyman RF: The genetics of phenotypic plasticity. III. Genetic correlations and fluctuating asymmetries. J Evol Biol. 1991, 4: 51-68. 10.1046/j.1420-9101.1991.4010051.x.
SanCristobal-Gaudy M, Elsen JM, Bodin L, Chevalet C: Prediction of the response to a selection for canalisation of a continuous trait in animal breeding. Genet Sel Evol. 1998, 30: 423-451. 10.1186/1297-9686-30-5-423.
Hoaglin DC, Welsh RE: The hat matrix in regression and ANOVA. Am Stat. 1978, 32: 17-22.
Felleki M, Lee D, Lee Y, Gilmour AR, Ronnegard L: Estimation of breeding values for mean and dispersion, their variance and correlation using double hierarchical generalized linear models. Genet Res. 2012, 94: 307-317. 10.1017/S0016672312000766.
Mulder HA, Hill WG, Vereijken A, Veerkamp RF: Estimation of genetic variation in residual variance in female and male broiler chickens. Animal. 2009, 3: 1673-1680. 10.1017/S1751731109990668.
Windig JJ, Mulder HA, Bohthe-Wilhelmus DI, Veerkamp RF: Simultaneous estimation of genotype by environment interaction accounting for discrete and continuous environmental descriptors in Irish dairy cattle. J Dairy Sci. 2011, 94: 3137-3147. 10.3168/jds.2010-3725.
Gilmour AR, Gogel BJ, Cullis BR, Thompson R: ASReml User Guide Release 2.0. 2006, VSN International Ltd: Hemel Hempstead
Lee Y, Nelder JA: Double hierarchical generalized linear models. J Roy Stat Soc C. 2006, 55: 139-167. 10.1111/j.1467-9876.2006.00538.x.
Calus MPL, Bijma P, Veerkamp RF: Effects of data structure on the estimation of covariance functions to describe genotype by environment interactions in a reaction norm model. Genet Sel Evol. 2004, 36: 489-507. 10.1186/1297-9686-36-5-489.
Rönnegård L, Felleki M, Fikse WF, Mulder HA, Strandberg E: Variance component and breeding value estimation for genetic heterogeneity of residual variance in Swedish Holstein dairy cattle. J Dairy Sci. 2013, 96: 2627-2636. 10.3168/jds.2012-6198.
ICAR: International agreement of recording practices. 2012,http://www.icar.org/pages/recording_guidelines.htm,
Hill WG: Heterogeneity of genetic and environmental variance of quantitative traits. J Ind Soc Agric Stat. 2004, 57: 49-63.
Mulder HA: Methods to optimize livestock breeding programs with genotype by environment interaction and genetic heterogeneity of environmental variance. PhD Thesis. 2007, Wageningen University, Animal Breeding and Genetics Group
Lillehammer M, Odegard J, Meuwissen THE: Reducing the bias of estimates genotype by environment interactions in random regression sire models. Genet Sel Evol. 2009, 41: 30-10.1186/1297-9686-41-30.
Windig JJ, Calus MPL, Veerkamp RF: Influence of herd environment on health and fertility and their relationship with milk production. J Dairy Sci. 2005, 88: 335-347. 10.3168/jds.S0022-0302(05)72693-X.
Su G, Madsen P, Lund MS, Sorensen D, Korsgaard IR, Jensen J: Bayesian analysis of the linear reaction norm model with unknown covariates. J Anim Sci. 2006, 84: 1651-1657. 10.2527/jas.2005-517.
Robertson A: The sampling variance of the genetic correlation coefficient. Biometrics. 1959, 15: 469-485. 10.2307/2527750.
Burnham KP, Anderson DR: Model selection and multi-model inference: a practical information-theoretic approach. 2002, New York: Springer
Eskridge KM, Johnson BE: Expected utility maximization and selection of stable plant cultivars. Theor Appl Genet. 1991, 81: 825-832.
Mulder HA, Bijma P, Hill WG: Selection for uniformity in livestock by exploiting genetic heterogeneity of residual variance. Genet Sel Evol. 2008, 40: 37-59.
Ordas B, Malvar RA, Hill WG: Genetic variation and quantitative trait loci associated with developmental stability and the environmental correlation between traits in maize. Genet Res. 2008, 90: 385-395. 10.1017/S0016672308009762.
Makumburage GB, Stapleton AE: Phenotype uniformity in combined-stress environments has a different genetic architecture than in single-stress treatments. Front Plant Sci. 2011, 2: 12-
Kliebenstein DJ: The quantitative genetics of phenotypic error or uniformity. Front Genet. 2011, 2: 59-
Lee Y, Nelder JA, Pawitan Y: Generalized linear models with random effects: unified analysis via H-likelihood. 2006, Chapman & Hall/CRC
We would like to thank Jack Windig, Majbritt Felleki, Jessica Franzén, Jorge Urioste and Susan Wijga for useful discussions about this research. This project was financed by the RobustMilk project, which is financially supported by the European Commission under the Seventh Research Framework Programme, Grant Agreement KBBE-211708. The content of this paper is the sole responsibility of the authors, and it does not necessarily represent the views of the Commission or its services.
The authors declare that they have no competing interests.
All authors were involved in designing the study and revising the manuscript. HAM developed the model, performed the simulations, analysed the data and drafted the manuscript. LR developed the adjusted profile h-likelihood. All authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
About this article
Cite this article
Mulder, H.A., Rönnegård, L., Fikse, W.F. et al. Estimation of genetic variance for macro- and micro-environmental sensitivity using double hierarchical generalized linear models. Genet Sel Evol 45, 23 (2013). https://doi.org/10.1186/1297-9686-45-23
- Genetic Correlation
- Residual Variance
- Dairy Cattle
- Genetic Parameter
- Sire Model