Modelling the growth curve of Maine-Anjou beef cattle using heteroskedastic random coefficients models

A heteroskedastic random coefficients model was described for analyzing weight performances between the 100th and the 650th days of age of Maine-Anjou beef cattle. This model contained both fixed effects, random linear regression and heterogeneous variance components. The objective of this study was to analyze the difference of growth curves between animals born as twin and single bull calves. The method was based on log-linear models for residual and individual variances expressed as functions of explanatory variables. An expectation-maximization (EM) algorithm was proposed for calculating restricted maximum likelihood (REML) estimates of the residual and individual components of variances and covariances. Likelihood ratio tests were used to assess hypotheses about parameters of this model. Growth of Maine-Anjou cattle was described by a third order regression on age for a mean growth curve, two correlated random effects for the individual variability and independent errors. Three sources of heterogeneity of residual variances were detected. The difference of weight performance between bulls born as single and twin bull calves was estimated to be equal to about 15 kg for the growth period considered.


INTRODUCTION
The weight performances of animals, recorded repeatedly during their lives, are a typical example of longitudinal data where the trait of interest is changing, gradually but continually, over time. Until recently in quantitative genetics, such records were frequently analysed fitting a so called "repeatability model", i.e. assuming all records were repeated measurements of a single trait with constant variances. Other approaches have been (i) to, somewhat arbitrarily, subdivide the range of ages and consider individual segments to represent different traits in a multivariate analysis or (ii) to fit a standard growth curve to the records and analyse the parameters of the growth curve as new traits.
Recently, there has been a great interest in random coefficient models [22] for the analysis of such data. These models use polynomials in time to describe mean profiles with random coefficients to generate a correlation structure among the repeated observations on each individual. Instead of considering only the overall growth curve, we assume that there is a separate growth curve for each individual. These have by and large been ignored in animal breeding applications so far, although they are common in other areas (see, for example, [22] for a general exposition). Repeated measurements on the same animal are more closely correlated than two measurements on different animals, and the correlation between repeated measurements may decrease as the time between them increases. Therefore, the statistical analysis of repeated measures data must address the issue of covariation between measures on the same unit. Modeling the covariance structure of repeated measurements correctly is of importance for drawing correct inference from such data [5]. The main advantages of longitudinal studies are increased power and robustness to model selection [6]. In animal genetics, random regressions in a linear mixed model context have been considered by Schaeffer and Dekkers [36]. Moreover, the recently developped SAS procedure PROC MIXED greatly increases the popularity of linear mixed models [40].
In quantitative genetics and animal breeding, heteroskedasticity has recently generated much interest. In fact, the assumption of homogeneous variances in linear mixed models may not always be appropriate. There is now a large amount of experimental evidence of heterogeneous variances for most important livestock production traits [14,33,43,44]. Major theoretical and applied work has been carried out for estimating and testing sources of heterogeneous variances arising in univariate mixed models [4,9,11,12,15,30,31,34,45].
In this paper, we extend the random regression model to a more general class of models termed the heteroskedastic random regression. This class of models assumes that all variances of random effects can be heterogeneous. Inference is based on likelihood procedures (REML, restricted maximum likelihood, [29]) and estimating equations derived from the expectation-maximization (EM, [2]) theory, more precisely the expectation/conditional maximization (ECM) algorithm recently introduced by Meng and Rubin [23].
The selection of a global model requires the choice of fixed effects (model on phenotypic mean vector E) and the choice of random effects (model on variancecovariance matrix V). In fact, this choice is complex because the choice of fixed effects depends on variance-covariance structure of observations, and in particular on the number of random effects included in the model. In practice, the strategy adopted is as follows: a structure of variance-covariance matrix V is assumed and a model E is chosen (selection of significant fixed effects) and subsequently, with a model E fixed, different structures for V are tested. One alternative approach consists of obtaining an inference on fixed effects by robust estimators (so-called "sandwich estimator", [21]) with respect to the structure on V. In this paper, the theory of the "sandwich estimator"is presented and used to select significant fixed effects.
These procedures are illustrated and presented via an example in growth performance of beef cattle. The aim of this study was to compare the growth curve of animals born as singles or twins and to quantify the difference of weight at different ages. The data analyzed in this paper comprised 943 weight records of 127 animals of the Maine-Anjou breed and are presented in the section "Materials and methods". The methods section encompasses models, estimation procedures and tests of hypotheses. Then, the results of the beef cattle example are presented and discussed. The paper ends with concluding remarks on longitudinal data analysis via random coefficient models.

Data
All animals were raised at the experimental Inra herd of "La Grêleraie" (Mayenne, France). This herd is part of a research project aimed at increasing the rate of natural twin calvings in cattle. From an economic point of view, breeders are also concerned with a comparison of growth performance of bull calves born as twins or single. Data consisted of 943 weight performances recorded between 100 and 650 days of age in 127 Maine-Anjou bulls (103 animals born as singles and 24 born as twins). There were on average 7 weight records per animal. The distribution of the number of records per animal and all characteristics of the data set analysed are presented in Table I.
The animals were grouped by year of birth and calving season. For each performance of an animal, the weight, the age at weighting, the calving parity of the mother and the birth status (single vs. twin) were recorded. These variables are presented in Table I.

Models
In this data set, animals can differ both in the number of records and in time intervals between them. One of the frequently used approaches is the linear mixed effects model [19] in which the repeated measurements are modeled using a linear regression model, with parameters allowed to vary over individuals and therefore called random effects.

Models for data
To characterize the effect of twinning on the growth curve between days 100 and 650, a mixed linear model including random effects and heterogeneous variances was used. The classical random coefficient model involves a random intercept and slope for each subject. The model considered here combines random regression with heteroskedastic variances; it can be written as follows: (1) where y ijl is the jth (j = 1, . . . , n k ) measurement recorded on the lth (l = 1, . . . , q) individual at time t jl in subclass i of the factor of heterogeneity (i = 1, . . . , p); x ijl β represents the systematic component expressed as a linear combination of explanatory variables (x ijl ) with unknown linear coefficients (β); (σ u 1i z 1ijl u * 1l + σ u 2i z 2ijl u * 2l ) represents the additive contribution of two random regression factors (u * 1l is the intercept effect and u * 2l is the slope effect) on covariable information (z 1ijl and z 2ijl ) and which are specific to each lth individual; σ u 1i and σ u 2i are the corresponding components of variance pertaining to stratum i. The random effects u * 1l and u * 2l are correlated and this correlation is assumed homogeneous over strata and equal to ρ. The e ijl represent independent errors.
In matrix notation, the model can be expressed as: where u * 1 = (u * 11 , . . . , u * 1l , . . . , u * 1q ) is the vector of normally distributed standardized intercept values N(0, I q ), u * 2 = (u * 21 , . . . , u * 2l , . . . , u * 2q ) is the vector of normally distributed standardized slope effects N(0, I q ), and e i is the vector of normally distributed residuals for stratum i N(0, Iσ 2 e i ). The regression components (u * 1 and u * 2 ) and environmental effects e i are assumed to be independent. Then, with the correlation coefficient ρ defined as previously. It would have been possible to introduce additional levels of random coefficients without any difficulty. But for the sake of simplicity, this example only considers two random regression components: the standard random interceptslope model. However, the equations shown in the appendix apply to both general (k = 1, . . . , K) and particular (K = 2) cases.
More generally, a heteroskedastic random coefficient model with K random coefficient components can be written as follows:

Models for variances
There are situations where variances are heterogeneous, i.e., variances are assumed to vary according to several factors. A convenient and parsimonious procedure to handle heterogeneity of variances is to model them via a log-linear function [20,27]. This approach has the advantage of maintaining parameter independence between the mean and covariance structure. As compared to transformations, it also avoids "to destroy a simple linear mean relationship making the interpretation and estimation of the mean and covariance parameters more difficult..." [46]. In the heteroskedastic model, residual variances (σ 2 e i ), for example, were assumed to vary according to several factors such as twinning, season of birth, rank of calving of the mother, age at weight. The idea was to find a model for the variance that describes the heterogeneity among p different subclasses (usually a large number in animal breeding) in terms of a few parameters. Following Foulley et al. [10] and San Cristobal et al. [34] among others, the residual variances were modeled as: where δ is an unknown (r × 1) vector of parameters, and p i is the corresponding (1 × r) row incidence vector of qualitative (e.g., twinning, rank of calving of the mother) or continuous covariates (e.g., age at weight). Just as was done with the residual variances, the individual variances σ 2 u 1i and σ 2 u 2i can be heteroskedastic and are also modeled with a structural model [11]: where η j with j = (1, 2) is an unknown vector of parameters and h ji is the corresponding row incidence vector of qualitative or continuous covariates.

Estimation of dispersion parameters
For the model developed in this paper, REML (restricted maximum likelihood, [29]) provides a natural approach for the estimation of fixed effects and all (co)variance components. To compute REML estimates, a generalized expectation-maximization (EM) algorithm was applied [7,8,11]. The theory of this method is described by Dempster et al. [2].
Let γ = (δ , η 1 , η 2 , ρ) denote the vector of parameters. The application of the generalized EM algorithm is based on the definition of a vector of complete data x (where x includes the data vector and the vector of fixed and random effects of the model, except the residual effect) and on the definition of the corresponding likelihood function L(γ; x) = ln p(x|γ). L(γ; x) can be decomposed as the sum of the log-likelihood Q u of u * as a function of ρ and of the log-likelihood Q e of e as a function of δ, η 1 , η 2 . The E step consists of computing the function Q(γ|γ [t] . The M step consists of selecting the next value γ [t+1] of γ by maximizing Q(γ|γ [t] ) with respect to γ. The function to be maximized could be written as: c [.] is a condensed notation for a conditional expectation taken with respect to the distribution of the complete data x given the observation y and the parameter γ set at their current value γ [t] .
For example, . For more complex functions, the same rules apply as shown in the appendix.
Q(γ|γ [t] ) can be decomposed into two parts: Note that Q u depends only on ρ. Thus, the maximisation of Q(γ|γ [t] ) with respect to ρ is reduced to the maximisation of Q u with respect to ρ. The REML estimates can be obtained efficiently via the Newton-Raphson algorithm for δ, η 1 and η 2 estimates and via the Fisher scoring algorithm for the parameter ρ. The corresponding systems of equations and their necessary inputs are shown in the appendix.

Tests of hypotheses
Tests of hypotheses involving fixed effects are more complex in mixed than in fixed effects models. The intuitive reason is clear: the fixed effects model has only one variance component and all fixed effects are tested against the error variance; a mixed model, however, contains different variance components and a particular fixed effects hypothesis must be tested against the appropriate background variability which can be expressed in terms of variance components present in a model.
Fitting linear mixed models implies that an appropriate mean structure as well as covariance structure needs to be specified. They are not independent of each other. Adequate covariance modeling is not only useful for the interpretation of the variation in the data, it is essential to obtaining valid inferences for the parameters in the mean structure. An incorrect covariance structure also affects predictions [1]. On the contrary, since the covariance structure models all variability in the data which is not explained by systematic trends, it highly depends on the specified mean structure.

Testing fixed effects
An approach based on robust estimators ("sandwich estimators", [21]) was chosen to select significant fixed effects. This method is defined as follows: Let α denote the vector of all variance and covariance parameters found in V. If y ∼ N(µ, V) with µ = Xβ and α is known, the maximum likelihood estimator of β, obtained by maximizing the likelihood function of y conditional on α, is given by [19]: and its variance-covariance matrix equals: where W i equals V −1 i . Note that a sufficient condition for (5) to be unbiased is that the mean E(y i ) is correctly specified as X i β. However, the equivalence of (6) and (7) holds under the assumption that the covariance matrix is correctly specified. Thus, an analysis based on (7) will not be robust with respect to model deviations in the covariance structure. Therefore Liang and Zeger [21] propose inferential procedures based on the so-called "sandwich estimator" for Var(β), obtained by replacing Var(y i ) by (y i − X iβ )(y i − X iβ ) . Liang and Zeger [21] showed that the resulting estimator of β is consistent, as long as the mean is correctly specified in the model. To that respect the simplest choice consists ofβ in (5) fitted by ordinary least squares, i.e.,β = However, it might be worthwhile to consider more complex structures for the working dispersion matrix W i , or generalized least squares estimation.
When α is not known but an estimateα is available, we can set and estimate β by using the expression (5) in which W i is replaced byŴ i . Estimates of the standard errors ofβ can then be obtained by replacing α byα in (6) and in (7) respectively, which are both available in the SAS MIXED procedure [35]. However, as noted by Dempster et al. [3], they underestimate the variability introduced by estimating α. The SAS MIXED procedure accounts to some extend for this downward bias by providing approximate t-and F-statistics for testing about β [18].
Practically, the resulting standard errors can be requested in the SAS MIXED procedure by adding the option "empirical" in the proc mixed statement. Note that this option does not affect the standard errors reported for the variance component in the model. For some fixed effects, however, the robust standard errors tend to be somewhat smaller than the model-based standard errors, leading to less conservative inferences for the fixed effects in the final model, but for others, there are larger with opposite effects on the real size of the test [41]. In any case, this procedure relies on asymptotic properties and therefore should be applied with at least a minimum number of individuals (about 100).
In this study, comparisons between robust and standard estimators will be presented for different homogeneous models: (0) a fixed effect model with independent errors, (1) a classical mixed model with one random effect and independent errors, (2) a fixed effect model with errors following a first order autoregressive process and (3) a random coefficient model with two correlated random effects (intercept and slope effects) and independent errors.
After selection of fixed effects in the model, random effects and factors of heterogeneity can be tested.

Testing random effects
Although the estimation of the parameters in the model is generally the main interest in an analysis, tests of hypotheses are usually required in assessing the significance of effects and in model selection. Tests of significance of random effects usually involve testing whether a single variance component is 0. For example, testing the significance of a random-intercept effect involves testing whether σ 2 u 1 = 0. These tests are carried out by using residual maximum likelihood ratio tests. However, the null hypothesis places the parameter on the boundary of the parameter space and the non-regular likelihood ratio theory is required [37]. Stram and Lee [38] considered the specific issue of tests concerning variance components and random coefficients.
For a single variance component, the asymptotic distribution of the likelihood ratio test is a mixture of a Dirac mass at zero and of a chi-square with a single degree of freedom with mixing probabilities equal to 0.5 [38]. The approximate P-value for the residual likelihood ratio statistic δ = −2 log(Λ) is easily calculated as 0.5Pr(X > d) where X ∼ χ 2 1 under the null hypothesis and d is the observed value of δ. The residual maximum likelihood ratio test for the test that p variance components are 0 involves a mixture of χ 2 -variates from 0 to p degrees of freedom. The mixing probabilities depend on the geometry of the situation [37]. Stram and Lee [38] found that the likelihood ratio test is conservative and for the residual maximum likelihood ratio test this was confirmed in a limited simulation study reported in Verbyla et al. [42]. A similar application was presented in Robert-Granié et al. [32].

Plot of data
With longitudinal data, an obvious first graph to consider is the scatterplot of the weight of animals against time. Figure 1 displays the data on weight of bulls in relation to age at weight. This simple graph reveals several important patterns. All bulls gained weight. The spread among all animals was substantially smaller at the beginning of the study than at the end. This pattern of increasing variance over time could be explained in terms of variation in the growth rates of the individual animals. In the case of the beef cattle data, the choice of a linear function between the 100th and the 650th days seemed appropriate for fitting the mean growth curve.

Model selection
As explained in the section "Tests of hypotheses", fixed effects were selected using robust estimators [21]. Comparisons between robust and standard estimators are presented for four homogeneous models with different structures of the variance-covariance matrix. The four models chosen are traditional models in longitudinal data analysis [13]: (0) a fixed effects model: y = Xβ + e, with independent errors, with y normally distributed, and with a variance-covariance matrix equal to Iσ 2 e ; (1) a classical homogeneous mixed model: y = Xβ + Zu + e, with u ∼ N(0, Iσ 2 u ) and e ∼ N(0, Iσ 2 e ); (2) a fixed effect model: y = Xβ + e, with first order autoregressive errors, e ∼ N(0, Σ) where Σ ij = σ 2 e ρ |t i −t j | , ρ is a real positive number, and |t i − t j | representing the distance between measurements i and j of the same animal. The error term corresponds to the contribution of a stationary Gaussian time process, where the correlation between repeated measurements decreases as the time between them increases; (3) a random coefficient model: y = Xβ + Z 1 u 1 + Z 2 u 2 + e, with u 1 ∼ N(0, Iσ 2 u 1 ), u 2 ∼ N(0, Iσ 2 u 2 ), Cov(u 1 , u 2 ) = σ 12 , and e ∼ N(0, Iσ 2 e ). For each model, all fixed effects were tested. Table II presents the value of the F-test and the P-value associated with each fixed effect and each model   considered. In each case, standard and robust estimators are given. Whatever the method considered, the interactions "age*twins", "age*rank of calving"and "twins*rank of calving"were not significant at the 5% level. The robust method led to the same conclusions whatever models were considered with respect to fixed effects. In contrast, using the standard approach, interactions "rank of calving*period of birth" and "twins*period of birth" were either significant or not significant depending on the structure of the variance-covariance matrix.
Despite the linear trend shown in Figure 1 for the mean growth curve of the animals, age 2 and age 3 were statistically significant, and thus, were kept in the model.
Finally, the list of the fixed effect retained in the model was: age, age 2 , age 3 , twins, rank of calving, period of birth, age*period of birth, rank of calving*period of birth and twins*period of birth; the non significant interaction age*twins was included in the model because this parameter is of primary interest to evaluate the difference in growth rate between single and twin born bulls.
In a second step, a set of random effects was chosen for the covariance model. A selection of random effects is summarized in Table III. The choice of random effects was based on the set of fixed effects selected with the robust procedure presented in the first step. Likelihood ratio tests (REML version) were used for comparisons among the following models: (a) a fixed effect model: y = Xβ + e, with e ∼ N(0, Iσ 2 e ); (b) a classical homogeneous mixed model with a random intercept for each subject: y = Xβ + Zu + e, with u ∼ N(0, Iσ 2 u ) and e ∼ N(0, Iσ 2 e ); (c) a homogenous random coefficient model with a random intercept and slope for each animal with two random effects assumed correlated: y = Xβ + Z 1 u 1 + Z 2 u 2 + e, with u 1 ∼ N(0, Iσ 2 u 1 ), u 2 ∼ N(0, Iσ 2 u 2 ), Cov(u 1 , u 2 ) = σ u 12 and e ∼ N(0, Iσ 2 e ).
The results in Table III show large values for the likelihood ratio statistics. The model finally accepted is a homogeneous mixed model with two correlated random effects and independent errors. This model includes a three degree polynomial function in time to describe the mean growth curve; an intercept and a slope for each animal.
From the model defined above (model including as fixed effects age, age 2 , age 3 , twins, rank of calving, period of birth, age*period of birth, rank of calving*period of birth, age*twins and twins*period of birth and as random effects an intercept and a slope for each animal), sources of heterogeneity (e.g., rank of calving, season of birth, twins or age at weight) were tested on different variances (intercept, slope or residual variances) of the model. Only residual variances were found to be heterogeneous according to rank of calving, season of birth and age at weight. No heterogeneity of variances was observed for individual intercepts and slopes. Final estimates of variance-covariance parameters are presented in Table IV. The correlation between the two random effects is negative and equal to −0.34; i.e., if an animal's intercept is larger than the others, its slope will tend to be smaller as well. The individual variability for the intercept is very large and equal to 827.65. The variance of the slope is equal to 0.012 which corresponds to a value of the coefficient of variation of 12% indicating a rather substantial variability in the growth rate of bulls. The results about the heterogeneity of variances suggest an increasing variance of weight records in time and a larger variability for bulls born in the spring and out of heifers. Figure 2 presents the graph of mean growth curves estimated from the last model (heteroskedastic random coefficient model presented in Tab. IV) for bulls born as twins or single. It shows that single born bulls were larger at birth than twins and the weights of both of them increased linearly; the growth difference between single and twin bulls was approximately constant and equal to about 15 kg during the period of growth considered. Figure 3 shows the differences under two models between the mean growth curves of single born bulls or twins: (a) fixed effects model and (d) heteroskedastic random coefficient model with the same fixed effects as in model (a). The difference between singles and twins shows two opposite patterns: increasing under model (a) and decreasing under model (d). For instance, at 550 days, the difference is estimated to be 11 kg under model (a) and 17 kg under model (d). How can this be explained given that both estimators are a priori unbiased. Actually they are not unbiased. The downward pattern seen under OLS (Ordinary Least Squares) can be explained as follows: usually heavier bulls are going to be slaughtered earlier resulting in an apparent decrease of growth rate with time. These missing data do not arise completely at random.  This missingness process is not taken into account under OLS which leads to an apparent smaller difference between bulls born as single and twins (heavier bulls being in general single born).

Concluding remarks
This study illustrates a way to analyze repeated measurements with models that use variance-covariance structures for the observations modeled as functions of time. Random coefficient models are convenient tools for modeling such data. They not only reduce the number of parameters, as compared to multiple traits but they can also easily cope with irregular recording patterns in time. They are easily interpretable and manageable under mixed model methodology. For instance, they are of great interest in practice, since they allow for easy calculation of trait performance at typical ages (e.g., here weights at 100, 200, 400 days). They can also be very useful in genetic evaluation for breeding purposes [36].   More generally, random coefficient models provide a valuable tool for modeling repeated records in animal breeding adequately, especially if traits measured change gradually over time (e.g., analysis of lactation curves in dairy cattle, of feed intake or growth curves in beef cattle, etc.). However, there are critical issues to be aware of in order to use these models properly and efficiently. With respect to fixed effects, a critical question lies in the order of the polynomials used to model response. In many studies especially in animal breeding, the authors assume the same regression structure on the fixed and random effects [24][25][26]28,39]. This is neither mandatory in theory nor desirable in practice, since the variation between populations and between subjects within populations does not necessarily follow the same pattern. In practice, the order of polynomials for fitting the random part of the model (adjusted individual profiles) is usually lower than that for the fixed part (population trend), as was the case here.
In addition, semiparametric methods (e.g., splines or kernel methods) can be applied at the fixed effect level, while the between subject variation is fitted via random regression [16,42,47,48].
With respect to the random part, dispersion models can be improved significantly (i) by the application of stochastic time processes to take into account the existing correlations between successive measurements, e.g. autoregressive processes [6,13,41] and (ii) by allowing for heterogeneity of variances, as was done here (see also [46] K of δ, η 1 , η 2 , . . . , η K are computed with the following iterative system: ⇐⇒     P W δδ P P W δη 1 H 1 · · · P W δη K H K H 1 W η 1 δ P H 1 W η 1 η 1 H 1 · · · H 1 W η 1 η K H K . . . . . . . . . . . . . In the general case, −2Q u = ln |G| + E c [u * G −1 u * ] = q(ln |G 0 | + tr[G −1 0 D * ]) with D * = d * kl = 1 q E c (u * k u * l ) . And the current estimate of ρ is computed from the following equation: where ∂(−2Q u ) ∂ρ kl = qtr G −1 Calculations have been made easier by taking advantage of the simple expression of the Fisher information matrix since E[D * ] = G 0 . This system reduces to a third degree polynomial equation, i.e.
This equation can be solved either analytically or numerically. If individuals are not independent, one has to replace G by G 0 A, where A is a symmetric, positive definite matrix of known coefficients.