A general approach to mixed effects modeling of residual variances in generalized linear mixed models

We propose a general Bayesian approach to heteroskedastic error modeling for generalized linear mixed models (GLMM) in which linked functions of conditional means and residual variances are specified as separate linear combinations of fixed and random effects. We focus on the linear mixed model (LMM) analysis of birth weight (BW) and the cumulative probit mixed model (CPMM) analysis of calving ease (CE). The deviance information criterion (DIC) was demonstrated to be useful in correctly choosing between homoskedastic and heteroskedastic error GLMM for both traits when data was generated according to a mixed model specification for both location parameters and residual variances. Heteroskedastic error LMM and CPMM were fitted, respectively, to BW and CE data on 8847 Italian Piemontese first parity dams in which residual variances were modeled as functions of fixed calf sex and random herd effects. The posterior mean residual variance for male calves was over 40% greater than that for female calves for both traits. Also, the posterior means of the standard deviation of the herd-specific variance ratios (relative to a unitary baseline) were estimated to be 0.60 ± 0.09 for BW and 0.74 ± 0.14 for CE. For both traits, the heteroskedastic error LMM and CPMM were chosen over their homoskedastic error counterparts based on DIC values.


INTRODUCTION
Most genetic evaluation programs for livestock are based on generalized linear mixed model (GLMM) analyses [39] with breeding values or genetic effects modeled as random with a covariance structure based on known genetic relationships in a pedigree. Often heterogeneous residual variances, also 32 K. Kizilkaya, R.J. Tempelman defined as residual heteroskedasticity, are observed across various environmental subclasses. If heteroskedasticity is not properly modeled, biased breeding value predictions may result such that a disproportionate numbers of animals are selected from highly variable environmental subclasses with consequent adverse effects on genetic improvement [16].
Genetic evaluation systems based on linear mixed models (LMM) have been proposed for the analysis of continuous production characters in which residual variances are modeled as a function of both fixed, e.g. region and sex, and random, e.g. herd, effects [9,17,27,29,31]. It is well recognized that using random effects or, equivalently, empirical Bayes specifications for a factor with many small subclasses facilitates the efficient borrowing of information from all subclasses for inference on any one particular subclass [13,30]. There has been increasing interest to model residual heteroskedasticity in GLMM other than the LMM. For example, the cumulative probit mixed model (CPMM), often labeled as the threshold model in animal breeding [12], was developed to provide genetic evaluations on ordinal categorical traits, such as calving ease (CE) as the LMM is only suitable for the analysis of nearly normally distributed characters. The CPMM was extended by Foulley and Gianola [10] to model the logarithm of residual variances as a linear function of fixed effects on the underlying latent scale that characterizes this model. This heteroskedastic CPMM now forms the basis for CE genetic evaluations in French Montbéliarde, Normande, Holstein [7] and Italian Holstein [3] cattle by which residual variances are modeled, for example, as a function of calf sex and age of dam. San Cristobal-Gaudy et al. [32] further extended the CPMM of Foulley and Gianola [10] to specify log residual variances as a function of both fixed and random effects, including correlated genetic effects.
With the exception of Sorensen and Waagepetersen [37], virtually all of the heteroskedastic error GLMM analyses procedures presented thus far have involved either rather tenuous large sample approximations or complicated numerical integrations as necessary to derive marginal likelihood functions. Furthermore, we perceive the lack of a unifying framework for the structural modeling of heterogeneous variances in GLMM analysis, whether for LMM, CPMM, or other models such as those for censored data [36] or count data [40]. The objectives of our study then were (1) to develop and validate a fully Bayesian structural mixed effects multiplicative model for residual variances in a GLMM, concentrating on a LMM analysis of normally distributed data and a CPMM analysis of ordinal data and (2) to apply the model to a dataset of BW and CE observed on calvings from first parity Italian Piemontese dams.

MODEL CONSTRUCTION
For a number of GLMM, data augmentation schemes exist such that a n × 1 vector of either observed or augmented variables L = {L i } n i=1 conceptually maps one-to-one to the data vector Y = {Y i } n i=1 . Examples where such augmented variables have been particularly useful for Bayesian inference include the CPMM [35] and censored data models [36].
We write a linear mixed effects model as where β is a vector of fixed location effects, u is a vector of random location effects, X and Z are known design matrices and e ∼ N(0, R(ξ)) is a vector of normally distributed residuals with variance covariance matrix R(ξ) having a certain heteroskedasticity specification based on unknown ξ as defined later. The linear model in (1) is equivalent to the following distributional specification: For a LMM of normally distributed data, there is no distinction between Y and L, i.e. Y ≡ L, such that p(Y | β, u, ξ) ≡ p(L | β, u, ξ). However, for ordinal data with say, C = 4 categories, L maps to Y as follows: where τ 0 = −∞ < τ 1 < τ 2 < τ 3 < τ 4 = ∞ are threshold parameters that define bin boundaries for Y based on L.
We further partition e into residual variance subclasses e = [e 11 e 12 · · · e st ] where e kl ∼ N(0, R kl (ξ) = I n kl σ 2 e kl ) pertains to the n kl × 1 subvector of residuals identified with the kth level (k = 1, 2, . . . , s) of a single fixed factor, e.g. sex, and the lth level (l = 1, 2, . . . , t) of a single random factor, e.g. herd, that jointly influence the residual variance σ 2 e kl for the klth group. For simplicity of presentation, we concentrate on a specification based on only one fixed factor and one random factor for residual heteroskedasticity; implementation details involving multiple fixed and random factors are forthcoming in future work. We partition the data realization y of Y as y = [y 11 y 12 · · · y st ] and the corresponding augmented variables L = [L 11 L 12 · · · L st ] as with e. We propose a 34 K. Kizilkaya, R.J. Tempelman multiplicative structural effects model as follows: where σ 2 e is a "fixed" reference residual variance, γ k > 0 is a multiplicative scaling factor identified with the kth level of the fixed effect subclass and ν l > 0 is a random multiplicative scaling factor unique to the lth level of the random effect. When noninformative priors are used for σ 2 e and γ, identifiability constraints are required. In that case, we arbitrarily set γ s = 1 such that σ 2 e then specifies the expected residual variance for the last fixed effects subclass k = s given E(ν l ) = 1. If one writes equation (4) on the logarithmic scale, this constraint (log(γ s ) = 0) is analogous to the corner parameterization used for location parameters [5] and is the default parameterization using, for example, SAS  linear model software [21].
Note then that fixed and random effects specifications are specified for both location parameters, i.e. β and u, and for dispersion parameters, i.e. σ 2 e , γ = [γ 1 γ 2 . . . γ s ] and v = [ν 1 ν 2 . . . ν t ] . Different classes of effects could be specified for the two sets of parameters. For example, for fixed effects, there may be calf sex differences specified for means, i.e. location parameters, but not for residual variances whereas for random effects there may be variability in contemporary groups specified for residual variances but not for location parameters.
We specify a subjective prior density on β: where p(β) is typically specified to be a bounded flat uniform prior or an informative, perhaps conjugate normal, prior. Here β could be parameterized in a number of different ways, e.g. to partition factor level effects and interactions thereof from a reference mean, say µ, analogous to the partitioning of σ 2 e from γ for residual variance modeling. For elements of β, as with γ, identifiability constraints should be specified if non-informative priors are used [5,21].
As typical for any GLMM, u is specified by a structural multivariate prior: Here G(ϕ) is a variance-covariance matrix that is a function of several unknown variance components or variance-covariance matrices in ϕ, depending on whether or not there are multiple sets of random effects and/or specified covariances between these sets; an example of the latter is the covariance between additive and maternal genetic effects. Furthermore, flat priors, inverted Gamma densities, inverted Wishart densities or products of conditionally independent components thereof may be specified for the prior density on ϕ, i.e. ϕ ∼ p(ϕ) depending, again, on the number of sets of random effects and whether there are any covariances thereof [34]. Analogously to β, a flat (bounded) prior density may be specified separately for σ 2 e and for each γ k ; alternatively, proper conjugate inverted-gamma or other subjective priors may be specified, i.e. and Conversely, a structural prior is used to model the random residual dispersion effects, ν l , l = 1, 2, . . . , t, analogous to that used for the random location effects u in equation (6). We conveniently choose this structural prior to be an inverted-gamma density with parameters α ν and α ν − 1: (9) Here E(ν l ) = 1 and σ 2 ν = Var(ν l ) = 1 α ν −2 , l = 1, 2, . . . , t. Note that as α ν → ∞ then σ 2 ν → 0 such that the influence of v on residual heteroskedasticity diminishes. The structural prior in (9) facilitates a borrowing of information across the t random effects of v, similar to what occurs for classical random effects modeling of location parameters using u [30]. Since E(ν l ) = 1, ν l can be interpreted as a relative measure of variance for random subclass l. Generally, α ν is unknown such that a subjective prior may be placed on it. One vaguely informative proper prior commonly used for strictly positive parameters [1] and used in some of our previous applications [20,41] is specified by The remaining hierarchical specifications in this heteroskedastic GLMM depend upon the first (data sampling) stage of the n × 1 data vector y. For a LMM analysis of normal error data, equation (1) suffices, i.e. Y ≡ L, such that no augmented variables are required. However, for a CPMM analysis of ordinal data with C ordinal categories, numbered j = 1, 2, . . . , C, we specify the first stage of our hierarchical model using Sorensen et al. [35]: where L ikl and y ikl are the ith elements of L kl and y kl , respectively. As before, τ = [τ 0 τ 1 · · · τ C ] denotes a vector of unknown threshold parameters that delimit the augmented variables L into their respective observed data bins y with 1(.) denoting the indicator function having value 1 if the condition within the function is true and 0 otherwise. The second stage of the model was given previously in equation (2) and can be re-expressed as a product of conditionally independent normals: and p(L ikl | β, u, σ 2 e , γ k , ν l ) = N(x ikl β + z ikl u, σ 2 e γ k ν l ). Here, x ikl and z ikl are known incidence row vectors from X and Z, respectively, corresponding to animal i, i = 1, 2, . . . , n kl , within the klth residual variance subclass, k = 1, 2, . . . , s; l = 1, 2, . . . , t.
Recall that τ 0 = −∞ and τ C = +∞; furthermore, τ 1 is fixed to an arbitrary constant in order to satisfy identifiability constraints. We adapt the alternative parameterization presented by Sorensen et al. [35] in which the residual variance is explicitly modeled rather than being constrained, e.g. σ 2 e = 1, the latter being the more common parameterization used in homoskedastic error CPMM [12]. This specification thereby requires one additional constraint on τ, say on τ 2 > τ 1 , such that C − 3 parameters in τ are uniquely identifiable. A prior distribution on the remaining elements of τ may be specified provided that the order constraints on elements of τ are satisfied [35], i.e.
The joint posterior density of β, u, γ, v, ϕ, α ν and any other parameters necessary for the GLMM in question, i.e. L and τ in the CPMM, is simply specified as the product of the various stages of the hierarchical model. That is, for the LMM where Y ≡ L, the joint posterior density of all unknowns specified to proportionality is whereas for the CPMM, the joint posterior density is A Markov chain Monte Carlo (MCMC) inference strategy requires determination of and sampling from the full conditional distributions (FCD) of each parameter or blocks thereof. The joint FCD for β and u, regardless of whether inference is based on the LMM or the CPMM, can be readily shown to be multivariate normal are the typical mixed model solutions to β and u based on current MCMC samples of other unknown parameters/variables. Furthermore, represents the direct sum operator [33] across all st groups. Univariate MCMC sampling strategies, as an alternative to a joint block sample from (15), are elucidated in Wang et al. [43].
As in Kizilkaya et al. [20], we use an efficient Metropolis Hastings update [6] to sample from the joint FCD of L and τ under the CPMM. If some partitions of ϕ form a variance-covariance matrix, then their respective FCD can be readily shown to be inverted-Wishart whereas if other partitions of ϕ involve scalar variance but no covariance components, then the FCD of each component can be shown to be inverted-gamma [34].
We subsequently determine the FCD for σ 2 e which is virtually identical for both the CPMM and the LMM. Under the CPMM, where X kl and Z kl are appropriate partitions of X and Z that relate y kl and L kl to β and u, respectively. With a flat, preferably bounded, uniform prior specified in (8a), i.e. p(σ 2 e ) ∝ 1, the FCD for σ 2 e from (16) can then be seen to be simply inverted gamma with parameters α ++ = n ++ 2 − 1 and β ++ = 1 e kl e kl γ k + α ν − 1 for l = 1, 2, . . . , t. Note that MCMC sampling of elements of ξ under the LMM is identical to that for the CPMM except that e kl = y kl − X kl β − Z kl u rather than e kl = L kl − X kl β − Z kl u.
The FCD for α ν based on the prior p(α ν ) adopted from equation (10), is not readily recognizable: Here we sample from (17) indirectly using the change of variable ψ ν = log(α ν ) with a random walk Metropolis-Hastings update based on a Gaussian proposal density [4] in a manner virtually identical to how we sampled the degrees of freedom parameter for a cumulative t-link mixed model (CTMM) in our previous work [20].

MODEL COMPARISON
The deviance information criterion (DIC) has been proposed for comparing goodness of fit for alternative constructions of hierarchical models to data [38]. The DIC is based on the posterior distribution of the deviance statistic or −2 times the sampling distribution of the data as specified in the first stage of a hierarchical model. For the LMM, the data sampling stage is specified as: whereas for the CPMM, the data sampling stage is based on the marginal likelihood of the data [20] based on the integration of the product of (11a) and (11b) jointly across all n observations with respect to L: Smaller DIC values are indicative of better data fit. Further details on DIC including its components and computations for LMM and CPMM can be found in Kizilkaya et al. [20].

Simulation study
A simulation study was carried out to validate Bayesian inference on the proposed heteroskedastic error LMM and CPMM and to assess the ability of the DIC to correctly choose between homoskedastic and heteroskedastic error GLMM. A simple mixed effects model was used to generate latent variables L for 50 progeny from each of 50 unrelated sires: Here is a 100 × 1 vector of independent random location effects, and {sire k } 50 k=1 ∼ N(0, Iσ 2 s ) represent a 50 × 1 vector of independent random location effects with σ 2 h = 0.25 and σ 2 s = 0.10. Hence, L i jkl and e i jkl are, respectively, the latent and residual variables generated for calf l of sex i with sire k and raised within herd j. Now, is the vector of residuals associated with the n i j records from sex i and herd j where ν j ∼ Inverted-Gamma(α ν , α ν − 1), j = 1, 2, . . . , 100. Three replicated datasets from each of four different populations or different values of α ν were generated. Three of the populations had residual heteroskedasticity specifications that differed with respect to degree of herd variability: Population I: α ν = 3, Population II: α ν = 12, and Population III: α ν = 50. That is, the values α ν = 3, 12 and 50 represented extreme, moderate and mild levels, respectively, of residual heteroskedasticity across herds, given that the standard deviations σ ν = 1 α ν −2 of the relative variance factors are then 1, 1 10 , and 1 48 , respectively. Each population was based on σ 2 e = 1.25 and γ 1 = 0.80, with constraint γ 2 = 1 meaning that Sex 1 (say females) had a marginal residual variance of σ 2 e f emale = σ 2 e γ 1 = 1.00 whereas males (Sex 2) had a marginal residual variance of σ 2 e male = σ 2 e γ 2 = 1.25. Our MCMC implementation was slightly reparameterized from that presented previously in that σ 2 e male and σ 2 e f emale were sampled directly. The three replicates from Population IV were generated with homoskedastic error (α ν = ∞) with σ 2 e = 1 and γ 1 = γ 2 = 1.
Levels of fixed and random effects for both location and residual variance parameters were randomly assigned to individuals in generating data. Augmented data L was mapped to ordinal data y based on C = 4 categories with thresholds τ 1 = −0.50, τ 2 = 1.00 and τ 3 = 2.00 in all populations. Here L and its ordinal mapping to y were analyzed using the appropriate GLMM (LMM and CPMM, respectively) based on both homoskedastic and heteroskedastic error models with flat priors placed on β, σ 2 e male , σ 2 e f emale , σ 2 s , σ 2 h , and the prior from equation (10), used for α ν . We invoked the constraints τ 1 = −0.50 and τ 2 = 1.00 to facilitate parameter identifiability in the CPMM analysis of ordinal y. Although these constraints were identical to the true values of τ 1 and τ 2 , so as to avoid rescaling of estimates of other parameters for comparison to their true values, any values of τ 1 and τ 2 could have been used provided τ 2 > τ 1 . For each replicated data set within each population, a burn-in period of 10 000 MCMC cycles was discarded before saving samples from each of an additional 100 000 MCMC cycles. DIC values were computed for each model on each replicated dataset to validate those measures as appropriate for model choice. Posterior densities on σ 2 e f emale , σ 2 e male , and α ν were summarized by their 95% equal-tailed posterior probability intervals (95% PPI), i.e. the 2.5th and 97.5th percentiles of the posterior densities.

Italian Piemontese birth weight and calving ease data
The very same first parity calf BW and CE scores analyzed by Kizilkaya et al. [20] are also considered here. These data were recorded on Italian Piemontese cattle from January, 1989 to July, 1998 by Associazione Nazionale Allevatori Bovini di Razza Piemontese (ANABORAPI), Strada Trinità 32a, 12061 Carrù, Italy. Only herds that were represented by at least 100 records were considered in the study, leaving a total of 8847 records. CE was scored into five categories by breeders and subsequently recorded by technicians who visited the breeders monthly. The five ordered categories were: (1) unassisted delivery, (2) assisted easy calving, (3) assisted difficult calving, (4) Caesarean section, and (5) foetotomy. As the incidence of foetotomy was less than 0.5%, the last two ordinal categories were combined, leaving a total of four mutually exclusive categories. The effects of dam age, sex of the calf, and their interaction were modeled by combining eight different age groups with sex of calf for a total of 16 nominal subclasses. Herd-year-season (HYS) subclasses were created from combinations of herd, year, and two different seasons (from November to April and from May to October). Further details on the data, including factor definitions and number of levels thereof, are provided in Kizilkaya et al. [20].
As in Kizilkaya et al. [19,20], the LMM and CPMM used for the analyses of BW and CE included the fixed effects of age of dam classifications, sex of calf and their interaction in β, whereas the random effects u included independent herd-year-season effects in h, random sire effects in s and random maternal grandsire effects in m. We also assume: , with σ 2 s denoting the sire variance, σ 2 m denoting the maternal grandsire variance, σ sm denoting the sire-maternal grandsire covariance, and σ 2 h denoting the HYS variance. Furthermore, ⊗ denotes the Kronecker product [33], and A is the numerator additive relationship matrix between sires due to identified male ancestors [14].
Residual heteroskedasticity was modeled as a function of fixed sex effects and random herd effects. For fixed effects, γ male = 1 was considered as a constraint such that σ 2 e f emale = σ 2 e γ f emale is the marginal residual variance for female calves and σ 2 e male = σ 2 e is the marginal residual variance for male calves as specified in the simulation study. Random effects of 66 herds for residual variability were specified by ν j ∼ Inverted-Gamma(α ν , α ν − 1), j = 1, 2, . . . , 66. Flat (bounded) uniform priors were placed on β, σ 2 s , σ 2 m , σ sm , σ 2 h , σ 2 e male , σ 2 e f emale , and the noninformative prior from equation (10) was used for α ν .
MCMC inference was based on the running of three independent chains for each model using the same specifications as in Kizilkaya et al. [20]. That is, for each chain, a total of 20 000 burn-in MCMC cycles was followed by the saving of samples from each of 100 000 additional MCMC cycles.
Key genetic parameters were calf sex specific, including direct heritabilities, i.e. h 2 d _ f emale and h 2 d _male , maternal heritabilities, i.e. h 2 m _ f emale and h 2 m _male , as based on the same expressions used by Kizilkaya et al. [19,20] except that σ 2 e male or σ 2 e f emale was substituted for σ 2 e in the denominator for males and female calves, respectively. Posterior densities of these parameters, σ ν = 1 √ α ν −2 , all variance components, the direct-maternal genetic correlation (r dm ) and key differences such as h 2 m _male , and σ 2 e male − σ 2 e f emale were summarized by posterior means and standard deviations well as by 95% PPI. For computational expediency, 95% PPI on the herd specific relative variance measures ν j , j = 1, 2, . . . , 66, were approximated by their posterior means ± two posterior standard deviations. Analyses of both BW and CE were also carried out by the appropriate homoskedastic GLMM in order to facilitate DIC comparisons against their heteroskedastic counterparts for model fit. Furthermore, Spearman rank correlations were computed between the heteroskedastic and homoskedastic GLMM for posterior means of s for both LMM and CPMM analyses of BW and CE, respectively.

Simulation study
Posterior inference on σ 2 s and σ 2 h is not reported in detail but summarized as follows. Posterior means of σ 2 s and σ 2 h were generally slightly biased upwards for the LMM analysis on L and for the CPMM analysis on y. Nevertheless, the 95% PPI of these two variance components, whether for the LMM analysis of L or the CPMM analysis of y, included the true values of parameters in all cases. The 95% PPI for these two components were also understandably wider for the CPMM analysis of y than for the LMM analysis of L, likely due to the substantial loss in data information in mapping from continuous L to ordinally discrete y.
The 95% PPI on σ 2 e male , σ 2 e f emale and α ν for each of the three replicated datasets from each of the four populations, are provided in Table I for the direct LMM analysis of L and Table II for CPMM for the corresponding ordinally mapped values of y. The 95% PPI of σ 2 e f emale and σ 2 e male in either the LMM and CPMM analyses of L and y, respectively, always included the true values of σ 2 e f emale = 1 and σ 2 e male = 1.25, respectively, for each of the three replicated datasets from each of Populations I, II, and III. For Population IV, in which three replicates were generated with homoskedastic error, posterior means and 95% PPI for σ 2 e f emale and σ 2 e male were found to be similar to each other and concentrated around the true value σ 2 e f emale = σ 2 e male =1 as anticipated. For all populations, the 95% PPI of σ 2 e f emale and σ 2 e male tended to center about their corresponding posterior means (results not shown). The 95% PPI for α ν for each of the three datasets from each of Populations I, II, and III also included the corresponding true parameter value. The widths of the 95% PPI of α ν increased appreciably as α ν increased. Again, because of the lower information content of ordinal data relative to underlying latent data, CPMM analyses of y produced understandably wider 95% PPI on α ν relative to LMM analyses of L. Furthermore, posterior means of α ν were biased upwards for both analyses in Population III (results not shown), thereby reflecting the greater skewness of the posterior densities for higher values of α ν . As anticipated, the 95% PPI of α ν in Population IV (α ν = ∞) included very large values (>300) for either the LMM or CPMM analyses.    The simulation study was also used to validate DIC as model choice criteria. For this purpose, DIC values based on homoskedastic and heteroskedastic GLMM analyses for each replicated dataset within each of the four populations were determined for the LMM analysis of L and for the CPMM analysis of y. A DIC difference exceeding 7 has been suggested by Spiegelhalter et al. [38] as indication of a decisive difference in model fit. The DIC differences between homoskedastic and heteroskedastic GLMM (LMM for y and CPMM for L) as matched for each replicate dataset within populations are shown in Figure 1. In all cases, DIC differences were clearly in favor the correct model except when α ν = ∞ where model choice was always indecisive except for one CPMM case where the homoskedastic error model was correctly chosen. Furthermore, as expected, the magnitude of DIC differences involving LMM analyses of L was consistently higher than that for CPMM analyses of y as there is likely greater statistical power for inference on continuous L as opposed to discrete y. Note that the DIC differences between homoskedastic and heteroskedastic error GLMM approached 0 with increasing α ν , such that there was understandably lower power to conclude the existence of random sources of residual heteroskedasticity when it was very mild.

Genetic parameter inference
Sire and maternal grandsire LMM and CPMM were used for the analyses of BW and CE, respectively, from Italian Piemontese cattle. Posterior means and standard deviations and 95% PPI on dispersion parameters, i.e. variance components and heteroskedastic parameters, are summarized in Table II, whereas similar quantities are presented for genetic parameters, i.e., heritabilities and their key differences and the direct-maternal genetic correlation, in Table III. The effective sample size (ESS) as a measure of the number of effectively independent samples amongst the 100 000 correlated MCMC samples saved from each chain are also reported for each parameter using Sorensen et al. [35]. All reported results in Tables II and III are based on combining information from the three independent MCMC chains after burn-in. All posterior densities (not shown) were nearly symmetric and unimodal which is further illustrated by the near symmetricity of the 95% PPI about the reported posterior means in Tables II and III. The total number of ESS for dispersion parameters across the three chains ranged from 1836 to 21 839 for the LMM analysis of BW and from 841 to 14 045 for the CPMM analysis of CE scores. As anticipated from the results of the simulation study, the CPMM analysis of CE generated lower ESS than the LMM analysis of BW. The additional inference on α ν , σ 2 e male , and σ 2 e f emale relative to a homoskedastic GLMM specification did not appear to adversely impact MCMC mixing on other dispersion parameters; for example, the reported ESS for the heterogeneous CPMM in Tables II and III appeared to be comparable to those for the homogeneous CPMM on the same data and for the same number of MCMC cycles as reported in Kizilkaya et al. [20]. We observed that the posterior means of the residual variance using the homoskedastic error models (results not reported) were nearly equal to the average of the posterior means for σ 2 e male and σ 2 e f emale in the heteroskedastic error models for both GLMM reported in Table II. Note that for BW and CE, the posterior mean of σ 2 e male was over 40% greater than that for σ 2 e f emale , and in either case the 95% PPI on σ 2 e male − σ 2 e f emale did not include 0. These significant differences further translated into sex-specific differences for direct and maternal heritabilities estimated to be, respectively, 0.07 and 0.03 greater for BW in females and 0.13 and 0.03 greater for CE in females; furthermore the 95% PPI on each of these differences did not include 0 as also indicated in Table III. Heteroskedastic LMM and CPMM led to relatively large posterior means (≥0.60) and 2.5th percentiles (≥0.46) for σ ν = 1 √ α ν −2 , thereby providing convincing evidence that the degree of residual heteroskedasticity across herds is great for both BW and CE. Approximate 95% PPI (based on the posterior mean ±2 posterior standard deviations) for herd-specific ν j in the LMM and CPMM analyses of BW and CE, respectively, are presented in Figure 2. For both traits, it can be seen that the approximate 95% PPI for ν j in many herds do not overlap with the prior expected value of 1, indicating that residual variances for these herds are significantly higher or lower than average, as would be anticipated from the inference on σ ν in Table II. It should be noted that the ranges of posterior means of ν j were greater than 10 fold for the LMM analysis of BW in Figure 2a and greater than 20 fold for the CPMM analysis of CE in Figure 2b; i.e. residual variances in some herds are estimated to be as large as 20 times greater than residual variances in other herds.
For each homoskedastic and heteroskedastic GLMM, DIC values were averaged across the three independent MCMC chains. Differences in DIC values within models never ranged by more than 3 units across these three chains, thereby indicating satisfactory control of Monte Carlo error. For the LMM analysis of BW, the average DIC value for the heteroskedastic error model was 44 332 compared to 45 240 for the homoskedastic error model, thereby clearly establishing the heteroskedastic error LMM as the better fitting model. Similarly, for CE, the average DIC value for the heteroskedastic CPMM was 15 949 compared to 16 563 for the homoskedastic CPMM, again favoring the heteroskedastic error model. Kizilkaya et al. [20] analyzed the same CE data, comparing the homoskedastic error CTMM to the homoskedastic error CPMM and determined that the average DIC value (16 348) for the homoskedastic CTMM made that the model of choice. Using the results from Kizilkaya et al. [20] and the results reported here, the heteroskedastic error CPMM appears to be clearly superior to both the homoskedastic error CTMM and CPMM models for fit to CE in Italian Piemontese cattle.

Inferences on sire effects
Posterior means of elements of s were used as corresponding point estimates of expected progeny differences (EPD) under both homoskedastic and heteroskedastic error GLMM. Scatterplots of the heteroskedastic error EPD versus homoskedastic error EPD for the LMM analysis of BW and for the CPMM analysis of CE are provided in Figure 3. Although the degree of overall reranking was not great between the two error specifications (rank correlations > 0.96 for both GLMM), the same was not necessarily true for the bottom and top 10% of sires as ranked by the appropriate heteroskedastic error GLMM. For the bottom 10%, rank correlations were less than 0.86 for both GLMM whereas the same correlations were less than 0.80 for the top 10%. EPD differences between the heteroskedastic and homoskedastic error GLMM ranged from -0.74 kg to 0.96 kg for BW and from -0.20 to 0.22 (on the latent scale) for CE; these ranges are not trivial relative to σ s as based on the  Table III. It is interesting to note, in particular, the large EPD discrepancy for BW between the heteroskedastic and homoskedastic error LMM for the left-most point in Figure 3a. This sire had all of his progeny in year-seasons deriving from one herd; incidentally, this herd corresponds to the right-most 95% PPI provided in Figure 2a and having the largest posterior mean for relative residual variance (ν j ).

DISCUSSION
Accounting for heterogeneous residual variances in genetic evaluations has represented an important area of research in animal breeding (e.g. [7,9,11,17,18,26,29,31,32]). We have developed a general framework for structural modeling of heterogeneous residual variances in GLMM based on fully Bayesian inference using MCMC methods. Our method can be readily extended to allow for heterogeneous residual variances across subclasses for other GLMM analyses, particularly those that involve normally distributed latent variables, including censored data models [36].
We validated two different heteroskedastic error GLMM by simulation, demonstrating that fully Bayesian inference allows satisfactory inference on the parameters that specify residual heteroskedasticity. Our simulation study further indicated that Bayesian model choice criteria such as DIC can be used with confidence to choose between heteroskedastic and homoskedastic GLMM error models.
Residual variances for BW and CE were estimated to be over 40% greater in male calves compared to female calves in Italian Piemontese cattle, contributing therefore to smaller heritabilities of direct and maternal effects for male calves. These results for BW are in relatively good agreement with those from Garrick et al. [11]. Our results for CE are in agreement with analyses on Blonde d'Aquitaine cattle [8] but are substantially more deviant than those results reported by Ducrocq [7] who estimated residual variances to be 7-18% greater in male calves in French Holstein, Normande and Montbéliarde populations and those results reported by Canavesi et al. [3] who estimated residual variances to be less than 3% greater for male calves in Italian Holsteins. Because of the large estimated differences in residual variances, direct and maternal heritabilities were then determined to be substantially larger for female calves than for male calves for both BW and CE. However, our results and particularly its CE comparisons with Ducrocq [7] and Canavesi et al. [3] should be treated with caution. Firstly, our population involves a double muscled breed with a high frequency of unfavorable CE scores relative to the cattle populations considered in Ducrocq [7] and Canavesi et al. [3]. This distinction appears to be further apparent in that our reported direct heritability estimates for either calf sex were substantially larger than corresponding estimates from recent studies using CPMM [3,7,22,24,25,27,28,42]. Nevertheless, our inference on a strongly negative direct-maternal genetic correlation for both BW and CE is in good agreement with previous work (e.g. [2,42]). Secondly, Ducrocq [7] and Canavesi et al. [3] also modeled constant genetic/residual variance ratios, following the work of Foulley and Gianola [10], whereas we model constant genetic variances across environments; a more formal comparison using, e.g. DIC, of these two specifications is worthy of further study. To our knowledge, nobody has yet inferred upon sex-specific genetic variance for CE. This is an area for further research and requiring a substantially larger dataset than what we considered as the power for inferring upon sex-specific genetic variance is likely to be substantially lower than that for sex-specific residual variance using the CPMM. Finally, Ducrocq [7] and Canavesi et al. [3] used approximate marginal maximum likelihood (MML) inference which has been demonstrated to lead to biased estimates of dispersion parameters in simulation studies [15] although our recent work does not suggest meaningful differences between MML and MCMC inference using a CPMM for CE based on larger datasets [19].
There appeared to be no appreciable differences in overall sire rankings based on EPD rank correlations between homoskedastic and heteroskedastic error GLMM for either BW or CE; however, this was not necessarily the case for sires with more extreme EPD for direct genetic effects, specifically the top and bottom 10%. These results are significant in that breeder selection strategies for both traits invariably involve either direct selection of sires in one of these tails and/or avoidance of sires in the other tail, particularly for first parity dams. Furthermore, absolute differences in EPD between the two GLMM for each trait were large relative to the genetic variability. Even greater differences in EPD rankings and absolute differences might be anticipated for genetic evaluations of dams using animal model GLMM since they are used less extensively across potentially heteroskedastic subclasses, e.g. herds, compared to sires.
Kizilkaya et al. [20] used homoskedastic error CTMM whereas Ducrocq [7], Foulley [8], and Canavesi et al. [3] used heteroskedastic error CPMM, all as independent efforts to provide stable genetic evaluations of sires for subjectively recorded CE. Elements of both sets of models, however, may be required. That is, CE data quality may be compromised if, for example, some breeders assign CE scores on a much wider ordinal scale than others, thereby requiring a heteroskedastic error model, or decisions on CE assistance, e.g. Caesarean sections, are more likely for highly valued dams, thereby requiring a robust specification like the CTMM. It is conceptually easy to jointly extend the work presented here for heteroskedastic error GLMM and for t-error GLMM as in Kizilkaya et al. [20] to develop heteroskedastic t-error LMM for BW or heteroskedastic error CTMM for CE. We have already pursued this work with DIC values clearly pointing to improved model fit based on these enhanced specifications for both BW and CE; this work will be reported in a future study.
The existence of potentially sizable residual and genetic correlations between BW and CE also facilitate more accurate genetic evaluations of sires for CE based on a bivariate threshold/linear multiple trait analysis with BW [23,42]. We believe the model that we have presented will be useful in facilitating a mixed effects extension to bivariate heteroskedastic error modeling for both BW and CE jointly using, say, inverted Wishart rather than inverted gamma structural prior specifications.