Inference for threshold models with variance components from the generalized linear mixed model perspective

L'analyse des modeles a seuils avec effets fixes et aleatoires et des composantes de variance correspondantes est ici placee dans la perspective des modeles lineaires mixtes generalises (GLMMs). Les parametres sont estimes par une procedure iterative, appelee minimum de vraisemblance restreinte repondere obtenu par iteration (IRREML). Cette procedure est une extension de l'algorithme iteratif des moindres carres reponderes pour les modeles lineaires generalises. Elle a l'avantage de suggerer immediatement une maniere d'etendre la methodologie habituelle du modele mixte aux GLMMs. Une application a des donnees de difficultes d'agnelage est presentee. IRREML peut etre mis en œuvre avec les logiciels standard disponibles pour les modeles lineaires mixtes normaux habituels. Le lien avec d'autres procedures d'estimation, par exemple l'approche du minimum a posteriori (MAP), est discute. Une comparaison par simulation avec une methode voisine montre un biais caracteristique du MAP et de l'IRREML pour l'heritabilite. Quand le nombre des effets fixes est diminue, a nombre total d'observations constant, le biais passe d'une valeur fortement positive a une valeur fortement negative, apparemment independantes de l'importance des effets fixes


INTRODUCTION
In his paper on sire evaluation Thompson (1979) already pointed out the potential interest for binomial data in modifying the generalized linear model (GLM) estimating equations to allow for random effects. He conjectured that if modification is feasible, generalization towards other distributions such as the Poisson or gamma distribution should be easy. The iterated re-weighted restricted maximum likelihood (IRREML) procedure (Schall, 1991;Engel and Keen, 1994) for generalized linear mixed models (GLMM) proves to be exactly such a modification. IRREML is motivated by the fact that in GLMMs the adjusted dependent variate in the iterated re-weighted least squares (IRLS) algorithm (McCullagh and Nelder, 1989, § 2.5) approximately follows an ordinary mixed-model structure with weights for the residual errors and, in the absence of underor overdispersion, residual error variance fixed at a constant value (typically 1). IRREML is quite flexible and not only covers a variety of underlying distributions for the threshold model but also easily extends to other types of data such as count data, for example, litter size. This entails simple changes in the algorithm with respect to link and variance function employed. When the residual error variance for the adjusted dependent variate is not fixed, it represents an additional underor overdispersion parameter which is a useful feature, for example, underor overdispersed Poisson counts. Calculations in this paper are performed with REML (Patterson and Thompson, 1971) facilities for ordinary mixed models in Genstat 5 (1993). Software for animal models such as DFREML (Meyer, 1989), after some modification, can be used for IRREML as well.
Methods for inference in ordinary normal data mixed models, eg, the Wald test (Cox and Hinkley, 1974, p 323) for fixed effects, are also potentially useful for GLMMs, as will be illustrated for the lambing difficulty data. Simulation results for the Wald test in a GLMM for (overdispersed) binomial data were presented in Engel and Buist (1995).
For threshold models with normal underlying distributions and known components of variance, Gianola and Foulley (1983) observe that their Bayesian maximum a posteriori (MAP) approach produces estimating equations for fixed and random effects such as those anticipated by Thompson. Under normality assumptions, for fixed components of variance, IRREML will be shown to be equivalent to MAP. IRREML therefore offers an alternative, non-Bayesian, derivation of MAP. The MAP approach was also presented in Harville and Mee (1984), including estimation of variance components. Their updates of the components of variance are akin to those of the estimation maximization (EM) algorithm (Searle et al, 1992, § 8.3) for REML. The algorithm presented in Engel and Keen (1994), which is used in this paper, is related to Fisher scoring. Both algorithms solve the same final estimating equations, but the latter is considerably faster than the former. Gilmour et al (1985) presented an iterative procedure for threshold models with normal underlying distributions, which also uses an adjusted dependent variate and residual weights. This approach, which will be referred to as GAR, is different from MAP and IRREML. In the terminology of Zeger et al (1988) MAP and IRREML are closely related to the subject-specific nature of the GLMM, while GAR is of a population-averaged nature, as will be explained in more detail in this paper.
A number of authors, eg, Preisler (1988), Im and Gianola (1988), and Jansen (1992), have discussed maximum likelihood estimation for threshold models. Apart from the fact that straightforward maximum likelihood estimation does not correct for loss of degrees of freedom due to estimation of fixed effects, as REML does in the conventional mixed model, it is also handicapped by the need for highdimensional numerical integration. Maximum likelihood estimation for models with several components of variance, especially with crossed random effects, is practically impossible. IRREML is more akin to quasi-likelihood estimation (McCullagh and Nelder, 1989, chap 9;McCullagh, 1991): conditional upon the random effects only; the relationship between the first 2 moments is employed while no full distributional assumptions are needed beyond existence of the first 4 moments.
Since practical differences between various methods proposed pertain mainly to their subject-specific or population-averaged nature, we will give some attention to a comparison between GAR and IRREML. Simulation studies were reported in Gilmour et al (1985), Breslow and Clayton (1993), Hoeschele and Gianola (1989), and Engel and Buist (1995). Conclusions from the Hoeschele and Gianola study differ from conclusions from the other studies with respect to bias of MAP/IRREML and GAR. Since the Hoeschele and Gianola study was rather modest in size, it was decided to repeat it here in more detail, ie under a variety of parameter configurations and for larger numbers of simulations.

GLMMs and threshold models
The GLMM model Suppose that random effects are collected in a random vector u, with zero means and dispersion matrix G, eg, for a sire model G = Ao, 8 2, where A is the additive relationship matrix and 0 & d q u o ; ; the sire component of variance. Conditional upon u, eg, for given sires, observations y are assumed independent, with variances proportional to known functions V of the means p: For binary data, y = 1 may denote a difficult birth and y = 0 a normal birth. The mean f.1, is the probability of a difficult birth for offspring of a particular sire. The conditional variance is Var(y!u) = V(p) = /l (1-p) and 0 equals 1. For proportions y = x/n, an appropriate choice may be: Parameter 0 may be included to allow for underor overdispersion relative to binomial variation (McCullagh and Nelder, 1989, § 4.5). Observe that [2] is inappropriate when n is predominantly small or large: for n = 1 no overdispersion is possible and 0 should equal 1 and for n -! oo, [2] vanishes to 0 while extrabinomial variation should remain. More complicated variances (Williams, 1982) may be obtained by replacing 0 in [1] by {1 + (n -1)!0{ or by {1 + (n -1)u5 f .1,(1-f.1,n. In both expressions ao is a variance corresponding to a source of overdispersion (for a discussion of underdispersion see Engel and Te Brake, 1993). Limits for n -> oo of the variances are o, 2 t L (1 -0 f .1,) and o, 0 2 /-t 2 (1_M)2 respectively. Both can be accomodated in a GLMM for continuous proportions, eg, motility of spermatozoa, and are covered by IRREML. The mean f.1, is related to a linear predictor by means of a known link function g: = g(f.1,). The linear predictor is a combination of fixed and random effects: ! = x l l3 + z'u, where x and z are design vectors for fixed and random effects collected in vectors 13 and u, respectively. For difficulty of birth, for instance, 77 may include main effects for parity of the dam and a covariable for birthweight as fixed effects and the genetic contribution of the sire as a random effect. Popular link functions for binary or binomial data are the logit and probit link functions: logit(p) = log(f.1,/(l-¡.¡,)) = q and probit (p) = !-1(p.) = 17 , where !-1 is the inverse of the cumulative density function (cdf) of the standard normal distribution.
The threshold model Suppose that r is the 'liability', an underlying random variable such that y = 1 when r exceeds a threshold value 0 and y = 0 otherwise. Without loss of generality it may be assumed that 0 = 0. Let 77 be the mean of r, conditional upon u. Furthermore, let the cdf of the residual e = (r &mdash; 7!), say F, be independent of u. Then where F-1 is the inverse of F. It follows that the threshold model is a GLMM with link function g( f .1,) = -F-1 (1f.1,), which simplifies to g(f.1,) = F-1 (f.1,) when e is symmetrically distributed. Residual e may represent variation due to Mendelian sampling and environment. Probabilities p do not change when r is multiplied by an arbitrary positive constant and the variance of e can be fixed at any convenient constant value, say or 2 . When F is the cdf L of the standard logistic distribution, ie F(e) = L(e) = 1/(1 + exp(-e)), g is the logit link and a 2 = !r2/3. When F is the cdf 4) of the standard normal distribution, g will be the probit link and Q Z = 1. Although the logistic distribution has relatively longer tails than the normal distribution, to a close approximation (Jonhson and Kotz, 1970, p 6): where c = (15/16)!/!. Results of analyses with a probit or logit link are usually virtually equivalent, apart from the scaling factor c for the effects and C 2 for the components of variance. Heritability may be defined on the liability scale, eg, for a sire model: h 2 = 4a;/(a;+a2). As a function of 0 ,2/or 2 heritability does not depend on the choice of !2. Hence, estimates h 2 for the probit and logit link are often about the same.

Conditional and marginal effects
In a GLMM, effects are introduced in the link-transformed conditional means, ie in the linear predictor q = g(p). Consequently, effects refer to subjects or individuals.
The GLMM and the threshold model are both subject-specific models, using the terminology of Zeger et al (1988). This is in contrast with a population-averaged model where effects are introduced in the link-transformed marginal means g(E(f.1,)) and refer to the population as a whole. In animal breeding, where sources of variation have a direct physical interpretation and are of primary interest, a subjectspecific model, which explicitly introduces these sources of variation through random effects, seems the natural choice. For fixed effects however, presentation in terms of averages over the population is often more appropriate. In the threshold model, there is no information in the data about the phenotypic variance of the liability, allowing QZ to be fixed at an arbitrary value. Intuitively one would expect the expressions for marginal effects to involve some form of scaling by the underlying phenotypic standard deviation. For normally distributed random effects and probit link this is indeed so. From r -N(x'j3, z'Gz + 1) the marginal probability, say p, follows directly: Hence, the probit link also holds for marginal probabilities, but the effects are shrunken by a factor Ap = (z'Gz + 1)-°!5. For a sire model Ap = (u £ + 1) -0 . 5 . That the same link applies for both conditional means f .1, and marginal means p is rather exceptional. For the logit link, the exact integral expression for p cannot be reduced to any simple form (Aitchison and Shen, 1980). However, from [3] it follows that the logit link holds approximately for p, with shrinkage factor A L = ((z'Gz/c 2 ) + 1 )-0 . 5 .
Without full distributional assumptions, for relatively small components of variance, marginal moments may also be obtained by a Taylor series expansion (see Engel and Keen, 1994).
Binary observations y i and y j corresponding to, for instance, the same sire will be correlated. For the probit link the covariance follows from: Here V2 (a, b; p) is the cdf of the bivariate normal distribution with zero means, unit variances and correlation coefficient p, p2! is the correlation on the underlying scale, eg, in a simple sire model Pij = or 8 2/( 0 ,2 + 1). For the logit link, using !3!, Ap should be replaced by !L/c, while the value of the correlation, expressed in terms of the components of variance in the logit model, is about the same. The double integral in !2 may effectively be reduced to a single integral (Sowden and Ashford, 1969), which can be evaluated by Gauss quadrature (Abramowitz and Stegun, 1965, p 924). Alternatively, for small p 2 , a Taylor expansion (Pearson, 1901;Abramowitz and Stegun, 1965, 26.3.29, p 940) may be used: where T ( t ) is the tth derivative of the probability density function (pdf) T of the standard normal distribution. For a sire model, under normality assumptions, the first-order approximation appears to be satisfactory, except for extreme incidence rates p (Gilmour et al, 1985). By grouping of n binary observations pertaining to the same fixed and random effect, moments for binomial proportions y immediately follow from !4), eg: where p is the intra-class correlation on the liability scale. Expression [6] can be simplified by using !5!. Results for the logit link follow from !3!.

Estimation of parameters
The algorithm for IRREML The algorithm will be described briefly. For details see Engel and Keen (1994) and Engel and Buist (1993a Here, X and Z are the design matrices for the fixed and random effects respectively, W is a diagonal matrix with weights w along the diagonal and l. denotes the vector of values of the adjusted dependent variate. 13 0 and u o are replaced by (3 and u, l. and w are updated and a new MINQUE step is performed. This is repeated until convergence. Note that MINQUE does not require full distributional assumptions beyond the existence of the first 4 moments and may be presented as a weighted least-squares method (Searle et al, 1992, Ch 12).

Some properties of IRREML
When the MMEs are expressed in terms of the original observations y, it is readily shown that at convergence the following equations are solved: where * denotes a direct elementwise (Hadamard) product. These equations are similar to the GLM equations for fixed u (with appropriate side conditions) except for the term 0 G-1 u on the right-hand side. Equations [8] may also be obtained by setting first derivatives with respect to elements of j3 and u of D + u'G-l u equal to zero, where D is the (quasi) deviance (see McCullagh and Nelder, 1989, § 2.3 and § 9.2.2.) conditional upon u. The assumption of randomness for u imposes a 'penalty' on values which are 'too far' from 0. When the pdf of observations y conditional upon normally distributed random effects u is in the GLM exponential family, eg, a binomial or Poisson distribution, maximization of D + u'G-l u is easily shown to be equivalent to maximization of the joint pdf of y and u.
Suppose that we have a sire model with q sires and sire variance component or 2 The IRREML estimating equations for o,2 s and 0 (see, for example, Engel, 1990) are: Here Z o = I, Z I is the design matrix for the sires, A o = W-1 , A l = A, P = SZ-1 _ [21 X(X'[2-1 X)-l X'[2-I and S2 = ZGZ' + cpW-1 . The difference with ordinary REML equations is that depends on the parameter values as well. The MINQUE/Fisher scoring update of IRREML can be recovered from !9!, by using P = P[2P 8 AZ'P + cpPW-1p on the left-hand side: where 0'5 = ø and at = as. When 0 is fixed at value 1, the equation for k' = 0 is dropped from (10!. Alternative updates related to the EM algorithm may also be obtained from [9] (see, for example, Engel, 1990), and will be of interest when other estimation procedures are discussed: Here T / ø is the part of the inverse of the MME coefficient matrix corresponding to u.

Application to birth difficulties in sheep
The data are part of a study into the scope for a Texel sheep breeding program in the Netherlands employing artificial insemination. Lambing difficulty will be analyzed as a binary variable: 0 for a normal birth and 1 for a difficult birth. Variability on the liability scale may depend on litter size. Therefore, observations corresponding to a litter size of 1 and litter sizes of 2 or more are analyzed separately.
Corresponding data sets are referred to as the S-set (single; 191 observations) and M-set (multiple; 483 observations). The M-set is reproduced in Engel and Buist (1993) and is available from the authors. Some summary statistics are shown in   [-3.5, 3.5!. In addition to fixed HYS effects, factors for age and parity of the dam (P), sex of the lamb (S), and for the M-set a covariate for litter size (L = litter size -2) and included. Levels for factor P consist of the following 6 combinations of age and parity: (1;1), (2;1), (2;2), (3; ! 2), (4; ! 3) and (> 5; ! 4). In models 3, 4 and 5 a factor D for pelvic dimension of the dam ('wide', 'normal' or 'narrow'), and in models 4 and 5 a covariate W = birthweightaverage birthweight of the lamb is also included, with separate averages of 4.27 and 3.63 for the Sand M-sets respectively.
Fixed effects may be screened by applying the Wald test to the values of ! saved from the last iteration step. Some results for the M-set are shown in Standard errors are in parentheses. S, P and D denote main effects, W and L are covariables, S x L and D x W are interactions. When an estimate is negative (-), the component is assumed to be negligible and set to 0. table III. In all cases, test statistics are calculated for the values of the variance components obtained for the corresponding full model, ie model 1-5. Variability due to estimation of the variance components is ignored. For each line in the table the corresponding test statistic accounts for effects above that line, but ignores effects below the line. Referring to a chi-square distribution, in model 1 seasonal effects seem to be unimportant and are excluded from the subsequently fitted models.
The effects refer to the probits of the conditional probabilities. For the probits of the marginal probabilities, effects have to be multiplied by (1+(J&dquo;!+(J&dquo;5)-0.5 = 0.769 (from table II). The difference between 'narrow' and 'wide', for example, becomes 1.55 (0.43). In model 5 for the M-set, separate coefficients for birthweight are fitted for the 3 levels of pelvic opening. The estimated coefficient for birthweight for a dam with a narrow pelvic opening is 0.72 (0.61); this is about 0.47 (0.63) higher than the estimates for the other 2 levels, which are about the same. Although a larger coefficient is to be expected for a narrow pelvic opening, the difference found is far from significant. Fitting a common coefficient, ie dropping the interaction D x W between pelvic opening and birthweight in model 4, gives an estimated coefficient for birthweight of 0.28 (0.15), which becomes 0.21 (0.12) after shrinkage. By comparison, the coefficient for the S-set, after shrinkage, is 0.92 (0.30). The reduced effect of birthweight for the M-set agrees with the negligible values found for the component of variance for sires.

Relation to other methods
We will mainly concentrate on differences between GAR and IRREML. GAR is based on QL for the marginal moments with a probit link and normally distributed random effects (see also Foulley et al, 1990). QL-estimating equations for dependent data are (McCullagh and Nelder, 1989 where R = diag({p(l -p) -pr 2 (ÀpXj3)}/n), B = diag(r(ÀpXj3)), C = A2G and p represents the appropriate intraclass correlation. Var(y) will be replaced by [12], ignoring terms of order p 2 and higher. The QL equations may be solved by iterated generalized least squares on an adjusted dependent variate, say ( GAR : where B is B evaluated at j3,. By comparison, the adjusted dependent variate from [7] is: The latter variate relates to a first-order approximation of the conditional means p, while the former relates to a first-order approximation of the marginal means p. Approximately Var(( GAR ) = B-l Var(y)B-1 = B-I RB-1 + ZCZ' = w G1 R +ZCZ', and Gilmour et al (1985) solve MMEs in terms of !GAR and WGAR! For example, in a sire model with q sires, predictions from these MMEs for Apu, say Û GAR , are used to update the intraclass correlation p. Analogous to !11! with <!=1: Both GAR and IRREML are based on an approximate mixed-model structure for an adjusted dependent variate. Note however that in GAR, in contrast to IRREML, predictions of the random effects are only used to update the components of variance and do not enter W GAR and ( GAR directly. At the end of this section we shall see that this may be an advantage for GAR over IRREML in situations where there is little information in the data about individual random effects, eg, in a sire scheme with small family sizes or extreme incidence. The marginal quasi-likelihood method (MQL) presented in Breslow and Clayton (1993) can be regarded as a generalization of GAR. However, in its more general setting, MQL does not have the benefit of some of the exact results for binomial data and underlying normal distributions used to derive GAR.
In MAP, assuming a vague prior for (3 and a normal prior for u, the posterior mean for (0', u')' is approximated by the posterior mode. Hence, (13 , û')' maximize the joint pdf of y and u. Under normality this is equivalent to maximization of a penalized deviance. Hence, the estimating equations for 13 and u for MAP and IRREML are the same and given by [8]. As shown in Foulley et al (1987) an estimator for, for instance, a sire component QS , may be solved from: Eo[ 8/8 u £ s log (f (u I or 2))] = 0. Here, f(.[.) denotes a (conditional) pdf for the variables indicated, and expectation E o is with respect to f (u!y, as). The latter pdf may be approximated by a normal density with mean u and dispersion T. In an iterative scheme this leads to the EM-type update !11!. Consequently, MAP and IRREML also solve the same final estimating equations with respect to the components of variance, although the algorithms used are different.
The penalized quasi-likelihood method (PQL) presented in Breslow and Clayton (1993) is based on Laplace integration. Random effects are assumed to be normally distributed. The log pdf of y conditional upon u is replaced by a quasi-likelihood. When parameter 0 is involved, the extended quasi-likelihood (Nelder and Pregibon, 1987) can be used, see Engel and Buist (1993). In the resulting integral expression, the logarithm of the integrand in terms of u is approximated by a quadratic function around its optimum in u, employing expected second-order derivatives. Now random effects are easily integrated out. Some further approximations, eg, approximating conditional deviance residuals by Pearson residuals, result in a normal log likelihood for the adjusted dependent variate ( from [7]. An REML type adjustment for loss of degrees of freedom due to estimation of fixed effects finally yields an REML log likelihood, which is maximized by Fisher scoring. Hence, although motivated quite differently, PQL is equivalent to IRREML (for details see Engel and Buist, 1993).
A comparison between GAR and MAP/P(!L/IRREML by simulation was presented in Gilmour et al (1985). They consider a simple one-way model, eg, a sire model with q unrelated sires, with n offspring per sire, and with a binary observation per offspring. An overall mean is the only fixed effect. Gilmour et al (1985) observe that there is a tendency for MAP/P(!L/IRREML to underestimate h 2 for small family size n or extreme incidence p. For extreme incidence p, GAR tends to overestimate h'. As also noted by Thompson (1990), for this simple set-up, closed-form expressions can be derived from [8] and !11!, viz, P GAR = {MS B -y (1y )}/{(n -1)T(!-1(y)z}. Here MS B is the mean sum of squares between sires for the binary observations and y is the overall mean. The adjusted dependent variable and weights are: ( GAR = !-1(y) + (y -y)/T(!'-1(y)) and W GAR = T(!-1(y))z/{!(1y) -/&dquo;'(!!07))!}-As long as the first-order approximation from [5] holds, and q is not too small, P GAR will be nearly unbiased.
Actually, for this simple scheme, GAR may be shown to be equivalent to Williams' method of moments for estimating overdispersion (Williams, 1982). Starting from u o = 0 and ?7o = !-1 (y), the adjusted dependent for IRREML is the same as for GAR, but the weights w = T(!-1(y))z/{y(1 -V)l are smaller! Consequently, the first estimate a 2 = {MS B -y(l -y)}/{nr(<p-l (y)) 2 } underestimates U s approximately by a factor (n -1)/n. Simulation results in Gilmour et al (1985) suggest that with further iteration underestimation by this factor persists. For extreme incidence, say close to 1, IRREML weights are approximately T[r(i¡) , as follows from Abramowitz and Stegun (1965, 26.2.12, p 932). Most of the information is in the negative-valued random effects, and over-shrinkage of random effects will yield smaller weights in the next iteration, ie 'residual variances' will be too high. Therefore, underestimation by IRREML may be expected to be more serious for extreme incidence, even when the first-order approximation from [5] still holds. Lack of information about individual random effects, because of small family size or extreme incidence, seems a serious problem with MAP/P(aL/IRREML. The smaller weights in IRREML are a consequence of 'residual variances' being derived from predicted values of conditional variances. Alternative weights w o = E(g'(f.1,) 2 V(f.1,))-1 for IR-REML are suggested in Engel and Keen (1994). For the logit link evaluation of alternative weights is straightforward: w o = {2 + 2exp( QZ/ 2) cosh(x'I3)}-B but for the probit link it is problematic. It is suggested that we use w o = {g' (ji)2 E(V (f.1,))) -1 instead. Note that in the first step for the simple sire scheme this weight equals WGAR . Simulation results in Breslow and Clayton (1993) also show a negative bias for MAP/P(aL/IRREML. However, Hoeschele and Gianola (1989) find a positive bias. Their simulation study concerns a sire model which includes 135 fixed HYS effects. MAP/PQL/IRREML has smaller bias and smaller mean-squared error than GAR. For both methods an upward bias in the order of 20% (based on 45 simulations) of the true h 2 = 0.25 is observed. An upward bias is also apparent for some of the models simulated in Hoeschele et al (1987). Simulation results for overdispersed binomial data in Engel and Buist (1995) only indicate a serious upward bias for variance components estimated by IRREML when the true component value is small and the non-negativity constraints are active.

Simulation results
Data are generated as described in Hoeschele and Gianola (1989)  (the value -0.10 in HG was assumed to be a typing error), 0.15 and 0.40. The numbers of sires in the groups are 12, 14, 13 and 11. The 50 independent sire effects are generated from an N(0, < 7 g) distribution, where or 2 = h 2 /(4 -h 2 ). The residuals for the liability are from an N(0, 1) distribution. The overall constant on the probit scale is -!-1(po) (1 + Q H YS + a!)0.5, where p o determines the overall incidence. Suppose that aH YS corresponds to a proportion / HYS of 1 + o h ys 2 + e r g, then QHYS = (4f HYS/ (1 -,I HYS ))/(4 -h 2 ). In HG, h 2 = 0.25, f HYS = 0.30, po = 0.9 and PHYS = 0.2. The expected total number of records is 2 025, with about 40 offspring per sire. This is configuration 10, which is presented with some of the other configurations of parameter values studied in table IV. For each configuration, either 1 or 2 series of 200 simulations are performed. In a series, HYS effects, sire-offspring configuration and data are all generated from a sequence of random numbers from the same seed. The first series corresponds to the same seed for all configurations. Therefore, for the first series, for example for configurations 3 and 8, the design is the same. Seeds of the second series are all different. In table IV bias (%bias) and root mean square error (8 l 4Q) are both expressed as a percentage of the true value of h 2 . In contrast with the results in Gilmour et al (1985) and in agreement with Hoeschele and Gianola (1989), with one exception, MAP/PGZL/IRREML shows a positive bias. The exception is configuration 13 where HYS effects are not included, neither in the generation of the data nor in the model fitted. The negative value for this configuration indicates that, although the bias seems fairly independent of the size of the fixed effects, it may depend on the (relative) number of fixed effects in the model. Table V shows that this is indeed so. Starting from the original HG scheme, using new seeds for generating the data, the number of fixed effects is reduced by factors 1/3, 1/2, 2/3 and 3/4. A reduction of 1/3, for instance, is effected by combination of HYS effects (2, 3), (5, 6), (8,9) and so forth, replacing the original values for The negative bias agrees with the calculations in the preceding section for the simple sire model and with the results in Gilmour et al (1985), where an overall constant was the only fixed effect in the model. Data for configurations 10 and 16 was also generated with random HYS effects, with a new set of HYS effects for each simulation, and analysed with HYS as a random factor in the model. Estimated bias was -35.8 (2.7)% for configuration 10 and -11.8 (1.7)% for configuration 16. Corresponding root mean square errors were 52.0 and 27.3%, respectively. Bias and root mean square error are reduced when the progeny size is increased, see, for example, configurations 18, 21 and 22. Estimators are also less biased when incidence is less extreme, eg, configurations 3 versus 8 and 1 versus 7. Differences between series within configurations are sometimes greater than would be expected on the basis of the standard errors involved, showing that the configuration of sires and offspring is also of importance.
Bias and root mean square error for estimated heritability are expressed as a percentage of the true value of h 2 . Standard errors in parentheses.
GAR was programmed, similar to MAP/PQL/IRREML, in terms of Fisher scoring for the updates of the sire variance. Estimates are the same as for the original method proposed, which uses EM steps, but speed of convergence is often increased by at least a factor 10. Some results, using the same seeds as for MAP/P(aL/IRREML, are presented in tables IV and V. For high incidence (po = 0.9) results of MAP/PQL/IRREML and GAR are comparable. For moderate incidence (p o = 0.6), the bias for GAR is clearly less than for MAP/P(aL/ IRREML. GAR also has smaller mean square error than MAP/P(aL/IRREML at the lower incidence, although the difference is less marked. In all cases, the root mean square error is considerably larger than the corresponding bias. Plots of the estimated h 2 values of MAP/P(aL/IRREML against GAR suggest a strong, fairly linear relationship. Correlations between sire predictions from MAP/P(aL/IRREML and GAR are high, eg, over 0.99 for the first series of configurations 10 and 16. Table V also includes results obtained for IRREML with the alternative weights Wo = T ( 1]?/ {p(l-p) -pT(Àp1])2} from the previous section. There is a distinct change in the bias due to the change of weights. However, although bias is considerably improved for the top lines in the table, underestimation for the bottom lines has become more serious.
In table VI, approximate standard errors of heritability estimates from MAP/ / PQL/IRREML are compared with standard errors estimated from the series of 200 simulations (referred to as empirical values). The approximate standard errors, based on Fisher information assuming normality for the adjusted dependent variate, perform quite well, as was also observed in Engel and Buist (1995). These standard errors are standard output in Genstat 5.
Mean, 10 and 90% percentage points of the approximate standard errors are presented as percentages of the empirical value.
The approximately linear relationship between IRREML and GAR estimates within a configuration of parameter values, and the similar standard errors, suggest that it should be possible to correct for bias in IRREML, at the least in those cases where GAR performs well. For incidence 0.90 and the full set of 135 HYS effects, IRREML and GAR have large positive bias of similar size. In this case the direction of the bias is in line with results for GLM S , where bias correction often involves shrinkage towards the origin. Possibly, the results of Cordeiro and McCullagh (1991), involving an extra iteration step for adjusted response variables, may be extended to minimization of the penalized deviance, thus improving estimates /3 and predictions u. This would imply modification of the adjusted dependent variate (. Users of MAP/P(aL/IRREML should be aware of the problems involved when family sizes are small, or in the context of an animal model, when many animals are only weakly related. With extreme incidence, say over 0.90, and a sizeable number of HYS effects, say in the order of 6% of the number of binary observations, MAP, PQL, IRREML and GAR are liable to seriously overestimate heritability, and actual selection response may be considerably less than expected on the basis of model calculations.