Alternative models for QTL detection in livestock. III. Heteroskedastic model and models corresponding to several distributions of the QTL effect

This paper describes two kinds of alternative models for QTL detection in livestock: an heteroskedastic model, and models corresponding to several hypotheses concerning the distribution of the QTL substitution effect among the sires: a fixed and limited number of alleles or an infinite number of alleles. The power of different tests built with these hypotheses were computed under different situations. The genetic variance associated with the QTL was shown in some situations. The results showed small power differences between the different models, but important differences in the quality of the estimations. In addition, a model was built in a simplified situation to investigate the gain in using possible linkage disequilibrium.


. INTRODUCTION
In theoretical papers dealing with QTL detection in livestock, the QTL effects are most often considered to be different across the sires i, and the residual variance within the QTL genotype as constant among the sires (e.g. [9,10]). These hypotheses were made in the two previous papers about alternative models for QTL detection in livestock [4, 8!. In this third paper, these two sets of parameters are studied.
First, a heteroskedastic model with residual variance a/ specific to each sire i is evaluated. The rationale for this test is that it should be more robust against true heteroskedasticity, for instance when different alleles are segregating at another QTL than the QTL under consideration. However, the power of the tests may be smaller than in the homoskedastic model if the homoskedastic model is correct. Different possibilities concerning the within sire QTL substitution effect o! will also be considered: a fixed and limited number of alleles, or an infinite number of alleles. Taking into account these distributions of the QTL effect can increase the power of the tests if the model is correct, and decrease this power if the model is incorrect. Therefore, the behaviour of the tests based on these different models will be compared under different situations concerning the distribution of the QTL effect. More specifically, the case of a biallelic QTL in linkage disequilibrium with the marker, will be explored in greater detail.
Jansen et al. [6] also considered the same kind of model concerning the residual variances and the number of alleles, but did not compare the power of the tests. Coppieters et al. [3] also considered these kinds of models and compared the power of regression analysis and of a non-parametric approach.
Most hypotheses and notations are given in Elsen et al. [4]. To simplify the computations, all the comparisons were made using the most probable sire genotype hsi = argmax hS i P (hs il M d and the linearised approximation of the likelihood described in the previous paper. All the simulations were made with 5 000 replications, and the length of the confidence interval for the simulated power was smaller than 1 %. When an analytical solution could not be found, we used a quasi Newton algorithm to compute the maximum likelihood. The chromosome length was 1 Morgan, with 3 or 11 markers, equally spaced, each with two alleles segregating at an equal frequency in the population.

EVALUATION OF A HETEROSKEDASTIC MODEL
In this section, the power of the T 2 test built under a homoskedastic model [8] will be compared to the power of the T 6 test built under a heteroskedastic model, where o, e'i 2 is used in place of Q2 in the likelihood Â'r, hs . This comparison will be made for both homoskedastic and heteroskedastic situations. The heteroskedastic situation will be modelled assuming the existence of an independent QTL, i.e. located on another chromosome. This QTL is assumed to be biallelic, with balanced frequencies (0.5) in the sire population and with an additive effect. Dams are homozygous for this QTL. Under this hypothesis, the within offspring residual variance is lower for sires homozygous for this QTL than for the heterozygous sire. Powers were calculated considering an H o rejection threshold corresponding to a correct type I error, which is computed in the same situation, homoskedastic or heteroskedastic, with no QTL on the tested chromosome.  (3 or 11), of the position of the QTL (0.05 or 0.35) and of the additive effect of the QTL (a = 0.5 or 1). The two possible QTL alleles thus had the same probability. Note that in this case, the QTL substitution effect equals the QTL additive effect.
Tables II and III concern true heteroskedastic situations. A QTL located on another chromosome was simulated with an a 2 effect. The thresholds of the T Z and T 6 tests are given in table II for different values of the a 2 effect and for 20 sires, 50 progeny per sire and 11 markers. The results were obtained with 5 000 simulations. The power of the T 2 and T 6 tests are given in table III for different values of the linked QTL additive effect (a = 0.5 or 1.0), of the position of this linked QTL (x -0.05 or 0.35) and of the independent QTL additive effect (a 2 = 0, 1, 1.5 or 2). For each QTL, the two possible alleles had the same probability.
In the true homoskedastic situation, and for a given number of sires and markers, the thresholds of the two tests appear to be very close to each other for all cases (data not shown), which is in agreement with the asymptotic theory in linear models. In a linear model, the asymptotic distribution of Fisher test statistic is the same if the residual variance used in the denominator is replaced by any consistent estimate of this variance. The estimate of the residual variances in the model corresponding to the T!' test is consistent, as is the estimate in the other model. The thresholds given in table II show that the T 6 test is not sensitive at all to the value of a 2 , whereas T 2 is slightly more sensitive. The use of the threshold corresponding to a 2 = 0 when it is not true can lead to a first type error of 5.5 % instead of 5 %.
The power of the T! test appears to be only slightly smaller than the power of the T 2 test in the case of o r ,,i = 0' e' This very small decrease is in agreement with the difference in power of an analysis of variance test when the number of degrees of freedom of the residual varies from 50 to 1000, i.e. from the number of progeny per sire to the total number of progeny.
The power of the T! test is slightly larger than that of the T 2 test only in cases where the QTL leading to heteroskedasticity has a large effect. Even in these cases, the differences between the power of the two tests remain small and of the same order as for homoskedastic situations, but with the opposite sign.
From these results, and considering that the tests based on the heteroskedastic model take a little less time to compute (about 5 %), the following tests will be based on this model.

VARIOUS NUMBERS OF ALLELES AT THE QTL LOCUS
In the previous papers [4, 8!, QTL substitution effects ai were defined within with each sire i. In this paper, two possible alternative situations concerning these effects are considered.
-A limited number of QTL alleles, and therefore a set of only a few possible values for ai . In this case, the parameters are these values and the probability of QTL genotypes. This is the model used by Knott  In these two situations, we will consider that the QTL effects are independently and identically distributed between the sires.
In the two cases, the linearised version of the likelihood can be written as: where f(a7) is the density of the distribution of a2 . In the situation with two possible alleles at the QTL locus, the likelihood becomes: where p' = p(ai = a) = p(ai = -a) and a are the two parameters of the distribution.
In the situation with a normal distribution of the QTL effect, the density f (a2 ) is the normal density 0(a'; 0, o, 2) and the likelihood is written as A3!! (normal).
The test built with the likelihood AHhs(two alleles) will be T 7 and the test built with the likelihood A3!! (normal) , T 8 .
In table IV, T 7 and T' test thresholds are given for different situations concerning the number of markers and the number of progeny per sire. In table V, the power of the T 6 , T 7 and T 8 tests are presented for two kinds of situations. In the first, the QTL had two possible equiprobable (p a = 1/2) alleles with no dominance and an additive effect a. The QTL substitution effect ai for each sire i is therefore 0 with a probability of 1/2 and a with a probability of 1/2. We have E(an = a 2 /2. The QTL variance due to the sire in the progeny of i is a2/4, and globally a / = E(a2/4) = a 2 /8. In the second, the effect of each value a i was drawn at random in a normal distribution, ol = a 2 /2 of null expectation and variance. Therefore, E(a?) = a 2 /2 and or = E(af /4) = a 2 /8 as in the first case. The results are presented for different values of the parameters.
It is interesting to note that the thresholds are appreciably smaller than the thresholds presented in table Il. This is due to the fact that there is only one parameter for the QTL effect in T 7 and T 8 , and 20 in T 6 . The differences between the two kinds of thresholds can be compared with the differences between the xi ddl 95 % quantile, 3.84, and the X!oddl 95 % quantile, 31.41.
The main and quite strange result was that the power of T! is always larger than or equal to the power of the other tests.
In order to compare the T! and T 7 tests more thoroughly when the model really has two alleles, a very large number of simulations were performed in a simplified situation. A very informative marker, linked totally to the QTL was assumed to exist, and the residual variance was assumed to be known (20 sires and 50 progeny per sire). The T 6 and T 7 tests were simplified accordingly. The T 6 test was found to be more powerful (with a difference of 3-4 %) than the T 7 test for 0.1 < p' < 0.9, and T 7 was more powerful (with the same differences) than T 6 for the other values of p'. This confirms that the loglikelihood ratio test is not the more powerful test in mixture situations, for all values of the alternative parameters. Andrews and Ploberger !1, 2] showed that the loglikelihood ratio test is admissible but not optimal in cases, such as mixture models, where a parameter disappears under the null hypothesis (here the probability of having one of the two alleles). We tried a value p a = 0.05 in the general framework with md = 50, L = 11, a = 0.5, but unfortunately the T 6 test remains more powerful (with a difference of 2 %) than the T 7 test.
Concerning the comparison between T! and T' in situations where the QTL effect is normally distributed, it is clear in such simple and balanced situations that both T 6 and T 8 are asymptotically equivalent to the test based on the value of 6Z where the a, are the maximum likelihood estimators i of the QTL substitution effect. Therefore, their power should have been quite the same. The relatively poor performance of T' is perhaps partially due to numerical problems, because in some cases (2 %), the algorithm had difficulties in converging and the corresponding simulations were excluded from the results.
The estimation of the QTL variance due to the sire Q2 obtained with the different models is shown in table VI. With the models used in T 6 and T 7 , this estimation is obtained as a function of the estimates of the a i or a; with T', it is estimated directly. The value 0.03125 (resp. 0.125) of ( T2 corresponds to values a = 0.5 and o,2 = 0.125 (resp. 1.0 and 0.5).
It appears that the estimator obtained using T 8 is the only quite unbiased estimator of u.;. The bias is very large when using the other tests. A practical solution would be to use the simple T 6 test to detect a QTL and to use the estimate associated with T 8 when a QTL is detected.

BETWEEN SIRES LINKAGE DISEQUILIBRIUM
To investigate the usefulness of using a model including a linkage disequilibrium between markers and QTL alleles at the between sires level, a simplified situation, which mimics the real situation, but which is considerably easier to compute, was considered.
The QTL is supposed to be located on a marker locus, with all the 20 sires considered A, B heterozygous for this marker. The dams are considered as carrying other alleles and therefore all the progeny are informative. We denote Y A (i) (resp. Ya(i)) the mean of the n A (i) (resp. n B (i)) progeny of sire i carrying allele A (resp B). The two possible alleles at the QTL are denoted Q, with an additive effect of a/2 and q, with an additive effect -a/2. The model for the expectation of Y A (i) and Y B (i) is: The variability around this expectation will be considered as normally distributed, with mean 0 and variance a 2 / n A (i) (resp. u 2 / n B (i)) assumed to be known. We will consider two tests: the analysis of variance test which corresponds to the model E(Y A (i)) -E(Y B (i)) = a i , without an assumption concerning the distribution of the a i , and the likelihood ratio test corresponding to the mixture model concerning the sire allele. The first test is analogous to test T 6 and will be denoted T6! and the second, analogous to test T 7 will be denoted T 7' . This is only an analogy because the residual variance is assumed to be known, all the progeny are informative and the tests are computed only on the marker.
The powers of these two tests for U 2 = 1, a = 0.5, with different numbers of informative progeny n A (i) + rz B (i) = constant across the sires, and different values of the parameters p i and p 2 , are given in table VII. Note that the 25 informative progeny would correspond to the mean number of informative progeny for 50 dams and a single biallelic marker.
It appears that the use of a model with a linkage disequilibrium can increase the power if there is really a linkage disequilibrium (that is a large difference between p i and p 2 ) but can lose power when there is a small linkage disequilibrium. These results depend heavily however on the hypothesis made in this simplified situation.
-QTL location knowledge; this knowledge increases the power of the two tests but perhaps does not change the difference between the two tests. -The females do not carry either of the sire's alleles; it is not a very realistic situation, but it leads to easier computations and one can think that it does not change the power difference between the two tests.
-The use of a completely linked marker; it is considerably more difficult to build a model with one or several partially linked markers and the gain in using this information would be smaller than the gain presented in table VIL

CONCLUSIONS
In many situations, the power of the simple T! test, which is easier and faster to compute, is equal to or a little bit better than the power of the other tests. This result could be specific to QTLs of little effect. In the present study, we focused on QTL effects of such a relatively small magnitude because, with (aTLs with larger effects, all the tests would have had the same power, one. For (aTLs with large effects, the comparison should rely upon other criteria than power, such as the length of the QTL location confidence interval. Nevertheless, the T 8 test is appreciably better than the other test in estimating QTL variance.
The model using a linkage disequilibrium can lead to more power in some situations. Nevertheless, it is of interest only if one can be sure that there is really a linkage disequilibrium. The other problem for the use of this model is the extension to a general situation where the QTL is not located on a marker.