Research | Open | Published:
A note on mate allocation for dominance handling in genomic selection
Genetics Selection Evolutionvolume 42, Article number: 33 (2010)
Estimation of non-additive genetic effects in animal breeding is important because it increases the accuracy of breeding value prediction and the value of mate allocation procedures. With the advent of genomic selection these ideas should be revisited. The objective of this study was to quantify the efficiency of including dominance effects and practising mating allocation under a whole-genome evaluation scenario. Four strategies of selection, carried out during five generations, were compared by simulation techniques. In the first scenario (MS), individuals were selected based on their own phenotypic information. In the second (GSA), they were selected based on the prediction generated by the Bayes A method of whole-genome evaluation under an additive model. In the third (GSD), the model was expanded to include dominance effects. These three scenarios used random mating to construct future generations, whereas in the fourth one (GSD + MA), matings were optimized by simulated annealing. The advantage of GSD over GSA ranges from 9 to 14% of the expected response and, in addition, using mate allocation (GSD + MA) provides an additional response ranging from 6% to 22%. However, mate selection can improve the expected genetic response over random mating only in the first generation of selection. Furthermore, the efficiency of genomic selection is eroded after a few generations of selection, thus, a continued collection of phenotypic data and re-evaluation will be required.
Estimation of non-additive genetic effects in animal breeding is important because ignoring these effects will produce less accurate estimates of breeding values and will have an effect on ranking breeding values. As a consequence, including these effects will produce a more accurate prediction and, therefore, more genetic response. This potential increase of genetic response is about 10% for traits with a low heritability, high proportion of dominance variance, low selection intensity and high percentage (>20%) of full-sibs .
However, dominance effects have rarely been included in genetic evaluations. The reasons, that can be argued, are the greater computational complexity and the inaccuracy in the estimation of variance components (it is commonly believed that 20 to 100 times more data are required including a high proportion of full-sibs ). It has also been claimed that there is little evidence of non-additive genetic variance in the literature (see for example ). However, although estimates are scarce, dominance variance usually amounts to about 10% of the phenotypic variance . Furthermore, in an extensive review , estimates of the ratio of additive to dominance variance have been reported in wild species i.e. about 1.17 for life-history traits, 1.06 for physiological traits and 0.19 for morphological traits. In the same study, the estimate of this ratio for domestic species was 0.80.
Moreover, mating plans (or mating allocations) have been used in animal breeding for several reasons: a) to control inbreeding; b) in situations where economic merit is not linear; c) when there is an intermediate optimum (or restricted traits); d) to increase connection among herds and, finally, e) to profit from dominance genetic effects. With respect to the last point, it is well known that every methodology pretending to use non-additive effects [6–8] must contemplate two types of mating: a) matings from which the population will be propagated; b) matings to obtain commercial animals. Among all the methodologies aimed at profiting from dominance, mating allocation could be the easiest option. Optimal mating allocation relies on the idea that although selection should be carried out on estimated additive breeding values, animals used for commercial production should be the product of planned mating which maximizes the overall (additive plus dominance effects) genetic merit of the offspring. Mating allocation profits from dominance when the commercial population is constructed, but for the next generation only additive effects are transmitted.
Although not considered here, other ideas could be used to exploit dominance in later generations. The key idea is that selection should be applied not only to individuals and should be extended to mating. Although it is usually thought that application of the above ideas requires two separate lines as in the classical crossbreeding programmes or in the so-called reciprocal recurrent selection, it can be carried out in a single population [6, 7]. Furthermore, a 'super-breed' model can be implemented to exploit both across- and within-breed dominance variances .
With the recent availability of very dense SNP panels and the advent of genomic selection  it seems natural that methods using dominance variation should be revisited. The aim of this study was to quantify the efficiency of mating allocation under a whole-genome evaluation scenario in terms of genetic response to selection in the first and subsequent generations.
A population was simulated for 1000 generations at an effective size of 100. After 1000 generations, the actual size of the population increased up to 1000 (500 per sex) and remained at 1000 for three discrete and consecutive generations. During the whole process, all individuals were generated with one gamete from a random father and one from a random mother. Therefore the data set for the estimation of the marker effects consisted of the 3000 individuals from the last three generations. These 3000 (generation 1001, 1002 and 1003) individuals were genotyped and phenotyped and then used as training population to estimate additive and dominance effects of SNP.
The genome was assumed to consist of 10 chromosomes each 100 cM long and 1000 loci/chromosome (i.e. a total of 9000 SNP plus 1000 QTL) were located at random map positions. Both SNP and QTL were biallelic. Mutations were generated at a rate of 2.5 × 10-3 per locus per generation at the marker loci and at a rate of 2.5 × 10-5 at the QTL loci. These mutation rates, taken from  are unrealistic but they seem to provide a reasonable level of segregation after only 1000 generations. Both the additive and the dominance effects were sampled from a standard normal distribution and scaled to obtain the desired values of h2 (VA/VP) and d2 (VD/VP) where VA, VD and VP the additive, dominance and phenotypic variances as defined in, for example . The simulation of additive and dominance effects was a bit simplistic because it is known that the distribution of additive effects is leptokurtic and the distribution of dominance effects is dependent on additive effects . In generation 1, about half of the loci were fixed for allele 1 and the other half were fixed for allele 2.
Model of analysis
For simplicity, estimation of marker effects was carried out using a Bayes A method  with two alternative models:
The first model assumed that the phenotypic value of individual j (j = 1, ... N) is
where p is the number of SNP and xij are indicator functions that take the values 1, 0, -1 for the SNP genotypes AA, Aa and aa at each loci, respectively. The assumed distributions for each additive ai component and residual component (ej) were:
The prior distribution of the variances was the scaled inverted chi-square distribution:
where S is a scale parameter and v is the number of degrees of freedom. The values of v = 4.012 and S = 0.0020 were taken from .
The second model also assumed, in addition, that dominance effects were included for each SNP:
where wij are indicator functions that take the values 0,1, 0 for the SNP genotypes AA, Aa and aa, respectively. The assumed distributions for each dominance effect (di) was:
The prior distribution of the variances of the dominance effects was the scaled inverted chi-square distribution
where S is a scale parameter and v is the number of degrees of freedom. As before, the values of v = 4.012 and S = 0.0020 were assumed.
Gibbs sampling based on posterior distributions conditional on other effects was implemented for estimation by averaging the samples from 10,000 cycles, after discarding the first 1,000.
Prediction of breeding values
From the estimates of additive and dominance effects, breeding values (ui) were calculated, according to , for each individual in both models:
where wij is an indicator function of the genotype of the jth marker of the ith individual that takes the values 1, 0, -1 when the genotypes are AA, Aa or aa, respectively. Moreover, pj and qj are the allelic frequencies (A or a) for the jth marker in the training population and α is the average effect of substitution for the jth marker calculated as α j = α j under model a) and α j = α j + d j (q j - p j ) under model b).
Prediction of genotype effects of future matings
The prediction of performance of future mating (Gij) between the ith and jth individual is performed by:
where pr ijk (AA), pr ijk (Aα) and pr ijk (aa) are the probabilities of the genotypes AA, Aa and aa for the combination of the ith and jth individual and the kth marker.
Generation 1004 was formed from 25 sires and 250 dams selected from generation 1003. Two strategies of selection, carried out during five generations, were compared. In the first strategy, 25 males and 250 females were selected from 500 males and 500 females based on the prediction of breeding values from the estimation of markers effect under model a) and b), denoted and GSA and GSD, respectively. Afterwards they were mated randomly (10 dams per sire) and four sibs were obtained from each mating; the true genotypic values of the offspring were calculated.
In the second (GSD + MA), from the 6250 (25 × 250) possible matings, we chose the best 250 based on the prediction of the mating (Gij), and we generated four new individuals for each mating mate. The true genotypic values of the offspring were also calculated. The algorithm of searching used was the simulated annealing.
Finally, phenotypic selection was also carried out as a control, and we replicated the selection strategies by considering the true QTL as markers and the simulated effects of the additive and dominance effects of the QTL as known.
Fifty replicates of each method and strategy were performed.
Results and discussion
In generation 1003, around 8000 SNP markers and 65 QTL were segregating. The average linkage disequilibrium between adjacent polymorphisms was 0.1097. In addition, the linkage disequilibrium among the polymorphic loci (QTL and SNP) in generation 1003 measured as the square of the correlation (r2) is represented in Figure 1a as a function of the map distance. Besides, we have also represented in Figure 1b the r2 values between QTL and SNP. Furthermore, the observed distribution of the number of SNP with different degrees of linkage disequilibrium with its nearest QTL is presented in Table 1. Thus, in generation 1003, an average of 1.39 SNP has an r2 greater than 0.5 with its nearest QTL. This fact indicates that there was enough LD with QTL for selection purposes based on SNP information. Finally, the r2 value among the QTL themselves attains the very low value of 0.0014.
First generation response
The results of the first generation of selection are presented in Table 2 for all the studied situations: MS (mass selection), GSA (genomic selection without dominance), GSD (genomic selection with dominance), and GSD + MA (genomic selection with dominance and mate allocation). Apart from the clear superiority of genomic selection over mass selection (MS), introduction of dominance effects in the model of evaluation (GSD) results in a clear advantage over genomic selection with an additive model (GSA). The advantage ranges from 9 to 14% of the expected response (i.e. 0.527 vs. 0.471 for h2 = 0.20 and d2 = 0.05). These results of the expected response are confirmed with the results of the accuracy of breeding value prediction that are also presented in Table 2. In addition, the use of mate allocation (GSD + MA) provides an additional response ranging from 6% (h2 = 0.40, d2 = 0.05) to 22% (h2 = 0.20, d2 = 0.10). In general, the superiority of GSD + MA increases as the ratio of dominance variance increases and as the heritability decreases. Both advantages are similar to those reported when dominance is included in the classical polygenic model [1, 2].
Furthermore, it must be mentioned that the use of a model including dominance does not give worse results even when the true simulated model is purely additive. For just one generation, the selection responses with and without dominance in the evaluation model were 0.4724 vs. 0.4670 (h2 = 0.20) and 0.7832 vs. 0.7728 (h2 = 0.40), respectively.
Subsequent generation response
Unfortunately, the results in subsequent generations are rather discouraging for both genomic selection and mating allocation procedures. Medium term genetic responses to selection for each case of simulation are presented in Figures 2 and 3. As observed, the advantage of GSD and GSD + MA over MS presented in the previous table disappears in subsequent generations although it must be noted that MS would require extra-cost and time to record the phenotypes of candidates to selection at each generation.
In addition, it is notable that the increase of response due to GSD + MA over GSD is observed only in the first generation, the responses being similar from generation two to five. Thus, the advantage in terms of selection response obtained in the first generation is only maintained in the subsequent ones. However, a single generation of random mating eliminates this superiority, as shown in Figure 4, where two generations of accumulated response of the selected population are shown for four alternative selection strategies: a) GSD (1st generation) - GSD (2nd generation), b) GSD (1st) - GSD+MA (2nd), c) GSD + MA (1st) - GSD (2nd) and d) GSD + MA (1st) - GSD + MA (2nd).
The loss of efficiency of GS after the first generation can be attributed to the reduction of genetic variance caused by the reduced population size of the selected population and by the increase of linkage disequilibrium among the QTL as a consequence of selection, the so-called Bulmer effect . In fact, the LD among QTL increases from an r2 value of 0.0014 in generation 1003 to a value of 0.0032 in generation 1004.
Furthermore, additional reduction of the expected response is explained by the loss of linkage disequilibrium between the SNP and the QTL due to recombination.
Response after random mating
In order to gain some insight in this loss of efficiency observed in Figures 2 and 3, we studied the response when GSD and GSD + MA are carried out after 0, 1, 2 and 3 previous generations with random mating and no selection in order to evaluate the consequences of reduction of linkage disequilibrium between SNP and QTL in a no selection scenario. The results are presented in Table 3. The observed selection response is eroded, but at much lower degree than in the cases where selection was carried out in previous generations.
To illustrate this fact, we calculated the linkage disequilibrium between QTL and SNP markers in generation 1003 and in generation 1004 with and without selection. Figure 5a represents the relationship between the correlation (r2) in generations 1003 and 1004, between every pair of QTL and SNP with r2 >0.10 in generation 1003 when selection was carried out. On the contrary, Figure 5b and 5c show the same relationship in cases where individual selection or no selection occurs between generations 1003 and 1004, respectively. The LD between QTL and SNP is more conserved when selection is not carried out and when selection is performed using only phenotypic records irrespective of the distance (results not shown). Thus, the efficiency of selection by SNP markers is reduced when a previous step of genomic selection is performed.
Known QTL genotypes and effects
In addition, we compared the results of GSD in two other different scenarios. First, we assumed that the QTL genotypes were known and we used them as markers in a Bayes A algorithm (Scenario A) and, second, we assumed the true effects of the QTL known and used them (Scenario B), the latter representing the maximum achievable response. Results are presented in Table 4. As in the previous simulations, the advantage of GSD + MA over GSD is only observed in the first generation, independently of the information used for mating prediction.
If we examine the increase of response due to MA in the first generation, in Scenario A (QTL genotypes known) it ranges from 19% (h2 = 0.40 and d2 = 0.05) to 45% (h2 = 0.20 and d2 = 0.10) and in Scenario B (QTL genotypes and effects known) from 17% (h2 = 0.40 and d2 = 0.05) to 38% (h2 = 0.20 and d2 = 0.10). Although the percentage of increase over GSD is greater in Scenario A, the absolute value of extra response due to MA is bigger in Scenario B, as expected when maximum information is available. Success of MA is due to the possibility of predicting the genotype of future offspring and of estimating the additive and dominance effects. The first challenge is accomplished even in Scenario A, which shows a higher relative superiority than Scenario B. In addition, these extra genetic responses are greater than the ones shown in Table 2, when SNP genotypes are used to predict additive and dominance effects.
Furthermore, a strong reduction in the genetic response is observed between the first and the second generations for every scenario. However, the response is maintained at a higher degree when QTL effects are known than when SNP or QTL effects are estimated. As expected, the scenario in which QTL genotypes are known but their effects need to be estimated, provides an intermediate response.
Introduction of dominance effects in genetic evaluation is easier to achieve in the whole-genome evaluation scenario than in the classical polygenic model, where potential parental combinations have to be defined and evaluated. Introduction of dominance effects in models of whole-genome evaluation provides two main results. First, it increases the accuracy of prediction of breeding values and second, it makes it possible to obtain an extra response by the appropriate design of future matings using mate allocation techniques.
Thus, mate allocation is recommended in the genetic management of populations under selection by whole-genome evaluation procedures, although the potential extra response is achieved only in the first generation and then maintained afterwards.
Our results also show that in most scenarios of genomic selection a continued collection of phenotypic data and re-evaluation of the additive and dominance effects of markers will be required, because the ability of predicting breeding values is greatly reduced when selection is carried out.
Varona L, Misztal I: Prediction of parental dominance combinations for planned matings, methodology, and simulation results. J Dairy Sci. 1999, 82: 2186-91. 10.3168/jds.S0022-0302(99)75463-9.
Misztal I, Varona L, Culbertson M, Betrand JK, Mabry J, Lawlor TJ, Van Tassel CP, Gengler N: Studies on the value of incorporating the effect of dominance in genetic evaluations of dairy cattle, beef cattle and swine. Biotechnol Agron Soc Environ. 1998, 2: 227-233.
Hill WG, Goddard ME, Visscher P: Data and Theory Point to Mainly Additive Genetic Variance for Complex Traits. PLOS Genetics. 2008, 4: 1-10. 10.1371/journal.pgen.0040001.
Varona L, Misztal I, Bertrand JK, Lawlor TJ: Effect of full sibs on additive breeding values under the dominance model for stature in United States Holsteins. J Dairy Sci. 1998, 81: 1126-35. 10.3168/jds.S0022-0302(98)75675-9.
Crnokrak P, Roff DA: Dominance variance: associations with selection and fitness. Heredity. 1995, 75: 530-540. 10.1038/hdy.1995.169.
Toro MA: A new method aimed at using the dominance variance in closed breeding populations. Genet Sel Evol. 1993, 26: 63-74. 10.1186/1297-9686-25-1-63.
Toro MA: Selection of grandparental combinations as a procedure designed to make use of dominance genetic effects. Genet Sel Evol. 1998, 30: 339-349. 10.1186/1297-9686-30-4-339.
Maki-Tanila A: An overview on quantitative and genomic tools for utilising dominance genetic variation in improving animal production. Agric Food Sci. 2007, 16: 188-198. 10.2137/145960607782219337.
Hayes BJ, Miller SP: Mate selection strategies to exploit across- and within-breed dominance variation. J Anim Breed Genet. 2000, 117: 347-359. 10.1046/j.1439-0388.2000.00252.x.
Meuwissen THE, Hayes BJ, Goddard ME: Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001, 157: 1819-1829.
Falconer DS, Mackay TFC: Introduction to Quantitative Genetics. 1996, Addison Wesley Longman, England
Bennewitz J, Meuwissen THE: The distribution of QTL additive and dominance effects in porcine F2 crosses. J Anim Breed Genet. 2010, 127: 171-179. 10.1111/j.1439-0388.2009.00847.x.
Bulmer MG: The effect of selection on genetic variability. Am Nat. 105: 201-211. 10.1086/282718.
The research was supported by Project CGL2009-13278-C02-02/BOS (Ministerio de Educación y Ciencia, Spain). It was prepared for the 2009 Chapman Lectures in Animal Breeding and Genetics at the University of Wisconsin-Madison
The authors declare that they have no competing interests.
LV wrote the main computer programs and ran them. Both authors wrote and approved the final manuscript.