Response to genomic selection: The Bulmer effect and the potential of genomic selection when the number of phenotypic records is limiting
© Van Grevenhof et al.; licensee BioMed Central Ltd. 2012
Received: 11 October 2011
Accepted: 18 July 2012
Published: 3 August 2012
Over the last ten years, genomic selection has developed enormously. Simulations and results on real data suggest that breeding values can be predicted with high accuracy using genetic markers alone. However, to reach high accuracies, large reference populations are needed. In many livestock populations or even species, such populations cannot be established when traits are difficult or expensive to record, or when the population size is small. The value of genomic selection is then questionable.
In this study, we compare traditional breeding schemes based on own performance or progeny information to genomic selection schemes, for which the number of phenotypic records is limiting. Deterministic simulations were performed using selection index theory. Our focus was on the equilibrium response obtained after a few generations of selection. Therefore, we first investigated the magnitude of the Bulmer effect with genomic selection.
Results showed that the reduction in response due to the Bulmer effect is the same for genomic selection as for selection based on traditional BLUP estimated breeding values, and is independent of the accuracy of selection. The reduction in response with genomic selection is greater than with selection based directly on phenotypes without the use of pedigree information, such as mass selection. To maximize the accuracy of genomic estimated breeding values when the number of phenotypic records is limiting, the same individuals should be phenotyped and genotyped, rather than genotyping parents and phenotyping their progeny. When the generation interval cannot be reduced with genomic selection, large reference populations are required to obtain a similar response to that with selection based on BLUP estimated breeding values based on own performance or progeny information. However, when a genomic selection scheme has a moderate decrease in generation interval, relatively small reference population sizes are needed to obtain a similar response to that with selection on traditional BLUP estimated breeding values.
When the trait of interest cannot be recorded on the selection candidate, genomic selection schemes are very attractive even when the number of phenotypic records is limited, because traditional breeding requires progeny testing schemes with long generation intervals in those cases.
Genomic selection (GS) is a variant of marker-assisted selection in which genetic markers covering the whole genome are used so that all quantitative trait loci (QTL) are in linkage disequilibrium with at least one marker . Simulation results and practical data in dairy cattle suggest that breeding values can be predicted with high accuracy using genetic markers alone [2, 3]. Since the introduction of the idea by Meuwissen et al.  ten years ago, there has been a number of developments, including the implementation of GS in dairy cattle breeding . Until recently, the major limitation to implement GS was the large number of markers required and the cost of genotyping these markers. Both these limitations have now been overcome in most livestock species, following the sequencing of the genomes and the subsequent availability of high-density SNP chips . It is now feasible to meet the requirements for the implementation of GS in breeding programs. In fact, after deriving a prediction equation from a reference population that uses markers and phenotypes as input and predicts breeding values as output, there is in principle no need to record phenotypes of the candidates for the selection. Thus, GS can potentially cut costs for producing and testing potential breeding animals considerably. Moreover, GS can have a large impact on breeding programs for many livestock species as it can shorten generation intervals, which is of special importance in long-lived species such as dairy cattle and horses, or when trait values become available late in life or on progeny only. This is important, as genotyping can be applied to new-born animals or even embryos , and because of the reduced need for progeny testing.
A limitation of GS, however, is that large reference populations are needed to obtain high accuracies of estimated breeding values (EBV). When the size of the reference population increases, the accuracy of EBV can reach high values, approaching 0.8 to 1.0 [4, 6, 7]. Reference population sizes used in simulations sometimes even exceed 100 000 animals  but in practise, reference populations are in some cases limited to less than 1000 animals . In many livestock populations and some livestock species, creation of large reference populations is not very feasible for many traits, especially when phenotypes are difficult or expensive to record, such as methane emission in cattle or traits related to disease resistance. If large reference populations cannot be obtained, GS will reach relatively low accuracies, and may yield no or relatively little additional response compared to traditional selection on EBV based on phenotypic information. This applies particularly to populations with a large historical effective size, and to traits that are determined by many genes, which is common in livestock [10, 11]. The more genes involved, the smaller the effect of individual genes, and the larger the reference population needed to reach a certain accuracy . For those reasons, it is important to investigate when GS offers advantages over traditional selection in cases in which the size of the reference population is limited.
In this study, we compared response to GS with response to selection on BLUP-EBV based on own performance (OP) or progeny testing (PT) information, in cases with a limited number of phenotypic records available to develop a reference population. We focussed on the equilibrium response obtained after a few generations of selection . Thus, we first investigated the reduction of accuracy and response to selection due to the effect of selection on the genetic variance, the so-called Bulmer effect . Second, we investigated the optimal construction of the reference population when the number of phenotypic records is limited. In dairy cattle, construction of a reference population started with genotyping progeny tested bulls, merely because accurate EBV based on progeny testing were available for these bulls . When the number of phenotypic records is limiting, e.g., when records still need to be collected, it may however be suboptimal to use progeny tested individuals to construct the reference population. Finally, we investigated the minimal size of the reference population necessary for GS to become advantageous over traditional BLUP EBV selection, and the dependency of this break-even point on heritability and generation interval.
Predicting response to selection
Response to selection is predicted with deterministic simulations based on selection index theory, using the SelAction software. SelAction predicts the response to selection and accuracy of selection for breeding programs. The software accounts for reduction in variance due to selection, known as the “Bulmer-effect” , and for the use of pedigree information, as with selection on BLUP-EBV. Features of SelAction and the theoretical background are described in . Genomic selection schemes can be simulated in SelAction by including an additional trait representing the marker information [2, 15]. The marker information was modelled as a trait with a heritability of 0.999, which was genetically correlated to the trait of interest. The genetic correlation between the marker information and the trait of interest was equal to the accuracy of genomic EBV, , which depends on the reference population. The represents the accuracy of genomic EBV in an unselected population and is calculated using Equation 2a-d given below . Because it is assumed that genotypes can be observed without error, the marker information is fully heritable and has no residual variance. Thus, the environmental correlation between the marker information and the trait of interest is meaningless and was set to zero in SelAction. Further details of this approach are given in .
Because the accuracy of genomic EBV established in the reference population, refers to the accuracy in a population that is not under selection, there is a distinction between and the Bulmer equilibrium accuracy of a breeding scheme based on genomic selection, denoted in this paper. The Bulmer effect reduces the proportion of genetic variance explained by the markers, so that will be smaller than in an on-going breeding scheme. The results of the deterministic simulations presented in the Results and Discussion section refer to the Bulmer-equilibrium accuracy and response that are reached after a few (≥ 3) generations of selection.
The Bulmer effect
The deterministic simulations accounted for the Bulmer effect. Numerical results may, however, give less insight into the magnitude of the Bulmer effect than a simple mathematical expression. In this section, therefore, a mathematical expression is derived for the Bulmer-equilibrium response, accuracy and additive genetic variance with genomic selection.
where the * denotes values after selection, σ M,t 2 is the variance of the genomic EBV before selection, i.e. among the candidates for selection in generation t, and k is the proportional reduction in the variance of the selection criterion due to selection of the parents. Because k refers to the variance of the selection criterion, rather than the variance of true breeding values, it is independent of the accuracy of selection. With truncation selection on a normally distributed selection criterion, k is determined entirely by the intensity of selection, k = i (i x), where i is the intensity of selection and x the standardized truncation point [13, 16–19]. Values of k are usually in the range of 0.7 to 0.9. To illustrate the impact of the Bulmer effect as simple as possible, equal selection intensity is assumed here for both sexes.
For example, for a phenotypic variance of σ P 2 = 1, and a heritability of the unselected base-generation of h02 = 0.3, an initial accuracy of genomic EBV of = 0.8, and a selected proportion of 5%, so that k = 0.86, the initial variance of genomic EBV equals 0.82 × 0.3 × 1 = 0.192 and the equilibrium variance of genomic EBV equals 0.192/1.86 = 0.103.
This results shows that the relative reduction in response due to the Bulmer effect is independent of the accuracy of selection, which agrees with results from the deterministic simulations (See Results and Discussion). For example, for a selected proportion of 5%, the Bulmer effect reduces response to GS by = 27%, irrespective of the initial accuracy of genomic EBV.
Continuing the above example yields an equilibrium accuracy of = 0.70.
Continuing the above example yields an equilibrium additive genetic variance of 0.3[1-(0.86×0.82)/(1+0.86)] = 0.21. Equations 1a through 1d show that Bulmer-equilibrium parameters can be calculated from base-generation parameters using simple equations.
Construction of the reference population to maximise accuracy
where N is the number of half-sib progeny on which the EBV is based. To investigate the optimal construction of the reference population, values for were compared for different numbers of progeny per sire and reference population sizes, for a fixed heritability of 0.3.
Response of traditional versus GS breeding schemes
For the comparison of GS with selection based on traditional BLUP-EBV estimated from phenotypic information, deterministic simulations were performed with SelAction, using the approach described above. Alternative breeding schemes were compared based on the Bulmer-equilibrium response to selection. Several selection schemes were evaluated, to illustrate the general characteristics of selection on traditional BLUP-EBV versus GS. For GS, the reference population size (n P ), and heritability (h2) were varied, to investigate the effect of these parameters on response to selection. All other parameters were kept constant across scenarios. Selection was for a single trait and in males only. To mimic the absence of selection in females, the selected proportion in females was set to 0.99.
Selection on BLUP-EBV estimated from own performance information (OP).
Selection on BLUP-EBV estimated from progeny information (PT).
Genomic selection based on marker information on selection candidates (GS).
To investigate the benefit of genomic information on top of phenotypic information vs. genomic information instead of phenotypic information, the GS scheme was applied both with and without phenotypic information on the candidates for selection.
The population had discrete generations and a fixed number of sires and dams per generation.
There was a population of 1 000 dams per generation.
Twenty sires were used per generation.
Each dam produced two male and two female offspring per generation.
In the case of progeny testing, 10 half sib progeny per sire were available in the progeny test.
In scenarios (1), (2) and (3) with genomic information in addition to phenotypic information, full pedigree information was available for breeding value estimation as assumed in the pseudo-BLUP selection index used in SelAction .
The historical effective population size was assumed to be 100 (required for Equation 1b).
One-stage selection, with a selection proportion of 0.02 in sires and 0.99 in dams. It was assumed that, of all progeny born, 50% were not suitable as selection candidates because of health, fertility or veterinary reasons. Therefore, we used selected proportions of 99% in females and 2% in males.
In scenario (1), OP, no phenotypic information was assumed to be available on sibs of the selection candidate.
Results will be presented in two ways. First, we compare responses to selection on traditional BLUP-EBV based on own performance or progeny information with GS, where GS schemes either include or exclude phenotypic information. Second, we identify the break-even size of the reference population at which GS without phenotypic information yields the same response as selection on traditional BLUP-EBV. In this approach, we model the break-even size of the reference population as a function of the reduction in generation interval that can be achieved when implementing GS.
Results and discussion
Comparison of the Bulmer-effect for mass selection and genomic selection
Equilibrium genetic variance
Results of the deterministic simulations also revealed a second difference between GS and mass selection. With mass selection, the reduction in response due to the Bulmer effect was greater at higher accuracy (i.e. h2). With a selected proportion of 5%, for example, response to mass selection is reduced by only 7% when h2 = 0.10 but by 21% when h2 = 0.50 (Table 1). With GS, in contrast, the reduction in response due to the Bulmer effect did not depend on the accuracy of selection. With a selected proportion of 5%, the Bulmer effect always reduced response by 27% in GS schemes, irrespective of the accuracy. Again, this occurred because the estimated genetic effects used in GS are known with full accuracy, since the markers are observed.
The above results show that reduction in response due to the Bulmer effect is always larger for GS than for selection based directly on phenotypic information (e.g. mass selection), except when accuracy of selection approaches 100%, in which case the reduction will be the same. The theoretical results found here (Equations 1a-e) are identical to the results found by Dekkers [18, 19] for selection on traditional BLUP-EBV. Hence, the impact of the Bulmer-effect on response, accuracy and additive genetic variance is the same for GS as for traditional BLUP selection. The reduction in response to selection, for example, is independent of the accuracy of selection for both GS and selection on BLUP-EBV. The above calculations of the Bulmer effect will be approximations when marker effects are updated each generation (known as “retraining”). Nevertheless, the effect of updating marker effects is expected to be small, because the additional data becoming available for retraining each generation will usually be smaller than the already existing reference population, and the change in accuracy due to adding records to the reference population shows a diminishing return relationship. The expressions derived here for the Bulmer-equilibrium response, additive genetic variance and accuracy with GS (Equations 1a-e above) are identical to those for selection on classical BLUP-EBV presented in [22, 23]. This makes sense because a model with genome-wide estimated marker effects is equivalent to a mixed model with a genomic relationship matrix . Hence, results for the Bulmer effect presented here are consistent with the equivalence of GS based on estimated marker effects vs. a mixed model with a genomic relationship matrix. However, the derivations in [22, 23] have a different foundation; they rely on the property of BLUP that selection on EBV does not affect the prediction error variance of the EBV [25, 26]. Hence, the agreement of our results with those in [22, 23] constitutes an independent proof of the expressions derived here.
Construction of the reference population to maximise accuracy
Studies using stochastic simulations have shown that the accuracy of genomic EBV decreases as the number of generations between the selection candidates and the animals in the reference population increases . Thus, a reference population is ideally constructed using individuals most closely related to the candidates for selection . Buch et al.  also showed that the number of daughters that need to be genotyped to replace their sires in the reference population is a function of the number of offspring underlying the sire’s EBV but is independent of the number of sires. Our results are based on a theoretical relationship between reference population size and accuracy (Equation 2), which assumes that individuals in the reference population and selection candidates are not closely related . Hence, the accuracies of genomic EBV used here may be conservative. In addition, we assumed no decay of linkage disequilibrium (LD) or change in marker frequencies.
Figure 1 shows that the increase in the accuracy of genomic EBV with the number of phenotypic records is strongly non-linear, showing a diminishing-return relationship. As a consequence, increasing the total number of phenotypic records increases accuracy less than proportional. Increasing the number of phenotypes in the reference population from 5 000 to 10 000, for example, which is 2-fold, increases accuracy of genomic EBV by only 32% (Figure 1).
Response of traditional versus GS breeding schemes
Results presented in this section refer to the Bulmer-equilibrium response and assume that the reference population is optimized for a limited number of phenotypic records. Thus, the same individuals are both phenotyped and genotyped, so that refers to both the number of phenotypic records and the number of genotyped individuals in the reference population.
Figure 2b compares selection on traditional BLUP-EBV to GS schemes when the selection candidates also have phenotypic information. Hence, in the GS schemes in Figure 2b, genomic information is available in addition to phenotypic information. Results show that in these cases, GS is of little additional value, unless the reference population is very large. Figures 2a and b show that the response pattern does not change much with heritability. In conclusion, Figures 2a and b show that GS cannot compete with traditional selection when the number of phenotypic records is limited, unless the generation interval can be decreased by GS.
Breeding programs usually focus on improvement of multiple traits and, thus, the generation interval is not determined by a single trait. This raises the question whether the conclusions drawn above from Figures 3a and b can be applied to breeding programs in practise. We believe they can, for the following reasons. First, for traits that are easy to record but that cannot be recorded on the selection candidates of both sexes, such as milk yield or egg number, GS is attractive since it allows a substantial reduction in generation interval. Thus, selection for such traits will not be an obstacle for the reduction in generation interval that is required to make GS of interest for traits with a limited number of phenotypic records. Second, for traits that can be recorded early in life on all candidates for selection, such as growth rate in broilers, GS can be combined with phenotypic information to estimate breeding values early in life . Hence, in such cases, GS for the trait with a limited number of phenotypic records will allow a reduction in generation interval and this also increases response in traits that can be recorded on all candidates early in life. Thus, when considering multi-trait selection, GS is equally or more beneficial than suggested by results in Figures 3a and b.
In this work, the accuracy of genomic EBV was based on the expression presented in  (Equation 1a), rather than based on stochastic simulation. This expression is independent of allele frequencies, in contrast to the expression derived by Goddard . However, Hayes et al.  found these two expressions to result in very similar accuracies but the method in  yielded slightly lower accuracies at low to moderate heritabilities. For traits with a limited number of phenotypes, heritabilities will be mostly in this low to moderate range . Hence, this suggests that the accuracies used here are slightly conservative.
With an equivalent intensity of selection, the reduction in response to selection due to the Bulmer-effect is the same for GS and for selection on traditional BLUP-EBV, irrespective of the accuracy of EBV used for selection. Hence, when schemes have the same selection intensity in both sexes, accounting for the Bulmer-effect is not essential to obtain the correct ranking of GS versus traditional BLUP schemes. However, when selection intensities differ between schemes, the Bulmer effect can affect the ranking and a comparison based on accuracies in an unselected population can be misleading . Schemes in which selection is based directly on phenotypic information, such as mass selection, have a lower reduction in response due to the Bulmer effect than GS or traditional BLUP schemes.
To maximize the accuracy of genomic EBV when the number of phenotypic records is limiting, the same individuals should be genotyped and phenotyped, rather than genotyping parents and phenotyping their progeny. When the generation interval cannot be decreased with GS, large reference populations are required to obtain a similar response to that with own performance selection or progeny testing. However, the accuracy of genomic EBV has a diminishing-return relationship with the size of the reference population. As a consequence, when GS schemes have a moderate decrease in generation interval, relatively small reference population sizes are needed to obtain a response equal to that with selection on traditional BLUP-EBV based on own performance or progeny information. Thus, when the trait of interest cannot be recorded on the selection candidate, GS schemes are very attractive, even when the number of phenotypic records is limited, because traditional breeding schemes would have to rely on information from relatives with many phenotypic records and long generation intervals in the case of progeny testing.
The authors would like to thank JCM Dekkers for his effort in improving the contents of this article.
- Goddard ME, Hayes BJ: Genomic selection. J Anim Breed Genet. 2007, 124: 323-330. 10.1111/j.1439-0388.2007.00702.x.View ArticlePubMedGoogle Scholar
- Schrooten C, Bovenhuis H, Van Arendonk JAM, Bijma P: Genetic progress in multistage dairy cattle breeding schemes using genetic markers. J Dairy Sci. 2005, 88: 1569-1581. 10.3168/jds.S0022-0302(05)72826-5.View ArticlePubMedGoogle Scholar
- VanRaden PM, Van Tassell CP, Wiggans GR, Sonstegard TS, Schnabel RD, Taylor JF, Schenkel FS: Invited review: Reliability of genomic predictions for North American Holstein bulls. J Dairy Sci. 2009, 92: 16-24. 10.3168/jds.2008-1514.View ArticlePubMedGoogle Scholar
- Meuwissen THE, Hayes BJ, Goddard ME: Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001, 157: 1819-1829.PubMed CentralPubMedGoogle Scholar
- Bredbacka P: Progress on methods of gene detection in preimplantation embryos. Theriogenology. 2001, 55: 23-34. 10.1016/S0093-691X(00)00443-X.View ArticlePubMedGoogle Scholar
- Daetwyler HD, Villanueva B, Woolliams JA: Accuracy of predicting the genetic risk of disease using a genome-wide approach. PLoS One. 2008, 3: e3395-10.1371/journal.pone.0003395.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhao HH, Fernando RL, Dekkers JCM: Power and precision of alternate methods for linkage disequilibrium mapping of quantitative trait loci. Genetics. 2007, 175: 1975-1986. 10.1534/genetics.106.066480.PubMed CentralView ArticlePubMedGoogle Scholar
- Goddard ME, Hayes BJ: Mapping genes for complex traits in domestic animals and their use in breeding programmes. Nat Rev Genet. 2009, 10: 381-391. 10.1038/nrg2575.View ArticlePubMedGoogle Scholar
- Calus MPL: Genomic breeding value prediction: methods and procedures. Animal. 2010, 4: 157-164. 10.1017/S1751731109991352.View ArticlePubMedGoogle Scholar
- Hayes BJ, Chamberlain AJ, Goddard ME: Use of markers in linkage disequilibrium with QTL in breeding programs. Proceedings of the 8th World Congress on Genetics Applied to Livestock Production: 13–18 August 2006. 2006, Belo Horizonte, , communication 30–06Google Scholar
- Goddard ME: The genetic architecture of quantitative traits. Proceedings of the 62nd Annual Meeting of the European Federation of Animal Science (EAAP): 29 August - 2 September 2011. 2011, Stavanger, 114.Google Scholar
- Goddard ME: Genomic selection: prediction of accuracy and maximisation of long term response. Genetica. 2008, 136: 245-252.View ArticlePubMedGoogle Scholar
- Bulmer MG: The effect of selection on genetic variability. Am Nat. 1971, 105: 201-211. 10.1086/282718.View ArticleGoogle Scholar
- Rutten MJM, Bijma P, Woolliams JA, Van Arendonk JAM: SelAction: Software to predict selection response and rate of inbreeding in livestock breeding programs. J Hered. 2002, 93: 456-458. 10.1093/jhered/93.6.456.View ArticlePubMedGoogle Scholar
- Dekkers JCM: Prediction of response from marker-assisted and genomic selection using selection index theory. J Anim Breed Genet. 2007, 124: 331-341. 10.1111/j.1439-0388.2007.00701.x.View ArticlePubMedGoogle Scholar
- Cochran WG: Improvement by means of selection. Proceedings of the 2ndBerkeley Symposium on Mathematics, Statistics and Probability: 31 July-12 August 1950; Berkeley. Edited by: Neyman J. 1951, University of California Press, Berkeley, 449-470.Google Scholar
- Tallis G: The moment generating function of the truncated multi-normal distribution. J R Stat Soc Ser B. 1961, 23: 223-229.Google Scholar
- Dekkers JCM: Reduction of response to selection due to linkage disequilibrium with selection on base linear unbiased predictors. Edited by: Hill WG, Thompson R, Woolliams JA. 1990, , Edinburgh, 280-287.Google Scholar
- Dekkers JCM: Asymptotic response to selection on best linear unbiased predictors of breeding values. Anim Prod. 1991, 54: 351-360.View ArticleGoogle Scholar
- Hayes BJ, Daetwyler HD, Bowman PJ, Moser G, Tier B, Crump R, Khatkar M, Raadsma HW, Goddard ME: Accuracy of genomic selection: comparing theory and results. Proc Assoc Advmt Anim Breed. 2009, 17: 352-355.Google Scholar
- Falconer DS, Mackay TFC: Introduction to Quantitative Genetics. 1996, Pearson Education Limited, EssexGoogle Scholar
- Dekkers JCM: Asymptotic response to selection on best linear unbiased predictors of breeding values. Anim Prod. 1992, 54: 351-360. 10.1017/S0003356100020808.View ArticleGoogle Scholar
- Bijma P: Accuracies of estimated breeding values from ordinary genetic evaluations do not reflect the correlation between true and estimated breeding values in selected populations. J Anim Breed Genet. 10.1111/j.1439-0388.2012.00991.x.Google Scholar
- Misztal I, Legarra A, Aguilar I: Computing procedures for genetic evaluation including phenotypic, full pedigree, and genomic information. J Dairy Sci. 2009, 92: 4648-4655. 10.3168/jds.2009-2064.View ArticlePubMedGoogle Scholar
- Henderson CR: Best linear unbiased estimation and prediction under a selection model. Biometrics. 1975, 31: 423-447. 10.2307/2529430.View ArticlePubMedGoogle Scholar
- Henderson CR: Best linear unbiased prediction in populations that have undergone selection. Edited by: Barton RA, Smith WC. 1982, Massey University: Dunmore Press, Palmerston North, 191-201.Google Scholar
- Buch LH, Kargo M, Berg P, Lassen J, Sorensen AC: The value of cows in reference populations for genomic selection of new functional traits. Animal. 2012, 6: 880-886. 10.1017/S1751731111002205.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.