Accuracy of estimated breeding values with genomic information on males, females, or both: an example on broiler chicken
© Lourenco et al. 2015
Received: 14 October 2014
Accepted: 22 June 2015
Published: 2 July 2015
As more and more genotypes become available, accuracy of genomic evaluations can potentially increase. However, the impact of genotype data on accuracy depends on the structure of the genotyped cohort. For populations such as dairy cattle, the greatest benefit has come from genotyping sires with high accuracy, whereas the benefit due to adding genotypes from cows was smaller. In broiler chicken breeding programs, males have less progeny than dairy bulls, females have more progeny than dairy cows, and most production traits are recorded for both sexes. Consequently, genotyping both sexes in broiler chickens may be more advantageous than in dairy cattle.
We studied the contribution of genotypes from males and females using a real dataset with genotypes on 15 723 broiler chickens. Genomic evaluations used three training sets that included only males (4648), only females (8100), and both sexes (12 748). Realized accuracies of genomic estimated breeding values (GEBV) were used to evaluate the benefit of including genotypes for different training populations on genomic predictions of young genotyped chickens.
Using genotypes on males, the average increase in accuracy of GEBV over pedigree-based EBV for males and females was 12 and 1 percentage points, respectively. Using female genotypes, this increase was 1 and 18 percentage points, respectively. Using genotypes of both sexes increased accuracies by 19 points for males and 20 points for females. For two traits with similar heritabilities and amounts of information, realized accuracies from cross-validation were lower for the trait that was under strong selection.
Overall, genotyping males and females improves predictions of all young genotyped chickens, regardless of sex. Therefore, when males and females both contribute to genetic progress of the population, genotyping both sexes may be the best option.
Large amounts of genomic information have accumulated for nearly all livestock species and its use has led to increases in the accuracy of estimated breeding values (EBV) . These increases are mainly due to improved inferences on relationships between individuals and linkage disequilibrium (LD) between quantitative trait loci (QTL) and markers . Higher accuracies are obtained when relationships between animals in the training population are weak and the relationship between the training and validation populations is high .
Questions about how the genotyped population should be structured and which animals should be used in the training population are still a matter of debate in all species. In dairy cattle, for example, phenotypes for production traits are collected on females and combined with genotypes of males for successful genomic evaluation. According to Rendel and Robertson , genetic progress in a population is a combination of the progress in each of the four paths of selection. In dairy cattle, selection intensities are highest for elite sires of bulls and elite dams of bulls  because strong selection pressure can be applied in both these pathways. With genomic selection, very young females can be chosen (e.g., even heifers) as dams of bulls, and elite cows are often genotyped . Although accurate genomic breeding values for females are highly relevant, including female genotypes and phenotypes in the training population resulted in very small increases in the accuracy of evaluation of young dairy bulls [6, 7]. For instance, adding 17 000 female genotypes to 7000 male genotypes increased the accuracy of evaluation of young bulls from 0.70 to 0.72 . This small increase is due to female phenotypes being largely redundant, since these phenotypes are already included in their sire’s information, either explicitly in the form of pseudo-phenotypes, or implicitly, as in the single-step genomic best linear unbiased predictor (ssGBLUP). However, in dairy cattle, genotyping females is useful for intra-herd selection of females  and for identifying elite females to produce future sires.
In species such as broiler chickens or pigs, the number of progeny is smaller per male and larger per female than in dairy cattle. Therefore, the impact of female paths on genetic progress is potentially stronger. Also, when phenotypes are recorded on both sexes (e.g., body weight), then not only can female phenotypes contribute to male evaluations but male phenotypes can also contribute to female evaluations. For this reason, genotyping females in these species can make a substantial contribution to accuracy and genetic progress.
Realized accuracies of genetic values can be obtained from the correlation between true and estimated breeding values for the validation population . There are large discrepancies between theoretical accuracy (e.g., by inversion of the coefficient matrix of the mixed model equations) and realized accuracy of EBV in populations under selection, where the latter is noticeably smaller . For genetic values obtained through genomic BLUP methods (GBLUP), the accuracies that are obtained by inversion of the coefficient matrix depend on the assumed allele frequencies , although scaling of genomic relationships for compatibility with pedigree relationships [13, 14] reduces this dependency.
The objective of our work was to analyze a commercial broiler chicken population and determine the gains in the accuracy of genomic evaluations on males and females due to the use of genotypes and phenotypes of males, females, or both sexes.
Heritabilities (diagonal), genetic correlations (above the diagonal), and phenotypic correlations (below the diagonal) for the four traits
Number of genotyped birds with phenotypes for each trait
The birds that were genotyped were chosen randomly or based on phenotypes, depending on the trait. The dataset available for this study was split into training and validation populations according to date of birth. Thus, 2975 birds born in generation 4 were chosen as validation animals and their phenotypes were removed from the analyses.
Model and analysis
where t is for traits T1 to T4; y, b, u, and e are vectors of phenotypes, fixed effects of sex and generation-hatch interaction, random additive direct genetic effects, and random residuals, respectively; X and Z are incidence matrices for b and u, respectively. A vector of random maternal permanent environmental effects was added for T1. Although sex effect was fitted in the model, no sexual dimorphism was considered and the traits on males and females were assumed to have a genetic correlation of 1, which may not always be the case in practice .
where G is the genomic relationship matrix that was constructed as in VanRaden , using observed allele frequencies; A 22 ‐ 1 is the inverse of the pedigree-based relationship matrix for genotyped animals. Weights were assigned for G (α = 0.95) and A 22 (β = 0.05) to avoid singularity problems . Coefficients a and b were used to match pedigree and genomic relationships [14, 19, 20]. Different H matrices were used based on different G that contained 2975 birds from the validation population plus one of the three training populations: males (n = 4648), females (n = 8100), and both sexes (n = 12 748).
Traditional and genomic evaluations were computed using the software BLUP90IOD [21, 22]. The convergence criterion was set to 10−14 for all evaluations. Variance components used in all analyses were pre-computed by Cobb-Vantress Inc. using the same data and model as presented here.
Composition of genomic estimated breeding values from ssGBLUP
where PAi is the parent average EBV for animal i, YDi is the yield deviation (phenotype adjusted for the model effects’ solutions other than additive genetic effects and errors) for animal i, and PCi is the progeny contribution for animal i. When both parents are known, the phenotype is available, and each progeny has a known mate, weights w1 to w3 sum to 1. The decomposition of EBV can be derived by analyzing a row of the mixed model equations for a given animal. More specifically, YD is based on own phenotypic information, PA is the average of the parental EBV, and PC is the sum of the differences between the EBV of any progeny of animal i minus one half of the EBV of each progeny’s dam (or the mate of animal i).
where gij and a 22 ij are the off-diagonal elements of G −1 and A 22 ‐ 1 , respectively; uj is the inverse EBV of animal j.
In general, PP accounts for the part of PA that is explained by DGV; when all animals are genotyped, A = A 22, PA and PP cancel out and DGV explains a larger fraction of the GEBV; when a genotyped animal is unrelated to the genotyped population, PP = 0 and DGV explains a smaller portion of the GEBV; when both parents are genotyped, PP will include a large part of PA. The accuracy of DGV differs between animals, depending on how many ancestors of that animal are genotyped, as reported by Mulder et al. . When a genotyped animal has many progeny, w3 ≈ 1 and its GEBV is mainly driven by PC; however, genotyping those animals is useful since they are usually included in the training population. When an animal is not genotyped, w4 = 0 and predictions can be improved due to improved PA and PC if its relatives are genotyped. When an animal is not genotyped and has no phenotypes and no progeny, the GEBV is driven by PA and, in most cases, only a slight improvement in prediction is achieved based on genotyped relatives [17, 18, 26].
where (G)EBV can be either EBV or GEBV.
Correlation between EBV and GEBV
Correlations between EBV and GEBV using genotypes for both sexes were calculated for sires with large (≥500) and small (<50) progeny groups, and for dams with large (≥50) and small (<5) progeny groups to check the importance of progeny size versus genomic information on EBV of proven parents.
Results and discussion
Family structure for all birds and for genotyped birds in the dataset
Number of parents in the dataset
Average progeny per parent
Number of genotyped parents in the training population
Average progeny per parent
Total parents of the validation population
Number of genotyped parents of the validation population
Average progeny per parent
Accuracies and genomic contributions
Correlations between EBV and GEBV were equal to 0.97 and 0.93 for sires with more than 500 and less than 50 progeny, respectively, whereas correlations for dams with more than 50 and less than 5 progeny were equal to 0.89 and 0.88, respectively. Correlations for dams were lower because they have less progeny than males and, as a result, the weight on genotypic information is greater than the weight on PC for dams. For sires, even if there was some re-ranking between EBV and GEBV by including genomic information, the accuracy of the GEBV of sires with many progeny came mostly from PC, because the contribution from other sources was small or null. Although genomic information had a smaller impact on the GEBV of parents with large numbers of progeny, genotyping those birds was helpful to improve predictions from related birds.
Traits for which male genotypes had a greater impact (T1 and T3) had either a larger number of phenotypes compared to the other traits, or females had no phenotypes such as T3 (Table 2). For T1, the number of phenotypes on males was 57 % of the number of phenotypes on females, but for T2 and T4 the number of phenotypes on males was roughly 27 % of the number of phenotypes on females. In contrast to using a training population with only males, using genotypes for both sexes improved accuracies for all traits except for T3, for which females had no phenotypes. When males were evaluated, including only female genotypes increased the accuracy only slightly. Also, when females were evaluated, including male genotypes hardly increased accuracies. The same trend was observed by Cooper et al.  in a study on the US Holstein population.
Accuracy for pedigree and genomica parent average for genotyped and non-genotyped birds
For males in the validation population, accuracy improved significantly when male genotypes were added to the training population (Fig. 3). Similarly, for females, accuracy improved significantly when female genotypes were added. Consequently, genotypes for a particular sex that are linked to phenotypic information benefit the genotyped birds of that sex. Cooper et al.  showed that using only female genotypes in the training population, opposed to using genotypes only on males, was advantageous for predicting the GEBV for cows, and the same was true for bulls; however, adding female genotypes to an already existing training population of bulls resulted in a very small benefit.
In our study, when genotypes of both sexes were included, opposed to using genotypes for one sex, there was an additional increase in accuracy for each sex (Fig. 3). This may be caused by the contribution of males versus females to the population being different in broiler chickens than in dairy cattle, in which males have a much greater impact on the population due to larger progeny groups. Part of this increase is likely due to the use of the ssGBLUP method, which can model phenotypes and genotypes from both sexes when genotypes are not available for the entire population. This method weights the records of males and females and avoids double-counting of phenotypic and pedigree information. It also establishes connections among more animals with independent information (since it avoids double-counting) through genomic relationships, and combines PA and pedigree prediction.
The increase in accuracy from including genotypes of the opposite sex was greater for validation males than for validation females (Fig. 3). This could be due to several factors: (1) the number of genotypes for females was much larger than that for males and consequently more links were established through H (as G is identical by state) and estimates of DGV and PP were improved; (2) genetic correlations between phenotypes on males and females differ from 1 (our study assumes a correlation of 1); or (3) genomic imprinting is present and thus gene expression depends on the parental origin of the allele .
The relative increase in accuracy for females from adding male genotypes was larger for trait T1 than for T4 because T1 had a larger number of male phenotypes (4648) than trait T4 (2017 male phenotypes) (Table 2 and Fig. 3). Since accuracy was computed as the correlation between EBV or GEBV and phenotypes corrected for fixed effects, no accuracy could be computed for T3 for females because this trait was only recorded for males. Therefore, there was no improvement in accuracy of GEBV from adding female genotypes for T3. In fact, the accuracy deteriorated slightly from 0.50 to 0.46, although adding genotypes is not expected to decrease accuracy if the model is correct, the genomic information is accurate, and all selection is accounted for. Thus, the observed decrease in accuracy could be due to modeling issues, e.g., insufficient modeling of factors associated with T3, structure of the validation population, unaccounted selection, or sexual dimorphism .
Our study ignored sexual dimorphism [16, 35, 36] because genetic correlations between sexes were assumed to be equal to 1. If this assumption does not hold, realized accuracies could be higher with proper modeling. Follow-up research is required to evaluate the change in ranking for animals evaluated for different traits when sexual dimorphism is accounted for and genomic information is available.
Realized accuracy and accuracy from the inverse of the coefficient matrix of the mixed model equations
Accuracies in genomic selection depend on the number, distribution, and contributions of genotypes and phenotypes to the genomic evaluation. Contrary to what has been reported for dairy cattle, in this chicken population, the gain in accuracy of GEBV for young genotyped animals was higher when the training population included genotypes for both males and females. We also observed that when the training population has only animals from one sex, the greatest benefit is for young genotyped animals from the same sex. However, when both sexes are genotyped, the amount of genomic information increases greatly and accuracy of GEBV also increases. Thus, genotyping both sexes may be a suitable option in species and production systems for which not only males but also females have a high reproductive impact. For highly selected traits, realized accuracy of GEBV is smaller because it accounts for selection.
This study was partially supported by USDA Agriculture and Food Research Initiative (Grant no. 2009-65205-05665 from the USDA National Institute of Food and Agriculture Animal Genome Program). We would like to thank Cobb-Vantress Inc. (Siloam Springs, AR) for providing access to the dataset, and Robyn Sapp for helping with data details. Helpful comments from the anonymous reviewers are also gratefully acknowledged.
- VanRaden PM, VanTassel CP, Wiggans GR, Sonstegard TS, Schnabel RD, Taylor JF, et al. Invited review: reliability of genomic predictions for North American Holstein bulls. J Dairy Sci. 2009;92:16–24.PubMedView ArticleGoogle Scholar
- Daetwyler HD, Kemper KE, van der Werf JH, Hayes BJ. Components of the accuracy of genomic prediction in a multi-breed sheep population. J Anim Sci. 2012;90:3375–84.PubMedView ArticleGoogle Scholar
- Pszczola M, Strabel T, Mulder HA, Calus MPL. Reliability of direct genomic values for animals with different relationships within and to the reference population. J Dairy Sci. 2012;95:389–400.PubMedView ArticleGoogle Scholar
- Rendel JM, Robertson A. Estimation of genetic gain in milk yield by selection in a closed herd of dairy cattle. J Genet. 1950;50:1–8.PubMedView ArticleGoogle Scholar
- Schaeffer LR. Strategy for applying genome-wide selection in dairy cattle. J Anim Breed Genet. 2006;123:218–23.PubMedView ArticleGoogle Scholar
- Wiggans GR, Cooper TA, VanRaden PM, Cole JB. Technical note: adjustment of traditional cow evaluations to improve accuracy of genomic predictions. J Dairy Sci. 2011;94:6188–93.PubMedView ArticleGoogle Scholar
- Tsuruta S, Misztal I, Lawlor TJ. Short communication: genomic evaluations of final score for US Holsteins benefit from the inclusion of genotypes on cows. J Dairy Sci. 2013;96:3332–5.PubMedView ArticleGoogle Scholar
- Harris BL, Winkelman AM, Johnson DL. Impact of including a large number of female genotypes on genomic selection. Interbull Bull. 2013;47:23–7.Google Scholar
- Di Croce FA, Osterstock JB, Weigel DJ, Lormore MJ. Gains in reliability with genomic information in US commercial Holstein heifers [abstract]. J Dairy Sci. 2014;97:154.Google Scholar
- Legarra A, Granie CR, Manfredi E, Elsen JM. Performance of genomic selection in mice. Genetics. 2008;180:611–8.PubMed CentralPubMedView ArticleGoogle Scholar
- Bijma P. Accuracies of estimated breeding values from ordinary genetic evaluations do not reflect the correlation between true and estimated breeding values in selected populations. J Anim Breed Genet. 2012;129:345–58.PubMedView ArticleGoogle Scholar
- Stranden I, Christensen OF. Allele coding in genomic evaluation. Genet Sel Evol. 2011;43:25.PubMed CentralPubMedView ArticleGoogle Scholar
- VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91:4414–23.PubMedView ArticleGoogle Scholar
- Vitezica ZG, Aguilar I, Misztal I, Legarra A. Bias in genomic predictions for populations under selection. Genet Res (Camb). 2011;93:357–66.View ArticleGoogle Scholar
- Groenen MA, Megens HJ, Zare Y, Warren WC, Hillier LW, Crooijmans RP, et al. The development and characterization of a 60 K SNP chip for chicken. BMC Genomics. 2011;12:274.PubMed CentralPubMedView ArticleGoogle Scholar
- Closter AM, van As P, Elferink MG, Crooijmanns RPMA, Groenen MAM, Vereijken ALJ, et al. Genetic correlation between heart ratio and body weight as a function of ascites frequency in broilers split up into sex and health status. Poult Sci. 2012;91:556–64.PubMedView ArticleGoogle Scholar
- Aguilar I, Misztal I, Johnson DL, Legarra A, Tsuruta S, Lawlor TJ. Hot topic: a unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score. J Dairy Sci. 2010;93:743–52.PubMedView ArticleGoogle Scholar
- Christensen OF, Lund MS. Genomic prediction when some animals are not genotyped. Genet Sel Evol. 2010;42:2.PubMed CentralPubMedView ArticleGoogle Scholar
- Chen CY, Misztal I, Aguilar I, Legarra A, Muir WM. Effect of different genomic relationship matrices on accuracy and scale. J Anim Sci. 2011;89:2673–9.PubMedView ArticleGoogle Scholar
- Christensen OF. Compatibility of pedigree-based and marker-based relationship matrices for single-step genetic evaluation. Genet Sel Evol. 2012;44:37.PubMed CentralPubMedView ArticleGoogle Scholar
- Aguilar I, Misztal I, Legarra A, Tsuruta S. Efficient computation of the genomic relationship matrix and other matrices used in single-step evaluation. J Anim Breed Genet. 2011;128:422–8.PubMedView ArticleGoogle Scholar
- Tsuruta S, Misztal I, Strandén I. Use of the preconditioned conjugate gradient algorithm as a generic solver for mixed-model equations in animal breeding applications. J Anim Sci. 2001;79:1166–72.PubMedGoogle Scholar
- VanRaden PM, Wiggans GR. Deviation, calculation, and use of national animal model information. J Dairy Sci. 1991;74:2737–46.PubMedView ArticleGoogle Scholar
- VanRaden PM, Wright JR. Measuring genomic pre-selection in theory and in practice. Interbull Bull. 2013;47:147–50.Google Scholar
- Mulder HA, Calus MPL, Druet T, Schrooten C. Imputation of genotypes with low-density chips and its effect on reliability of direct genomic values in Dutch Holstein cattle. J Dairy Sci. 2012;95:876–89.PubMedView ArticleGoogle Scholar
- Legarra A, Aguilar I, Misztal I. A relationship matrix including full pedigree and genomic information. J Dairy Sci. 2009;92:4656–63.PubMedView ArticleGoogle Scholar
- Misztal I, Tsuruta S, Aguilar I, Legarra A, VanRaden PM, Lawlor TJ. Methods to approximate reliabilities in single-step genomic evaluation. J Dairy Sci. 2013;96:647–54.PubMedView ArticleGoogle Scholar
- Garcia-Cortes LA, Legarra A, Chevalet C, Toro MA. Variance and covariance of actual relationships between relatives at one locus. PLoS One. 2013;8:e57003.PubMed CentralPubMedView ArticleGoogle Scholar
- Hill WG, Weir BS. Variation in actual relationship as a consequence of Mendelian sampling and linkage. Genet Res (Camb). 2011;93:47–64.View ArticleGoogle Scholar
- Wang H, Misztal I, Legarra A. Differences between genomic-based and pedigree-based relationships in a chicken population, as a function of quality control and pedigree links among individuals. J Anim Breed Genet. 2014;131:445–51.PubMedView ArticleGoogle Scholar
- Forneris NS, Legarra A, Vitezica ZG, Tsuruta S, Aguilar I, Misztal I, et al. Quality control of genotypes using heritability estimates of gene content at the marker. Genetics. 2015;199:675–81.PubMedView ArticleGoogle Scholar
- Cooper TA, Wiggans GR, VanRaden PM. Short Communication: analysis of genomic predictor population for Holstein dairy cattle in the United States–effects of sex and age. J Dairy Sci. 2015;98:2785–8.PubMedView ArticleGoogle Scholar
- Pszczola M, Strabel T, van Arendonk JAM, Calus M. The impact of genotyping different groups of animals on accuracy when moving from traditional to genomic selection. J Dairy Sci. 2012;95:5412–21.PubMedView ArticleGoogle Scholar
- de Koning DJ, Rattink AP, Harlizius B, van Arendonk JA, Brascamp EW, Groenen MA. Genome-wide scan for body composition in pigs reveals important role of imprinting. Proc Natl Acad Sci USA. 2000;97:7947–50.PubMed CentralPubMedView ArticleGoogle Scholar
- Mignon-Gasteau S, Beaumont C, Poivey JP, Rochambeau H. Estimation of the genetic parameters of sexual dimorphism of body weight in’label’ chickens and Muscovy ducks. Genet Sel Evol. 1998;30:481–91.View ArticleGoogle Scholar
- Maniatis G, Demiris N, Kranis A, Banos G, Kominakis A. Genetic analysis of sexual dimorphism of body weight in broilers. J Appl Genet. 2013;54:61–70.PubMedView ArticleGoogle Scholar
- Edel C, Neuner S, Emmerling R, Gotz KU. A note on ‘forward prediction’ to access precision and bias of genomic predictions. Interbull Bull. 2012;46:16–9.Google Scholar
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.