Bias, dispersion, and accuracy of genomic predictions for feedlot and carcase traits in Australian Angus steers

Alexandre, Pâmela A.; Li, Yutao; Hine, Brad C.; Duff, Christian J.; Ingham, Aaron B.; Porto-Neto, Laercio R.; Reverter, Antonio

doi:10.1186/s12711-021-00673-8

Research Article
Open access
Published: 26 September 2021

Bias, dispersion, and accuracy of genomic predictions for feedlot and carcase traits in Australian Angus steers

Pâmela A. Alexandre¹,
Yutao Li¹,
Brad C. Hine²,
Christian J. Duff³,
Aaron B. Ingham¹,
Laercio R. Porto-Neto¹ &
…
Antonio Reverter ORCID: orcid.org/0000-0002-4681-9404¹

Genetics Selection Evolution volume 53, Article number: 77 (2021) Cite this article

3240 Accesses
7 Citations
2 Altmetric
Metrics details

A Correction to this article was published on 15 November 2021

This article has been updated

Abstract

Background

Improving feedlot performance, carcase weight and quality is a primary goal of the beef industry worldwide. Here, we used data from 3408 Australian Angus steers from seven years of birth (YOB) cohorts (2011–2017) with a minimal level of sire linkage and that were genotyped for 45,152 SNPs. Phenotypic records included two feedlot and five carcase traits, namely average daily gain (ADG), average daily dry matter intake (DMI), carcase weight (CWT), carcase eye muscle area (EMA), carcase Meat Standard Australia marbling score (MBL), carcase ossification score (OSS) and carcase subcutaneous rib fat depth (RIB). Using a 7-way cross-validation based on YOB cohorts, we tested the quality of genomic predictions using the linear regression (LR) method compared to the traditional method (Pearson’s correlation between the genomic estimated breeding value (GEBV) and its associated adjusted phenotype divided by the square root of heritability); explored the factors, such as heritability, validation cohort, and phenotype that affect estimates of accuracy, bias, and dispersion calculated with the LR method; and suggested a novel interpretation for translating differences in accuracy into phenotypic differences, based on GEBV quartiles (Q1Q4).

Results

Heritability (h²) estimates were generally moderate to high (from 0.29 for ADG to 0.53 for CWT). We found a strong correlation (0.73, P-value < 0.001) between accuracies using the traditional method and those using the LR method, although the LR method was less affected by random variation within and across years and showed a better ability to discriminate between extreme GEBV quartiles. We confirmed that bias of GEBV was not significantly affected by h², validation cohort or trait. Similarly, validation cohort was not a significant source of variation for any of the GEBV quality metrics. Finally, we observed that the phenotypic differences were larger for higher accuracies.

Conclusions

Our estimates of h² and GEBV quality metrics suggest a potential for accurate genomic selection of Australian Angus for feedlot performance and carcase traits. In addition, the Q1Q4 measure presented here easily translates into possible gains of genomic selection in terms of phenotypic differences and thus provides a more tangible output for commercial beef cattle producers.

Background

Genomic selection represents a revolution in animal breeding. It enables the identification of superior animals through the estimation of genomic estimated breeding values (GEBV) for relevant quantitative traits, and has led to dramatic genetic progress in farm animals during the last two decades [1,2,3]. This approach is based on the expectation that quantitative trait loci (QTL) are in linkage disequilibrium (LD) with one or more single nucleotide polymorphisms (SNPs) in such a way that a sufficiently dense SNP panel, covering the entire genome, would be able to capture the genetic effects of QTL [4]. Thus, the sum of the estimated effects of all SNP genotypes of an animal is considered to be a predictor of its breeding value [5]. However, the accuracy of GEBV depends on several factors including the size of the reference population, the heritability of the traits and the extent of the LD between SNPs and QTL [6, 7]. One of the most important advantages of genomic selection is the ability to select unproven young candidates; however, accurate predictions are required to support confident decision making. Therefore, Legarra and Reverter [8] have proposed the linear regression (LR) method, which provides population-based semi-parametric estimates of accuracy and bias of GEBV by comparing predictions based on partial and whole data. This cross-validation method has been validated and applied to data from several species, including cattle, sheep, pigs, chickens and trout [9,10,11,12,13,14,15]. One recent finding is the need to assess biases and accuracies using various criteria (truncation points) to define partial vs. whole comparisons so that the effect of random variation across years is accounted for [16]. Here, we used the LR method to evaluate GEBV for feedlot and carcase traits in Australian Angus cattle from a dataset spanning seven years of birth cohorts with a minimal level of sire linkage across cohorts.

In beef cattle, genomic prediction offers an opportunity to evaluate, at an early age, traits that are difficult and/or expensive to measure, or can only be measured post-mortem, such as carcase traits. Few studies have assessed the predictive accuracies of GEBV for feedlot and carcase traits in cattle. For example, GEBV for average daily weight gain in feedlot-finished Nellore steers that were generated using Bayesian models, have been previously reported with accuracies ranging from 0.18 to 0.27 [17]. Similarly, using Bayesian and genomic BLUP methods, Bolormaa et al. [18] reported an average GEBV accuracy of 0.27 across different carcase and meat quality traits in Bos taurus, Bos indicus, and composite beef breeds, and a large variation in accuracy between breeds and between traits. Indeed, it is well established that considerable variation exists between breeds for body composition and meat quality traits, further highlighting the importance of evaluating these traits in specific populations [19].

In the Australian cattle herd, Angus is the dominant breed with an estimated 5.6 million females influenced by Angus genetics and accounts for 48% of the national female herd [20]. Considering its importance, our aim was to determine the potential for accurate genomic selection of Australian Angus for feedlot performance, carcase weight and quality by assessing the accuracy of GEBV for these traits using the traditional and the LR methods.

While the LR method has received substantial attention since its development [8], the statistics that it proposes for assessing the quality of genomic predictions have not been widely tested as a function of time (e.g., truncation on year of birth) or with other not time-dependent validation datasets. In addition, changes in GEBV accuracies (and other quality metrics) that are observed due to the use of different models and/or validation populations are usually explored separately for different phenotypes. Further compounding these issues, are the lack of a clear understanding of the relationship between accuracy values and how much extreme individuals based on GEBV will differ in performance. While genetic progress is proportional to accuracy and drives breeding programs for seedstock producers, how changes in accuracy translate to phenotypic differences in commercial settings is poorly understood. An attempt to address this question was reported by [21] in which the distribution of phenotypic values was evaluated after assigning animals to quartiles based on their GEBV.

Here, we complement previous studies in three major aspects: (1) by testing the quality of genomic predictions using the LR method for a complete range of traits that are relevant to feedlot performance and carcase yield and quality and are key components of the beef industry in Australia and worldwide; (2) by exploring the factors, such as heritability estimate, validation cohort and phenotype, that affect the estimates of accuracy, bias and dispersion calculated with the LR method; and (3) by suggesting a novel interpretation for translating differences in accuracy into possible gains of genomic selection in terms of phenotypic differences, providing a more tangible output for beef cattle producers.

Methods

The data for this study were collected as part of the Australian Angus Sire Benchmarking Program (ASBP), a major initiative of Angus Australia [22] with support from the Meat and Livestock Australia (MLA) and industry partners. This program aims at generating data on steers that were progeny from modern Angus sires, particularly for hard-to-measure traits such as feed efficiency, abattoir carcase measurements, meat quality attributes, and female reproduction. For the development of the ASBP, each cohort of steers included progeny of a genetically diverse range of sires, which were nominated by breeders from all the states of Australia and New Zealand, while some cohorts also included progeny of sires from the USA and the UK. The sires in each cohort were predominantly young bulls (2–3 years of age), with also a few older influential bulls [23]. For the current study, the dataset included phenotypes and fixed effect information for 3408 Australian Angus steers from seven years of birth cohorts (YOB, 2011–2017) for which genotypes for 45,152 autosomal SNPs were available.

The 3408 steers represent 12 breeding properties (herds) and 294 sires with an average of 11.5 progeny per sire, ranging from 1 to 27. In total, 2773 dams were included in the dataset with an average of 1.22 progeny per dam, ranging from 1 to 4. Across the seven YOB cohorts, the numbers of dams with one, two, three, and four progenies were 2221, 485, 65 and 2, respectively. The numbers of progeny (number of sires in brackets) in the YOB cohorts from 2011 to 2017 were 361 (35), 514 (48), 579 (44), 274 (25), 569 (49), 575 (63) and 536 (56), respectively.

Seven phenotypes were analysed including feedlot average daily gain (ADG, kg/d), feedlot average daily dry matter intake (DMI, kg/d), carcase weight (CWT, kg), carcase eye muscle area (EMA, cm²), carcase Meat Standards Australia marbling score (MBL, score), carcase ossification score (OSS, score) and carcase subcutaneous rib fat depth (RIB, mm). Table 1 provides summary statistics for these phenotypes. ADG, DMI, CWT, EMA, and RIB were measured as described in [24]. MBL was measured in scores ranging from 100 to 1100 in increments of 10, with higher scores indicating greater marbling [25]. Finally, OSS scores ranged from 100 to 590 in increments of 10, with lower scores indicating less physiological maturity [26, 27].

Table 1 Summary statistics including number of records, mean, standard deviation (SD), minimum (Min) and maximum (Max) for traits and covariates included in this study

Full size table

Variance components, heritabilities, and genetic and residual correlations were estimated using the Qxpak5 software package [28]. For this purpose, a linear mixed model was used to analyse all traits, which included a fixed effect of contemporary group (CG), i.e. an amalgamation of property of origin, year and month of birth, and date of measurement, and effects of age of dam (AOD) at birth of the calf (in years) and age at measurement (as a linear covariate). CG were not the same for feedlot and carcase traits because measurement dates differed. In addition, the random additive polygenic and residual effects were fitted with assumed distributions $N(\mathbf{0}, \mathbf{G}\otimes{\mathbf{V}}_{\mathbf{G}})$ and $N(\mathbf{0}, \mathbf{I}\otimes{\mathbf{V}}_{\mathbf{R}})$, respectively, where $\mathbf{G}$ represents the genomic relationship matrix (GRM) generated using the first method of VanRaden [29], ${\mathbf{V}}_{\mathbf{G}}$ is the genetic variance–covariance matrix, $\mathbf{I}$ is an identity matrix, ${\mathbf{V}}_{\mathbf{R}}$ is the residual covariance matrix and $\otimes$ represents the Kronecker product. Two different analyses were undertaken to generate estimates for the whole and partial datasets. First, a multivariate (7-variate) analysis was performed with all seven traits. The resulting GEBV from this multivariate analysis are termed ${\widehat{\mathbf{u}}}_{\mathrm{w}}$ to indicate that they are based on the whole dataset and will be used as the calibration in the computation of the accuracy and bias with the LR method. Next, a series of 49 univariate analyses were undertaken each with a single trait and where the values for animals from consecutive YOB cohorts were treated as missing. Hence, 49 analyses were performed originating from seven traits by seven YOB cohorts. The resulting GEBV from these univariate analyses are termed ${\widehat{\mathbf{u}}}_{p}$ to indicate that they are based on partial data and will be used as validation data.

To ascertain the quality of the resulting GEBV in the validation population (i.e. the elements of ${\widehat{\mathbf{u}}}_{p}$ corresponding to the focal individuals in the validation population), we used the following four metrics:

(1) Traditional accuracy (${\mathrm{ACC}}_{\mathrm{T}}$): Pearson’s correlation ($r$) between a GEBV and its associated adjusted phenotype (${\mathbf{y}}^{*}$; phenotype $y$ adjusted for CG fixed effects and covariates) for individuals in the validation population was divided by the square root of the heritability [18]:

$${\text{ACC}}_{{\text{T}}} = \frac{{r\left( {{\hat{\mathbf{u}}}_{p} - {\mathbf{y}}^{*} } \right)}}{{\sqrt {h^{2} } }}.$$

(2) Bias calculated with the LR method (${\text{Bias}}_{{{\text{LR}}}} )$: is the difference between the average GEBV of individuals in the validation population using the partial data minus that using the whole data [8, 15]:

$${\text{Bias}}_{{{\text{LR}}}} = \overline{{{\hat{\mathbf{u}}}_{p} }} - \overline{{{\hat{\mathbf{u}}}_{w} }} .$$

(3) Dispersion calculated with the LR method (${\text{Disp}}_{{{\text{LR}}}}$): for individuals in the validation population, dispersion was measured from the slope of the regression of ${\hat{\mathbf{u}}}_{w}$ on ${\hat{\mathbf{u}}}_{p}$ [8, 15]:

$${\text{Disp}}_{{{\text{LR}}}} = \frac{{cov\left( {{\hat{\mathbf{u}}}_{w} ,\user2{ }{\hat{\mathbf{u}}}_{{\varvec{p}}} } \right)}}{{var\left( {{\hat{\mathbf{u}}}_{p} } \right)}}.$$

(4) Accuracy calculated with the LR method (${\text{ACC}}_{{{\text{LR}}}} )$: for individuals in the validation population, ${\text{ACC}}_{{{\text{LR}}}}$ was computed as follows [8, 15]:

$${\text{ACC}}_{{{\text{LR}}}} = \sqrt {\frac{{cov\left( {{\hat{\mathbf{u}}}_{w} ,\user2{ }{\hat{\mathbf{u}}}_{{\varvec{p}}} } \right)}}{{\left( {1 + \overline{F} - 2\overline{f}} \right)\sigma_{g, \infty }^{2} }}} ,$$

where $\overline{F}$ is the average inbreeding coefficient, $2\overline{f}$ is the average relationship between individuals, and $\sigma_{g, \infty }^{2}$ is the genetic variance at equilibrium in a population under selection. Assuming the individuals in the validation population are not under selection, $\sigma_{g, \infty }^{2}$ was estimated by the additive genetic variance estimated from the partial dataset.

Then, to characterise the factors affecting the GEBV quality metrics, accuracy, bias and dispersion were treated as dependent variables in an ANOVA model that included h² estimate, validation cohort and trait as independent predictor variables.

Finally, using only the animals in the validation population, we ranked animals based on GEBV, identified those in the highest (Q1) and lowest (Q4) quartiles of the GEBV scale, and calculated the difference (Q1Q4) between the adjusted phenotypes of these two sets of animals. Then, we used the following models to evaluate the relationship between individual GEBV accuracy metrics and Q1Q4 using the PROC GLM program (SAS Inst. Inc.):

$${\text{Q}}1{\text{Q}}4 = {\text{Trait}} + {\text{Cohort}} + {\text{ACC}}_{{\text{T}}} + {\mathbf{e}},$$

$${\text{Q}}1{\text{Q}}4 = {\text{Trait}} + {\text{Cohort}} + {\text{ACC}}_{{{\text{LR}}}} + {\mathbf{e}},$$

where ${\text{Q}}1{\text{Q}}4$ is the difference, in SD units, between the highest and the lowest quartile for adjusted phenotypes based on GEBV ranking, ${\text{Trait}}$ corresponds to the seven phenotypes analysed, ${\text{Cohort}}$ corresponds to the seven validation cohorts, and ${\mathbf{e}}$ is the vector of residual effects.

Results

In this study, we used data from 3408 Australian Angus steers from seven YOB cohorts (2011 to 2017). These steers represented 294 sires from 12 breeding properties (or herds). A low level of sire linkage across cohorts was identified (see Additional file 1 Table S1) as was intended in the ASBP design. The 12 breeding properties contributed on average 284 animals ranging from 57 to 495 and all except two contributed animals across three YOB cohorts. One breeding property was represented in a single YOB cohort while another one was represented in five YOB cohorts (see Additional file 1: Table S2). These sire and breeding property linkages across YOB cohorts can have an impact on the accuracies of GEBV since each cohort is used as the validation population. Of note, the GRM showed that the within- (i.e. diagonals of the GRM) and between-animal relationships (off-diagonals of the GRM) were close to the expected values of 1 and 0, respectively (see Additional file 1 Table S3). Equally interesting, was the very similar variation that we observed across these two types of relationships, which indicates a single population from the point of view of genetic variation [30].

Heritability estimates were generally moderate to high, ranging from 0.30 for ADG to 0.53 for CWT (Table 2). Genetic correlations were strong and positive between ADG and DMI (0.59) and between ADG and CWT (0.65) and close to zero between MBL and OSS (− 0.01) and between MBL and RIB (− 0.09). In general, the estimates of the residual correlation were lower and closer in magnitude to zero than the genetic correlations. For instance, between the growth traits ADG, DMI and CWT, the genetic and residual correlations were estimated at ~ 0.60 and ~ 0.30, respectively. Finally, except for CWT, the estimates of the genetic and residual correlations between feedlot and carcase traits were weak.

Table 2 Genomic estimates of heritability (italics on the diagonal), genetic (above the diagonal) and residual (below the diagonal) correlations for feedlot and carcase traits

Full size table

The four GEBV quality metrics (ACC_T, ACC_LR, Bias_LR, and Disp_LR, see in “Methods” section) are in Table 3. ACC_T ranged from 0.28 for ADG to 0.51 for DMI, while ACC_LR ranged from 0.44 for RIB to 0.64 for CWT. We found a strong correlation of 0.73 (P-value < 0.001) between ACC_T and ACC_LR (Fig. 1a). ACC_LR were on average lower than ACC_T (Table 3) and more variable (Fig. 1b). This resulted in a much higher coefficient of variation for ACC_LR (Fig. 1c), particularly for ADG (41.06 vs. 7.79%) and OSS (37.07 vs. 10.24%). For all the traits, the Bias_LR values were close to 0 and the Disp_LR values close to 1 (Table 3), as expected in the absence of bias.

Table 3 Traditional (ACC_T) and LR (ACC_LR) accuracies, LR bias (Bias_LR) and LR dispersion (Disp_LR) of GEBV for feedlot and carcase traits from a 7-way cross-validation^a scheme based on year of birth (YOB) cohorts

Full size table

In magnitude, the Bias_LR for CWT (on average 0.27 kg; Table 3) appears to be larger than that observed for the other traits. However, in relative terms, this bias is equivalent for all traits. For instance, the SD of the GEBV Bias_LR is 0.61 kg and 0.16 cm² for CWT and EMA, respectively (Table 3), which are equal to ~ 1% of the SD observed for each trait (Table 1).

By investigating the effects of heritability, validation cohort and trait on the GEBV quality metrics (Table 4) we found that, in the cross-validation scheme and for a given trait, there is a significant negative correlation between the estimated heritability and the slope of the dispersion (r = − 0.56 ± 0.089; P-value < 0.001). Based on the coefficient of determination (R²), a model that includes the effects of heritability estimate, validation cohort and trait explained 65.3, 84.9, 14.5 and 73.3% of the variation in ACC_T, ACC_LR, Bias_LR and Disp_LR, respectively. Thus, validation cohort, trait, or heritability of the trait did not significantly affect the Bias_LR of GEBV. In addition, it is important to note that validation cohort was not a significant source of variation (P-value > 0.10) for any of the four GEBV quality metrics (Table 4).

Table 4 P-value and coefficient of determination (R²) of the effect of heritability (h²), validation cohort and trait on GEBV quality metrics, including traditional accuracy (ACC_T) and accuracy (ACC_LR), bias (Bias_LR) and dispersion (Disp_LR) obtained with the LR method

Full size table

After ranking the validation animals according to their GEBV and calculating the phenotypic differences (Q1Q4) between animals in the highest and lowest GEBV quartile (Table 5), we observed that, averaged across the 49 estimates (7 cohorts and 7 traits), the estimate of the Q1Q4 difference is 5.59-fold larger than its SE, which indicates the consistency of this metric. When expressed in SD units (Table 5, last row), the smallest (0.35) and largest (0.94) Q1Q4 differences were found for ADG and CWT, respectively. After adjusting for the effects of trait (P-value < 0.0001) and validation cohort (P-value > 0.05), we found that for each 0.1 increase in ACC_LR, the Q1Q4 difference increased by an average of 0.132 SD units, while for each 0.1 increase in ACC_T this increase was smaller, i.e. 0.081 SD units. In both cases, the intercept did not significantly differ from zero (P-value > 0.05).

Table 5 Average difference (± SE) in adjusted phenotypes between the highest and lowest quartile (Q1Q4) based on GEBV ranking

Full size table

Discussion

Genomic predictions need to be accurate to be successfully implemented. The accuracy of predictions depends highly on size of the reference population, relatedness between test animals and those in the reference population, and heritability of the target traits, but it can also vary between different breeds and populations [18]. Here, we tested the accuracy of genomic predictions for seven feedlot and carcase traits that were generated using 3408 Australian Angus steers genotyped for 45,152 SNPs. Our estimates of genetic parameters for Australian Angus were all genomic-based and no pedigree data was used in the estimation process. Heritability estimates as well as ACC_T and ACC_LR were moderate to high. ACC_T were highly correlated with ACC_LR. Since the lowest ACC_LR value obtained was 0.44 (for RIB and OSS), and the measures of bias and dispersion fell within expected values, our results provide evidence of the potential for accurate genomic selection of the evaluated traits in Australian Angus cattle.

We have shown that the 7-way cross-validation scheme implemented here, based on YOB cohorts within the same population, is as accurate as genomic prediction using a training set from a different (target) subpopulation [9]. In that work [9], the authors argued that genomic predictions using genetically heterogeneous training sets could provide more flexibility and showed that a training set that includes animals from genetically related lines can be as valuable as a training set from the target population. In our study, since the YOB cohorts used to generate the validation populations presented a low level of sire linkage, we could use this experimental design.

Heritability estimates ranged from 0.30 for ADG to 0.53 for CWT which is consistent with previously reported values. For instance, Somavilla et al. [17] using Bayesian genomic best linear unbiased prediction (GBLUP) to evaluate feedlot ADG in Nellore cattle reported a heritability of 0.31, and Su et al. [31] reported heritabilities of 0.48 and 0.43 for marbling score and of 0.51 and 0.34 for CWT, in Hereford and Simmental cattle, respectively. In Angus cattle, a previous study using animals from the ASBP but based on pedigree information only, reported heritabilities of 0.33, 0.34, 0.52, 0.55 and 0.66 for ADG, RIB, EMA, DMI, and CWT, respectively [24].

Genetic correlations were high and positive between feedlot and weight traits (ADG, DMI and CWT) and close to zero between carcase quality traits (MBL, OSS and RIB). Moreover, low correlations were observed between these two groups of traits. These results corroborate the findings from previous studies that found lower correlations between live/carcass weight and traits such as fat deposition and marbling [32]. Particularly in Angus cattle, similar results based on pedigree information have been reported using a subset of six [24] and four [33] of the seven YOB cohorts used here. In those studies, the standard error (SE) associated with pedigree-based estimates of h² and genetic correlation ranged from 0.06 to 0.11 and from 0.04 to 0.27, respectively. In the literature on livestock genomics, there is ample evidence showing that the SE associated with genomic estimates of genetic parameters is lower than that associated with pedigree-based estimates (see for instance [34,35,36]), which is attributed to the genomic relationship matrix being more informative than the pedigree-based numerator relationship matrix.

Based on a simulation study, Macedo et al. [15] showed that the LR method works in the presence of selection and verified that LR accuracies agreed with theoretical accuracies once the Bulmer effect is correctly accounted for. In the current study, we used real data and report that the ACC_T and ACC_LR for each trait were highly correlated (r = 0.73; P-value < 0.001). One key advantage of the LR method for computing accuracy is that it does not need adjustment factors to pre-correct phenotypes, which are themselves estimates and prone to errors, for instance, in situations with many contemporary groups each with few records or when heritability is poorly estimated (i.e. when the selection process is inadequately described in the data and environmental trends are present). Instead, the LR method obviates the need for adjustment factors and has been shown to perform optimally even if the model uses an incorrect heritability or if a hidden trend exists in the data [15].

It is worth noting that the complete dataset was used to obtain estimates of CG fixed effects and covariates, and these estimates were used to adjust the phenotypes of individuals in the validation population. These adjusted phenotypes were needed in the computation of ACC_T and Q1Q4. Animals in the validation and training sets were raised in different CG. Therefore, the only linkage between these animals is through genomic relationships and no link was created as a consequence of using records in the validation sets to obtain the estimates for the precorrection. However, while the key advantage of ACC_LR is that is does not require to estimate adjustment factors from fixed effects corresponding to the validation population, whether that is a sufficient argument to favour ACC_LR over ACC_T cannot be determined with certainty because it is likely that they are capturing different aspects of predictions.

In agreement with previous studies, our results suggest that the accuracy for carcase traits is higher than for live animal body composition traits [37] and that the accuracy is higher for traits with a higher heritability [18, 38]. In fact, a high correlation (r = 0.91, P-value < 0.001) was observed between heritability and GEBV accuracy. An absence of GEBV bias was indicated by values close to zero for all traits. Bias was not significantly influenced by validation cohort, heritability of the trait, or trait. In the absence of bias, the expected value of dispersion is 1. Although a negative correlation between heritability and dispersion was observed, such that higher estimates of heritability were associated with overdispersion in the resulting GEBV, Disp_LR values were mostly around 1, ranging from 0.93 for OSS and RIB to 1.12 for DMI.

The breeding properties that contributed data to the ASBP were selected on a YOB basis and on their ability to supply data on hard-to-measure traits and from sires that were not already represented in other YOB. This particular structure allows for a unique paradigm by which each YOB cohort can be considered as a truly independent validation dataset to generate the “partial” GEBV which, in turn, gives us the opportunity to better test the optimality of the genomic predictions than if the partial datasets were generated at random or based on the last generation (as often used to mimic the “old” versus the “recent” predictions). Indeed, analysis of the variability of accuracy estimates within and across traits and years revealed that ACC_LR were less affected by random variation within trait across years (Figs. 1b and c) and within year across traits than ACC_T. Averaged across the seven YOB cohorts, the SD of ACC_LR was 0.09 compared to 0.14 for ACC_T.

To further characterise the factors that affect GEBV quality metrics, accuracy, bias, and dispersion were treated as dependent variables in an ANOVA model that included h² estimate, validation cohort and trait as independent predictor variables (Table 4). We confirmed that bias was not significantly affected by any of the independent variables (P-value > 0.05). Similarly, in spite of the low level of sire linkage across cohorts and the varying size of the cohorts (274–579), validation cohort was not a significant source of variation for any of the GEBV quality metrics.

The high correlation between heritability and GEBV accuracy was also reflected in the phenotypic differences between validation animals in the highest and lowest GEBV quartile (Q1Q4). The higher was the GEBV accuracy, the larger was the phenotypic difference between quartiles and, therefore, the greater was the genetic gain which could be expected when selecting for the trait. Moreover, we found a larger increase in Q1Q4 difference (0.132 SD units) for each 0.1 increase in ACC_LR than that (0.081 SD units) for the same 0.1 increase in ACC_T. These results suggest an improved ability of ACC_LR to discriminate between extreme GEBV quartiles. The fact that both intercepts were not significantly different from zero indicates that when either ACC_T or ACC_LR is zero, GEBV are not different from randomly guessed values, and hence, the Q1Q4 difference is zero, as expected.

Conclusions

We have performed a series of analyses aimed at investigating the behaviour of bias, dispersion, and accuracy of GEBV according to the characteristics of the validation dataset, and the value of these quality metrics for reflecting extreme-performing individuals. The GEBV quality metrics based on the LR method, i.e. accuracy, bias, and dispersion, as well as the heritabilities reported here, suggest that there is potential for accurate genomic selection of Australian Angus for feedlot performance and carcase weight and quality.

Availability of data and materials

The datasets used and analysed during the current study are available from the corresponding author on reasonable request and upon signing a data transfer agreement.

Change history

15 November 2021
A Correction to this paper has been published: https://doi.org/10.1186/s12711-021-00677-4

References

Hayes BJ, Lewin HA, Goddard ME. The future of livestock breeding: genomic selection for efficiency, reduced emissions intensity, and adaptation. Trends Genet. 2013;29:206–14.
Article CAS Google Scholar
Goddard ME, Hayes BJ, Meuwissen THE. Genomic selection in livestock populations. Genet Res. 2010;92:413–21.
Article CAS Google Scholar
Wiggans GR, Cole JB, Hubbard SM, Sonstegard TS. Genomic selection in dairy cattle: the USDA experience. Annu Rev Anim Biosci. 2017;5:309–27.
Article Google Scholar
Goddard ME, Kemper KE, MacLeod IM, Chamberlain AJ, Hayes BJ. Genetics of complex traits: prediction of phenotype, identification of causal polymorphisms and genetic architecture. Proc Biol Sci. 2016;283:20160569.
PubMed PubMed Central Google Scholar
Boichard D, Ducrocq V, Croiseau P, Fritz S. Genomic selection in domestic animals: principles, applications and perspectives. C R Biol. 2016;339:274–7.
Article Google Scholar
Hayes BJ, Bowman PJ, Chamberlain AJ, Goddard ME. Invited review: Genomic selection in dairy cattle: progress and challenges. J Dairy Sci. 2009;92:433–43.
Article CAS Google Scholar
VanRaden PM, Van Tassell CP, Wiggans GR, Sonstegard TS, Schnabel RD, Taylor JF, et al. Invited review: reliability of genomic predictions for North American Holstein bulls. J Dairy Sci. 2009;92:16–24.
Article CAS Google Scholar
Legarra A, Reverter A. Semi-parametric estimates of population accuracy and bias of predictions of breeding values and future phenotypes using the LR method. Genet Sel Evol. 2008;50:53.
Article Google Scholar
Aliakbari A, Delpuech E, Labrune Y, Riquet J, Gilbert H. The impact of training on data from genetically-related lines on the accuracy of genomic predictions for feed efficiency traits in pigs. Genet Sel Evol. 2020;52:57.
Article Google Scholar
Silva RMO, Evenhuis JP, Vallejo RL, Gao G, Martin KE, Leeds TD, et al. Whole-genome mapping of quantitative trait loci and accuracy of genomic predictions for resistance to columnaris disease in two rainbow trout breeding populations. Genet Sel Evol. 2010;51:42.
Article CAS Google Scholar
Durbin HJ, Lu D, Yampara-Iquise H, Miller SP, Decker JE. Development of a genetic evaluation for hair shedding in American Angus cattle to improve thermotolerance. Genet Sel Evol. 2020;52:63.
Article CAS Google Scholar
Cesarani A, Hidalgo J, Garcia A, Degano L, Vicario D, Masuda Y, et al. Beef trait genetic parameters based on old and recent data and its implications for genomic predictions in Italian Simmental cattle. J Anim Sci. 2020;98:skaa242.
Article Google Scholar
Chu TT, Bastiaansen JWM, Berg P, Romé H, Marois D, Henshall J, et al. Use of genomic information to exploit genotype-by-environment interactions for body weight of broiler chicken in bio-secure and production environments. Genet Sel Evol. 2019;51:50.
Article CAS Google Scholar
Granado-Tajada I, Legarra A, Ugarte E. Exploring the inclusion of genomic information and metafounders in Latxa dairy sheep genetic evaluations. J Dairy Sci. 2020;103:6346–53.
Article CAS Google Scholar
Macedo FLL, Reverter A, Legarra A. Behavior of the Linear Regression method to estimate bias and accuracies with correct and incorrect genetic evaluation models. J Dairy Sci. 2020;103:529–44.
Article CAS Google Scholar
Macedo FL, Christensen OF, Astruc JM, Aguilar I, Masuda Y, Legarra A. Bias and accuracy of dairy sheep evaluations using BLUP and SSGBLUP with metafounders and unknown parent groups. Genet Sel Evol. 2020;52:47.
Article CAS Google Scholar
Somavilla AL, Regitano LCA, Rosa GJM, Mokry FB, Mudadu MA, Tizioto PC, et al. Genome-enabled prediction of breeding values for feedlot average daily weight wain in Nelore cattle. G3 (Bethesda). 2017;7:1855–9.
Article Google Scholar
Bolormaa S, Pryce JE, Kemper K, Savin K, Hayes BJ, Barendse W, et al. Accuracy of prediction of genomic breeding values for residual feed intake and carcass and meat quality traits in Bos taurus, Bos indicus, and composite beef cattle. J Anim Sci. 2013;91:3088–104.
Article CAS Google Scholar
Marshall DM. Breed differences and genetic parameters for body composition traits in beef cattle. J Anim Sci. 1994;72:2745–55.
Article CAS Google Scholar
Angus Australia. Australian beef breeding insights. 2020. https://www.angusaustralia.com.au/australian-beef-breeding-insights/. Accessed 16 Jul 2021.
Hine BC, Duff CJ, Byrne A, Parnell P, Porto-Neto L, Li Y, et al. Development of Angus SteerSELECT: a genomic-based tool to identify performance differences of Australian Angus steers during feedlot finishing: phase 1 validation. Anim Prod Sci. 2021. https://doi.org/10.1071/AN21051.
Article Google Scholar
Angus Australia. Angus Sire Benchmarking Program. 2020 https://www.angusaustralia.com.au/sire-benchmarking/about/general-information/. Accessed 16 Jul 2021
Parnell PF, Duff CJ, Byrne AI, Butcher NM. The Angus sire benchmarking program—a major contributor to future genetic improvement in the Australian beef industry. In Proceedings of the 23rd Conference of the Association for the Advancement of Animal Breeding and Genetics (AAABG): 27th October-1st November 2019; Armidale. 2019; pp 492–5.
Torres-Vázquez JA, van der Werf JHJ, Clark SA. Genetic and phenotypic associations of feed efficiency with growth and carcass traits in Australian Angus cattle. J Anim Sci. 2018;96:4521–31.
Article Google Scholar
McGilchrist P, Polkinghorne RJ, Ball AJ, Thompson JM. The meat standards Australia index indicates beef carcass quality. Animal. 2019;13:1750–7.
Article CAS Google Scholar
Gudex BW, McPhee MJ, Oddy VH, Walmsley BJ. Prediction of ossification from live and carcass traits in young beef cattle: model development and evaluation. J Anim Sci. 2019;97:144–55.
Article Google Scholar
Watson R, Polkinghorne R, Thompson JM. Development of the Meat Standards Australia (MSA) prediction model for beef palatability. Aust J Exp Agric. 2008;48:1368–79.
Article Google Scholar
Pérez-Enciso M, Misztal I. Qxpak.5: old mixed model solutions for new genomics problems. BMC Bioinformatics. 2011;12:202.
Article Google Scholar
VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91:4414–23.
Article CAS Google Scholar
Simeone R, Misztal I, Aguilar I, Legarra A. Evaluation of the utility of diagonal elements of the genomic relationship matrix as a diagnostic tool to detect mislabelled genotyped animals in a broiler chicken population. J Anim Breed Genet. 2011;128:386–93.
Article CAS Google Scholar
Su H, Golden B, Hyde L, Sanders S, Garrick D. Genetic parameters for carcass and ultrasound traits in Hereford and admixed Simmental beef cattle: accuracy of evaluating carcass traits. J Anim Sci. 2017;95:4718–27.
Article CAS Google Scholar
Nkrumah JD, Keisler DH, Crews DH, Basarab JA, Wang Z, Li C, et al. Genetic and phenotypic relationships of serum leptin concentration with performance, efficiency of gain, and carcass merit of feedlot cattle. J Anim Sci. 2007;85:2147–55.
Article CAS Google Scholar
Reverter A, Hine BC, Porto-Neto L, Li Y, Duff CJ, Dominik S, et al. ImmuneDEX: a strategy for the genetic improvement of immune competence in Australian Angus cattle. J Anim Sci. 2021;99:skaa84.
Article Google Scholar
Lassen J, Poulsen NA, Larsen MK, Buitenhuis AJ. Genetic and genomic relationship between methane production measured in breath and fatty acid content in milk samples from Danish Holsteins. Anim Prod Sci. 2016;56:298–303.
Article CAS Google Scholar
Aldridge MN, Vandenplas J, Bergsma R, Calus MPL. Variance estimates are similar using pedigree or genomic relationships with or without the use of metafounders or the algorithm for proven and young animals. J Anim Sci. 2020;98:skaa019.
Article Google Scholar
Veerkamp RF, Mulder HA, Thompson R, Calus MPL. Genomic and pedigree-based genetic parameters for scarcely recorded traits when some animals are genotyped. J Dairy Sci. 2011;94:4189–97.
Article CAS Google Scholar
Boerner V, Johnston DJ, Tier B. Accuracies of genomically estimated breeding values from pure-breed and across-breed predictions in Australian beef cattle. Genet Sel Evol. 2004;46:61.
Article Google Scholar
Fernandes Júnior GA, Rosa GJM, Valente BD, Carvalheiro R, Baldi F, Garcia DA, et al. Genomic prediction of breeding values for carcass traits in Nellore cattle. Genet Sel Evol. 2016;48:7.
Article CAS Google Scholar

Download references

Acknowledgements

The authors acknowledge Angus Australia for facilitating access to progeny from the Angus Sire Benchmarking Program for testing and associated data, and thank cooperating herd owners, managers, and staff. The authors are also thankful to the Associate Editor and two anonymous reviewers for insightful comments. The assistance of Andres Legarra addressing some of the issues raised during the peer-review process is gratefully recognized.

Funding

This work was co-funded by Meat and Livestock Australia (MLA, on behalf of the Australian Lot Feeders’ Association), Angus Australia and CSIRO (project P.PSH.0528).

Author information

Authors and Affiliations

CSIRO, Agriculture and Food, Queensland Bioscience Precinct, 306 Carmody Rd., St Lucia, Brisbane, QLD, 4067, Australia
Pâmela A. Alexandre, Yutao Li, Aaron B. Ingham, Laercio R. Porto-Neto & Antonio Reverter
CSIRO, Agriculture and Food, F.D. McMaster Laboratory, Chiswick, New England Highway, Armidale, NSW, 2350, Australia
Brad C. Hine
Angus Australia, 86 Glen Innes Rd., Armidale, NSW, 2350, Australia
Christian J. Duff

Authors

Pâmela A. Alexandre
View author publications
You can also search for this author in PubMed Google Scholar
Yutao Li
View author publications
You can also search for this author in PubMed Google Scholar
Brad C. Hine
View author publications
You can also search for this author in PubMed Google Scholar
Christian J. Duff
View author publications
You can also search for this author in PubMed Google Scholar
Aaron B. Ingham
View author publications
You can also search for this author in PubMed Google Scholar
Laercio R. Porto-Neto
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Reverter
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

AR conceived the study; AR and PAA performed the formal analysis; AR, PAA, YL, LPN and BCH carried out the investigation; CJD and ABI performed curation of data; AR and PAA prepared the original draft; YL, BCH, CJD, ABI and LPN edited and prepared the final manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Antonio Reverter.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Table S1

. Levels of sire linkage (negligible) across cohorts. The data provided represent the level of sire linkage across the year of birth cohorts used as validation population. Table S2. Breeding property linkages across cohorts. The data provided represent the breeding property linkages across the year of birth cohorts used as validation population. Table S3. Summary statistics for the genomic relationship matrix (GRM) values computed using Method 1 of VanRaden [29]. The data provided represent the summary statistics for the genomic relationship matrix (GRM) values computed using Method 1 of VanRaden [29]. Table S4. Accuracy (ACCT) of GEBV for feedlot and carcase traits from a 7-way cross-validation scheme based on YOB cohorts. The data provided represent the accuracy (computed from the correlation between GEBV and adjusted phenotypes in the validation population divided by the square root of heritability) of GEBV for feedlot and carcase traits from a 7-way cross-validation scheme based on year of birth cohorts used as validation population. Table S5. Estimates of GEBV bias using the LR method (BiasLR) for feedlot and carcase traits from a 7-way cross-validation scheme based on YOB cohorts. The data provided represent the estimates of GEBV bias using the LR method for feedlot and carcase traits from a 7-way cross-validation scheme based on year of birth cohorts used as validation population. Table S6. Estimates of GEBV dispersion using the LR method for feedlot and carcase traits from a 7-way cross-validation scheme based on YOB cohorts. The data provided represent the estimates of GEBV dispersion using the LR method (DispLR) for feedlot and carcase traits from a 7-way cross-validation scheme based on year of birth cohorts used as validation population. Table S7. Estimates of GEBV accuracy using the LR method (ACCLR) for feedlot and carcase traits from a 7-way cross-validation scheme based on YOB cohorts. The data provided represent the estimates of GEBV accuracy using the LR method for feedlot and carcase traits from a 7-way cross-validation scheme based on year of birth cohorts used as validation population.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Alexandre, P.A., Li, Y., Hine, B.C. et al. Bias, dispersion, and accuracy of genomic predictions for feedlot and carcase traits in Australian Angus steers. Genet Sel Evol 53, 77 (2021). https://doi.org/10.1186/s12711-021-00673-8

Download citation

Received: 18 March 2021
Accepted: 15 September 2021
Published: 26 September 2021
DOI: https://doi.org/10.1186/s12711-021-00673-8

Bias, dispersion, and accuracy of genomic predictions for feedlot and carcase traits in Australian Angus steers

Abstract

Background

Results

Conclusions

Background

Methods

Results

Discussion

Conclusions

Availability of data and materials

Change history

15 November 2021

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Supplementary Information

Additional file 1: Table S1

Rights and permissions

About this article

Cite this article

Share this article

Genetics Selection Evolution

Contact us