Genomic evaluation for two-way crossbred performance in cattle

Mei, Quanshun; Liu, Huiming; Zhao, Shuhong; Xiang, Tao; Christensen, Ole F

doi:10.1186/s12711-023-00792-4

Research Article
Open access
Published: 17 March 2023

Genomic evaluation for two-way crossbred performance in cattle

Quanshun Mei^1,2,
Huiming Liu³,
Shuhong Zhao¹,
Tao Xiang¹ &
…
Ole F Christensen ORCID: orcid.org/0000-0002-8230-8062²

Genetics Selection Evolution volume 55, Article number: 17 (2023) Cite this article

2200 Accesses
2 Altmetric
Metrics details

Abstract

Background

Dairy cattle production systems are mostly based on purebreds, but recently the use of crossbreeding has received increased interest. For genetic evaluations including crossbreds, several methods based on single-step genomic best linear unbiased prediction (ssGBLUP) have been proposed, including metafounder ssGBLUP (MF-ssGBLUP) and breed-specific ssGBLUP (BS-ssGBLUP). Ideally, models that account for breed effects should perform better than simple models, but knowledge on the performance of these methods is lacking for two-way crossbred cattle. In addition, the differences in the estimates of genetic parameters (such as the genetic variance component and heritability) between these methods have rarely been investigated. Therefore, the aims of this study were to (1) compare the estimates of genetic parameters for average daily gain (ADG) and feed conversion ratio (FCR) between these methods; and (2) evaluate the impact of these methods on the predictive ability for crossbred performance.

Methods

Bivariate models using standard ssGBLUP, MF-ssGBLUP and BS-ssGBLUP for the genetic evaluation of ADG and FCR were investigated. To measure the predictive ability of these three methods, we estimated four estimators, bias, dispersion, population accuracy and ratio of population accuracies, using the linear regression (LR) method.

Results

The results show that, for both ADG and FCR, the heritabilities were low with the three methods. For FCR, the differences in the estimated genetic parameters were small between the three methods, while for ADG, those estimated with BS-ssGBLUP deviated largely from those estimated with the other two methods. Bias and dispersion were similar across the three methods. Population accuracies for both ADG and FCR were always higher with MF-ssGBLUP than with ssGBLUP, while with BS-ssGBLUP the population accuracy was highest for FCR and lowest for ADG.

Conclusions

Our results indicate that in the genetic evaluation for crossbred performance in a two-way crossbred cattle production system, the predictive ability of MF-ssGBLUP and BS-ssGBLUP is greater than that of ssGBLUP, when the estimated variance components are consistent across the three methods. Compared with BS-ssGBLUP, MF-ssGBLUP is more robust in its superiority over ssGBLUP.

Background

Crossbreeding is commonly used in many livestock production systems [1], especially for pig and poultry. For dairy and beef cattle, production systems are mostly based on purebreds, but recently the use of crossbreeding between dairy breed cows and beef breed bulls has received increased interest for a number of reasons [1]. In particular, meat production from crossbreds between dairy cows and beef bulls has a lower environmental footprint than that from beef cattle [2]. Furthermore, because the improved reproductive performance of dairy cows reduces the need for replacement heifers, some dairy cows in a herd can be inseminated with beef semen.

For livestock production systems that use crossbreds, although the breeding goal is to improve crossbred performance, selection usually takes place in the purebred lines [3], which is sub-optimal since the genetic performances of purebred (PB) and crossbred (CB) animals differ [4,5,6]. By reviewing the existing literature on the genetic correlations between the performances of purebred and crossbred pigs, Wientjes and Calus [7] found an average genetic correlation of 0.6, while based on the review of 14 studies on broilers and layers, Calus et al. [8] found an average genetic correlation of 0.71. These results indicate that it is meaningful to select for CB performance as well as for PB performance in crossbred production systems.

Since 2010, single-step genomic best linear unbiased prediction (ssGBLUP) has been used as a standard genomic selection (GS) method in the pig industry, and has shown a high predictive ability for both genotyped and non-genotyped animals [9,10,11]. However, when crossbred information is considered, ssGBLUP does not fit well due to the existence of genetic differences between breeds (i.e., allele frequency, linkage disequilibrium, and gametic phase) [12]. An alternative ssGBLUP, called breed-specific ssGBLUP (BS-ssGBLUP) that integrates purebred and crossbred information, was proposed by Christensen et al. [13] based on multiple breed-specific relationship matrices [14]. Xiang et al. [5] applied this method on real pig data and validated its superiority over ssGBLUP. However, other studies have not confirmed this superiority [15, 16]. Another method called metafounder ssGBLUP (MF-ssGBLUP) has been developed by Legarra et al. [17] to model genetic differences between breeds. The differences in the estimates of genetic parameters and predictive ability between these three methods have been investigated using pig data [5, 18] and simulated data [19], but not with data on crossbreds between dairy and beef cattle, thus more research is needed. In addition, to date, ssGBLUP and MF-ssGBLUP have been successfully used in crossbred cattle to estimate genetic parameters [20, 21], but not BS-ssGBLUP.

Thus, the objectives of this study were to (1) compare the estimates of genetic parameters in crossbred beef and dairy cattle for average daily gain (ADG) and feed conversion ratio (FCR) with ssGBLUP, MF-ssGBLUP and BS-ssGBLUP; and (2) evaluate the impact of these methods on the predictive ability for crossbred performance.

Methods

Data

All datasets were provided by SEGES Innovation Cattle and Nordic Cattle Genetic Evaluation. In this study, 4089 two-way crossbred calves (BH) with purebred Belgian Blue beef (BBL) sires and purebred Holstein dairy (HOL) dams were on test for about 1 month. During this period, feed intake was recorded for each animal, and body weight of each animal was recorded at both the start and end of the test period. ADG (kg/d) and FCR (kg/kg) of each animal within this period were calculated as the increase in body weight divided by number of days and the average daily feed intake divided by average daily gain, respectively. After data editing of feed intake and body weight records (see Additional file 1: Fig. S1), 2592 crossbred calves were retained, with ADG available for all the calves and FCR for 2306 calves. The birth dates of these calves ranged from June 1 2019 to December 1 2021. These 2592 crossbred animals originate from 67 sires and 2419 dams, with an average number of progeny per sire of 38.7 and per dam of 1.1, and the average size of paternal half-sibling families was 37.6. The average age of these calves was 207 days (standard deviation (SD) = 34 days) at the beginning of the test, and 243 days (SD = 33 days) at the end of the test. Descriptive statistics of the phenotypes are in Table 1.

Table 1 Descriptive statistics

Full size table

Pedigree for the crossbred animals was traced back three generations, and included 30,643 animals with 846 BBL, 25,709 HOL and 4088 BH. Among these animals, 43 BBL and 882 HOL were genotyped with the EuroG 10K Bead chip, and 39 BBL, 1590 HOL, and 1780 BH were genotyped with the Eurogenomics 75K custom SNP chip. Among the parents of the BH, 52 BBL and 319 HOL were genotyped. For all genotyped animals, the procedure of filling-in missing genotypes and imputation from the EuroG 10K Bead chip to the Eurogenomics 75K custom SNP chip was done with the Beagle 5.2 software [22]. Quality control of the genomic data was done using the Plink software as follows [23]: first, we checked that no individuals had a call-rate lower than 90%; then, SNPs with a call-rate lower than 90%, SNPs with a minor allele frequency lower than 0.01, and SNPs that deviated strongly from the Hardy–Weinberg equilibrium within breed (p < 10–7) were removed. Finally, 4329 animals (81 BBL and 2468 HOL, and 1780 BH) and 48,777 SNPs were retained after quality control and imputation. The retained genotype data were phased with the Beagle 5.2 software [22].

Statistical models

A bivariate animal model was used to estimate genetic parameters and breeding values for ADG and FCR. To construct the single-step relationship matrices, three methods, standard ssGBLUP, MF-ssGBLUP, and BS-ssGBLUP, were incorporated in the bivariate model, as follows.

Standard ssGBLUP

With the aim of extending the marker-based relationship matrices to the non-genotyped animals, Legarra et al. [9] and Christensen and Lund [10] developed ssGBLUP.

The statistical bivariate model for ssGBLUP is:

$$\mathbf{y}=\mathbf{X}\mathbf{b}+\mathbf{Z}\mathbf{u}+\mathbf{e},$$

(1)

where ${\mathbf{y}}$ is the vector of phenotypic records for ADG and FCR in crossbred calves; $\mathbf{b}$ is the vector of fixed effects including the effects of sex, pen (during the experiment), herd-year-season (year and season of the testing period), and covariate of the weight at the start of the test for ADG and FCR; $\mathbf{u}$ is the vector of random additive genetic effects for ADG and FCR; ${\mathbf{e}}$ is the vector of the random residual error for ADG and FCR; ${\mathbf{X}}$ and ${\mathbf{Z}}$ are the corresponding incidence matrices.

It is assumed that the random effects follow normal distributions, i.e. $\mathbf{u}\sim \mathrm{N}(\boldsymbol{0}, {\sum }_{\mathrm{u}}\otimes \mathbf{H})$ and $\mathbf{e}\sim \mathrm{N}(\boldsymbol{0}, {\sum }_{\mathrm{e}}\otimes \mathbf{I})$, where $\mathbf{H}$ is the combined pedigree-based and marker-based relationship matrix presented below; $\mathbf{I}$ is the corresponding identity matrix; ${\sum }_{\mathbf{u}}$ is the genetic (co)variance matrix, ${\sum }_{\mathbf{e}}$ is the residual (co)variance matrix, and $\otimes$ denotes the Kronecker product. The (co)variance matrices are as follows:

$${\sum }_{\mathbf{u}}=\left[\begin{array}{cc}{\upsigma }_{{\mathrm{u}}_{\mathrm{ADG}}}^{2}& {\upsigma }_{{\mathrm{u}}_{\mathrm{ADG}}{\mathrm{u}}_{\mathrm{FCR}}}\\ \mathrm{sym}& {\upsigma }_{{\mathrm{u}}_{\mathrm{FCR}}}^{2}\end{array}\right],$$

$$\mathrm{and }\,{\sum }_{\mathbf{e}}=\left[\begin{array}{cc}{\upsigma }_{{\mathrm{e}}_{\mathrm{ADG}}}^{2}& {\upsigma }_{{\mathrm{e}}_{\mathrm{ADG}}{\mathrm{e}}_{\mathrm{FCR}}}\\ \mathrm{sym}& {\upsigma }_{{\mathrm{e}}_{\mathrm{FCR}}}^{2}\end{array}\right],$$

where ${\upsigma }_{{\mathrm{u}}_{\mathrm{ADG}}}^{2}$ is the additive genetic variance of ADG, ${\upsigma }_{{\mathrm{u}}_{\mathrm{FCR}}}^{2}$ is the additive genetic variance of FCR, ${\upsigma }_{{\mathrm{u}}_{\mathrm{ADG}}{\mathrm{u}}_{\mathrm{FCR}}}$ is the additive genetic covariance between ADG and FCR; ${\upsigma }_{{\mathrm{e}}_{\mathrm{ADG}}}^{2}$ is the residual variance of ADG, ${\upsigma }_{{\mathrm{e}}_{\mathrm{FCR}}}^{2}$ is the residual variance of FCR, and ${\upsigma }_{{\mathrm{e}}_{\mathrm{ADG}}{\mathrm{e}}_{\mathrm{FCR}}}$ is the residual covariance between ADG and FCR.

The combined pedigree-based and marker-based relationship matrix $\mathbf{H}$ is defined as [9, 10]:

$$\mathbf{H}=\left[\begin{array}{cc}{\mathbf{A}}_{\boldsymbol{11}}-{\mathbf{A}}_{\boldsymbol{12}}{\mathbf{A}}_{\boldsymbol{22}}^{-\boldsymbol{1}}{\mathbf{A}}_{\boldsymbol{21}}+{\mathbf{A}}_{\boldsymbol{12}}{\mathbf{A}}_{\boldsymbol{22}}^{-1}\mathbf{G}{\mathbf{A}}_{\boldsymbol{22}}^{-1}{\mathbf{A}}_{\boldsymbol{21}}& {\mathbf{A}}_{\boldsymbol{12}}{\mathbf{A}}_{\boldsymbol{22}}^{-\boldsymbol{1}}\mathbf{G}\\ \mathbf{G}{\mathbf{A}}_{\boldsymbol{22}}^{-1}{\mathbf{A}}_{\boldsymbol{21}}& (1-\upomega )\mathbf{G}+\upomega {\mathbf{A}}_{\boldsymbol{22}}\end{array}\right],$$

where $\mathbf{A}$ is the pedigree relationship matrix, $\mathbf{G}$ is the genomic realized relationship matrix; subscripts 1 and 2 stand for non-genotyped and genotyped animals, respectively; $\upomega$ is interpreted as the relative weight on the polygenic effect, which is set as 0.05 in this study as commonly done [24, 25].

The relationship matrix $\mathbf{G}$ was constructed as [26]:

$$\mathbf{G}=\frac{\mathbf{Z}\mathbf{Z}\mathbf{^{\prime}}}{\sum_{\mathrm{i}=1}^{\mathrm{m}}2{\mathrm{p}}_{\mathrm{i}}{\mathrm{q}}_{\mathrm{i}}},$$

where $\mathrm{m}$ is the number of SNPs, ${\mathrm{p}}_{\mathrm{i}}$ is the frequency of allele A at marker $\mathrm{i}$ and ${\mathrm{q}}_{\mathrm{i}}=1-{\mathrm{p}}_{\mathrm{i}}$; $\mathbf{Z}$ is the incidence matrix with elements of $2-2{\mathrm{p}}_{\mathrm{i}}$, $1-2{\mathrm{p}}_{\mathrm{i}}$ and $-2{\mathrm{p}}_{\mathrm{i}}$ for AA, Aa, and aa, respectively. Matrix $\mathbf{G}$ was adjusted to be compatible with matrix $\mathbf{A}$ as described by Christensen et al. [27].

MF-ssGBLUP

To account for allele frequency in the base population and compatibility between the pedigree and genomic additive relationship matrices, Legarra et al. [17] developed a new method named MF-ssGBLUP, based on developments described in Christensen [27].

The statistical bivariate model for MF-ssGBLUP is:

$$\mathbf{y}=\mathbf{X}\mathbf{b}+\mathbf{Z}\mathbf{u}+\mathbf{e},$$

(2)

where ${\mathbf{y}}$, $\mathbf{b}$, $\mathbf{u}$, $\mathbf{e}$, $\mathbf{X}$ and $\mathbf{Z}$ are as defined above.

The difference between Eq. (1) and Eq. (2) is the definition of the additive genetic relationship matrix. For Eq. (2), it is assumed that the random effects follow normal distributions, i.e. $\mathbf{u}\sim \mathrm{N}(\boldsymbol{0}, {\sum }_{\mathrm{u}}\otimes {\mathbf{H}}_{\mathbf{M}\mathbf{F}})$, where ${\mathbf{H}}_{\mathbf{M}\mathbf{F}}$ is the combined pedigree-based and marker-based metafounder relationship matrix, and ${\sum }_{\mathbf{u}}$ contains the genetic variance and covariance parameters.

The matrix ${\mathbf{H}}_{\mathbf{M}\mathbf{F}}$ is defined as:

$${\mathbf{H}}_{\mathbf{M}\mathbf{F}}=\left[\begin{array}{cc}{\mathbf{A}}_{\boldsymbol{11}}^{{\varvec{\Gamma}}}-{\mathbf{A}}_{\boldsymbol{12}}^{{\varvec{\Gamma}}}{{\mathbf{A}}_{\boldsymbol{22}}^{{\varvec{\Gamma}}}}^{-1}{\mathbf{A}}_{\boldsymbol{21}}^{{\varvec{\Gamma}}}+{\mathbf{A}}_{\boldsymbol{12}}^{{\varvec{\Gamma}}}{{\mathbf{A}}_{\boldsymbol{22}}^{{\varvec{\Gamma}}}}^{-\boldsymbol{1}}{\mathbf{G}}^{0.5}{{\mathbf{A}}_{\boldsymbol{22}}^{{\varvec{\Gamma}}}}^{-\boldsymbol{1}}{\mathbf{A}}_{21}^{{\varvec{\Gamma}}}& {\mathbf{A}}_{12}^{{\varvec{\Gamma}}}{{\mathbf{A}}_{22}^{{\varvec{\Gamma}}}}^{-\boldsymbol{1}}{\mathbf{G}}^{0.5}\\ {\mathbf{G}}^{0.5}{{\mathbf{A}}_{\boldsymbol{22}}^{{\varvec{\Gamma}}}}^{-1}{\mathbf{A}}_{\boldsymbol{21}}^{{\varvec{\Gamma}}}& (1-\upomega ){\mathbf{G}}^{0.5}+\upomega {{\mathbf{A}}_{\boldsymbol{22}}^{{\varvec{\Gamma}}}}^{-\boldsymbol{1}}\end{array}\right],$$

where $\upomega$ is as defined above; ${\mathbf{A}}^{{\varvec{\Gamma}}}$ is the pedigree relationship matrix with metafounders, ${\mathbf{G}}^{0.5}$ is the genomic realized relationship matrix with allele frequencies equal to 0.5; subscripts 1 and 2 stand for non-genotyped and genotyped animals, respectively.

The construction of the relationship matrix ${\mathbf{A}}^{{\varvec{\Gamma}}}$ is based on the estimated metafounder relationship matrix, ${\varvec{\Gamma}}$ [17], which represents the within- and across-population relationship matrix and is expressed as follows:

$${\varvec{\Gamma}}=\left[\begin{array}{cc}{\upgamma }_{\mathrm{B}}& {\upgamma }_{\mathrm{B},\mathrm{H}}\\ \mathrm{sym}& {\upgamma }_{\mathrm{H}}\end{array}\right],$$

where ${\upgamma }_{\mathrm{B}}$ is the metafounder relationship for BBL; ${\upgamma }_{\mathrm{H}}$ is the metafounder relationship for HOL; ${\upgamma }_{\mathrm{B},\mathrm{H}}$ is the across-metafounder relationship between the BBL and HOL populations. The generalized least squares method was used to estimate ${\varvec{\Gamma}}$ as described by Garcia-Baccino et al. [28].

The relationship matrix ${\mathbf{G}}^{0.5}$ is constructed as:

$${\mathbf{G}}^{0.5}=\frac{\mathbf{Z}\mathbf{Z}\mathbf{^{\prime}}}{\mathrm{s}},$$

where $\mathbf{Z}$ is the incidence matrix with elements of 1, 0 and − 1 for AA, Aa, and aa, respectively; $\mathrm{s}=\mathrm{m}/2$.

The genetic variance and covariance parameters from MF-ssGBLUP were estimated under the assumption that founders are related, while in other models usually unrelated founders are assumed for the genetic variance. To be comparable with estimates from other models that estimate genetic variance for unrelated founders, such as standard ssGBLUP and BS-ssGBLUP in our study, we multiplied the estimates of the genetic parameters estimated with MF-ssGBLUP by $1+\frac{\overline{\mathrm{diag }({\varvec{\Gamma}})}}{2}-\overline{{\varvec{\Gamma}} }$, following the suggestion of Legarra et al. [17].

BS-ssGBLUP

BS-ssGBLUP assumes that the substitution effects of breed-specific alleles differ between breeds. This method was developed by Christensen et al. [13] based on previous studies [14, 29].

The statistical bivariate model for BS-ssGBLUP is:

$$\mathbf{y}=\mathbf{X}\mathbf{b}+{\mathbf{Z}}_{\mathbf{B}}{\mathbf{u}}_{\mathbf{B}}+{\mathbf{Z}}_{\mathbf{H}}{\mathbf{u}}_{\mathbf{H}}+\mathbf{e},$$

(3)

where ${\mathbf{y}}$, $\mathbf{b}$, $\mathbf{e}$, and $\mathbf{X}$ are as defined above; ${\mathbf{u}}_{\mathbf{B}}$ is the vector of random additive genetic effects from BBL for ADG and FCR, ${\mathbf{u}}_{\mathbf{H}}$ is the vector of random additive genetic effects from HOL for ADG and FCR; ${\mathbf{Z}}_{\mathbf{B}}$ and ${\mathbf{Z}}_{\mathbf{H}}$ are the corresponding incidence matrices.

It is assumed that the random effects follow normal distributions, i.e. ${\mathbf{u}}_{\mathbf{B}}\sim \mathrm{N}(\boldsymbol{0}, {\sum }_{{\mathbf{u}}_{\mathbf{B}}}\otimes {\mathbf{H}}_{\mathbf{B}})$ and ${\mathbf{u}}_{\mathbf{H}}\sim \mathrm{N}(\boldsymbol{0}, {\sum }_{{\mathbf{u}}_{\mathbf{H}}}\otimes {\mathbf{H}}_{\mathbf{H}})$, where ${\mathbf{H}}_{\mathbf{B}}$ and ${\mathbf{H}}_{\mathbf{H}}$ are combined pedigree-based and marker-based breed specific partial relationship matrices for BBL and HOL; ${\sum }_{{\mathbf{u}}_{\mathbf{B}}}$ is the genetic (co)variance matrix for BBL, ${\sum }_{{\mathbf{u}}_{\mathbf{H}}}$ is the genetic (co)variance matrix for HOL. The (co)variance matrices are as follows:

$${\sum }_{{\mathbf{u}}_{\mathbf{B}}}=\left[\begin{array}{cc}{\upsigma }_{{{\mathrm{u}}_{\mathrm{B}}}_{\mathrm{ADG}}}^{2}& {\upsigma }_{{{\mathrm{u}}_{\mathrm{B}}}_{\mathrm{ADG}}{{\mathrm{u}}_{\mathrm{B}}}_{\mathrm{FCR}}}\\ \mathrm{sym}& {\upsigma }_{{{\mathrm{u}}_{\mathrm{B}}}_{\mathrm{FCR}}}^{2}\end{array}\right] ,$$

$${\mathrm{and }\sum }_{{\mathbf{u}}_{\mathbf{H}}}=\left[\begin{array}{cc}{\upsigma }_{{{\mathrm{u}}_{\mathrm{H}}}_{\mathrm{ADG}}}^{2}& {\upsigma }_{{{\mathrm{u}}_{\mathrm{H}}}_{\mathrm{ADG}}{{\mathrm{u}}_{\mathrm{H}}}_{\mathrm{FCR}}}\\ \mathrm{sym}& {\upsigma }_{{{\mathrm{u}}_{\mathrm{H}}}_{\mathrm{FCR}}}^{2}\end{array}\right],$$

where ${\upsigma }_{{{\mathrm{u}}_{\mathrm{B}}}_{\mathrm{ADG}}}^{2}$ is the additive genetic variance of ADG from BBL, ${\upsigma }_{{{\mathrm{u}}_{\mathrm{B}}}_{\mathrm{FCR}}}^{2}$ is the additive genetic variance of FCR from BBL, ${\upsigma }_{{{\mathrm{u}}_{\mathrm{B}}}_{\mathrm{ADG}}{{\mathrm{u}}_{\mathrm{B}}}_{\mathrm{FCR}}}$ is the additive genetic covariance between ADG and FCR from BBL; ${\upsigma }_{{{\mathrm{u}}_{\mathrm{H}}}_{\mathrm{ADG}}}^{2}$ is the additive genetic variance of ADG from HOL, ${\upsigma }_{{{\mathrm{u}}_{\mathrm{H}}}_{\mathrm{FCR}}}^{2}$ is the additive genetic variance of FCR from breed HOL, ${\upsigma }_{{{\mathrm{u}}_{\mathrm{H}}}_{\mathrm{ADG}}{{\mathrm{u}}_{\mathrm{H}}}_{\mathrm{FCR}}}$ is the additive genetic covariance between ADG and FCR from HOL.

The breed-specific matrix ${\mathbf{H}}_{\mathbf{B}}$ is defined as:

$${\mathbf{H}}_{\mathbf{B}}=\left[\begin{array}{cc}{\mathbf{A}}_{\boldsymbol{11}}^{(\mathbf{B})}-{\mathbf{A}}_{\boldsymbol{12}}^{(\mathbf{B})}{{\mathbf{A}}_{\boldsymbol{22}}^{(\mathbf{B})}}^{-1}{\mathbf{A}}_{\boldsymbol{21}}^{(\mathbf{B})}+{\mathbf{A}}_{\boldsymbol{12}}^{(\mathbf{B})}{{\mathbf{A}}_{\boldsymbol{22}}^{(\mathbf{B})}}^{-1}{\mathbf{G}}^{(\mathbf{B})}{{\mathbf{A}}_{\boldsymbol{22}}^{(\mathbf{B})}}^{-\boldsymbol{1}}{\mathbf{A}}_{\boldsymbol{21}}^{(\mathbf{B})}& {\mathbf{A}}_{\boldsymbol{12}}^{(\mathbf{B})}{{\mathbf{A}}_{\boldsymbol{22}}^{(\mathbf{B})}}^{-\boldsymbol{1}}{\mathbf{G}}^{(\mathbf{B})}\\ {\mathbf{G}}^{(\mathbf{B})}{{\mathbf{A}}_{\boldsymbol{22}}^{(\mathbf{B})}}^{-\boldsymbol{1}}{\mathbf{A}}_{\boldsymbol{21}}^{(\mathbf{B})}& (1-\upomega ){\mathbf{G}}^{(\mathbf{B})}+\upomega {{\mathbf{A}}_{\boldsymbol{22}}^{(\mathbf{B})}}^{-1}\end{array}\right],$$

where $\upomega$ is as defined above; ${\mathbf{A}}^{(\mathbf{B})}$ is the breed-specific pedigree relationship matrix from BBL, ${\mathbf{G}}^{(\mathbf{B})}$ is the breed-specific genomic realized relationship matrix from BBL; subscripts 1 and 2 stand for non-genotyped and genotyped animals, respectively.

The breed-specific pedigree relationship matrix ${\mathbf{A}}^{(\mathbf{B})}$ was previously described by García-Cortés and Toro [14]. Matrix ${\mathbf{G}}^{(\mathbf{B})}$ is split into submatrices with indices denoting genotyped BBL and crossbred animals as follows:

$${\mathbf{G}}^{(\mathbf{B})}=\left[\begin{array}{cc}{\mathbf{G}}_{\mathbf{B},\mathbf{B}}^{(\mathbf{B})}& {\mathbf{G}}_{\mathbf{B},\mathbf{B}\mathbf{H}}^{(\mathbf{B})}\\ \mathbf{s}\mathbf{y}\mathbf{m}& {\mathbf{G}}_{\mathbf{B}\mathbf{H},\mathbf{B}\mathbf{H}}^{(\mathbf{B})}\end{array}\right],$$

with these submatrices being defined as:

$${\mathbf{G}}_{\mathbf{B},\mathbf{B}}^{(\mathbf{B})}=\frac{({\mathbf{M}}_{\mathbf{B}}-2{\mathbf{p}}_{\mathbf{B}}\boldsymbol{1}){({\mathbf{M}}_{\mathbf{B}}-\boldsymbol{2}{\mathbf{p}}_{\mathbf{B}}\boldsymbol{1})}^{^{\prime}}}{2{\mathbf{p}}_{\mathbf{B}}^{^{\prime}}(\boldsymbol{1}-{\mathbf{p}}_{\mathbf{B}})},$$

$${\mathbf{G}}_{\mathbf{B},\mathbf{B}\mathbf{H}}^{(\mathbf{B})}=\frac{({\mathbf{M}}_{\mathbf{B}}-\boldsymbol{2}{\mathbf{p}}_{\mathbf{B}}\boldsymbol{1}){({\mathbf{Q}}_{\mathbf{B}}-{\mathbf{p}}_{\mathbf{B}}\boldsymbol{1})}^{^{\prime}} }{2{\mathbf{p}}_{\mathbf{B}}^{^{\prime}}(\boldsymbol{1}-{\mathbf{p}}_{\mathbf{B}})},$$

$$\mathrm{and }\,{\mathbf{G}}_{\mathbf{B}\mathbf{H},\mathbf{B}\mathbf{H}}^{(\mathbf{B})}=\frac{({\mathbf{Q}}_{\mathbf{B}}-{\mathbf{p}}_{\mathbf{B}}\boldsymbol{1}){({\mathbf{Q}}_{\mathbf{B}}-{\mathbf{p}}_{\mathbf{B}}\boldsymbol{1})}^{^{\prime}}}{2{\mathbf{p}}_{\mathbf{B}}^{^{\prime}}(\boldsymbol{1}-{\mathbf{p}}_{\mathbf{B}})},$$

where ${\mathbf{M}}_{\mathbf{B}}$ and ${\mathbf{Q}}_{\mathbf{B}}$ contain the breed BBL specific allele contents of the reference allele for BBL (coded as 0, 1, or 2) and for BH (coded as 0 or 1), respectively, for which tracing of the breed of origin of alleles is required; $\boldsymbol{1}$ is a vector of 1s; and ${\mathbf{p}}_{\mathbf{B}}$ is the vector of BBL specific allele frequencies. Finally, matrix ${\mathbf{G}}^{(\mathbf{B})}$ is adjusted to be compatible with matrix ${\mathbf{A}}^{(\mathbf{B})}$, as described by Christensen et al. [13]. The definition of the breed-specific matrix ${\mathbf{H}}_{\mathbf{H}}$ is similar to the definition of the ${\mathbf{H}}_{\mathbf{B}}$ matrix.

Tracing the breed of origin of alleles in F1 crosses is expected to be very accurate [30], and was conducted on the phased genotypes, separately, for each chromosome per individual. Among the 1780 genotyped crossbred animals, 1447 crossbred animals had 47 genotyped sires, whereas for the 333 remaining crossbred animals, none of the parents were genotyped. When the sire (or dam) was genotyped, four comparisons between crossbred and purebred phased alleles were made. For each comparison, when a crossbred allele differed from the corresponding purebred allele, it was counted as a difference. The chromosome with the smallest number of differences was assigned to the breed of the parent. When neither of the parents was genotyped, for each non-overlapping sliding window of 50 consecutive SNPs, comparisons between the two crossbred segments of phased alleles and segments of phased alleles in the reference panel were made for each breed. For each of the two crossbred segments, the number of copies was counted for each breed, and the segment was considered to originate from the breed with the largest number of copies. Finally, each crossbred chromosome was assigned to the breed from which the majority of its segments originated. This procedure is the same as in Xiang et al. [5].

For tracing of alleles and the construction of the breed-specific matrices ${\mathbf{H}}_{\mathbf{B}}$ and ${\mathbf{H}}_{\mathbf{H}}$ in BS-SSGBLUP, we developed an R package named cBar2, which has been uploaded on github (https://github.com/TXiang-lab/cBar2).

Estimation of genetic parameters in the above bivariate models with the three methods was carried out using the restricted maximum likelihood (REML) algorithm in the software DMU [31] via the wrapper of R package blupADC [32].

The heritability and genetic correlation estimates and their standard errors in ssGBLUP and MF-ssGBLUP were calculated as described by Falconer [33] and Mrode [34]. For BS-ssGBLUP, the heritability estimates for ADG and FCR were calculated as ${\mathrm{h}}_{\mathrm{ADG}}^{2}=\frac{{\upsigma }_{{{\mathrm{u}}_{\mathrm{BH}}}_{\mathrm{ADG}}}^{2}}{{\upsigma }_{{{\mathrm{u}}_{\mathrm{BH}}}_{\mathrm{ADG}}}^{2}+{\upsigma }_{{\mathrm{e}}_{\mathrm{ADG}}}^{2}}$ and ${\mathrm{h}}_{\mathrm{FCR}}^{2}=\frac{{\upsigma }_{{{\mathrm{u}}_{\mathrm{BH}}}_{\mathrm{FCR}}}^{2}}{{\upsigma }_{{{\mathrm{u}}_{\mathrm{BH}}}_{\mathrm{FCR}}}^{2}+{\upsigma }_{{\mathrm{e}}_{\mathrm{FCR}}}^{2}}$, where ${\upsigma }_{{{\mathrm{u}}_{\mathrm{BH}}}_{\mathrm{ADG}}}^{2}$ and ${\upsigma }_{{{\mathrm{u}}_{\mathrm{BH}}}_{\mathrm{FCR}}}^{2}$ are defined as $0.5\left({\upsigma }_{{{\mathrm{u}}_{\mathrm{B}}}_{\mathrm{ADG}}}^{2}+{\upsigma }_{{{\mathrm{u}}_{\mathrm{H}}}_{\mathrm{ADG}}}^{2}\right)$ and $0.5\left({\upsigma }_{{{\mathrm{u}}_{\mathrm{B}}}_{\mathrm{FCR}}}^{2}+{\upsigma }_{{{\mathrm{u}}_{\mathrm{H}}}_{\mathrm{FCR}}}^{2}\right)$. The standard errors of the heritabilities, ${\upsigma }_{\left({\mathrm{h}}_{\mathrm{ADG}}^{2}\right)}$ and ${\upsigma }_{\left({\mathrm{h}}_{\mathrm{FCR}}^{2}\right)}$, were obtained by the deltaMethod implemented in the R package msm [35]. The genetic correlation between ADG and FCR was calculated as $\mathrm{r}=\frac{0.5({\upsigma }_{{{\mathrm{u}}_{\mathrm{B}}}_{\mathrm{ADG}}{{\mathrm{u}}_{\mathrm{B}}}_{\mathrm{FCR}}}+{\upsigma }_{{{\mathrm{u}}_{\mathrm{H}}}_{\mathrm{ADG}}{{\mathrm{u}}_{\mathrm{H}}}_{\mathrm{FCR}}})}{\sqrt{{\upsigma }_{{{\mathrm{u}}_{\mathrm{BH}}}_{\mathrm{ADG}}}^{2}*{\upsigma }_{{{\mathrm{u}}_{\mathrm{BH}}}_{\mathrm{FCR}}}^{2}}}$, and its standard error was also obtained by the deltaMethod.

Model-based reliability

For ssGBLUP, the model-based reliability was calculated as follows:

$${\mathrm{Rel}}_{\mathrm{i}}=\boldsymbol{1}-\frac{{\mathbf{P}\mathbf{E}\mathbf{V}}_{\mathrm{i},\mathrm{i}}}{{\mathbf{H}}_{\mathbf{i},\mathbf{i}}{\upsigma }_{\mathrm{u}}^{2}},$$

where $\mathbf{H}$ is as defined previously; ${\mathrm{Rel}}_{\mathrm{i}}$ is the reliability of the individual $\mathrm{i}$, ${\upsigma }_{\mathrm{u}}^{2}$ is the additive genetic variance estimated with BS-ssGBLUP, $\mathbf{P}\mathbf{E}\mathbf{V}$ is the prediction error (co)variance matrix, which can be obtained by inverting the coefficient matrix of Henderson’s mixed model equations corresponding to the model used [34].

Both MF-ssGBLUP and BS-ssGBLUP can model the genetic difference between breeds, and the individual model-based reliability can be calculated within each breed [13, 36]. For MF-ssGBLUP, the model-based reliability was calculated as described in Bermann et al. [36].

For individual $\mathrm{i}$, the reliability within each metafounder is calculated as:

$${\mathrm{Rel}}_{\mathrm{i}}^{\mathrm{mf}}=\boldsymbol{1}-\frac{\mathbf{P}\mathbf{E}\mathbf{V}\left({\mathbf{u}}_{\mathrm{i}}\right)+\mathbf{P}\mathbf{E}\mathbf{V}\left({\mathbf{u}}_{\mathrm{mf}}\right)-2\mathbf{P}\mathbf{E}\mathbf{V}\left({\mathbf{u}}_{\mathrm{i}}, {\mathbf{u}}_{\mathbf{m}\mathbf{f}}\right)}{\left({\mathbf{H}}_{\mathbf{i}\mathbf{i}}^{\left({\varvec{\Gamma}}\right)}+{\mathbf{H}}_{\mathbf{m}\mathbf{f},\mathbf{m}\mathbf{f}}^{\left({\varvec{\Gamma}}\right)}-2{\mathbf{H}}_{\mathbf{i},\mathbf{m}\mathbf{f}}^{\left({\varvec{\Gamma}}\right)}\right){\upsigma }_{\mathrm{u}\left(\Gamma \right)}^{2}},$$

where $\mathbf{u}$, $\mathbf{P}\mathbf{E}\mathbf{V}$ and ${\mathbf{H}}^{{\varvec{\Gamma}}}$ are as defined previously; ${\mathrm{Rel}}_{\mathrm{i}}^{\mathrm{mf}}$ is the reliability of individual $\mathrm{i}$ within metafounder $\mathrm{mf}$; and ${\upsigma }_{\mathrm{u}(\Gamma )}^{2}$ is the additive genetic variance estimated with MF-ssGBLUP.

For BS-ssGBLUP, the model-based reliability is calculated as follows:

$${\mathrm{Rel}}_{\mathrm{i}}=\boldsymbol{1}-\frac{{\mathbf{P}\mathbf{E}\mathbf{V}}_{\mathrm{i},\mathrm{i}}}{{\mathbf{H}}_{{\mathbf{A}}_{\mathbf{i},\mathbf{i}}}{\upsigma }_{\mathrm{u}}^{2}},$$

where ${\mathbf{H}}_{\mathbf{A}}$ and $\mathbf{P}\mathbf{E}\mathbf{V}$ are as defined previously; ${\mathrm{Rel}}_{\mathrm{i}}$ is the reliability of the individual $\mathrm{i}$, and ${\upsigma }_{\mathrm{u}}^{2}$ is the additive genetic variance estimated with BS-ssGBLUP.

In this study, we investigated the model-based reliabilities of BBL sires having offspring with phenotypes.

Estimators of the LR method

In this study, four estimators, i.e. bias ($\widehat{\Delta }$), dispersion ($\widehat{\mathrm{b}}$), population accuracy ($\widehat{\mathrm{acc}}$) and ratio of accuracies ($\widehat{\uprho }$) were estimated with the LR method [37] and were used to evaluate the impact of each of the three methods (ssGBLUP, MF-ssGBLUP, and BS-ssGBLUP) on the estimated breeding values (EBV) for crossbred performance, since the LR method has proven to show better analytical properties than the ordinary cross-validation method [37,38,39]. EBV of focal individuals were denoted as ${\widehat{\mathbf{u}}}_{\mathbf{p}}$ and ${\widehat{\mathbf{u}}}_{\mathbf{w}}$ based on the partial and the whole dataset, respectively. The partial dataset was defined as the set of crossbred animals which were born before a specified cut-off date (we used two cut-off dates in this study, April 1 2021 and May 1 2021), and focal individuals were those born after the specific cut-off date. The number of individuals in the partial datasets for each trait are in Table 1. For BS-ssGBLUP, EBV are equal to the sum of ${\mathbf{u}}_{\mathbf{H}}$ and ${\mathbf{u}}_{\mathbf{B}}$. The estimators are summarized below.

Bias

The bias estimator $\widehat{\Delta }$ is defined as the difference between the mean of EBV based on the partial dataset and the mean of EBV based on the whole dataset, i.e. $\widehat{\Delta } =\overline{{\widehat{\mathbf{u}} }_{\mathbf{p}}}-\overline{{\widehat{\mathbf{u}} }_{\mathbf{w}}}$.

In absence of bias, the expected value of this estimator is 0.

Dispersion

The dispersion estimator is defined as the slope of the regression of ${\widehat{\mathbf{u}}}_{\mathbf{w}}$ on ${\widehat{\mathbf{u}}}_{\mathbf{p}}$, which is equal to $\widehat{\mathrm{b}}=\frac{\mathrm{Cov}({\widehat{\mathbf{u}}}_{\mathbf{w}},{\widehat{\mathbf{u}}}_{\mathbf{p}})}{\mathrm{Var}({\widehat{\mathbf{u}}}_{\mathbf{p}})}$. The expected value of this estimator is 1 under the assumption that ${\widehat{\mathbf{u}}}_{\mathbf{p}}$ has no dispersion bias, while $\widehat{\mathrm{b}}$< 1 indicates over-dispersion, and $\widehat{\mathrm{b}}$> 1 indicates under-dispersion of ${\widehat{\mathbf{u}}}_{\mathbf{p}}$.

Population accuracy

The population accuracy of focal individuals based on the partial dataset can be calculated as $\widehat{\mathrm{acc}}=\sqrt{\frac{\mathrm{Cov}({\widehat{\mathbf{u}}}_{\mathbf{w}},{\widehat{\mathbf{u}}}_{\mathbf{p}})}{(1+\overline{\mathrm{F} }-2\overline{\mathrm{f} } ){\upsigma }_{\mathrm{u},\infty }^{2}}}$, where $\overline{\mathrm{F} }$ is the average inbreeding coefficient of focal individuals, 2 $\overline{\mathrm{f} }$ is the average relationship between focal individuals, and ${\upsigma }_{\mathrm{u},\infty }^{2}$ is the estimated genetic variance with a partial dataset (assuming that the focal individuals are not under selection in the partial dataset).

Ratio of population accuracies

The ratio of population accuracies estimator is defined as the Pearson correlation between ${\widehat{\mathbf{u}}}_{\mathbf{w}}$ and ${\widehat{\mathbf{u}}}_{\mathbf{p}}$, which is equal to $\widehat{\uprho }=\frac{\mathrm{Cov}({\widehat{\mathbf{u}}}_{\mathbf{w}},{\widehat{\mathbf{u}}}_{\mathbf{p}})}{\sqrt{\mathrm{Var}({\widehat{\mathbf{u}}}_{\mathbf{w}})\mathrm{Var}({\widehat{\mathbf{u}}}_{\mathbf{p}})}}$. This is an estimator for $\frac{{\mathrm{acc}}_{\mathrm{p}}}{{\mathrm{acc}}_{\mathrm{w}}}$, where ${\mathrm{acc}}_{\mathrm{p}}$ is the population accuracy based on the partial dataset, and ${\mathrm{acc}}_{\mathrm{w}}$ is the population accuracy based on the whole dataset.

Results

Genetic parameters

Estimated variance components and heritabilities for ADG and FCR and the estimated genetic correlations between ADG and FCR are in Table 2. The genetic variances and covariance obtained with MF-ssGBLUP were scaled for comparison with those of the other two methods. For MF-ssGBLUP, the metafounder relationship coefficients ${\upgamma }_{\mathrm{B}}$, ${\upgamma }_{\mathrm{BH}}$ and ${\upgamma }_{\mathrm{H}}$ were estimated to be 0.702, 0.570, and 0.672, respectively.

Table 2 Estimates of variance components and their standard error (SE) obtained with three methods

Full size table

The estimated variance components for ssGBLUP and MF-ssGBLUP were similar. The estimates of the heritabilities for ssGBLUP and MF-ssGBLUP were also similar for both ADG (0.082 and 0.076) and FCR (0.079 and 0.080). However, for BS-ssGBLUP, ADG had a heritability estimate of 0.140, which differed from the estimate obtained with the other two methods. The genetic correlation between ADG and FCR was negative and moderate to high for all methods, i.e. − 0.531(0.239), − 0.515(0.251), and − 0.620(0.197) for ssGBLUP, MF-ssGBLUP and BS-ssGBLUP, respectively.

Model-based reliability

Table 3 shows the mean model-based reliabilities of purebred sires for their crossbred performance for ssGBLUP, MF-ssGBLUP and BS-ssGBLUP. Model-based reliabilities were computed for sires having offspring with phenotypes, and are presented as an average of all sires, an average of genotyped sires, and an average of non-genotyped sires. Among these 67 sires, only 47 have been genotyped. On average, genotyped sires had higher reliabilities than non-genotyped sires, regardless of which method was used. For ADG, MF-ssGBLUP had the highest model-based reliability (0.323), and BS-ssGBLUP had the lowest model-based reliability (0.221), while for FCR, BS-ssGBLUP and MF-ssGBLUP had the highest model-based reliability (0.348), and ssGBLUP had the lowest model-based reliability (0.261). For both traits, MF-ssGBLUP always had a higher model-based reliability than ssGBLUP.

Table 3 Mean model-based reliability of purebred bulls for their crossbred performance

Full size table

Predictive ability

Four estimators ($\widehat{\Delta }$, $\widehat{\mathrm{b}}$, $\widehat{\mathrm{acc}}$ and $\widehat{\uprho }$) in the LR method were used to evaluate the predictive ability of ssGBLUP, MF-ssGBLUP, and BS-ssGBLUP for two focal sets of individuals. The results for the different datasets of individuals are in Table 4 for those with a cut-off date at April 1 2021 and in Additional file 2: Table S1 for those with a cut-off date at May 1 2021. The results differed slightly between datasets, but the conclusions were similar. Therefore, in the remainder of the paper, we focus only on the results in Table 4.

Table 4 Bias ($\widehat{\Delta }$), dispersion ($\widehat{\mathbf{b}}$), population accuracy ($\widehat{\mathbf{a}\mathbf{c}\mathbf{c}}$) and ratio of population accuracies ($\widehat{{\varvec{\uprho}}}$) of EBV for focal individuals (cut-off date, April 1, 2021) obtained with three methods

Full size table

As shown in Table 4, the differences between $\widehat{\Delta }$ across the three methods were small. For all methods, the values of $\widehat{\Delta }$ were close to the expected value (equal to 0) for both traits, while the values of $\widehat{\mathrm{b}}$ were close to the expected value (equal to 1) for ADG and deviated from the expected value for FCR. For ADG, population accuracy ($\widehat{\mathrm{acc}}$) was highest (0.273) with MF-ssGBLUP, and lowest (0.239) with BS-ssGBLUP, while for FCR, it was highest (0.257) with BS-ssGBLUP, and lowest (0.210) with ssGBLUP. For both traits, population accuracy was higher with MF-ssGBLUP than with ssGBLUP. The ratios of population accuracies based on the partial and whole datasets were for ssGBLUP, MF-ssGBLUP and BS-ssGBLUP, respectively, 0.714, 0.729, and 0.520 for ADG, and 0.691, 0.699, and 0.737 for FCR.

Discussion

In this work, first we compared the estimates of genetic parameters for ADG and FCR obtained with ssGBLUP, MF-ssGBLUP and BS-ssGBLUP. In general, variance components and heritability estimates for FCR did not differ considerably between methods, while for ADG, those estimated with BS-ssGBLUP deviated largely from those estimated with ssGBLUP and MF-ssGBLUP. Then, we evaluated the impact of these methods on the predictive ability for crossbred performance. For both traits, the estimators ($\widehat{\Delta }$, $\widehat{\mathrm{b}}$, and $\widehat{\mathrm{acc}}$) in the LR method showed that the predictive ability of MF-ssGBLUP was always superior to that of ssGBLUP, whereas the comparison of the predictive ability of BS-ssGBLUP with the other two methods showed no consistent result.

Genetic parameters

Variance components, heritabilities and genetic correlations obtained with ssGBLUP and with MF-ssGBLUP were similar for both ADG and FCR. This observation is in line with previous studies [18, 40]. However, for BS-ssGBLUP, the estimated genetic parameters for FCR were similar to those with the other two methods, while the result was opposite for ADG. As shown in Table 2, the additive genetic variance for FCR in the sire breed and dam breed was 0.227 and 0.106, respectively, while for ADG, it was 0.004, and 0.023, respectively. Our results are not consistent with those reported by Poulsen et al. [19] on simulated data, who found that the estimated variance components from the three methods were similar, with those from MF-ssGBLUP being closer to those from BS-ssGBLUP than those from ssGBLUP. One possible reason for this difference may be the lack of sufficient information in our dataset to distinguish the additive genetic variances between the sire breed and the dam breed in BS-ssGBLUP. To date, few studies have examined whether there are differences in the variance components, heritabilities and genetic correlations between these three methods, and further investigation is needed.

In our study, ADG and FCR were lowly heritable with heritability estimates ranging from 0.081 to 0.153 for ADG, and from 0.080 to 0.084 for FCR. A few studies have reported similarly low values [41,42,43], but in general, ADG and FCR are considered as moderately to highly heritable traits [44, 45]. Our results could be due to the short testing period used. In general, ADG and FCR are normally collected over longer test periods (3–6 months) [44, 45] than the one-month test period in our study. Furthermore, Ahlberg et al. [46] pointed out that during different periods, the phenotypic correlations for each shortened test duration differed. Although the heritability estimates for ADG and FCR are lower than those reported in previous studies, the moderate to high negative genetic correlation between ADG and FCR is in agreement with other studies [44, 45].

Model-based reliabilities

In terms of model-based reliability with MF-ssGBLUP, the usual definition (expressed as 1 $-\frac{{\mathrm{PEV}}_{\mathrm{i},\mathrm{i}}}{{\mathrm{H}}_{\mathrm{ii}}^{\left(\Gamma \right) }{\upsigma }_{\mathrm{u}\left(\Gamma \right)}^{2}}$) is inappropriate for metafounder relationships, as pointed out by Bermann et al. [36], since it would underestimate reliabilities. To account for this, Bermann et al. [36] proposed a new method where reliabilities are calculated from contrasts to a reference metafounder. By applying this method with MF-ssGBLUP in our study, the reliabilities of purebred sires increased by almost 30%, compared with the usual definition (results not shown). In our study, there were two metafounders, one representing BBL and the other HOL. Each individual would have two reliabilities corresponding to BBL and HOL. For BS-ssGBLUP, there were also two reliabilities based on two breed-specific relationship matrices.

Within each method, for ADG and FCR, the reliabilities for the genotyped sire group were always larger than for the non-genotyped sire group. This result is in line with previous studies [5, 18]. In terms of reliabilities across methods, as expected, MF-ssGBLUP always had higher reliabilities than ssGBLUP. However, for BS-ssGBLUP, the results were not consistent, i.e. for FCR the reliabilities from BS-ssGBLUP were similar to those from MF-ssGBLUP, but for ADG they were the lowest among the three methods. This could be due to the fact that the genetic parameters estimated for ADG with BS-ssGBLUP deviated a lot from the estimated parameters with the other two methods, but also to the small sample size for the sires.

Predictive ability

In this study, four estimators, $\widehat{\Delta }$, $\widehat{\mathrm{b}}$, $\widehat{\mathrm{acc}}$ and $\widehat{\uprho }$, in the LR method [37] were used to evaluate the predictive ability of ssGBLUP, BS-ssGBLUP and MF-ssGBLUP. Table 4 shows that for $\widehat{\Delta }$ and $\widehat{\mathrm{b}}$, there are little differences between these three methods.

The difference between $\widehat{\mathrm{b}}$ and its expected value showed that the EBV of FCR were over-dispersed, and that their deviation from the expected value were larger than for ADG. Mäntysaari et al. [47] have suggested that over-dispersion of EBV may be due to strong selection. In terms of $\widehat{\uprho }$, Legarra and Reverter [37] pointed out that it is an estimator of change in population accuracy, but not a measure of population accuracy. Its reciprocal minus 1 can be interpreted as the relative increase of population accuracy from partial to whole information. For example, a value of 0.699 for the $\widehat{\uprho }$ of FCR with MF-ssGBLUP means that the corresponding increase in population accuracy from the partial to the whole dataset is 43.1%.

As expected, MF-ssGBLUP always had a slightly higher population accuracy than ssGBLUP. In a multiple-breed beef cattle population, Junqueira et al. [20] and Kluska et al. [21] found that, compared to ssGBLUP, MF-ssGBLUP decreased bias in genomic evaluations. The same result has also been found for crossbred pigs [18]. However, with BS-ssGBLUP, opposite $\widehat{\mathrm{acc}}$ values were obtained for ADG and FCR, which is similar to the model-based reliabilities that also showed opposite results for ADG and FCR with BS-ssGBLUP. As already mentioned, one possible reason is that the estimated genetic variance with BS-ssGBLUP for ADG deviated a lot from the estimated parameters with the other two methods. For FCR, for which the estimated genetic variance components were similar across the three methods, both BS-ssGBLUP and MF-ssGBLUP had a better predictive ability than ssGBLUP, which is in line with a previous study [19]. In addition, we found that for FCR, BS-ssGBLUP had a better predictive ability than MF-ssGBLUP, which was not consistent with the results of Poulsen et al. [19] who reported similar predictive abilities for BS-ssGBLUP and MF-ssGBLUP. A possible reason for the conflicting results observed in our study may be that the metafounder relationship matrix ${\varvec{\Gamma}}$ could be accurately estimated in the simulated dataset in Poulsen et al. [19], whereas in our case the estimates of ${\varvec{\Gamma}}$ maybe inaccurate, and could be biased because of the small number of genotyped animals, as is the case for BBL. Inaccurate estimates of Γ may affect the performance of MF-ssGBLUP. Moreover, missing genotypes were imputed based on a combination of different SNP panels (EuroG 10k Bead chip and Eurogenomics 75K custom SNP chip), which could make the estimation of ${\varvec{\Gamma}}$ even less accurate. We have also investigated the predictive ability of pedigree BLUP and metafounder pedigree BLUP methods (see Additional file 2: Table S1) and found that these two methods had a higher estimated population accuracy than ssGBLUP, MF-ssGBLUP and BS-ssGBLUP, but also that the estimated genetic variances were much smaller. These are puzzling results, which show that it is necessary to better understand how the estimation of the population accuracy in the LR method performs with imprecisely estimated parameters.

In terms of allele tracing, errors in detecting the breed of origin of alleles can affect a model’s predictive ability especially for a distantly-related crossbred population [15, 30, 48]. In our study, only few such errors were expected since all the alleles on one chromosome should originate from the same breed (either the sire breed or the dam breed) [30]. We also tested the accuracy of allele tracing in a simulated two-way crossbred population, and this was equal to 100% (results not shown). However, in more complicated situations (three-way, four-way, and rotational crossbred populations), our method is not suitable, and a more advanced method for tracing the breed origin of alleles is needed [30, 49].

Overall, MF-ssGBLUP and BS-ssGBLUP had a better predictive ability than ssGBLUP, when the estimated variance components were consistent across the methods. However, more research with larger datasets is needed for investigating the differences between these methods.

Conclusions

Our results reveal that, for FCR, there are little differences in the estimated genetic parameters of a bivariate model among the ssGBLUP, MF-ssGBLUP, and BS-ssGBLUP methods. However, for ADG, the estimated genetic parameters obtained with BS-ssGBLUP showed a large deviation compared to those with ssGBLUP and MF-ssGBLUP. The values of four estimators implemented in the LR method showed that, for the genetic evaluation for crossbred performance in a two-way crossbred cattle production system, MF-ssGBLUP and BS-ssGBLUP had a better predictive ability than ssGBLUP, when the estimated variance components were consistent across the three methods. In general, compared with BS-ssGBLUP, MF-ssGBLUP is more robust in its superiority over ssGBLUP.

Availability of data and materials

The phenotypic data is owned by partners of the FutureBeefCross project. The pedigree and genomic data are property of Nordic Cattle Genetic Evaluation Ltd (NAV, Aarhus, Denmark) and Viking Genetics (Randers, Denmark). None of this data is for public distribution.

References

Berry D. Invited review: beef-on-dairy—the generation of crossbred beef× dairy cattle. J Dairy Sci. 2021;104:3789–819.
Article CAS PubMed Google Scholar
de Vries M, van Middelaar CE, de Boer IJM. Comparing environmental impacts of beef production systems: a review of life cycle assessments. Livest Sci. 2015;178:279–88.
Article Google Scholar
Stock J, Bennewitz J, Hinrichs D, Wellmann R. A review of genomic models for the analysis of livestock crossbred data. Front Genet. 2020;11:568.
Article PubMed PubMed Central Google Scholar
Bedere N, Berghof TV, Peeters K, Pinard-van der Laan M-H, Visscher J, David I, et al. Using egg production longitudinal recording to study the genetic background of resilience in purebred and crossbred laying hens. Genet Sel Evol. 2022;54:26.
Article PubMed PubMed Central Google Scholar
Xiang T, Nielsen B, Su G, Legarra A, Christensen OF. Application of single-step genomic evaluation for crossbred performance in pig. J Anim Sci. 2016;94:936–48.
Article CAS PubMed Google Scholar
Dekkers JC. Marker-assisted selection for commercial crossbred performance. J Anim Sci. 2007;85:2104–14.
Article CAS PubMed Google Scholar
Wientjes YCJ, Calus MPL. Board invited review: the purebred-crossbred correlation in pigs: a review of theory, estimates, and implications. J Anim Sci. 2017;95:3467–78.
CAS PubMed Google Scholar
Calus M, Bos J, Duenk P, Wientjes Y, editors. The purebred-crossbred correlation in broilers and layers: a review. In Proceedings of the 71th Annual Meeting of the European Federation of Animal Science:1–4 December 2020; virtual meeting: 2020
Legarra A, Aguilar I, Misztal I. A relationship matrix including full pedigree and genomic information. J Dairy Sci. 2009;92:4656–63.
Article CAS PubMed Google Scholar
Christensen OF, Lund MS. Genomic prediction when some animals are not genotyped. Genet Sel Evol. 2010;42:2.
Article PubMed PubMed Central Google Scholar
Legarra A, Christensen OF, Aguilar I, Misztal I. Single step, a general approach for genomic selection. Livest Sci. 2014;166:54–65.
Article Google Scholar
Lourenco DAL, Tsuruta S, Fragomeni BO, Chen CY, Herring WO, Misztal I. Crossbreed evaluations in single-step genomic best linear unbiased predictor using adjusted realized relationship matrices. J Anim Sci. 2016;94:909–19.
Article CAS PubMed Google Scholar
Christensen OF, Madsen P, Nielsen B, Su G. Genomic evaluation of both purebred and crossbred performances. Genet Sel Evol. 2014;46:23.
Article PubMed PubMed Central Google Scholar
García-Cortés LA, Toro MÁ. Multibreed analysis by splitting the breeding values. Genet Sel Evol. 2006;38:601–15.
PubMed PubMed Central Google Scholar
Ibánẽz-Escriche N, Fernando RL, Toosi A, Dekkers JC. Genomic selection of purebreds for crossbred performance. Genet Sel Evol. 2009;41:12.
Article PubMed PubMed Central Google Scholar
Sevillano CA, Vandenplas J, Bastiaansen JWM, Bergsma R, Calus MPL. Genomic evaluation for a three-way crossbreeding system considering breed-of-origin of alleles. Genet Sel Evol. 2017;49:75.
Article PubMed PubMed Central Google Scholar
Legarra A, Christensen OF, Vitezica ZG, Aguilar I, Misztal I. Ancestral relationships using metafounders: finite ancestral populations and across population relationships. Genetics. 2015;200:455–68.
Article PubMed PubMed Central Google Scholar
Xiang T, Christensen OF, Legarra A. Genomic evaluation for crossbred performance in a single-step approach with metafounders. J Anim Sci. 2017;95:1472–80.
CAS PubMed Google Scholar
Poulsen BG, Ostersen T, Nielsen B, Christensen OF. Predictive performances of animal models using different multibreed relationship matrices in systems with rotational crossbreeding. Genet Sel Evol. 2022;54:25.
Article PubMed PubMed Central Google Scholar
Junqueira VS, Lopes PS, Lourenco D, Silva FFE, Cardoso FF. Applying the metafounders approach for genomic evaluation in a multibreed beef cattle population. Front Genet. 2020;11: 556399.
Article PubMed PubMed Central Google Scholar
Kluska S, Masuda Y, Ferraz JBS, Tsuruta S, Eler JP, Baldi F, et al. Metafounders may reduce bias in composite cattle genomic predictions. Front Genet. 2021;12: 678587.
Article PubMed PubMed Central Google Scholar
Browning SR, Browning BL. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet. 2007;81:1084–97.
Article CAS PubMed PubMed Central Google Scholar
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–75.
Article CAS PubMed PubMed Central Google Scholar
van Grevenhof EM, Vandenplas J, Calus MP. Genomic prediction for crossbred performance using metafounders. J Anim Sci. 2019;97:548–58.
Article PubMed Google Scholar
Macedo FL, Christensen OF, Astruc J-M, Aguilar I, Masuda Y, Legarra A. Bias and accuracy of dairy sheep evaluations using BLUP and SSGBLUP with metafounders and unknown parent groups. Genet Sel Evol. 2020;52:47.
Article CAS PubMed PubMed Central Google Scholar
VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91:4414–23.
Article CAS PubMed Google Scholar
Christensen OF. Compatibility of pedigree-based and marker-based relationship matrices for single-step genetic evaluation. Genet Sel Evol. 2012;44:37.
Article PubMed PubMed Central Google Scholar
Garcia-Baccino CA, Legarra A, Christensen OF, Misztal I, Pocrnic I, Vitezica ZG, et al. Metafounders are related to Fst fixation indices and reduce bias in single-step genomic evaluations. Genet Sel Evol. 2017;49:34.
Article PubMed PubMed Central Google Scholar
Wei M, van der Werf JH. Maximizing genetic response in crossbreds using both purebred and crossbred information. Anim Prod. 1994;59:401–13.
Google Scholar
Eiríksson JH, Karaman E, Su G, Christensen OF. Breed of origin of alleles and genomic predictions for crossbred dairy cows. Genet Sel Evol. 2021;53:84.
Article PubMed PubMed Central Google Scholar
Madsen P, Jensen J. A user’s guide to DMU. A package for analysing multivariate mixed models. Version 6, release 5.2. University of Aarhus: Center for Quantitative Genetics and Genomics. 2013.
Mei Q, Fu C, Li J, Zhao S, Xiang T. blupADC: An R package and shiny toolkit for comprehensive genetic data analysis in animal and plant breeding. bioRxiv. 2021. https://doi.org/10.1101/2021.09.09.459557.
Article PubMed PubMed Central Google Scholar
Falconer D. Introduction to quantitative genetics. Harlow: Pearson Education Limited; 1996.
Google Scholar
Mrode RA. Linear models for the prediction of animal breeding values. Wallingford: CABI Publishing; 2014.
Book Google Scholar
Jackson C. Multi-state models for panel data: the msm package for R. J Stat Softw. 2011;38:1–28.
Article Google Scholar
Bermann M, Aguilar I, Lourenco D, Misztal I, Legarra A. Reliabilities of estimated breeding values in models with metafounders. Genet Sel Evol. 2023;55:6.
Article CAS PubMed PubMed Central Google Scholar
Legarra A, Reverter A. Semi-parametric estimates of population accuracy and bias of predictions of breeding values and future phenotypes using the LR method. Genet Sel Evol. 2019;50:53.
Article Google Scholar
Macedo FL, Reverter A, Legarra A. Behavior of the linear regression method to estimate bias and accuracies with correct and incorrect genetic evaluation models. J Dairy Sci. 2020;103:529–44.
Article CAS PubMed Google Scholar
Bermann M, Legarra A, Hollifield MK, Masuda Y, Lourenco D, Misztal I. Validation of single-step GBLUP genomic predictions from threshold models using the linear regression method: an application in chicken mortality. J Anim Breed Genet. 2021;138:4–13.
Article CAS PubMed Google Scholar
Fu C, Ostersen T, Christensen OF, Xiang T. Single-step genomic evaluation with metafounders for feed conversion ratio and average daily gain in Danish Landrace and Yorkshire pigs. Genet Sel Evol. 2021;53:79.
Article CAS PubMed PubMed Central Google Scholar
Inoue K, Kobayashi M, Shoji N, Kato K. Genetic parameters for fatty acid composition and feed efficiency traits in Japanese Black cattle. Animal. 2011;5:987–94.
Article CAS PubMed Google Scholar
Rolf MM, Taylor JF, Schnabel RD, McKay SD, McClure MC, Northcutt SL, et al. Genome-wide association analysis for feed efficiency in Angus cattle. Anim Genet. 2012;43:367–74.
Article CAS PubMed PubMed Central Google Scholar
Martin P, Taussat S, Vinet A, Krauss D, Maupetit D, Renand G. Genetic parameters and genome-wide association study regarding feed efficiency and slaughter traits in Charolais cows. J Anim Sci. 2019;97:3684–98.
Article PubMed PubMed Central Google Scholar
Polizel GHG, Grigoletto L, Carvalho ME, Junior PR, Ferraz JBS, de Almeida Santana MH. Genetic correlations and heritability estimates for dry matter intake, weight gain and feed efficiency of Nellore cattle in feedlot. Livest Sci. 2018;214:209–10.
Article Google Scholar
Torres-Vázquez JA, van der Werf JH, Clark SA. Genetic and phenotypic associations of feed efficiency with growth and carcass traits in Australian Angus cattle. J Anim Sci. 2018;96:4521–31.
Article PubMed PubMed Central Google Scholar
Ahlberg CM, Allwardt K, Broocks A, Bruno K, McPhillips L, Taylor A, et al. Test duration for water intake, ADG, and DMI in beef cattle. J Anim Sci. 2018;96:3043–54.
PubMed PubMed Central Google Scholar
Mäntysaari E, Koivula M, Strandén I. Symposium review: single-step genomic evaluations in dairy cattle. J Dairy Sci. 2020;103:5314–26.
Article PubMed Google Scholar
Lopes MS, Bovenhuis H, Hidalgo AM, Van Arendonk JA, Knol EF, Bastiaansen JW. Genomic selection for crossbred performance accounting for breed-specific effects. Genet Sel Evol. 2017;49:51.
Article PubMed PubMed Central Google Scholar
Vandenplas J, Calus MP, Sevillano CA, Windig JJ, Bastiaansen JW. Assigning breed origin to alleles in crossbred animals. Genet Sel Evol. 2016;48:61.
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

Project partners in the FutureBeefCross project are acknowledged for conducting the data collection. Anders Fogh and Line Hjortø (SEGES Innovation Cattle) are acknowledged for providing explanations about the experiment and doing quality control of the data. Jette Odgaard Villemoes is acknowledged for English editing of the manuscript.

Funding

SZ acknowledges funding from the National Key R&D Program of China (2019YFE0115400); OFC and HL acknowledge funding from the FutureBeefCross project supported by the Green Development and Demonstration Programme (GUDP) from the Danish Ministry of Food, Agriculture and Fisheries (J. nr. 34009-18-1434); QM acknowledges funding from the China Scholarship Council (CSC) Scholarship.

Author information

Authors and Affiliations

Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education & Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, China
Quanshun Mei, Shuhong Zhao & Tao Xiang
Center for Quantitative Genetics and Genomics, Aarhus University, C. F. Møllers Allé 3, 8000, Aarhus C, Denmark
Quanshun Mei & Ole F Christensen
SEGES Cattle, Agrofood Park 15, 8200, Aarhus N, Denmark
Huiming Liu

Authors

Quanshun Mei
View author publications
You can also search for this author in PubMed Google Scholar
Huiming Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shuhong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Tao Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Ole F Christensen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

QM performed data analysis and wrote the manuscript. OFC, TX, SZ and HL supervised and assisted at all stages of the study, including the writing of the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Tao Xiang or Ole F Christensen.

Ethics declarations

Ethics approval and consent to participate

Data recording and sample collection were conducted following Danish laws of management and welfare procedures for animal production.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Figure S1.

Details of phenotype editing.

Additional file 2: Table S1.

Bias ($\widehat{\Delta }$), dispersion ($\widehat{\mathrm{b}}$), population accuracy ($\widehat{\mathrm{acc}}$), ratio of population accuracies ($\widehat{\uprho }$) of EBV, average inbreeding coefficient (F) and average relationship (2f) for focal individuals (cut off date, May 1, 2021) and the estimated additive genetic variance in partial dataset(${\upsigma }_{\mathrm{u},\infty }^{2}$) and whole dataset(${\upsigma }_{\mathrm{u}}^{2}$) with different method.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Mei, Q., Liu, H., Zhao, S. et al. Genomic evaluation for two-way crossbred performance in cattle. Genet Sel Evol 55, 17 (2023). https://doi.org/10.1186/s12711-023-00792-4

Download citation

Received: 30 June 2022
Accepted: 08 March 2023
Published: 17 March 2023
DOI: https://doi.org/10.1186/s12711-023-00792-4

Genomic evaluation for two-way crossbred performance in cattle

Abstract

Background

Methods

Results

Conclusions

Background

Methods

Data

Statistical models

Standard ssGBLUP

MF-ssGBLUP

BS-ssGBLUP

Model-based reliability

Estimators of the LR method

Bias

Dispersion

Population accuracy

Ratio of population accuracies

Results

Genetic parameters

Model-based reliability

Predictive ability

Discussion

Genetic parameters

Model-based reliabilities

Predictive ability

Conclusions

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Supplementary Information

Additional file 1: Figure S1.

Additional file 2: Table S1.

Rights and permissions

About this article

Cite this article

Share this article

Genetics Selection Evolution

Contact us