 Research Article
 Open Access
 Published:
Bias and accuracy of dairy sheep evaluations using BLUP and SSGBLUP with metafounders and unknown parent groups
Genetics Selection Evolution volume 52, Article number: 47 (2020)
Abstract
Background
Bias has been reported in genetic or genomic evaluations of several species. Common biases are systematic differences between averages of estimated and true breeding values, and their over or underdispersion. In addition, comparing accuracies of pedigree versus genomic predictions is a difficult task. This work proposes to analyse biases and accuracies in the genetic evaluation of milk yield in Manech Tête Rousse dairy sheep, over several years, by testing five models and using the estimators of the linear regression method. We tested models with and without genomic information [best linear unbiased prediction (BLUP) and singlestep genomic BLUP (SSGBLUP)] and using three strategies to handle missing pedigree [unknown parent groups (UPG), UPG with QP transformation in the \({\mathbf{H}}\) matrix (EUPG) and metafounders (MF)].
Methods
We compared estimated breeding values (EBV) of selected rams at birth with the EBV of the same rams obtained each year from the first daughters with phenotypes up to 2017. We compared within and across models. Finally, we compared EBV at birth of the rams with and without genomic information.
Results
Within models, bias and overdispersion were small (bias: 0.20 to 0.40 genetic standard deviations; slope of the dispersion: 0.95 to 0.99) except for model SSGBLUPEUPG that presented an important overdispersion (0.87). The estimates of accuracies confirm that the addition of genomic information increases the accuracy of EBV in young rams. The smallest bias was observed with BLUPMF and SSGBLUPMF. When we estimated dispersion by comparing a model with no markers to models with markers, SSGBLUPMF showed a value close to 1, indicating that there was no problem in dispersion, whereas SSGBLUPEUPG and SSGBLUPUPG showed a significant underdispersion. Another important observation was the heterogeneous behaviour of the estimates over time, which suggests that a single check could be insufficient to make a good analysis of genetic/genomic evaluations.
Conclusions
The addition of genomic information increases the accuracy of EBV of young rams in Manech Tête Rousse. In this population that has missing pedigrees, the use of UPG and EUPG in SSGBLUP produced bias, whereas MF yielded unbiased estimates, and we recommend its use. We also recommend assessing biases and accuracies using multiple truncation points, since these statistics are subject to random variation across years.
Background
Genetic progress in selection schemes depends on using correct models for genetic evaluation. Models are simplifications of reality and never completely perfect, which is why tools to analyze systematic errors are necessary. There are three important aspects to check in genetic evaluations: bias, dispersion and accuracy. Bias \(\left( {b_{0} = \bar{\hat{u}}  \bar{u}} \right)\) is the difference between estimated breeding values (EBV) \(\hat{u}\) and true breeding values (TBV) \(u\) and could lead to over or underestimation of genetic trend and to poor selection decisions (for example, selecting too many young individuals instead of keeping old ones). In the same way, on the one hand, values of the slope of the regression of TBV on EBV less than 1 imply overdispersion of the EBV and could lead to an overestimation of the genetic merit of preselected candidates. On the other hand, an unbiased estimate of accuracy (the correlation between TBV and EBV) is important to correctly predict the response to selection.
Bias has been found in genetic evaluations of several species. The use of genomic information in dairy cattle selection is widespread and the existence of bias has been extensively studied (e.g. [1,2,3,4]). Bias has also been studied in other species, such as pigs [5], dairy goats [6], turkeys [7] and beef cattle [8,9,10]. In general, biases decrease with more adequate models. However, all these studies rely on the use of precorrected data such as deregressed proofs or daughter yield deviations (DYD), which may give wrong estimates of biases if fixed effects are not well estimated [11].
Studies in France and Spain using DYD detected bias in genetic evaluations of dairy sheep breeds. For example, predictions in Lacaune showed bias and overdispersion of EBV, with more impact for traits under strong selection [12, 13]. Similar results were obtained for milk yield of Pyrenean dairy sheep breeds [14], although genomic evaluations decreased bias compared to pedigree evaluations. Manech Tête Rousse (MTR) is one of the major French Pyrenean dairy sheep breeds. For this breed, the selection scheme switched to genomic selection in 2018 and it is important to verify the bias, dispersion and accuracies, to avoid poor selection decisions. In particular, the bias detected in [14] is not well understood. However, it is difficult to assess such biases with DYD in dairy sheep, since DYD from “first crops” of 20 to 40 daughters are not very accurate.
Legarra and Reverter [11] described the linear regression method (LR method) to detect bias in genetic evaluations. The advantage of this method is the simplicity of the application; it compares EBV of a group of individuals obtained in different evaluations, with less (“partial”) and more (“whole”) information. Comparing the two subsets of EBV, estimators of bias, dispersion and accuracies (relatives or directs) are easily computed. Therefore, it is easy to analyze a genetic evaluation comparing the results of two consecutive evaluations.
To perform genetic evaluation, it should be possible to include genomic information and also to model missing pedigrees if needed. In this work, we tested models using only pedigree information (best linear unbiased prediction (BLUP) model) or including genomic information (in a singlestep genomic BLUP (SSGBLUP) model) and applying different strategies to deal with missing pedigree. Missing pedigree may be a problem in most species—in ruminants, parents may be unrecorded, whereas in monogastric species, new lines may be introduced. If we do not consider this missingness, we are assuming the same genetic mean for all missing parents in the pedigree. In dairy sheep, females born from natural mating usually do not have an assigned sire. However, these natural mating rams are offspring of highly selected artificial insemination (AI) rams and thus their breeding value increases over time. In addition, new flocks that entered the breeding scheme until (roughly) 1990 did not have pedigree data. Two strategies can be used to model the missing pedigree: unknown parent groups (UPG) [15, 16] and metafounders (MF) [17]. There is some evidence that the use of MF improves the performance of genetic evaluation [18], but it has not been systematically studied.
The aim of this work was to analyze bias, dispersion, and accuracies in the genetic evaluation of milk yield of MTR using the LR method with several evaluation models and performed over many truncation points of data. A second aim was to compare different strategies (UPG or MF) to manage missing pedigree in BLUP and SSGBLUP contexts. In this manner, we assessed the genetic evaluation of MTR, addressed the best method to consider missing pedigrees in SSGBLUP, and explored the possibilities of the LR method to discriminate models for prediction.
Methods
Records and pedigree
Milk production is recorded by the breeding scheme according to the International Committee for Animal Recording rules. The data that we analyzed were collected between 1978 and 2017 and comprise 1,842,295 performance records and 540,999 individuals in the pedigree, with a generation interval of about 4 years. There are missing parentships, either “sire unknown and dam known” (~ 15% of all animals) or “both sire and dam unknown” (~ 15% of all animals). This situation is particularly important in our case, because if we ignore the missing pedigree, the unknown parents of the more recently improved animals will be assigned to the base population at the beginning of the selection program. As a result, these animals will be unfairly penalized and it will not be possible to correctly model the genetic progress. Thus, we defined 13 UPG (or MF; see later). We computed a crude “number of equivalent records” from the first “offspring” of UPG (disregarding later generations). For instance, an individual with \(n\) records contributes \(n\) to its ancestor UPG if both parents are unknown and \(n/2\) if one parent is known. In all cases, the number of equivalent records was larger than 10,000.
Genomic information
We included genomic information on 3007 AI males (years of birth from 1999 until 2017), all of which have both parents known and are genotyped with the 50 k Illumina chip OvineSNP50. Only autosomal SNPs were considered. Quality control included individual and marker call rate, minor allele frequency (MAF) higher than 0.05, removal of Mendelian conflicts, deviation from Hardy–Weinberg equilibrium (number of heterozygotes deviating more than 15% from the expectation based on allele frequencies), and heritability of gene content (markers with an estimated heritability < 0.98 and significant pvalues of the likelihood ratio test, p < 0.01, were discarded) [19]. After quality control, 37,168 effective SNPs were retained.
Focal individuals
It is possible to apply the LR method to any group of individuals of a population, provided that they represent a homogenous tier (i.e. they are similarly selected, and prediction at the time of selection is based on the same sources of information). In this work, we were interested in evaluating bias, dispersion and accuracy of males at the time of their selection, i.e. at birth before they have progeny with records. The reason we are interested in this group is that most of the genetic gain in dairy sheep is obtained via males. In total, 10 groups of focal individuals were analyzed; each group corresponding to selected rams born from 2005 to 2014. These males were selected based on parent average to be progenytested and thus their genetic variation is smaller than that of their contemporaries [20].
Estimators of the LR method
In brief, the LR method estimates bias, dispersion and accuracies, based on the comparison of two subsets of EBV, estimated with less and more information, for the same group of individuals. In this paper, we will use the symbols \(\hat{u}_{p}\) or EBV_{p} to refer to the EVB estimated with less information (or “partial” dataset) and \(\hat{u}_{w}\) or EBV_{w} to refer to the EBV estimated with more information (or “whole” dataset). The LR method presents one estimator for the bias (\(\hat{\Delta }_{p}\)), one estimator for the dispersion (\(\hat{b}_{p}\)) and four estimators related to the accuracies (\(\rho_{wp}\), \(acc_{p}^{2}\), \(\rho_{wp}^{2}\), \(\widehat{{\varvec{ }rel}}_{p}\)). The estimators are summarized below; for a deeper overview and properties of the estimators see [11, 21].
Bias (\(\hat{\Delta }_{p}\))
The estimator of the bias is obtained from the difference between the mean of EBV_{p} and the mean of EBV_{w}, \(\hat{\Delta }_{p} = \overline{{\hat{u}_{p} }}  \overline{{\hat{u}_{w} }}\). In absence of bias, the expected value of this estimator is 0.
Dispersion (\(\hat{b}_{p}\))
The estimator of dispersion of EBV is the slope of the regression of EBV_{w} on EBV_{p}, \(\hat{b}_{p} = \frac{{cov\left( {\hat{u}_{p} ,\hat{u}_{w} } \right)}}{{var\left( {\hat{u}_{p} } \right)}}\). If over or underdispersion does not exists, the expected value of the estimator is 1, values of \(\hat{b}_{p} < 1\) indicate overdispersion whereas values of \(\hat{b}_{p} > 1\) indicate underdispersion.
Estimators related to accuracies
Ratio of accuracies (\(\hat{\rho }_{w,p}\))
This estimator estimates the inverse of the relative gain in accuracy from EBV_{p} to EBV_{w}. It is the correlation between EBV_{p} and EBV_{w}, \(\hat{\rho }_{w,p} = \frac{{cov\left( {\hat{u}_{p} ,\hat{u}_{w} } \right)}}{{\sqrt {var\left( {\hat{u}_{p} } \right)var\left( {\hat{u}_{w} } \right)} }}\) and the expected value is \(\frac{{acc_{p} }}{{acc_{w} }}\). A high value of this estimator means a small increase in accuracy, whereas a low value means a large increase in accuracy, when we add phenotypic information to genetic evaluations. For instance, a value of 0.7 means that the evaluation with the “partial” dataset is quite similar to the evaluation with the “whole” dataset, i.e. more phenotypes do not add much new information. This can be seen also as the relative increase in accuracy brought by phenotypes is \(\frac{1}{{\hat{\rho }_{w,p} }}  1 = \frac{{acc_{w}  acc_{p} }}{{acc_{p} }}\) (Matias Bermann, University of Georgia, personal communication). Thus, it is expected that genomic evaluations have higher \(\hat{\rho }_{w,p}\) than pedigreebased evaluations.
Ratio of reliabilities (\(\hat{\rho }_{p,w}^{2}\))
This estimator is the slope of the regression of EBV_{p} on EBV_{w}, \(\hat{\rho }_{p,w}^{2} = \frac{{cov\left( {\hat{u}_{p} ,\hat{u}_{w} } \right)}}{{var\left( {\hat{u}_{w} } \right)}}\) and, similar to the ratio of accuracies, it represents the inverse of the gain in reliabilities from EBV_{p} to EBV_{w}. The expected value is \(\frac{{acc_{p}^{2} }}{{acc_{w}^{2} }}\).
Selected reliability of EBV_{p} (\(\widehat{acc}_{p}^{2}\))
In a general formulation, \(\widehat{acc}_{p}^{2} = \frac{{cov\left( {\hat{u}_{p} ,\hat{u}_{w} } \right)}}{{\sigma_{g*}^{2} }}\), where \(\sigma_{g* }^{2}\) is the genetic variance of the group of individuals of interest. We use this more general formulation as in [21] instead of the formulation used in [11], because the latter is adequate only for a group of animals that represent the whole population after selection. In this work, we analyzed EBV of sets of contemporary young rams of the population, in other words highlyselected individuals, which decreases reliability [20, 21]. A difficulty associated to this estimator is the necessity of an estimation of the genetic variance of a group of individuals. We estimated the genetic variance of each group of focal individuals following [22] using the complete dataset. We used Gibbs sampling with the complete dataset with 150,000 iterations and a burnin of 15,000 iterations. At each 150th iteration, we took samples of the EBV of all AI males in the 10 focal groups and we computed, for each of these groups, the variance of these samples. This results in samples from the posterior distributions of the 10 genetic variances, one for each group of AI males.
Unselected reliability of EBVp (\(\widehat{rel}_{p}\))
This estimator estimates the reliability as if there was no selection, \(\widehat{rel}_{p} = 1  \frac{{\sigma_{g*}^{2} }}{{\sigma_{g}^{2} }}\left( {1  \widehat{acc}_{p}^{2} } \right)\) as in [23], where \(\sigma_{g}^{2}\) is the genetic variance of the base population and \(\sigma_{g* }^{2}\) is the genetic variance of the group of individuals of interest (see above). A short derivation of \(\widehat{rel}_{p}\) follows from [21, 24], \(Cov\left( {\hat{u}_{p} ,\hat{u}_{w} } \right) = \sigma_{g*}^{2}  PEV\), so \(PEV = \sigma_{g*}^{2}  Cov\left( {\hat{u}_{p} ,\hat{u}_{w} } \right)\). The reliability that is unaffected by selection is \(r^{2} = 1  \frac{PEV}{{\sigma_{g}^{2} }}\) leading to \(r^{2} = 1  \frac{{\sigma_{g*}^{2} }}{{\sigma_{g}^{2} }}\left( {1  r^{2*} } \right)\) [23], where \(r^{2*}\) is the selected reliability. The reliability \(\widehat{rel}_{p}\) can be interpreted as if the focal individuals were not selected, or, in other words, as the average theoretical reliability of the focal individuals obtained from the mixed model equations (MME).
Data analysis
To apply the LR method, we have to obtain EBV from the partial dataset and the whole dataset. In this work, in order to obtain an empirical distribution of the statistics of the LR method, we performed several comparisons between EBV_{p} and EBV_{w}, taking EBV_{p} from rams born in year \(y_{p}\) (2005 to 2014) and EBV_{w} from years \(y_{p} + 2\) until year 2017 (last year of records for this work). The year of the first set of EBV_{w} was \(y_{p} + 2\) because the first daughters of the selected rams generally start to produce 2 years after birth. For example, if we take the EBV at birth of rams born in 2005 as EBV_{p}, we have EBV_{w} of these rams from years 2007 to 2017, thus we have 11 sets of estimators; and if we take EBV from rams born in year 2014 as EBV_{p}, we only have EBV_{w} from year 2016 to 2017, thus only two sets of estimators. In total, we performed 65 comparisons, e.g. 2005 vs 2007, 2005 vs 2008 ... 2005 vs 2017 … 2014 vs 2016 and 2014 vs 2017.
Bias or accuracies are properties of the partial dataset only, and not of the whole dataset. Sampling several “partial” years allows to describe possible variations due to chance, i.e. properties of BLUP only hold on expectation. In addition, by considering multiple “whole” datasets, we tried to evaluate random deviations of the estimates of biases and accuracies. For instance, a ram may stop getting progeny performances after a few years, yet the estimates of contemporary groups may change. The theory of the LR method (actually, BLUP theory) shows that the estimators of the LR method are correct regardless of whether rams are selected (and having more and more offspring) or not.
We considered several models for the evaluations that are presented below. We applied the LR method within models, with both EBV_{p} and EBV_{w} obtained with the same model. We also applied this method across models: EBV_{p} obtained with one model, for example regular BLUP with MF, and EBV_{w} from another model, for example SSGBLUP with UPG. Finally, because the addition of genomic data to the evaluation can be seen as “more information”, it is possible to see EBV obtained at the same time but without and with genomic information as EBV_{p} and EBV_{w}, respectively. Thus, we also compared the EBV of the rams at birth estimated with the BLUP and SSGBLUP models. For example, the EBV of rams at birth in 2005 were estimated with BLUP as EBV_{p} and estimated with SSGBLUP as EBV_{w}.
Although there is no theoretical support for using the LR method across models [21], our objective was to check the consistency of models with each other, in the sense that a refinement of the model should not introduce unexpected changes in the evaluations. Otherwise, one of the models could possibly be quite wrong. For instance, switching the genetic evaluation of milk yield from lactational measures to testday models should not introduce big changes. Likewise, selection schemes that start adding genomic information to the genetic evaluations must change models without too large changes in the EBV. Viewed in this way, it is important to check the coherence (lack of strong changes) from one model to the other. We focused on the regression coefficient \(\hat{b}_{p}\), with an expected value of 1.
To summarize the 65 comparisons, raw averages of estimators are not correct because some years are more represented that others, e.g. 2005 has 11 comparisons whereas 2014 has two comparisons. Thus, we used the pseudomodel \({\mathbf{es}}_{pw} = {\mathbf{Xy}}_{p} + {\mathbf{Zy}}_{w} + {\varvec{\upvarepsilon}}\), where \({\mathbf{es}}_{pw}\) is a vector of the 65 values of the estimator (\(\hat{\Delta }_{p}\), \(\hat{b}_{p}\), \(\hat{\rho }_{wp}\), \(\widehat{acc}_{wp}^{2}\), \(\hat{\rho }_{pw}^{2}\), \(\widehat{rel}_{p}\)) from the comparison of EBV_{p} of the rams born in year \(p\) and of EBV_{w} of same rams obtained in year \(w\), \({\mathbf{y}}_{p}\) contains values for years \(p\) (2005 to 2014) and \({\mathbf{y}}_{w}\) contains values for years \(w\) (\({\text{y}}_{\text{p}} + 2\) until 2017), and we report an estimable function that yields \(\widehat{{{\mathbf{es}}_{pw} }}\) as if the design was balanced: \(\widehat{{{\mathbf{es}}_{pw} }} = \frac{1}{np}1^{\prime}{\hat{\mathbf{y}}}_{p} + \frac{1}{nw}1^{\prime}{\hat{\mathbf{y}}}_{w}\) where \(np\) and \(nw\) are the number of different years for the “partial” dataset (8) and “whole” dataset (11). The pseudomodel was fit by least squares (lm function in R), and the R package Gmodels version 2.18.1 was used to compute the contrasts. The code is given in “Appendix”.
Models
The genetic evaluations were performed using the regular linear model for genetic evaluation of MTR. This is a univariate model with repeated records for milk yield that accounts for heterogeneity of variances across contemporary groups [25]:
where \({\varvec{\Lambda}}\) is a diagonal matrix of scaling factors for heterogeneity of variances, \({\mathbf{y}}\) is a vector of milk yield records, \({\mathbf{y}}_{c}\) is a vector of the observations corrected for heterogeneity of variances, \({\mathbf{b}}\) is a vector of the fixed effects: contemporary group, age and number of lactation, month of lambing and interval “from lambing to first milk recording”, \({\mathbf{u}}\) is a vector of breeding values, \({\mathbf{p}}\) is a vector of permanent animal effects, \({\mathbf{e}}\) is a vector of residuals, and \({\mathbf{X}}\), \({\mathbf{W}}_{p}\) and \({\mathbf{W}}_{u}\) are incidence matrices for fixed effects, permanent animal effects, and breeding values. Following [25], the \(i{\text{th}}\) diagonal element in \({\varvec{\Lambda}}\) is \(\exp \left( {\frac{{\tau_{i} }}{2}} \right)\); a scaling factor for fixed and random effects. The linear model for \(\tau_{i} = {\mathbf{S}}_{i} {\varvec{\upbeta}}\), where \({\varvec{\upbeta}}\) is the vector of unknown effects for year (fixed) and flockyear (random) and \({\mathbf{S}}_{i}\) is the design vector. Heritability was fixed at 0.30 (the value used in official evaluations; an estimate calculated with the complete dataset was equal to 0.28). In models with UPG, EBV cannot be estimated, and the genetic basis changes with the model used. Therefore, we referred all estimates of EBV to the average EBV of the females born in 2005. Using this animal model, different (sub) models were defined depending on: (1) the use or not of genomic information, and (2) the strategy to model missing pedigree.
We used BLUP models with the matrix of additive genetic relationships \({\mathbf{A}}\) [24] and models that include the genomic information in a single step (SSGBLUP). The SSGBLUP models replaces \({\mathbf{A}}\) with a matrix \({\mathbf{H}}\). that combines pedigree and genomic relationships [26,27,28].
To model the missing pedigree, we used three strategies, unknown parent groups for \({\mathbf{A}}\) (UPG) and for \({\mathbf{H}}\) (EUPG) and metafounders (MF). Unknown parents groups were developed to avoid bias due to differences in genetic means of groups of individuals with different origins [15, 29]. The theory of UPG adapted to SSGBLUP models was reviewed by [16]. Later, Legarra et al. [17] conceived the theory of MF that represents base populations by related, inbred pseudoindividuals. The aim of MF was to provide a coherent theory, where UPG would account for the reduction in genetic variance due to drift and for relationships across base populations. Using genomic information, it is possible to estimate the relatedness between groups of unknown parents (\({\varvec{\Gamma}}\) matrix) [17, 30], and this relationship matrix across MF can be used also in purely pedigreebased BLUP models. We estimated matrix \({\varvec{\Gamma}}\) from observed genotypes using the GLS method of [30].
Let index 0 denote the base populations (either UPG or MF), index 1 “nongenotyped animals”, and index 2 “genotyped animals”. Denote \({\mathbf{A}}^{  1} = \left( {\begin{array}{*{20}c} {{\mathbf{A}}^{11} } & {{\mathbf{A}}^{12} } \\ {{\mathbf{A}}^{21} } & {{\mathbf{A}}^{22} } \\ \end{array} } \right)\) as the usual inverse of the relationship matrix and \({\mathbf{A}}_{22}^{  1}\) the inverse including only genotyped animals, \({\mathbf{A}}^{*} = \left( {\begin{array}{*{20}c} {{\mathbf{A}}^{00} } & {{\mathbf{A}}^{01} } & {{\mathbf{A}}^{02} } \\ {{\mathbf{A}}^{10} } & {{\mathbf{A}}^{11} } & {{\mathbf{A}}^{12} } \\ {{\mathbf{A}}^{20} } & {{\mathbf{A}}^{21} } & {{\mathbf{A}}^{22} } \\ \end{array} } \right)\) as the generalized inverse (as it is not full rank) including UPG, and \({\mathbf{A}}^{\left( \varGamma \right)  1} = \left( {\begin{array}{*{20}c} {{\mathbf{A}}^{\left( \varGamma \right)00} } & {{\mathbf{A}}^{\left( \varGamma \right)01} } & {{\mathbf{A}}^{\left( \varGamma \right)02} } \\ {{\mathbf{A}}^{\left( \varGamma \right)10} } & {{\mathbf{A}}^{\left( \varGamma \right)11} } & {{\mathbf{A}}^{\left( \varGamma \right)12} } \\ {{\mathbf{A}}^{\left( \varGamma \right)20} } & {{\mathbf{A}}^{\left( \varGamma \right)21} } & {{\mathbf{A}}^{\left( \varGamma \right)22} } \\ \end{array} } \right)\) as the inverse using MF. All three matrices are easily built using simple modifications of Henderson’s algorithm [31].
The SSGBLUP model proceeds by modifying the conditional variances and covariances in the inverse matrices according to observed genomic information, by obtaining \({\mathbf{H}}^{  1}\) matrices from \({\mathbf{A}}^{  1}\) matrices. Corresponding matrices are, for SSGBLUPUPG:
where \({\mathbf{G}}\) is the genomic relationship matrix that is built following the first method in [32], using observed allele frequencies, and made comparable to \({\mathbf{A}}_{22}\) following [5].
It is well known that this matrix is, at best, an approximation [16] because the theory of matrix \({\mathbf{H}}\) was derived under the constraint that \({\mathbf{A}}\) is full rank, which is not the case for \({\mathbf{A}}^{*}\). The same authors in [16] proposed a full transformation hereafter called “exact UPG” (EUPG) that can be written as:
where \({\mathbf{Q}}_{2}\) is the matrix containing UPG compositions for genotyped animals.
Whereas in “regular” SSGBLUP the only changes concern genotyped animals, here there are extensive changes that make programming difficult. In addition, because \({\mathbf{G}}\) accounts correctly for the different origins and does not need pedigree completion, there is, depending on the pedigree structure, some sort of doublecounting as observed by [18]. These problems are solved by MF which proposes:
where \({\mathbf{G}}_{05}\) is built with allele frequencies of 0.5 and there is no extra scaling to match \({\mathbf{A}}_{22}^{\left( \varGamma \right)}\), although there is blending as described below.
For all SSGBLUP models, the blending between \({\mathbf{G}}\) and \({\mathbf{A}}_{22}\) or between \({\mathbf{G}}_{05}\) and \({\mathbf{A}}_{22}^{\left( \varGamma \right)}\) was done using 0.95 and 0.05, as respective weights [32,33,34]. An analysis using MF also needs to consider that the population is more related by construction. We used the scaling of genetic variance in [17] such that if the genetic variance considering BLUP_UPG was \(\sigma_{u}^{2}\), the genetic variance component attributed to \({\mathbf{H}}_{MF}^{*}\) was \(\sigma_{u}^{2} /k\) for \(k = 1 + \frac{{\overline{{diag\left( {\varvec{\Gamma}} \right)}} }}{2}  {\bar{\mathbf{\varGamma }}}\).
Now we can describe the five models:

1.
BLUPUPG uses \({\mathbf{A}}^{*}\) and is the reference method known to be robust.

2.
BLUPMF uses \({\mathbf{A}}^{\left( \varGamma \right)  1}\). The main difference is that the latter assumes that MF are random effects and that they are correlated, whereas the former uses UPG that are fixed (unbounded a priori) effects.

3.
SSGBLUPUPG uses \({\mathbf{H}}_{UPG}^{*}\) and is expected to be somewhat biased because it is an approximation.

4.
SSGBLUPEUPG is supposed to be biased also because there is some doublecounting. However the bias is not necessarily the same as in SSGBLUPUPG.

5.
SSGBLUPMF is supposed to be the most accurate method.
All genetic evaluations were performed with heterf90 (not publicly released), which solves the outer model for heterogeneity of variances as in [25], whereas inner iterations used blup90iod2 [35]. To estimate the relationships across MF, we used gammaf90 (not publicly released), which uses the GLS method in [30].
Results
The estimated value of \({\varvec{\Gamma}}\) is presented below (each row/column corresponds to MF separated by 3 years). We did not explore these values in depth since it was out of the scope of this paper, but, in general, values showed moderate relationships across MF, i.e. most correlations obtained as \({\varvec{\Gamma}}_{{\left( {i,j} \right)}} /\sqrt {{\varvec{\Gamma}}_{{\left( {i,i} \right)}} {\varvec{\Gamma}}_{{\left( {j,j} \right)}} }\) ranged from 0.5 to 0.6. The second and third MF present somewhat extreme values because they have few genotyped descendants. For instance, if the allele frequencies in the base generation were uniformly distributed, the expected value in the diagonal is 2/3 [36]. Matrix \({\varvec{\Gamma}}\) is estimated from estimates of allele frequencies in the base population with standard errors ranging from 0.15 to 0.33, which are the highest values for the second and third MF. These errors seem large but we take the estimate of \({\varvec{\Gamma}}\) as a crude guess, i.e. just as breeding programs start with guessed heritabilities.
As for the LR method, Table 1 shows the values of estimators within models, i.e. when the model to estimate EBV_{p} and EBV_{w} were the same. In this case, the smallest bias (\(\hat{\Delta }\) of 0.23 genetic standard deviations (\(\sigma_{g}\)) and 0.25 \(\sigma_{g}\) for SSGBLUPMF and BLUPMF, respectively) was obtained with MF. All models are slightly biased and overestimate the genetic trend (around 0.25 genetic standard deviations, equivalent to 1 year of selection).
For the estimator of dispersion (\(\hat{b}_{p}\)), for all models, except for SSGBLUPEUPG, the values were close to 1, meaning absence of over or underdispersion of EBV. However, SSGBLUPEUPG model was biased (\(\hat{b}_{p} = 0.88\)), which indicates inflation of EBV. This agrees with [18] who found that SSGBLUPEUPG was biased. In Fig. 1, we present the values of each estimate of \(\hat{b}_{p}\) for BLUPMF (Fig. 1a), which has the average value of \(\hat{b}_{p}\) closest to 1, and for SSGBLUPEUPG (Fig. 1b), which generates the most overdispersion. The variability of the estimates of EBV_{p} within and across years is similar for both models, but the estimates of dispersion with SSGBLUPEUPG are systematically the smallest. As Fig. 1 shows, the year of birth 2008 seems to yield biased estimators. This agrees with [14] who found biases for predictions of rams born in this year. Figure 1 also illustrates that there is a large variability of estimates within and across years of the “partial” and “whole” datasets, with the implication that a single timepoint is not sufficient to describe the behavior of the genetic evaluation.
Estimator \(\hat{\rho }_{wp}\) represents the inverse of the relative gain in accuracy from EBV_{p} to EBV_{w}, thus high values of this estimator imply higher accuracy in the “partial” dataset, as expected for SSGBLUP. In agreement, values of this estimator were lower for the BLUP models (roughly 0.55) than for the SSGBLUP models (roughly 0.65). In other words, the EBV of the rams obtained without the records of theirs daughters were more accurate in SSGBLUP than in BLUP, which agrees with [14]. Similar results were found for \(\hat{\rho }_{pw}^{2}\), which estimates the ratio between reliabilities in EBV_{p} and EBV_{w}.
The direct estimators of accuracy (\(\widehat{acc}_{p}^{2}\) and \(\widehat{rel}_{p}\)), both based on the covariance between EBV_{p} and EBV_{w}, presented extremely high values (in some cases, the variance of EBV_{w} was larger than the genetic variance), for SSGBLUPUPG and SSGBLUPEUGP, and are therefore not reported. This may be an indirect indicator of the poor fit of UPG to SSGBLUP, whereas BLUPUPG shows reasonable values that agree with the other estimates of accuracy. For BLUP models, \(\widehat{acc}_{p}^{2}\) values were lower than for SSGBLUPMF (0.24 vs 0.32), which agrees with the information obtained from the other estimates of accuracy. Although these values are apparently small, this is expected because this is a sample of animals that are selected based on parent average [20]. In contrast, the estimation of “unselected” reliabilities, \(\widehat{rel}_{p}\), results in values within the usual scale of individual modelbased accuracies. Again, the SSGBLUPMF model estimated higher reliabilities than the BLUP models (0.59 vs 0.54 and 0.53, respectively). The increase in accuracy is fairly consistent across all four estimators of accuracy.
In Table 2, we presented the values of the slope of the regression of EBV_{w} on EBV_{p} (\(\hat{b}_{p}\)) when EBV_{p} was estimated with one model and EBV_{w} with another model. This gives some sort of measure of the disagreement across models, i.e. we expect models to behave similarly in terms of biases. Cases that estimate in a “partial” dataset with SSGBLUP and in a “whole” dataset with BLUP are not considered, since they seem unnatural in practice; for instance the decision on which animals to genotype may be based on the information of the whole dataset. When we use pedigreebased models to estimate EBV_{p} and EBV_{w}, the dispersion is around 1 (0.93 and 1.01), regardless of whether UPG or MF are used.
When EBV_{p} were estimated with the BLUP models and EBV_{w} with SSGBLUPUPG or SSGBLUPEUPG (the case when genomic selection is implemented), we observed an important underdispersion (around 1.25). However, SSGBLUPMF yielded \(\hat{b}_{p}\) values close to 1. Similar results were obtained when we compared EBV of the rams at birth, estimated with the BLUP models as “partial” with those estimated with the SSGBLUP models as “whole” (Table 3). The models SSGBLUPUPG and SSGBLUPEUPG show important underdispersion whereas SSGBLUPMF results in values of \(\hat{b}_{p}\) close to 1. This indicates that if we want to change a pedigreebased genetic evaluation for one that includes genomic information, the use of MF is a better option. Moreover, SSGBLUPEUPG is biased with itself as shown in Table 1, perhaps due to poor compatibility with the \({\mathbf{G}}\) matrices, because of doublecounting, or both.
Discussion
This study provides a comprehensive analysis of bias, dispersion and accuracies in dairy sheep genetic evaluation with several truncation points of data and several models. Estimates of bias, dispersion and accuracy were obtained with evaluation models that used only pedigree or a combination of pedigree and genomic relationship matrices with different strategies to model missing pedigree and using the LR method. The properties of such types of models have recently been extensively investigated [18, 30, 36,37,38,39,40]. The current study adds further evidence that the metafounder approach should be the preferred one for genomic evaluation across species.
The values of accuracy estimators confirm that the inclusion of genomic information increases the accuracy of the EBV of individuals without daughter records, which is consistent with other studies [41,42,43,44].
For \(\widehat{acc}_{p}^{2}\), we found extremely high values for models SSGBLUPUPG and SSGBLUPEUPG, due to values out of the parametric space. For example, for SSGBLUPUPG and the comparison 2010–2015, \(cov\left( {\hat{u}_{p} ,\hat{u}_{w} } \right) = 235\), \(var\left( {\hat{u}_{p} } \right) = 283\) and \(var\left( {\hat{u}_{w} } \right) = 580\), when the genetic variance in the base population is 565. This could indicate a difficulty for these models to manage correctly missing pedigree through UPG and the genomic information. Values within the expected range of reliabilities were found for the other models, and the SSGBLUPMF model reached the highest average value. These results agree with the values of estimators of the ratio of accuracies (\(\hat{\rho }_{wp}\) and \(\hat{\rho }_{p}^{2}\)), since the use of genomic information increases the reliability of EBV estimated without daughter records. We should note that \(\widehat{acc}_{p}^{2}\) tries to estimate the square of the correlation between EBV and TBV in the focal individuals, that are selected and with reduced variance, whereas \(\widehat{rel}_{p}\) would be the squared correlation if they were unselected. These two estimators have different purposes in practice [20]: the first, populational reliability \(\widehat{acc}_{p}^{2}\), describes the possible genetic gain, whereas the second describes stability of EBV. In the current breeding scheme of the Manech Tête Rousse, more candidates are genotyped for selection, so that our estimate \(\widehat{acc}_{p}^{2}\) is possibly a lower bound.
Concerning the bias (\(\hat{\Delta }_{p}\)), the lowest values were observed when MF were used to model the missing pedigree. As for the estimator of dispersion (\(\hat{b}_{p}\)), we did not observe important over or underdispersion, except for SSGBLUPEUGP. The closest values to 1 of this estimator were obtained when we used BLUPMF and SSGBLUPMF. Similar results were obtained in a recent work [18], which indicates that MF could be the best option to manage missing pedigree for SSGBLUP models. In the case of SSGBLUPEUPG, an important inflation of EBV was observed. A possible cause for this behavior could be that EUPG ignores the covariance between genetic groups (average relationship across MF is 0.38) whereas this relationship is included in \({\mathbf{G}}\). Similar results were reported by [18] using simulated data to compare the same three strategies to model missing parents, and they found that MF generated the smallest bias in evaluations.
In general, when BLUP or SSGBLUP_MF were used, no bias was found, although Legarra et al. [14] found biases in these same breeds using DYD both as pseudophenotypes and for validation. However, as we already mentioned, the validation set in [14] was composed of rams born in 2008–2009 with predictions that were also biased according to the LR method, which was due to a problem in collecting elite rams across flocks.
Finally, we consider important to highlight that a single cutoff point to estimate accuracy or bias is highly uncertain, as shown in Fig. 1. Breeding schemes should not rely on a single study based on a single point in time to define models for genetic evaluation.
Conclusions
The addition of genomic information increases the accuracy of the EBV of young rams in Manech Tête Rousse. In this population, that has missing pedigrees, the use of UPG and “exact UPG” in SSGBLUP produced bias, whereas MF yielded unbiased estimates and, thus we recommend its use. We also recommend assessing biases and accuracies using multiple truncation points, as these statistics are subject to random variation.
Availability of data and materials
The data set is available under reasonable request.
References
 1.
Spelman RJ, Arias J, Keehan MD, Obolonkin V, Winkelman AM, Johnson DL, et al. Application of genomic selection in the New Zealand dairy cattle industry. In: Proceedings of the 9th world congress on genetics applied to livestock production: 1–6 August 2010; Leipzig; 2010.
 2.
Patry C, Ducrocq V. Evidence of biases in genetic evaluations due to genomic preselection in dairy cattle. J Dairy Sci. 2011;94:1011–20.
 3.
Sargolzaei M, Chesnais J, Schenkel F. Assessing the bias in top GPA bulls. Canadian Dairy Network Open Industry Session: 30 October 2012; Guelph; 2012. p. 1–9.
 4.
Tyrisevä AM, Mäntysaari EA, Jakobsen J, Aamand GP, Dürr J, Fikse WF, et al. Detection of evaluation bias caused by genomic preselection. J Dairy Sci. 2018;101:3155–63.
 5.
Christensen OF, Madsen P, Nielsen B, Ostersen T, Su G. Singlestep methods for genomic evaluation in pigs. Animal. 2012;6:1565–71.
 6.
Carillier C, Larroque H, RobertGranié C. Comparison of joint versus purebred genomic evaluation in the French multibreed dairy goat population. Genet Sel Evol. 2014;46:67.
 7.
Abdalla EEA, Schenkel FS, Emamgholi Begli H, Willems OW, van As P, Vanderhout R, et al. Singlestep methodology for genomic evaluation in Turkeys (Meleagris gallopavo). Front Genet. 2019;10:1248.
 8.
Saatchi M, McClure MC, McKay SD, Rolf MM, Kim J, Decker JE, et al. Accuracies of genomic breeding values in American Angus beef cattle using Kmeans clustering for crossvalidation. Genet Sel Evol. 2011;43:40.
 9.
Saatchi M, Ward J, Garrick DJ. Accuracies of direct genomic breeding values in Hereford beef cattle using national or international training populations. J Anim Sci. 2013;91:1538–51.
 10.
Cardoso FF, Gomes CCG, Sollero BP, Oliveira MM, Roso VM, Piccoli ML, et al. Genomic prediction for tick resistance in Braford and Hereford cattle. J Anim Sci. 2015;93:2693–705.
 11.
Legarra A, Reverter A. Semiparametric estimates of population accuracy and bias of predictions of breeding values and future phenotypes using the LR method. Genet Sel Evol. 2018;50:53.
 12.
Astruc JM, Baloche G, Barillet F, Legarra A. Genomic evaluation validation test proposed by Interbull is necessary but not sufficient because it does not check the correct genetic trend. In: Proceedings of the 39th annual conference of the international committee for animal recording ICAR: 19–23 May 2014; Berlin; 2014. p. 50.
 13.
Baloche G, Legarra A, Sallé G, Larroque H, Astruc JM, RobertGranié C, et al. Assessment of accuracy of genomic prediction for French Lacaune dairy sheep. J Dairy Sci. 2014;97:1107–16.
 14.
Legarra A, Baloche G, Barillet F, Astruc JM, Soulas C, Aguerre X, et al. Within and acrossbreed genomic predictions and genomic relationships for Western Pyrenees dairy sheep breeds Latxa, Manech, and BascoBéarnaise. J Dairy Sci. 2014;97:3200–12.
 15.
Quaas RL. Additive genetic model with groups and relationships. J Dairy Sci. 1988;71:91–8.
 16.
Misztal I, Vitezica ZG, Legarra A, Aguilar I, Swan AA. Unknownparent groups in singlestep genomic evaluation. J Anim Breed Genet. 2013;130:252–8.
 17.
Legarra A, Christensen OF, Vitezica ZG, Aguilar I, Misztal I. Ancestral relationships using metafounders: finite ancestral populations and across population relationships. Genetics. 2015;200:455–68.
 18.
Bradford HLL, Masuda Y, VanRaden PMM, Legarra A, Misztal I. Modeling missing pedigree in singlestep genomic BLUP. J Dairy Sci. 2019;102:2336–46.
 19.
Forneris NS, Legarra A, Vitezica ZG, Tsuruta S, Aguilar I, Misztal I, et al. Quality control of genotypes using heritability estimates of gene content at the marker. Genetics. 2015;199:675–81.
 20.
Bijma P. Accuracies of estimated breeding values from ordinary genetic evaluations do not reflect the correlation between true and estimated breeding values in selected populations. J Anim Breed Genet. 2012;129:345–58.
 21.
Macedo FL, Reverter A, Legarra A. Behavior of the Linear Regression method to estimate bias and accuracies with correct and incorrect genetic evaluation models. J Dairy Sci. 2020;103:529–44.
 22.
Sorensen D, Fernando R, Gianola D. Inferring the trajectory of genetic variance in the course of artificial selection. Genet Res. 2001;77:83–94.
 23.
Dekkers JCM. Asymptotic response to selection on best linear unbiased predictors of breeding values. Anim Prod. 1992;54:351–60.
 24.
Henderson CR. Best linear unbiased estimation and prediction under a selection model. Biometrics. 1975;31:423–47.
 25.
Meuwissen THE, De Jong G, Engel B. Joint estimation of breeding values and heterogeneous variances of large data files. J Dairy Sci. 1996;79:310–6.
 26.
Legarra A, Christensen OF, Aguilar I, Misztal I. Single step, a general approach for genomic selection. Livest Sci. 2014;166:54–65.
 27.
Christensen OF, Lund MS. Genomic prediction when some animals are not genotyped. Genet Sel Evol. 2010;42:2.
 28.
Legarra A, Aguilar I, Misztal I. A relationship matrix including full pedigree and genomic information. J Dairy Sci. 2009;92:4656–63.
 29.
Westell RA, Quaas RL, Van Vleck LD. Genetic groups in an animal model. J Dairy Sci. 1988;71:1310–8.
 30.
GarciaBaccino CA, Legarra A, Christensen OF, Misztal I, Pocrnic I, Vitezica ZG, et al. Metafounders are related to F_{st} fixation indices and reduce bias in singlestep genomic evaluations. Genet Sel Evol. 2017;49:34.
 31.
Henderson CR. A simple method for computing the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics. 1976;32:69–83.
 32.
VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91:4414–23.
 33.
Aguilar I, Misztal I, Legarra A, Tsuruta S. Efficient computation of the genomic relationship matrix and other matrices used in singlestep evaluation. J Anim Breed Genet. 2011;128:422–8.
 34.
Christensen OF. Correction: compatibility of pedigreebased and markerbased relationship matrices for singlestep genetic evaluation. Genet Sel Evol. 2012;44:37.
 35.
Tsuruta S, Misztal I, Strandén I. Use of the preconditioned conjugate gradient algorithm as a generic solver for mixedmodel equations in animal breeding applications. J Anim Sci. 2001;79:1166–72.
 36.
van Grevenhof E, Vandenplas J, Calus MPL. Genomic prediction for crossbred performance using metafounders. J Anim Sci. 2019;97:548–58.
 37.
Meyer K, Tier B, Swan A. Estimates of genetic trend for singlestep genomic evaluations. Genet Sel Evol. 2018;50:39.
 38.
Xiang T, Nielsen B, Su G, Legarra A, Christensen OF. Application of singlestep genomic evaluation for crossbred performance in pig. J Anim Sci. 2016;94:936–48.
 39.
Xiang T, Christensen OF, Legarra A. Technical note: genomic evaluation for crossbred performance in a singlestep approach with metafounders. J Anim Sci. 2017;95:1472–80.
 40.
Yoshida GM, Carvalheiro R, Rodríguez FH, Lhorente JP. Genomics Singlestep genomic evaluation improves accuracy of breeding value predictions for resistance to infectious pancreatic necrosis virus in rainbow trout. Genomics. 2019;111:127–32.
 41.
Guarini AR, Lourenco DAL, Brito LF, Sargolzaei M, Baes CF, Miglior F, et al. Use of a singlestep approach for integrating foreign information into national genomic evaluation in Holstein cattle. J Dairy Sci. 2019;102:8175–83.
 42.
Duchemin SI, Colombani C, Legarra A, Baloche G, Larroque H, Astruc JM, et al. Genomic selection in the French Lacaune dairy sheep breed. J Dairy Sci. 2012;95:2723–33.
 43.
Carillier C, Larroque H, Palhière I, Clément V, Rupp R, RobertGranié C. A first step toward genomic selection in the multibreed French dairy goat population. J Dairy Sci. 2013;96:7294–305.
 44.
Wiggans GR, VanRaden PM, Cooper TA. The genomic evaluation system in the United States: past, present, future. J Dairy Sci. 2011;94:3202–11.
Acknowledgements
We are grateful to the Genotoul Bioinformatics Platform Toulouse MidiPyrenees (Bioinfo Genotoul) for providing computing and storage resources and to the Center for Quantitative Genetics and Genomics, Aarhus University, for receiving FLM during an internship, during which analyses were performed to complete the work. Furthermore, we want to thank Vincent Ducrocq, Alain Charcosset, Catherine Larzul, Anne Ricard, Leopoldo SanchezRodriguez, Francis Fidelle and the two anonymous reviewers for their helpful advice.
Funding
This work received funding from the project ARDI funded by INTERREG POCTEFA, the European Unions’ Horizon 2020 Research & Innovation program under grant agreement N°772787—SMARTER, Metaprogram SELGEN of INRAE, the Animal Genetics Division of INRAE and La Région Occitanie.
Author information
Affiliations
Contributions
FLM performed all analyses and wrote the paper. AL provided input on the procedures, models, data analyses. OFL provided input on procedures and data analyses. JMA provided the data and the model of the official genetic evaluation. IA and YM contributed to the code of BLUPF90 family applications, used to analyze the data. AL, OFC and IA commented on an improved draft. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
R code to obtain estimable functions of statistics in LR method
We used function estimable() from the R package Gmodels v. 2.18.1
#https://www.rdocumentation.org/packages/gmodels/versions/2.18.1/topics/estimable
library(gmodels)
# read input file
input = read.table(“input.txt”)
# 1 represent the intercept, 9 is number of years in “partial” minus one (included in the intercept), 10 is number of years in “whole” minus one (included in the intercept) cm = c(1,rep(1/9,9),rep(1/10,10))
# example with bias
lm_bias = lm(bias ~ as.factor(y_partial) + as.factor(y_whole),data = input)
# estimable function
estimable(lmbias,cm,conf.int = 0.95)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Macedo, F.L., Christensen, O.F., Astruc, J. et al. Bias and accuracy of dairy sheep evaluations using BLUP and SSGBLUP with metafounders and unknown parent groups. Genet Sel Evol 52, 47 (2020). https://doi.org/10.1186/s12711020005671
Received:
Accepted:
Published: