Single-step genomic evaluation with metafounders for feed conversion ratio and average daily gain in Danish Landrace and Yorkshire pigs

Background The single-step genomic best linear unbiased prediction (SSGBLUP) method is a popular approach for genetic evaluation with high-density genotype data. To solve the problem that pedigree and genomic relationship matrices refer to different base populations, a single-step genomic method with metafounders (MF-SSGBLUP) was put forward. The aim of this study was to compare the predictive ability and bias of genomic evaluations obtained with MF-SSGBLUP and standard SSGBLUP. We examined feed conversion ratio (FCR) and average daily gain (ADG) in DanBred Landrace (LL) and Yorkshire (YY) pigs using both univariate and bivariate models, as well as the optimal weighting factors (ω), which represent the proportions of the genetic variance not captured by markers, for ADG and FCR in SSGBLUP and MF-SSGBLUP. Results In general, SSGBLUP and MF-SSGBLUP showed similar predictive abilities and bias of genomic estimated breeding values (GEBV). In the LL population, the predictive ability for ADG reached 0.36 using uni- or bi-variate SSGBLUP or MF-SSGBLUP, while the predictive ability for FCR was highest (0.20) for the bivariate model using MF-SSGBLUP, but differences between analyses were very small. In the YY population, predictive ability for ADG was similar for the four analyses (up to 0.35), while the predictive ability for FCR was highest (0.36) for the uni- and bi-variate MF-SSGBLUP analyses. SSGBLUP and MF-SSGBLUP exhibited nearly the same bias. In general, the bivariate models had lower bias than the univariate models. In the LL population, the optimal ω for ADG was ~ 0.2 in the univariate or bivariate models using SSGBLUP or MF-SSGBLUP, and the optimal ω for FCR was 0.70 and 0.55 for SSGBLUP and MF-SSGBLUP, respectively. In the YY population, the optimal ω ranged from 0.25 to 0. 35 for ADG across the four analyses and from 0.10 to 0.30 for FCR. Conclusions Our results indicate that MF-SSGBLUP performed slightly better than SSGBLUP for genomic evaluation. There was little difference in the optimal weighting factors (ω) between SSGBLUP and MF-SSGBLUP. Overall, the bivariate model using MF-SSGBLUP is recommended for single-step genomic evaluation of ADG and FCR in DanBred Landrace and Yorkshire pigs.


Background
Single-step genomic best linear unbiased prediction (SSGBLUP), as a standard genomic evaluation method using single nucleotide polymorphism (SNP) genotypes, has been successfully used in the pig industry [1]. By using SSGBLUP, genomic selection (GS) can be implemented even if some animals are not genotyped because it can integrate phenotypic records, pedigree, and genomic information on all relevant animals [2,3]. However, some problems with SSGBLUP need to be solved. First, theoretically, the allele frequencies used to construct the genomic relationship matrix should be those of the base population of the pedigree [4]. However, allele frequencies in the base populations are usually unknown, because animals in the base population are often not genotyped [5]. Using allele frequencies other than those of the base population for the construction of the genomic relationship matrix causes incompatibility between the pedigree-based and marker-based relationship matrices [6]. Second, to make the genomic relationship matrix invertible and capture genetic variance that cannot be captured by SNPs, the genomic relationship matrix is usually combined with the pedigree-based relationship matrix, using a weighting factor (ω) that may be breed-and trait-specific, which needs to be further investigated. Some approaches are available to address these two problems, but they are not perfect. For example, instead of allele frequencies in the base population, those of the genotyped population are used to construct the genomic relationship matrix. Other studies have adjusted the marker-based relationship matrix to make it compatible with the pedigree-based relationship matrix [7][8][9]. Some studies use the same weighting factor (ω) for different traits, leading to a decrease in the accuracy of estimated breeding values [10]. Although these solutions have been widely used in SSGBLUP and appear to be effective in practice, they do not fully solve the abovementioned issues and further developments are needed to improve SSGBLUP.
To address the issue of unknown base allele frequencies and incompatibility of genomic and pedigree relationship matrices, Christensen [6] and Legarra et al. [5] defined a metafounder as a finite-sized pool of gametes from which alleles are randomly extracted to form diploids that represent animals in the base population [5]. Based on metafounder theory [5,6], allele frequencies of 0.5 should be used to construct the genomic relationship matrix and referred to in the construction of the pedigree-based relationship matrix, such that compatibility between pedigree-based and marker-based relationship matrices is automatically achieved. Thus, metafounder theory solves the two above-mentioned issues of SSGBLUP.
Since this theory was formulated, a few studies have focused on the application of single-step genomic evaluation coupled with metafounders, i.e. MF-SSGBLUP. For instance, Garcia-Baccino et al. [11] used simulated data to investigate methods to estimate relatedness in the base population (γ). Using MF-SSGBLUP, Xiang et al. [12] estimated breeding values for total number of piglets born in purebred pigs for two-way crossbred performance, and van Grevenhof et al. [13] estimated breeding values of purebred pigs for three-way crossbred performance, based on simulated data. In dairy cattle, Bradford et al. [14] compared the predictive performance of SSG-BLUP with unknown parent groups and MF-SSGBLUP using simulated data. Kudinov et al. [15] recommended MF-SSGBLUP for genomic prediction in red dairy cattle. Macedo et al. [16] reported that MF-SSGBLUP can yield unbiased estimates of breeding values when pedigree information is not complete.
Studies that compare MF-SSGBLUP and standard SSG-BLUP for genomic evaluation on production traits such as feed conversion rate (FCR) and average daily gain (ADG) are lacking in pigs. In addition, only a few studies have applied these two methods in a multi-trait model, which is extensively used in pig breeding programs. Finally, the effect of using different weighting factors ω for the pedigree-based relationship matrix in MF-SSGB-LUP on genomic predictions has not been investigated. Therefore, the aims of this study were to (1) compare estimates of genetic parameters for ADG and FCR obtained with SSGBLUP and MF-SSGBLUP; (2) investigate and compare the accuracy and bias of GEBV using SSGBLUP and MF-SSGBLUP with univariate and bivariate animal models; and (3) find optimal weighting factors, ω, for ADG and FCR when using SSGBLUP or MF-SSGBLUP.

Phenotypic records
All datasets were provided by the SEGES Danish Pig Research Centre. The traits ADG (ADG = weight gain/ days) and FCR (FCR = feed intake/weight gain) were measured during the 30 to 100 kg interval for both Dan-Bred Landrace (LL) and Yorkshire (YY) pigs between the years 2000 and 2017. Records for ADG were available on 686,420 LL pigs, of which 18,889 also had FCR records, and on 570,493 YY pigs, of which 19,387 also had FCR records. Pedigrees of the two breeds were traced back to January 1, 1994, and included 700,960 LL pigs and 582,114 YY pigs.

Genotypes
In total, 37,699 LL and 37,845 YY pigs were genotyped with the Illumina PorcineSNP60 Genotyping BeadChip, with approximately equal numbers by breed using version 1 and version 2. SNPs were mapped to chromosomes based on the pig genome build 10.2 [17]. Genotype quality control was conducted separately for the two versions and the two breeds. First, animals with a genotype call rate lower than 90% were excluded. Then, SNPs with a genotype call rate lower than 90% and a minor allele frequency lower than 0.05 were filtered out, as were SNPs that deviated strongly from Hardy Weinberg equilibrium within breed (p < 10e−7). Fimpute v2.2 [18] was used to impute missing genotypes. After quality control, 37,621 SNPs for 37,699 LL pigs and 36,687 SNPs for 37,845 YY pigs were retained for further analysis.

Methods to construct the relationship matrix
For SSGBLUP, the inverse of the combined genomic and pedigree-based relationship matrix ( H −1 ) was constructed following Christensen and Lund [3] and Christensen et al. [9], as: where G a,ω = (1 − ω)G a + ωA 22 and A 22 contains the pedigree-based relationships among the genotyped pigs. Matrix G a is an adjusted version of the genomic relationship matrix G , that was constructed following VanRaden's [4] method 1 as: where Z is a matrix with entries 0-2p i , 1-2p i , 2-2p i for genotypes AA, AB, and BB, respectively, for SNP i ranging from 1 to m , where m is the number of markers. Theoretically, p i are the minor allele frequencies in the base population but, in practice, p i is often computed from the observed genotypes, and this was the case in our study. To be compatible with the pedigree-based relationship matrix, matrix G was scaled to create a matrix G a , following the method described by Christensen et al. [9], as: G a = βG + α11 ′ , where β and α are obtained by solving the following equations: Avg diag(G) β + α = Avg diag(A 22 ) and Avg(G)β + α = Avg(A 22 ) . The inverse of the pedigree-based relationship A matrix was constructed according to Henderson's rule [19] by considering inbreeding coefficients in the relationship matrix computation. Matrix A 22 , with inverse A −1 22 , was constructed following Colleau [20]. To investigate the effects of different weighting factors, ω, between the pedigree-based and genomic relationship matrices on genomic predictions in SSGBLUP, we tested values of ω ranging from 0.05 to 0.95 with an interval of 0.05.
For MF-SSGBLUP, parameter γ, which represents the relatedness of the animals in the LL and YY base populations, must be estimated to construct the inverse of the combined relationship matrix. Christensen [6] and Garcia-Baccino et al. [11] reported that γ = 8σ 2 p where σ 2 p is the variance of the true allele frequencies in the base population. To estimate γ, σ 2 p was calculated by the method of generalized least squares reported by Garcia-Baccino et al. [11] and Xiang et al. [12]. For this purpose, gene content at a SNP, i.e. the number of copies of a given reference allele (e.g. 0/1/2 for genotypes AA/AB/BB) was viewed as a quantitative trait with a heritability of 1, and all the variance was assumed to be additive genetic [21]. The linear model used for analysis of gene content, which was independently proposed by McPeek et al. [22] and Gengler et al. [23], was: where m i is a vector of gene contents for SNP i across all animals; µ i is the overall mean of gene content, with µ i = 2p i , where p i is the allele frequency in the base population; u i is the vector of random genetic effects and was assumed to follow a multivariate normal distribution u i ∼ N 0, Aσ 2 u [24], where A is the pedigree-based additive genetic relationship matrix; matrix W is an incidence matrix relating individuals to their genotypes; and e i is a vector of error terms. We used σ 2 u = 0.999 and σ 2 e = 0.001 , such that heritability was almost 1. Similar to Xiang et al. [25], the overall mean µ i for each locus i was estimated as the best linear unbiased estimate (BLUE) using the BLUPF90 [26] software. Then, the estimate of the base allele frequency p i was calculated as half of the μ i . The variance of base population allele frequencies, σ 2 p , was obtained, followed by γ = 8 σ 2 p . After estimating γ, the inverse combined relationship matrix H(γ) −1 was constructed for the LL and YY populations as follows: where G MF,ω = (1 − ω)G MF + ωA γ 22 , and A γ 22 contain the pedigree-based relationships among the genotyped pigs. The marker-based relationship matrix was derived as M is a matrix with entries 0, 1, and 2 for genotypes AA, AB, and BB, respectively; 1 m and 1 n are vectors of ones of lengths m for the number of markers and n for the number of animals, respectively; and s is a scaling parameter equal to m 2 [27]. The calculation of H γ −1 was based on the method reported by Garcia-Baccino et al. [11]. Values of ω ranging from 0.05 to 0.95 with an interval of 0.05 were investigated. The inverse of the pedigree-based relationship matrix A γ −1 was constructed according to Henderson's [19] rules and matrix A γ 22 was constructed following the method described by Colleau [20], with a minor modification based on metafounders [5].

Statistical models
Both univariate and bivariate animal models were applied to estimate breeding values for ADG and FCR for both LL and YY pigs. The statistical models for analysis of ADG and FCR were as follows: where y ADG and y FCR are vectors of phenotypic records for ADG and FCR, respectively; β ADG and β FCR are the vectors of fixed effects, which include contemporary groups effects (animals present at the same point of time), sex effect, and covariates on initial weights for ADG and FCR, respectively; a ADG , a FCR , l ADG , and p ADG are the vectors of random effects, with a ADG and a FCR representing the additive genetic effects for ADG and FCR, respectively; l ADG and p ADG are vectors of random birth litter effects and random pen effects for ADG; X , Z , K , W are the corresponding incidence matrices; and e ADG and e FCR are vectors of random residual effects. Random litter and pen effects were assumed to follow normal distributions, i.e., l ADG ∼ N 0, Iσ 2 litter and p ADG ∼ N 0, Iσ 2 pen , where I is an identity matrix. In the univariate models, vectors a ADG and a FCR were assumed to be normally distributed with a mean of 0 and (co)variance structures var(a ADG ) = Hσ 2 a_UNI_ADG and var(a FCR ) = Hσ 2 a_UNI_FCR for SSGBLUP, and var(a ADG ) a_UNI_MF _ADG , and σ 2 a_UNI_MF _FCR are, respectively, the additive genetic variances for ADG and FCR in the univariate models using SSGBLUP and MF-SSGBLUP. In the bivariate model, a ADG and a FCR were assumed to follow a multivariate normal distribution: a_BI_MF _FCR are the additive genetic variances for ADG and FCR in the bivariate model for SSGBLUP and MF-SSGBLUP, respectively, and σ a_BI_ADG,a_BI_FCR and σ a_BI_MF _ADG,a_BI_MF _FCR are the additive genetic covariances between ADG and FCR in the bivariate model for SSGBLUP and MF-SSGBLUP, respectively. It should be noted that the (co)variance components in MF-SSGBLUP cannot be directly compared with those in SSGBLUP based on the assumption that animals in the base population were unrelated. In MF-SSGBLUP, all the genetic (co)variance components, including the additive genetic variances for ADG and FCR and the genetic covariance between ADG and FCR, were multiplied by (1 − γ/2) , with γ estimated separately for the LL and YY populations [5,26].

Methods and genetic parameters
Using the HE regression algorithm in the HIBLUP software (https:// hiblup. github. io/), all available data for the LL and YY pigs were used to estimate (co)variance components and heritabilities ( h 2 ) with ω = 0.25 for both ADG and FCR using four methods: (1)

Predictive ability
Genomic predictions for the four methods of analysis were obtained using the preconditioned conjugate gradient algorithm in the DMU software [28], using different ω (ranging from 0.05 to 0.95). Notably, in the univariate and bivariate models, each time only one ω was used to construct the relationship matrix. Predictive abilities were determined as the correlations between corrected phenotypes ( Y c ) and genomic estimated breeding values ( â ) in different methods ( cor Y c ,â ), in a validation population, following the method reported by Christensen et al. [9].  Table 1. The predictive ability was investigated for each subgroup and for the total validation population for both the YY and LL populations.
A Hotelling-Williams t-test at a 5% confidence level was performed to evaluate the statistically significant differences in predictive ability between methods. The value of ω that resulted in the highest accuracy of genomic predictions was considered the optimal ω. However, if several values of ω resulted in accuracies that were close to the highest accuracy, the value of ω with the lowest bias was considered the optimal weighting factor. Tables 2 and 3 show the estimated genetic parameters with a ω of 0.25 for ADG and FCR for the four methods of analysis and estimates of the genetic correlation between ADG and FCR for the bivariate analyses (BI_ SSGBLUP and BI_MF-SSGBLUP) for the LL and YY populations, respectively. For MF-SSGBLUP, parameter γ was estimated at γ L = 0.605 and γ Y = 0.553 for the LL and YY populations, respectively.

Genetic parameters
For the LL population, estimates of the additive genetic variance for ADG and FCR were similar for the four methods (Tables 2 and 3). Estimates of heritabilities for ADG ( h 2 a_ADG ) and FCR ( h 2 a_FCR ) were also almost identical for the four methods, with h 2 a_ADG ranging from 0.23 to 0.24 and h 2 a_FCR ranging from 0.10 to 0.11, but with very high standard errors (SE), i.e. ranging from 0.04 to 0.05 for h 2 a_ADG and from 0.05 to 0.07 for h 2 a_FCR . The estimate of the genetic correlation between ADG and FCR was higher negative with BI_MF-SSGBLUP ( r g = − 0.27, SE = 0.10) than with BI_ SSGBLUP ( r g = − 0.19, SE = 0.09).
Results for the YY population were similar to those for the LL population. Variance component estimates were similar for the four methods, except for estimates of the additive genetic variance for ADG and FCR, which were slightly greater with the use of metafounders (UNI_MF-SSGBLUP and BI_MF-SSGBLUP) than for the two SSGBLUP methods. Estimates of heritability were consistent for the four methods and ranged from 0.26 to 0.28 for ADG, with SE from 0.03 to 0.04, and from 0.19 to 0.20 for FCR, with SE from 0.03 to 0.04. The BI_MF-SSGBLUP and BI_ SSGBLUP methods resulted in the same estimate of the genetic correlation Table 1 Numbers of animals with records for average daily gain (ADG) and feed conversion ratio (FCR) and with genotypes in the training and validation datasets for the two populations between ADG and FCR (− 0.46, with SE ranging from 0.12 to 0.13).

Predictive ability
The effect of the weighting factor (ω) on accuracy for ADG and FCR for the four methods are shown in Fig. 1. For the LL population, the four methods resulted in nearly the same accuracy for ADG, with the highest accuracy equal to 0.355. The optimal weighting factors (ω) were 0.20, 0.15, 0.20, and 0.25 for UNI_ SSGBLUP, UNI_MF-SSGBLUP, BI_SSGBLUP, and BI_MF-SSGBLUP, respectively. For FCR, the predictive ability was slightly higher for the bivariate (0.19) than for the univariate methods (0.16). The optimal ω was 0.70 for UNI_SSGBLUP and BI_SSGBLUP, and 0.55 for UNI_ MF-SSGBLUP and BI_MF-SSGBLUP. For the YY population, the curves of the predictive ability for ADG as a function of ω for the four methods nearly overlapped at the top (Fig. 1). The highest predictive ability (0.347) was obtained when ω was set to 0.30 for all four methods. For FCR, the univariate models performed slightly better than the bivariate models when ω was lower than 0.85, i.e. the predictive ability was highest, 0.355, with ω equal to 0.10 and 0.15 for UNI_SSGBLUP and UNI_MF-SSGB-LUP, respectively, and 0.344 with a ω of 0.25 and 0.30 for BI_SSGBLUP and BI_MF-SSGBLUP, respectively. Table 2 Estimates of variance components, heritabilities, and genetic correlations for average daily gain (ADG) and feed conversion ratio (FCR) in Landrace pigs using pedigree-based univariate BLUP and univariate (UNI) or bivariate (BI) single-step genomic prediction models with or without metafounders (MF) Variance components for genetic parameters correspond to the usual genetic variance (in MF-SSGBLUP, all the variance components are multiplied by ( 1 − γ L )/2 ), which is the variance among unrelated individuals in the base population σ 2 a_ADG is the additive genetic variance for ADG, σ 2 a_FCR is the additive genetic variance for FCR; σ a_ADG,a_FCR is genetic covariance between ADG and FCR; σ 2 litter is the variance of litter effect for ADG; σ 2 e_ADG is the residual variance for ADG; σ 2 e_FCR is the residual variance for FCR Numbers between brackets are the standard errors of the corresponding parameters   Table 3 Estimates of variance components, heritabilities, and genetic correlations for average daily gain (ADG) and feed conversion ratio (FCR) in Yorkshire pigs using pedigree-based univariate BLUP and univariate (UNI) or bivariate (BI) single-step genomic prediction models with or without metafounders (MF) Variance components for genetic parameters correspond to the usual genetic variance (in MF-SSGBLUP, all the variance components are multiplied by ( 1 − γ Y )/2 ), which is the variance among unrelated individuals in the base population σ 2 a_ADG is the additive genetic variance for ADG, σ 2 a_FCR is the additive genetic variance for FCR; σ a_ADG,a_FCR is genetic covariance between ADG and FCR; σ 2 litter is the variance of litter effect for ADG; σ 2 e_ADG is the residual variance for ADG; σ 2 e_FCR is the residual variance for FCR Numbers between brackets are the standard errors of the corresponding parameters  Taken together, when the same value of ω was used, the predictive ability did not significantly differ between the four methods. The predictive ability of method BI_MF-SSGBLUP was slightly superior compared to that of the other methods.
In order to compare the predictive abilities for genotyped and non-genotyped animals, the LL and YY validation populations were divided into a genotyped and a non-genotyped subgroup, and the predictive abilities for ADG and FCR were calculated at the optimal ω for each subgroup. The results are in Table 4. As shown in Fig. 1, the optimal ω for SSGBLUP and MF-SSGBLUP was nearly the same for a given trait and population. Thus, we only investigated the predictive abilities for the genotyped and non-genotyped subgroups at the optimal ω of 0.20, 0.70, 0.25, and 0.10 for to ADG in LL, FCR in LL, ADG in YY, and FCR in YY, respectively. Based on Table 4, for both the univariate and the bivariate models, the predictive abilities of SSGBLUP and MF-SSGBLUP were similar for both subgroups but the predictive ability was higher for the genotyped subgroups than for the non-genotyped subgroups.
The effect of the weighting factor (ω) on bias of genomic predictions for ADG and FCR for the four methods are shown in Fig. 2. In general, the regression coefficient increased as ω increased. For the LL population, when ω was set to 0.95, the regression coefficients were closest to 1. For ADG, the effect of ω on the regression coefficient was nearly the same for the four methods; for FCR, the regression coefficients deviated more from 1 for the univariate models than for the bivariate models. For the YY population, the regression coefficient for ADG with method BI_MF-SSGBLUP was closest to 1 with a ω of 0.70, indicating that this method was the best among the four methods in terms of bias. For FCR, there were no dramatic differences in bias between the four methods. When ω was set to 0.15, the bias of genomic predictions was lowest for all four methods. When the optimal ω was used, the bias of genomic predictions was lower for the non-genotyped subgroup than for the genotyped subgroup for all four methods (see Table 5).

Discussion
In this work, we compared predictive abilities of genomic predictions for two economically important production traits, ADG and FCR, in DanBred Landrace and Yorkshire populations between using regular SSGBLUP and Fig. 1 Accuracy for ADG and FCR. The effect of the weighting factor (ω) on accuracy for ADG and FCR for the four methods for two populations, where the X-axis represents ω ranging from 0.05 to 0.95, and the Y-axis indicates the correlation coefficients between corrected phenotypes ( Y c ) and genomic estimated breeding values ( â) MF-SSGBLUP. First, we estimated γ to construct the H(γ) −1 matrix for MF-SSGBLUP. Then, we compared genetic parameters estimates and predictive abilities between SSGBLUP and MF-SSGBLUP, both for univariate and bivariate models. This study also explored the optimal weighting factor, ω, on pedigree relationships for ADG and FCR for both SSGBLUP and MF-SSGBLUP.

Estimation of γ for MF-SSGBLUP
For use in MF-SSGBLUP, we obtained estimates of γ of 0.605 and 0.553 for the LL and YY populations, respectively, which differed from those reported previously for the same populations (0.756 for LL and 0.730 for YY) [12]. Legarra et al. [5] have pointed out that several factors can affect estimates of γ, such as effective population size and the SNP panel used. The populations used in the current study and in previous studies [12] originated from the same ancestors, thus the effective population size was not responsible for the difference in γ estimates. However, in our study we used fewer SNPs (about 37,500 SNPs) than in the previous study (41,000 SNPs), which is due to a difference in the quality control criteria on minor allele frequencies between the two studies, which was 0.05 for both the LL and YY populations, while Xiang et al. 's [12] used a threshold of 0.01. Thus, the additional SNPs in their study had a lower allele frequency, which increased the percentage of homozygotes and resulted in greater estimates of γ [12].

Genetic parameters
When using MF-SSGBLUP and SSGBLUP, although both the univariate and bivariate models produced similar genetic (co)variance estimates, the slightly smaller standard errors for the bivariate models indicated that bivariate models can reduce the uncertainty of genetic parameter estimates compared to the univariate models. Estimates of the genetic correlation between ADG and FCR were based on the bivariate models. Although MF-SSGBLUP resulted in a higher negative (− 0.27) estimate than SSGBLUP (− 0.19) for the LL population, this difference was not significant. And MF-SSGBLUP and SSGB-LUP resulted in the same estimate for the YY population.
Estimates of heritability were similar for the four methods within a population, and the heritability estimate for FCR was considerably lower for the LL (about 0.11) than the YY (about 0.20) population. In the literature, a range of heritabilities for FCR has been reported. For example, Kavlak and Uimari [29] reported an estimate of 0.28 for Finnish Yorkshires; Dube et al. [30] estimates ranging from 0.21 to 0.27 for South African Yorkshires; and Hoque et al. [31] reported an estimate of 0.27 for Duroc pigs. In this study, the SE of the estimate of heritability for FCR was substantial (about 0.05). Thus, further studies are needed to compute the relevant genetic parameters based on larger datasets.

Predictive ability
Martini et al. [32] have reported that a high weighting factor ω increases the variance of EBV. Therefore, we Table 4 Accuracies of genomic predictions for average daily gain (ADG) and feed conversion ratio (FCR) in different validation groups for univariate (UNI) or bivariate (BI) single-step genomic prediction models with or without metafounders (MF), each with their optimal weighting factor (ω) in the two populations The results for ADG and FCR bivariate single-step genomic prediction models are obtained with their optimal ω respectively Superscript letters indicate significant differences (p < 0.05) by the Hotelling-Williams t-test explored the optimization of ω. However, predictive ability was only slightly changed by ω. The ω with the highest predictive ability was defined as the optimal ω. For ADG, the optimal ω was almost the same (around 0.2) for the LL and YY populations, but for FCR, the optimal ω was substantially higher for the LL (around 0.7) than for the YY (around 0.1) population. This may be caused by a difference in the proportions of genetic variances accounted for by 50 K SNPs between the two populations. Miao et al. [33] showed that the genes that affect FCR differ between LL and YY [33], thus the gene-linked SNPs may capture different proportions of the total genetic variance for the two populations. In this study, predictive ability was higher than that reported in a previous study [9], likely because of the larger population size both in terms of number of phenotypic records and amount of genomic information. In our study, the estimated accuracy for ADG and FCR with SSGBLUP was higher than that reported for total number of piglets born in a previous study on the same populations [26], likely because the latter trait has a lower heritability than ADG and FCR. As shown in Table 4, the predictive ability for ADG and FCR was higher for the genotyped subgroups than for the non-genotyped subgroups for all four methods, consistent with previous reports [9,12]. However, Guo et al. [34] reported that the predictive reliability of non-genotyped subgroups was higher than that of genotyped subgroups for total number of piglets born and litter size at day 5 after birth in the LL population, which they attributed to pre-selection of the genotyped animals, which reduced the correlations between phenotypes and genotypes. In this study, although genotyped animals were also pre-selected, the number of genotyped animals after pre-selection was still large enough to provide sufficient information to make genomic predictions of genotyped animals more accurate than those of non-genotyped animals. Furthermore, the regression coefficients were lower for the genotyped subgroups than for the non-genotyped subgroups, while the bias of non-genotyped subgroups was similar to that of the whole validation population.
Theoretically, genomic relationships should be constructed using base allele frequencies [4] and some studies [35] have used estimated base allele frequencies to construct the genomic relationship matrix for genomic prediction. Thus, predictive abilities and bias were also evaluated for a bivariate SSGBLUP model with a genomic relationship matrix constructed using estimated base allele frequencies, using the optimal ω of 0.20, 0.70, 0.25, and 0.10 for ADG in LL, FCR in LL, ADG in YY, and FCR in YY, respectively. We obtained the following predictive abilities and bias: 0.245 and 0.743, respectively for ADG in LL pigs, 0.114 and 0.642 for FCR in LL pigs, 0.198 and 0.567 for ADG in YY pigs, and 0.141 and 0.550 for FCR in YY pigs. These values were much lower than those obtained with SSGBLUP with a genomic relationship matrix constructed using observed allele frequencies. There are two possible explanations why the use of a genomic relationship matrix constructed using base allele frequencies resulted in worse predictive performance than the methods evaluated in this study: first, we used estimates of variance components using SSGBLUP with a genomic relationship matrix that was constructed using the observed allele frequencies; second, the estimated base allele frequencies may not have been accurate. However, we found that the performance of MF-SSGBLUP was at least as good as that of SSGBLUP, which indicates that the variance of the estimated base allele frequencies was reliable, since the self-relationship γ was estimated using this variance.
Overall, MF-SSGBLUP performed slightly better for genomic prediction for ADG and FCR than SSGBLUP, although the differences were not always statistically significant. This finding is in agreement with the theory [5]. Thus, overall we recommend the use of MF-SSGBLUP for genomic evaluation in pigs.

Conclusions
The single-step genomic evaluation method with a metafounder (MF-SSGBLUP) was successfully implemented for genomic prediction of ADG and FCR in Danish Landrace and Yorkshire populations and was found to be slightly superior to the regular single-step method. The optimal weighting factor (ω) on pedigree information was slightly different for SSGBLUP versus MF-SSGBLUP. Based on our results, the bivariate model with a metafounder approach is recommended to estimate breeding values for correlated production traits such as ADG and FCR. Table 5 Biases of genomic predictions for average daily gain (ADG) and feed conversion ratio (FCR) in different validation groups for univariate (UNI) or bivariate (BI) single-step genomic prediction models with or without metafounders (MF), each with their optimal weighting factor (ω) in the two populations The results for ADG and FCR for bivariate single-step genomic prediction models are obtained with their optimal ω respectively