A deterministic equation to predict the accuracy of multi-population genomic prediction with multiple genomic relationship matrices

Background A multi-population genomic prediction (GP) model in which important pre-selected single nucleotide polymorphisms (SNPs) are differentially weighted (MPMG) has been shown to result in better prediction accuracy than a multi-population, single genomic relationship matrix (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{GRM}}$$\end{document}GRM) GP model (MPSG) in which all SNPs are weighted equally. Our objective was to underpin theoretically the advantages and limits of the MPMG model over the MPSG model, by deriving and validating a deterministic prediction equation for its accuracy. Methods Using selection index theory, we derived an equation to predict the accuracy of estimated total genomic values of selection candidates from population \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A$$\end{document}A (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_{{{\mathbf{EGV}}_{{A_{T} }} }}$$\end{document}rEGVAT), when individuals from two populations, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A$$\end{document}A and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$B$$\end{document}B, are combined in the training population and two \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbf{GRM}}$$\end{document}GRM, made respectively from pre-selected and remaining SNPs, are fitted simultaneously in MPMG. We used simulations to validate the prediction equation in scenarios that differed in the level of genetic correlation between populations, heritability, and proportion of genetic variance explained by the pre-selected SNPs. Empirical accuracy of the MPMG model in each scenario was calculated and compared to the predicted accuracy from the equation. Results In general, the derived prediction equation resulted in accurate predictions of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_{{{\mathbf{EGV}}_{{A_{T} }} }}$$\end{document}rEGVAT for the scenarios evaluated. Using the prediction equation, we showed that an important advantage of the MPMG model over the MPSG model is its ability to benefit from the small number of independent chromosome segments (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$M_{e}$$\end{document}Me) due to the pre-selected SNPs, both within and across populations, whereas for the MPSG model, there is only a single value for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$M_{e}$$\end{document}Me, calculated based on all SNPs, which is very large. However, this advantage is dependent on the pre-selected SNPs that explain some proportion of the total genetic variance for the trait. Conclusions We developed an equation that gives insight into why, and under which conditions the MPMG outperforms the MPSG model for GP. The equation can be used as a deterministic tool to assess the potential benefit of combining information from different populations, e.g., different breeds or lines for GP in livestock or plants, or different groups of people based on their ethnic background for prediction of disease risk scores.

numerically small breeds or lines in livestock or numerically small human ethnic groups, it is difficult or impossible to assemble a large enough training population that can accurately predict the genomic values. Therefore, the accuracy of GP in numerically small populations is limited [9].
A potential option to increase the accuracy of GP in numerically small populations is to use a large training population made up of individuals from multiple populations, including the target population, a method known as multi-population GP. Results from dairy cattle indicate that this approach can lead to substantial increases in the accuracy of GP for numerically small breeds, if the training population is made up of individuals from different but closely-related breeds that have recently had substantial exchanges of genetic material, and that a large number of individuals from the additional breed is included [10]. However, in cases in which distantly related breeds were combined in a single training population, increases in the accuracy of multi-population GP were limited compared to that of within-population GP [11][12][13].
Different statistical approaches have been adopted for multi-population GP. One approach, and the most straightforward, is the univariate single-trait approach in which individuals from multiple populations are pooled and treated as individuals from the same population in a training population. The underlying assumption of this approach is that the genetic correlation between the populations is equal to 1. Deviations from this assumption, for example, when distantly related populations are combined, can result in prediction accuracies that are even lower than the accuracy of within-population GP [11]. Another approach used for multi-population GP is to consider phenotypes of individuals from different populations, e.g. phenotypes from different, but correlated traits [14,15]. The advantage of this multi-trait approach is that it can consider that the genetic correlation between populations can be less than 1. In the worstcase scenario, the accuracy of multi-population GP using a multi-trait approach is expected to be the same as the accuracy of within-population GP, but not lower [16].
In both the single-trait and the multi-trait approaches for multi-population GP, DNA markers such as single nucleotide polymorphisms (SNPs) are equally weighted in the model. However, some studies have shown that accuracy of GP can be improved by prioritising certain SNPs that have a significant effect on the trait or by incorporating prior biological knowledge on SNPs in the prediction model [17][18][19]. Based on that rationale, and to improve the potential to use information from other populations, Raymond et al. [20] proposed the so-called multi-breed, multiple genomic relationship matrices (GRM) GP model (MBMG), which in this study is generalised as MPMG, given that the model can be applied in other species, e.g., plant or humans. The three key features of this model are: (1) SNPs are preselected based on prior knowledge of potential causal effects and are used to build a GRM ; (2) the remaining unselected SNPs are used to make a separate GRM , in order to explain the residual genetic variance for the trait; and (3) information of each population in the training population is weighed by their genetic correlation with the other populations and with the selection candidates. The MPMG model is equivalent to a model with a single GRM, in which different weights are applied to the two classes of SNPs. Using both real and simulated data, Raymond et al. [20] showed that the MPMG model can result in significant increases in the accuracy of GP, as compared with a multi-trait approach in which all SNPs are pooled together in a single GRM (MPSG) [20]. Given the superior performance of the MPMG model over MPSG, the objective of this study was to underpin theoretically the advantages and limits of the MPMG model as compared to the MPSG model, by deriving and validating a deterministic prediction equation for the accuracy.

Multi-population, multiple genomic relationship matrices (MPMG) model
We assume that individuals from populations A and B are combined in the training population to predict the genomic value of selection candidates from population A using the MPMG model following Raymond et al. [20]. The MPMG model is a bivariate model that considers the phenotypes of individuals from populations A and B for the same trait as those from two different, but correlated traits. The prior biological knowledge that exists about the effect sizes of the SNPs is used to pre-select important SNPs that are used to build one GRM . The remaining SNPs are used to build a second GRM to explain the residual genetic variance not explained by the preselected SNPs. Both GRM are fitted simultaneously in the bivariate model [20]. The model can be specified as: where subscripts 1 and 2 represent the first and second GRM fitted in the model and subscripts A and B represent the populations A and B . y A is a vector of phenotypes for individuals from population A and y B is a vector of phenotypes for individuals from population B , µ is the trait mean, W 1 and W 2 are incidence matrices linking phenotypes to the two estimated genomic values,EGV 1 (1) and EGV 2 , and e is the residual. Estimated genomic values are assumed to be normally distributed as: where σ 2 g A and σ 2 g B are genetic variances in populations A and B , respectively, and σ g A,B is the genetic covariance between the populations. The multi-population GRM fitted in the MPMG model are calculated according to Wientjes et al. [21].

Theory
In the following derivation, the main interest is to predict the accuracy of the estimated total genomic value of selection candidates from population A ( r EGV A T ). r EGV A T is a product of the accuracy of estimating SNP effects ( r SNP ) and the square root of the proportion of total genetic variance explained by SNPs ( ρ ) [6,22,23]. As a foundation, we will first derive an equation to predict r EGV A T for within-population GP using two GRM , made from two separate sets of SNPs, that are fitted simultaneously in a GREML model. Subsequently, we will derive the full equation to predict r EGV A T when individuals from two populations, A and B , are combined in the training population and two separate GRM are fitted simultaneously in a GREML model (MPMG). For the derivations, we will use selection index theory [24], and build upon works from Daetwyler et al. [6] and Wientjes et al. [16], who have done similar derivations for within-and multipopulation GP models in which all SNPs are equally weighted in a single GRM.

Accuracy of within-population genomic prediction with two separate GRM (WPMG)
The within-population, multiple GRM model (WPMG) can be represented as: With the WPMG model, two different sets of estimated genomic values are obtained for the validation candidates. These are EGV A 1 , due to GRM 1 , and EGV A 2 , due to GRM 2 . Both estimates of genomic values can be combined as sources of information in a selection index approach to obtain EGV A T as follows: and (3) where b A 1 and b A 1 are weighting factors for the two sources of information. In the context of selection index theory, the breeding goal ( H ) is TGV A and the index trait ( I ) is EGV A T . The optimum values of b A 1 and b A 2 can be obtained as: where P is the (co)variance matrix of information sources EGV A 1 and EGV A 2 , and g is a vector containing the covariances between information sources EGV A 1 and EGV A 2 and the true genomic value ( TGV A ). Thus: and For simplicity of the derivation, we will assume that TGV A are scaled such that they have a variance of 1. Therefore, the variance of the estimated genomic values can be calculated as the reliability ( r 2 ) of the estimated genomic values.
We assume that there is no covariance between EGV A 1 and EGV A 2 , since the expectation is zero when both GRM are jointly fitted [25]. Thus, when GRM are fitted simultaneously in a GREML model, only partial variances are explained by each of the GRM such that the sum of the variances explained by the two GRM equals the total genetic variance for the trait that can be explained by all SNPs. The partial variances explained by SNPs in each GRM can be viewed as partial regression coefficients in a multiple regression scenario.
The first element of the vector g is: Thus, var EGV A 1 = r 2 Hence, P = r 2 Therefore, The accuracy of selection index, representing the accuracy of EGV A T is the correlation between the index values and the breeding goal ( r IH ). Thus, r IH = cov IH σ I σ H . As explained in Falconer and MacKay [26], the selection index is constructed such that one unit of the index is equivalent to one unit of the breeding goal. In other words, the selection index is constructed such that the regression of the breeding goal on the index ( b HI ) is 1, resulting in the expression cov IH = σ 2 I . Thus, The accuracies of estimated genomic values r EGV A 1 and r EGV A 2 can be calculated as ρ A 1 r SNP A 1 and ρ A 2 r SNP A 2 respectively, where r SNP A 1 and r SNP A 2 are the accuracies of estimated SNP effects in population A for SNPs in GRM 1 and GRM 2 respectively, ρ A 1 and ρ A 2 are the square root of the proportions of genetic variance explained in the validation population A by SNPs in GRM 1 and GRM 2 , respectively. The accuracies of estimated SNP effects (9) where h 2 A is the heritability of the trait in population A , N A is the number of individuals from population A in the training population, M e A 1 and M e A 2 are the effective number of chromosome segments segregating in population A , based on variation in GRM 1 and GRM 2 , respectively.
The M e values represent the effective number of effects that are estimated in the model. The values for M e can be calculated as the inverse of the variance of within-population GRM [27,28].

Accuracy of multi-population GP with two separate GRM (MPMG model)
In the case of multi-population GP model with two GRM fitted simultaneously, we assume that individuals from two populations A and B are combined in a training population to estimate the total genomic values for validation candidates from population A ( EGV A T ) . To estimate the accuracy of EGV A T from this model, we estimate the accuracy of a selection index in which EGV for the selection candidates are combined from two different models having either population A or B as training population. The first model is a WPMG model using individuals from population A in the training population ( EGV A 1,A and EGV A 2 ,A ). The second model is an across-population model with two GRM using individuals from population B in the training population ( EGV A 1,B and EGV A 2 ,B ). The selection index was as follows: The (co)variance matrix of information sources is given as: (11) r SNP A 1 and r SNP A 2 can be deterministically predicted by Daetwyler's equation [6]. Hence, Eq. (10) can be written as: Again, we assume that the covariances between EGV based on GRM 1 and GRM 2 are zero. Given also that we assume a variance of 1 for TGV A : Following Eq. (9), g can be written as: The accuracy of the index, representing rĝ A T can be calculated as: With some algebra (see Appendix 2), we show that the equation g ′ P −1 g for the MPMG model can be represented as: which in matrix form can be represented as: is the covariance of the EGV A 1 calculated using SNP effects estimated in the reference populations A and B , respectively. Following the derivation in Appendix 1, we show that: The input parameters are: For predicting r EGV A T , Eq. (17) reduces to Eq. (11) for within-population GP when r g between populations A and B is 0. The values for M e AB can be calculated as the inverse of the variance of the across-population block of multi-population GRM [16,27].

Validation of prediction equations using simulations
The aim of this section was to use simulations to validate Eqs. 11 (WPMG) and 17 (MPMG) in scenarios that differed in the proportion of causal SNPs that are preselected and fitted in the models. Consequently, the scenarios differed in the proportion of total genetic variance explained by SNPs in each of the two GRM fitted simultaneously in the models. The scenarios also differed in heritability of the trait. For the MPMG model, the scenarios also differed in the level of genetic correlation between the populations A and B . Genotype data of two existing cattle populations were used in combination with simulated phenotypes. As validation, we compared the empirical accuracies in each simulated scenario to the accuracy obtained using the derived prediction equations.

Genotype data
Genotypes for SNPs on the Illumina Bovinesnp50 (Illumina Inc., San Diego, CA, USA) with 48,912 SNPs after quality control, were available on 595 New Zealand Jersey bulls and 5553 Dutch Holstein bulls. These SNPs had at least ten copies of the minor allele in a combined Dutch Holstein and New Zealand Jersey population, with a minor allele frequency (MAF) ranging from 0.009 to 0.5. Hereafter, we will refer to the New Zealand Jerseys as population A and to the Dutch Holsteins as population B.

Simulation of phenotypes
Phenotypes for all individuals were simulated using their real genotypes and assuming an additive model. From the 48,912 SNPs, 500 were randomly selected to be causal SNPs in both populations. Allele substitution effect of the causal SNPs ( a) were sampled from a bi-variate normal distribution with a mean of 0, variance of 1, and a correlation of 0.8, 0.6 and 0.4 between populations A and B . Since allele substitution effects were sampled independently from their allele frequency, the correlation between allele substitution effect represents the correlation between genomic values of individuals from populations A and B , which is referred to as the genetic correlation between populations ( r g ). Within each population, TGV for individual i were calculated as x i,j * a j , where x i,j is the genotype of individual i at causal SNP j (coded as 0, 1, 2) and a j is the allele substitution effect of causal SNP j . The corresponding phenotype was computed as TGV i + e i , where e i is the residual effect of individual i , sampled from a standard normal distribution with a mean of 0 and a variance equal to For each population, the residual effects were sampled from a separate normal distribution. The heritability ( h 2 ) was set to 0.3 and 0.8 in each population. Simulation of phenotypes was carried out in R [29] and was replicated 100 times.

Genomic prediction
The WPMG (only 476 individuals from population A in the training population) and MPMG (476 individuals from population A and 5553 individuals from population B in the training population) models were implemented in the software MTG2 [30]. We used three levels for the proportion of causal SNPs that are identified, preselected and used to create the first GRM . The number of causal SNPs underlying the simulated trait was always 500. The levels are: CSNP_125: this level represents a situation in which a quarter of the causal SNPs are identified, preselected and used to create the first GRM. CSNP_250: this level represents a situation in which half of the causal SNPs are identified, pre-selected and used to create the first GRM. CSNP_500: this level represents the extreme situation in which all 500 causal SNPs are identified, preselected and used to create the first GRM.
At all levels, the remaining SNPs that were not used as causal were used to create the second GRM . The level above were evaluated under varying degrees of genetic correlation between populations (0.4, 0.6 and 0.8) and heritability in both populations (0.3 and 0.8). Throughout the study, individuals from population A were used as the validation candidates in a fivefold cross-validation scheme, where individuals from population A were randomly split into five sets of 119 individuals each. The GRM fitted in the MPMG model were constructed according to Wientjes et al. [21], considering populationspecific allele frequencies. The empirical accuracies of prediction at all levels were computed as the correlation between the EGV and the simulated TGV of validation candidates.
To compare how accurate the accuracy of the MPMG model can be predicted compared to a MPSG model, we also fitted a multi-population, single GRM (MPSG) model, with the multi-population GRM made from non-causal SNPs in addition to either the CSNP_125, CSNP_250 or CSNP_500 SNPs. To predict the accuracy of the MPSG model, we used the derived prediction Eq. (17) in which ρ 2 A 2 is set to 0, given that the model has only one GRM . The value used for ρ 2 A 1 was empirically estimated as the ratio of empirical accuracy of the WPSG model and the predicted accuracy using the formula of Daetwyler et al. [6], assuming all the variance is captured by SNPs. Setting ρ 2 A 2 to 0 reduces Eq. (17) to the equation derived by Wientjes et al. [16].

Values of input parameters for the prediction equations
For the prediction of accuracy using Eqs. (11) and (17), we used the simulated values as input for the parameters h 2 A , h 2 B , and r g . The values used for ρ 2 A 1 were 0.25, 0.5 and 1 in the CSNP_125, CSNP_250 and CSNP_500 levels, respectively. This is because a quarter of the causal SNPs explains, on average, a quarter of the total genetic variance for the trait, which was confirmed empirically. An empirical approach was used to determine the appropriate values for the input parameter ρ 2 A 2 . This parameter represents the proportion of total genetic variance of the trait in the validation population explained by the non-causal SNPs ( GRM 2 ) in the training population. We determined empirically that the non-causal SNPs could only explain 66% of the total genetic variance of the trait in the validation individuals of population A using a within-population model with one GRM including all non-causal SNPs. We did this by taking a ratio of the empirical accuracy obtained from cross-validation and the predicted accuracy using Daetwyler's equation [6], assuming that 100% of the total genetic variance for the trait is captured by SNPs. Thus, the values for ρ 2 A 2 as used in the prediction equation were 0.66 × 0.75, 0.66x0.5 and 0, in the CSNP_125, CSNP_250 and CSNP_500 levels, respectively, with the values 0.75, 0.5 and 0 representing the proportion of total genetic variance unexplained by the causal SNPs ( GRM 1 ) in the CSNP_125, CSNP_250 and CSNP_500 levels, respectively. Throughout the study, M e within a population was calculated according to Lee et al. [28] as the inverse of the variance of the within-population GRM , while M e across populations was calculated as the inverse of the variance of the across-population block terms of the multi-population GRM [16,27].
Potential accuracies of different models in relation to different levels of r g , ρ A 1 , and ME AB 2 The aim of this section was to identify the situations under which the MPMG model will outperform all other models tested in terms of prediction accuracy. We evaluated the potential accuracy of predicting the genomic values of selection candidates from a numerically small population A under three cases. These are hypothetical cases that aim to mimic real life situations in dairy cattle breeding programs.

Case 1
For the first case, we assume that, in addition to individuals from the target population A ( N A = 476) , individuals from a larger but different population B ( N B = 5553 ) are available to be included in the training population, mimicking the real sample sizes of the Jersey and Holstein populations used in this study. We investigated the effect of the level of genetic correlation between populations on the accuracy of prediction. The following additional assumptions were made: M e within population A based on pre-selected SNPs (calculated from real genotype data) = 159; M e across populations A and B based on preselected SNPs (calculated from real genotype data) = 280; M e within population A based on remaining SNPs (calculated from real genotype data) = 463; M e across populations A and B based on remaining SNPs (calculated from real genotype data) = 32,970; heritability of the trait: 0.3 in both populations; ρ 2 due to 500 pre-selected causal SNPs = 0.4, ρ 2 due to all SNPs = 0.8 (assuming the remaining non-causal SNPs capture 66% of the residual genetic variance).

Case 2
Many genome-wide association studies have been carried out in livestock with the aim to identify the causal variants underlying complex traits. The variants that were discovered explain varying proportions of the genetic variance for the traits of interest. Here, we evaluated the potential accuracy of prediction under situations ranging from poor causal SNP discovery (discovered "causal SNPs" explain 0% of genetic variance) to accurate causal SNP discovery (discovered causal SNPs explain 100% of genetic variance for the trait). The following additional assumptions were made: M e within population A based on pre-selected SNPs (calculated from real genotype data) = 159; M e across populations A and B based on 500 pre-selected SNPs (calculated from real genotype data) = 280; M e within population A based on 48,412 remaining SNPs (calculated from real genotype data) = 463; M e across populations A and B based on 48,412 remaining SNPs (calculated from real genotype data) = 32,970; genetic correlations between populations A and B = 0.6; heritability of the trait = 0.3 in both populations; proportion of genetic variance explained by all SNPs = 1.

Case 3
In this case, our aim was to explore the impact of the closeness between the validation population and the training populations as measured by the M e across populations. In most studies, the number of variants identified as causal for complex traits is at most a few hundred. In the context of the MPMG model, the M e across populations based on the identified potential causal SNPs is usually small, with an upper bound equal to the total number of identified "causal SNPs". A parameter that is expected to be considerably large, especially with increasing SNP density is the M e based on the remaining unselected SNPs. Here, we evaluated the effect of M e across populations based on the remaining unselected SNPs on the potential accuracy of prediction. We varied M e across populations from 1000 to 50,000. The following additional assumptions were made: M e within 0 to 90%. The following additional assumptions were made: M e across populations A and B based on 500 preselected SNPs (calculated from real genotype data) = 280; M e across populations A and B based on 48,412 remaining SNPs (calculated from real genotype data) = 32,970; genetic correlations between populations A and B : = 0.6; heritability of the trait = 0.3 in both populations; proportion of genetic variance explained by SNPs in In all four cases, we evaluated the potential accuracy of three models using their prediction equations as follows.
Within-population, single-GRM (WPSG) model: To predict the potential accuracy of this model, we used the formula of Daetwyler et al. [6], which takes the proportion of genetic variance explained by all SNPs ( . The value for M e A 1 was calculated based on all SNPs. Within population, multiple GRM (WPMG) model: to predict the potential accuracy of this model, we used the derived prediction Eq. (12).
Multi-population, single GRM (MPSG) model: to predict the potential accuracy of this model, we used the formula of Wientjes et al. [16]: Here also, M e A and M e AB were calculated based on all SNPs.

Number of independent chromosome segments ( M e ) within and across populations
The number of independent chromosome segments per SNP set estimated within population A (595 New Zealand Jersey) and across populations A and B (5553 Dutch Holsteins) are in Table 1.
The same value of M e within population A (280) was obtained when all 48,912 SNPs and when only the noncausal SNPs were used to construct the GRM . Estimated M e within population A differed markedly between SNP sets only when the number of SNPs used to calculate the GRM was small. At lower SNP densities, M e across population A and B were close to the number of SNPs used to calculate the GRM . M e across populations obtained with higher density SNPs were similar.
population A based on 500 pre-selected SNPs (calculated from real genotype data) = 159; M e across populations A and B based on 500 pre-selected SNPs (calculated from real genotype data) = 280; M e within population A based on 48,412 remaining SNPs (calculated from real genotype data) = 463; genetic correlations between populations A and B = 0.6; heritability of the trait = 0.3 in both populations; ρ 2 due to 500 pre-selected causal SNPs = 0.4, ρ 2 due to all SNPs = 0.8 (assuming the remaining non-causal SNPs capture 66% of the residual genetic variance).

Case 4
The study of Van den Berg et al.  Predicted -CSNPs_250 E mpirical -CSNPs_250 Predicted -CSNPs_500 E mpirical -CSNPs_500 As expected, given the large number of M e across populations, empirical accuracies were not significantly different between scenarios differing in r g and percentage of causal SNPs included in the GRM . The standard errors of empirical accuracies were higher at low heritability (Fig. 2), than at high heritability (Fig. 3), which most likely reflects the higher level of noise in the phenotype at low heritability than at a high heritability. As expected, an increase in heritability resulted in an increase in empirical accuracies. In general, the prediction equation for the accuracy of the MPSG model resulted in an over-prediction of empirical accuracy ranging from ~ 5 to 10%.

Empirical versus predicted accuracies of the multi-population, multiple GRM (MPMG) model
The empirical and predicted accuracies of the different levels of pre-selection of causal SNPs, with a simulated genetic correlation between populations A and B of 0.4, 0.6 and 0.8 and a heritability of 0.3 in both populations, are in Fig. 4.
Results show that empirical and predicted accuracies increase with an increasing number of pre-selected causal SNPs in the first GRM , and the level of genetic correlation between populations. Using the prediction equation (Eq. 17), predicted accuracies were less than one standard error away from the average empirical accuracy in seven of the nine scenarios evaluated. We observed overprediction of accuracies in all the scenarios, but only in the CSNP_125 and the CSNP_250 and with a genetic correlation between populations of 0.4 did the predicted accuracies go outside the standard error of the empirical accuracies.
For a higher heritability trait (0.8), similar patterns of results were observed (Fig. 5). Empirical and predicted accuracies increase with increasing number of pre-selected causal SNPs in the first GRM , and the level of genetic correlation between populations. We also observed slight over-predictions of accuracy in the CSNP_125 and CSNP_250 levels, across the three levels of genetic correlation between populations. However, for the CSNP_500 level, we observed a slight under-prediction of accuracies, across the three levels of genetic correlation between populations. The level of over-estimation of empirical accuracy seems to be consistently higher at low heritability (Fig. 4) than at high heritability (Fig. 5). This also the case for the WPMG model ( Fig. 1) and for most scenarios of the MPSG model ( Figs. 2 and 3). A possible explanation might be that for a lower heritability trait, the SNPs have more difficulty in explaining all the genetic variance. This can result from a larger environmental effect in the phenotypes, which can be considered as a noise term in the phenotype around the genetic effect. Furthermore, the level of over-estimation of empirical accuracies with the MPMG model (between ~ 2 and ~ 10%, Figs. 4 and 5) is within the range of over-estimation observed with the MPSG model (between ~ 5 and 10%, Figs. 2 and 3). Hence, the relative advantage of the MPMG model over the MPSG model as assessed by the corresponding deterministic prediction equations should be a good indication of the true advantage of the MPMG over the MPSG model. In general, we observed a positive correlation between the EGV from the two GRM in MPMG, albeit with high standard errors, except in the CSNP_500 for which all the QTL underlying the trait are in one GRM , where the correlation was around zero (see Additional file 1: Table S1).

Potential accuracies of different models in relation to different levels of r g , ρ A 1 , and M e AB 2
The potential accuracies of within-and multi-population GP models, with either single or multiple GRM fitted, in relation to different levels of genetic correlation ( r g ; case 1) between populations are presented in Fig. 6. The accuracy of the within-population, single GRM (WPSG) model, is not affected by r g between populations. The result shows that a 9.6% increase in accuracy is possible by splitting SNPs into separate GRM based on prior information on their causality (WPMG model), when the preselected SNPs explain 40% of the total genetic variance for the trait. Compared to WPSG, the multi-population, single GRM (MPSG) model can result in a small increase in the accuracy of prediction, ranging from 0% ( r g = 0) to a maximum of 3.1% ( r g = 1). When the multipopulation, multiple GRM (MPMG) model is implemented, the increase in accuracy as compared to the WPSG model ranges from 9.6% ( r g = 0) to 32% ( r g = 1),  again, assuming that the preselected SNPs explain 40% of the genetic variance. The potential accuracies of the different models in relation to different proportions of total genetic variance explained by the pre-selected SNP set ( ρ A 1 ; case 2) are presented in Fig. 7. When the pre-selected SNP set explains zero proportion of genetic variance, the accuracy is equal for the models with one or two GRM for both within-and multi-population GP. Simply implementing the multi-population single GRM model, instead of the within-population single GRM model results in a negligible (< 1%) increase in the accuracy of prediction. An increasing proportion of genetic variance explained by the pre-selected SNP set results in a linear increase in the accuracy of prediction, up to a maximum of ~ 18.4% when the WPMG model is implemented instead of the WPSG model. Increase in accuracy, as compared to the WPSG model, reaches a maximum of ~ 29.2% when the MPMG model is implemented instead.
We investigated further the impact of the number of independent chromosome segments between populations based on the non-causal SNPs ( M e AB ; case 3) on the accuracy of GP models, the results are presented in Fig. 8. For the WPSG and WPMG models, the parameter M e AB has no relevance. However, in both the MPSG and MPMG models, the accuracy of prediction decreases with increasing M e AB . The rate of decrease in accuracy with increasing M e AB , however, is higher in the MPSG model than in the MPMG model. For example, an increase in M e AB from 1000 to 20,000 resulted in an 11.3% decrease in accuracy of the MPSG model, while this decrease was only 4.8% in the MPMG model. In general, the difference in accuracy between the MPSG and MPMG models increases with increasing M e AB . For M e AB values smaller than 5000, both multi-population models (MPSG and MPMG) have higher accuracies than any within-population GP model. For larger values of M e AB , however, the accuracy of the MPSG model is lower than that of the WBMG model, while the accuracy of the MPMG model tends to flatten above that of the WPMG model. Empirical -CSNPs_250 Predicted -CSNPs_500 Empirical -CSNPs_500 Furthermore, we assessed the potential bias in the predicted accuracies due to the under-estimation of M e within the predicted population A . Across different GP models, an underestimation of M e within population A of ~ 20% resulted in an inflation of predicted accuracies ranging between 4.5 (MPMG model) to 7.4% (WPSG model). At level of under-estimation of 90%, inflation of predicted accuracies ranged from 36 (MPMG) to 58% (WPSG)".

Discussion
The objective of this study was to underpin theoretically the advantages and limits of the multi-population, multiple GRM (MPMG) genomic prediction model over the multi-population, single GRM (MPSG) genomic prediction (GP) model, by deriving and validating a deterministic prediction equation for its accuracy. We derived the deterministic prediction equation for the accuracy of the MPMG model using selection index theory and building upon previous works by Daetwyler et al. [6] and Wientjes et al. [16]. We showed that, the derived equation can predict the accuracy of the MPMG model under varying levels of genetic correlation between the target population and the additional population in the training population, varying levels of the heritability of the trait and varying levels of the proportions of genetic variance explained by the pre-selected and differentially weighted SNPs. The equation can be used to assess the potential benefit of combining information from different populations, e.g., different breeds or lines for GP in livestock or plants, or different groups of people based on their ethnic background for the prediction of disease risk scores.
To date, in the literature, increase in the accuracy of GP in a multi-population, single-GRM (MPSG) context as compared to within-population GP has been limited as illustrated in Table 2 of the review paper of Lund et al. [32]. This is consistent with our result shown in Fig. 7, where less than 1% increase in accuracy is projected if the MPSG model is implemented, assuming a genetic correlation of 0.6 between populations, instead of a Predicted -CSNPs_250 Empirical -CSNPs_250 Predicted -CSNPs_500 Empirical -CSNPs_500 within-population, single-GRM model. However, Raymond et al. [20] showed that the MPMG model, which differentially weights SNPs based on prior knowledge of potential causality, can yield significant increases in the accuracy of GP as compared to a within-population or multi-population model in which all markers are equally weighted. The prediction equation developed in this study, highlights two parameters: number of independent chromosome segments across populations A and ( M e AB ), and the proportion of genetic variance explained by preselected SNPs ( ρ 2 A 1 ), that may underlie the improved performance of the MPMG model as compared to a single GRM model. These parameters and their estimation are discussed below. We also discuss the values for the genetic correlation between populations as used in the prediction equation.
The number of independent chromosome segments across populations ( M e AB ) is an important parameter that influences the accuracy of multi-population GP, since it determines the effective number of effects that are estimated in the model [22,33]. For example, Wientjes et al. [16] showed that when M e AB is large, combining populations together in a multi-population single GRM model is less likely to result in a significant increase in accuracy as compared to single-population GP model. However, the prediction equation developed here shows that the MPMG model is still able to take advantage of information from distantly related populations (large M e AB ), mainly by partitioning the parameter M e AB into two components, corresponding to pre-selected and remaining SNPs, respectively. In most cases, the value of M e AB due to the pre-selected SNPs is small, given that, in most cases, only a few hundred SNPs are pre-selected from e.g., a genome-wide association study. The small value for M e AB due to pre-selected SNPs means that the accuracy contributed by the pre-selected SNPs is high, and completely unaffected by the value for M e AB due to the remaining SNPs. For distantly related populations such as Holstein and Jersey, a good estimate for M e AB based on a few pre-selected SNPs is the number of pre-selected SNPs (Table 1), which in any case is the maximum possible value. For M e AB based on the remaining SNPs, it is very difficult to get an estimate without genotype data, as this value depends on the relatedness between populations. Similar to Wientjes et al. [16], our suggestion is to consider genotyping a sample of individuals from each of the populations, e.g. 100 each, from which M e AB can be estimated. With the availability of genotype data on individuals from the target population, the number of independent chromosome segments within population ( M e A ) can be estimated empirically as the reciprocal of the variance of within-population genomic relationships [27,28,34]. It is also possible to estimate M e A based on population parameters such as the effective population size [22,35,36]. However, this approach cannot be used to partition M e A into values corresponding to the pre-selected and remaining SNPs, respectively, which are required for the prediction equation.
All the M e values used in our prediction equation were estimated from the GRM . Van den Berg et al. [31] argued that M e values estimated from the GRM are under-estimated, and, when used in a prediction equation, result in the over-prediction of empirical accuracy. We observed that an underestimation of M e within the target population A results in a substantial inflation of predicted accuracies (Fig. 9). The prediction equation developed in our study, with M e values calculated from the GRM tend to over-predict the accuracy of the MPMG model, although in most of the scenarios evaluated, predicted accuracies were still within the standard errors of empirical accuracies. However, the extent of over-prediction of accuracy in van den Berg et al. [31], using the deterministic formula of Goddard et al. [22] with M e values estimated from the GRM , was much higher than in our study. We cannot pinpoint, with certainty, the underlying reasons for the differences in the extent of over-prediction of [31] fitted a polygenic component (pedigree-based relationship matrix) in their model, which was meant to capture the genetic variance for the trait that is not picked up by SNPs. Although the authors did not present the results of variance components estimation, it is likely that the polygenic component picked up some proportion of the total genetic variance for the trait. If that is the case, using a prediction equation with the assumption that SNPs explain 100% of the total genetic variance for the trait, as the authors did, could have resulted in the over-prediction of empirical accuracies. The chances that the polygenic component picks up some proportion of the total genetic variance for the trait are higher when QTL are sampled from sequence variants that have low MAF, and effects that are difficult to estimate or regressed heavily towards zero in the model, than when QTL are sampled from common SNPs [16]. Furthermore, in their prediction equation, van den Berg et al.
[31] corrected for the reduction in prediction error variance as the accuracy of predicted genomic values increases. This correction results in higher predicted accuracies than when the correction is not applied. In our study, however, this correction was not applied. There are other factors, such as the difference in the design of cross-validation schemes to calculate empirical accuracies and structure of the populations analyzed that could potentially underlie the difference in the extent of over-prediction of empirical accuracies between the two studies. The ability of the MPMG model to take advantage of the small value of M e AB due to pre-selected SNPs depends on the accuracy of SNP pre-selection, which in turn determines the proportion of genetic variance explained by the pre-selected SNPs. Our results (Fig. 7) show that for an improvement in accuracy of GP by using the MPMG instead of the MPSG model, it is not sufficient to split randomly SNPs into two classes without accurate prior knowledge on potential causality of the SNPs. Instead, the pre-selected SNPs must explain some proportion of the genetic variance for the trait. Our results agree with those of Sarup et al. [33], who showed that a significant improvement in accuracy by using the "genomic feature" model over the standard GBLUP could be achieved provided the prioritised variants explained more than 10% of the total genetic variance for the trait.
In their study, Sarup et al. [33] fitted a linear mixed model including two random genomic effects, with the genetic effects estimated from "genomic feature" or prioritized variants and the remaining variants, respectively. In general, when the pre-selected SNPs explain zero proportion of genetic variance, the accuracy of the MPMG model is expected to be the same as the accuracy of the MPSG model (Fig. 7). Furthermore, if the prior information is accurate, such that the pre-selected SNPs explain some proportion of the genetic variance for the trait, fitting the pre-selected SNPs in a separate GRM in a within-population GP scenario may be more beneficial than combining distantly related populations in a multi-population single-GRM model (Fig. 6).
In this study, the proportions of total genetic variance explained in the validation population by the SNPs in the first GRM ( ρ 2 A 1 ) was determined in a simulation context. There was 100% certainty that the selected SNPs were causal, and that ρ 2 A 1 did not depend on linkage disequilibrium (LD) between non-causal SNPs and the underlying causal SNPs. In this case, we found that the estimate of genomic heritability (proportion of the genetic variance explained by the SNPs) from the GREML model [37,38] was equivalent to ρ 2 A 1 (results not shown). de los Campos et al. [39] also showed that when all causal mutations are included in the GREML analysis, the genomic heritability parameter is equivalent to the proportion of genetic variance explained by SNPs ( ρ 2 ) . However, in practice, the causal mutations that underlie a trait are not always observed, and pre-selected SNPs will usually come from GWAS studies, in which SNPs can show an association with a trait due to LD with unobserved causal mutations. When SNPs that are proxies for causal mutations are used for the analysis, the variance components estimates depend on the effects of the unobserved causal mutations, the extent of LD between SNPs and the causal mutations, . This means that the estimate of genomic heritability from the GREML model cannot be considered to be equivalent to ρ 2 . Browning et al. [40] also showed that population structure can inflate SNP-based heritability estimates. Although genomic heritability from GREML models are higher than ρ 2 , they are still a good indication for ρ 2 , as they have similar trend [41]. Thus, for comparison of the proportion of genetic variance explained by SNPs in different SNP sets, genomic heritability from a GREML model can be used.
In practice, when two GRM are fitted simultaneously in a bivariate GREML model, two estimates of genetic correlation are obtained, one for each GRM . However, in this study, we used only the simulated value for the genetic correlation between populations for both GRM in the prediction Eq. (17). This is because the pre-selected SNPs were randomly sampled from all causal SNPs. In practice, a general expectation is that causal SNPs that are pre-selected from e.g. GWAS have a higher effect on the trait than the remaining unselected SNPs, and are most likely to be more consistent across populations [20,42]. Therefore, the genetic correlation of the preselected SNPs can be higher than that for the remaining SNPs. In the cases that this expectation holds, it is inappropriate to use a single value of genetic correlation for both GRM in the prediction Eq. (17). A solution to this issue is to use the estimated genetic correlation. However, one must take into account that the estimated genetic correlation can be biased when the causal and non-causal SNPs used to estimate the genetic correlation do not have similar properties, e.g., similar pattern of allele frequencies [43]. In general, it is likely that causal SNPs have lower MAF than non-causal SNPs, which means that estimates of genetic correlation obtained by non-causal SNPs can be underestimating the genetic correlation between populations at the causal SNPs [20].
In the derivation of the predicting Eq. (17), we assumed that there is no covariance between EGV from different GRM , since the expectation is zero when the GRM are jointly fitted [25]. Because a negative sampling covariance might appear when effects cannot be estimated independently, and the covariance might bias predicted accuracies using the derived equation, we tested this assumption. We calculated the correlations between EGV A 1 and EGV A 2 in each replicate and averaged these across replicates. In general, we observed a positive correlation between the EGV from the two GRM in MPMG, albeit with high standard errors, except in the CSNP_500 in which case all the causal SNPs underlying the trait are in one GRM , where the correlation was around 0 (see Additional file 1: Table S1). However, in most cases the correlations were not significantly different from 0. The positive correlation between the EGV mean that individuals that have a high EGV based on the pre-selected SNPs ( GRM 1 ), on average also have a high EGV based on the rest of the genome ( GRM 2 ). To check if the observed correlation between EGV is an artefact of our simulation, we summed the effects of all causal SNPs in GRM 1 as TGV 1 and summed the effect of all causal SNPs not in GRM 2 as TGV 2 . Across 100 replicates, we observed no correlation between the TGV : -0.007 (0.07) for CSNP_125 and 0.01 (0.07) for CSNP_250. The observed correlation between EGV do not seem to be a result of sampling covariance, since the expected sign of the resulting correlation is negative, and are not an artefact of our simulation, given that the TGV based on the separate sets of causal SNPs were not correlated. We think that the empirical correlation between EGV are a result of an estimation issue, but we are not sure. In any case, given that the level of overestimation of empirical accuracy of the MPMG model is in the range of overestimation observed in a model with only one GRM (MPSG), the impact of any possible covariance between the EGV on the predicted accuracy of the MPMG model is expected to be small.

Conclusions
In this paper, we presented a deterministic prediction equation for the accuracy of a multi-population, multiple GRM (MPMG) model, which gives insight into the underlying reasons for the superior performance of the MPMG model over the multi-population, single GRM (MPSG) model. With the help of the prediction equation, we showed that an important advantage of the MPMG model is its ability to benefit from the small number of independent chromosome segments ( M e ) due to the preselected SNPs, both within and across populations, while for the MPSG model, there is only a single value for M e , calculated based on all SNPs. However, this advantage depends on the condition that the pre-selected SNPs can explain some proportion of the total genetic variance for the trait. The prediction equation developed here can be used as a deterministic tool to assess the potential benefit of combining information from different populations e.g., different breeds or lines for GP in livestock or plants, or different groups of people based on their ethnic background for genomic prediction in humans. This is especially the case, when accurate biological information about the causality of the SNPs is available.

Appendix 1
where X A i,j is the genotype of validation candidate i from population A at locus j , β A 1j and β B 1j are the estimated SNP effects for locus j in training populations A and B , respectively, and r g is the genetic correlation between populations A and B.

Appendix 2
Matrix P and vector g are shown below: Since P is a block diagonal matrix, it can be split into 2x2 sub-matrices for ease of inversion: Then, P = P 11 0 0 P 22 and P −1 = P −1 11 0 0 P −1

22
. The inverse of the two non-zero sub-matrices of P can be calculated as: Similarly: Simplifying the equation above results in: The reliability of estimated genomic values is a product of the reliability of the estimated SNP effects ( r 2 SNP ) and the proportion of genetic variance explained in the  validation population by the SNPs ( ρ 2 ) . Ignoring ρ 2 , for now, and replacing r 2 SNP with the deterministic equations to predict accuracy of GP [6,27]: then Eq. (21) becomes: Taking into account the proportion of genetic variance explained by SNPs in GRM 1 ( ρ 2 A 1 ) and GRM 2 ( ρ 2 A 2 ), respectively, Eq. (23) becomes: Received: 11 September 2019 Accepted: 14 April 2020