Predictive performances of animal models using different multibreed relationship matrices in systems with rotational crossbreeding

Background In livestock breeding, selection for some traits can be improved with direct selection for crossbred performance. However, genetic analyses with phenotypes from crossbred animals require methods for multibreed relationship matrices; especially when some animals are rotationally crossbred. Multiple methods for multibreed relationship matrices exist, but there is a lack of knowledge on how these methods compare for prediction of breeding values with phenotypes from rotationally crossbred animals. Therefore, the objective of this study was to compare models that use different multibreed relationship matrices in terms of ability to predict accurate and unbiased breeding values with phenotypes from two-way rotationally crossbred animals. Methods We compared four methods for multibreed relationship matrices: numerator relationship matrices (NRM), García-Cortés and Toro’s partial relationship matrices (GT), Strandén and Mäntysaari’s approximation to the GT method (SM), and one NRM with metafounders (MF). The methods were compared using simulated data. We simulated two phenotypes; one with and one without dominance effects. Only crossbred animals were phenotyped and only purebred animals were genotyped. Results The MF and GT methods were the most accurate and least biased methods for prediction of breeding values in rotationally crossbred animals. Without genomic information, all methods were almost equally accurate for prediction of breeding values in purebred animals; however, with genomic information, the MF and GT methods were the most accurate. The GT, MF, and SM methods were the least biased methods for prediction of breeding values in purebred animals. Conclusions For prediction of breeding values with phenotypes from rotationally crossbred animals, models using the MF method or the GT method were generally more accurate and less biased than models using the SM method or the NRM method.

are the two-way terminal, three-way terminal, and twoway rotational crossbreeding procedures. Among these crossbreeding systems, genetic analysis is most complicated with phenotypes from rotationally crossbred animals [4][5][6]. Nevertheless, rotationally crossbred animals comprise a possible source of both additional and novel phenotypes. Therefore, in the following, we will focus on genetic analyses with phenotypes from rotationally crossbred animals.
Phenotypes from rotationally crossbred animals are often subject to more variable genetic effects than phenotypes from purebred animals and F1 animals [4,7]. Mating animals from different populations often leads to offspring with a high degree of heterozygosity. For dominance effects, the increase in heterozygosity results in a favorable change in the phenotypic mean and increased dominance variance in subsequent generations [7]. For additive genetic effects, the increase in heterozygosity increases the additive genetic variance in following generations [4]. All the aforementioned changes are relative to the average of the genetic parameters in the constituting purebred populations. Animal breeding focuses mainly on additive genetic effects, which are modelled using additive genetic relationship matrices. Since the usual numerator relationship matrix (NRM) [8] can not correctly model additive genetic effects in rotationally crossbred animals [4], specialized additive relationship matrices are needed.
Specialized additive genetic relationship matrices for crossbred animals exist [4,[9][10][11]. These relationship matrices decompose the additive genetic relationships into a breed-specific term for each breed and a segregation term for each pair of breeds. The partial relationship matrices for the breed-specific terms are analogous to NRM-based matrices and they refer to the additive genetic variances in the purebred base populations. Meanwhile, the partial relationship matrices for the segregation terms model the increased additive genetic variances in crossbred animals. Both types of partial relationship matrices have later been approximated [12] and the theory for the partial relationship matrices for breedspecific terms has been extended to incorporate genomic information [13]. The additive relationship matrix with metafounders was proposed by Legarra et al. [14], and it is an alternative to the partial relationship matrices mentioned above. In theory, the relationship matrix with metafounders simultaneously models both breed-specific and segregation terms with one additive genetic effect [14]. There is a need to investigate how models with these relationship matrices compare for prediction of accurate and unbiased breeding values with phenotypes from rotationally crossbred animals.
The objective of this study was to compare methods for relationship matrices in terms of ability to predict accurate and unbiased breeding values with phenotypes from rotationally crossbred animals. We compared the NRM as used by Poulsen et al. [15], the partial relationship matrices by García-Cortés and Toro [9], the approximate partial relationship matrices by Strandén and Mäntysaari [12], and the relationship matrix with metafounders by Legarra et al. [14].
We hypothesized that the methods by García-Cortés and Toro [9] and Legarra et al. [14] were the most accurate and least biased methods because they are the only methods which fully comply with the theory [4].

Methods
The prediction accuracies and prediction biases of the models with different relationship matrices were investigated through a simulation study. The simulation design represents a two-way-rotational crossbreeding system [3]. In this section, we first present how the populations were simulated. This includes the description of their population structure, genomic architecture, genetic effects, and phenotypes. Then, we present how we predicted breeding values with phenotypes from rotationally crossbred animals using statistical models with different relationship matrices. Lastly, we present how we evaluated and compared the statistical models with different relationship matrices. In the following, we refer to the statistical models with different relationship matrices as methods.

Simulation General
A two-way-rotational crossbreeding system and genomic architecture were simulated with the QMSim software [16]. For all populations, generations did not overlap, the numbers of males and females were equal, sires and dams were chosen at random (no selection), mating was random and sampled without replacement, and the litter size was 6. We simulated 100 replicates. The population structures are shown in Fig. 1.

Historical population
The first generation in the historical population consisted of 3000 animals. The population size was constant for 1000 generations, and over the following 200 generations the population size decreased linearly to 2800 animals at the end.

Purebred populations
We created two purebred populations. Each purebred population was founded by 25 males and 25 females from the last generation in the historical population. The Red: Pedigree information and phenotypes. Green: Pedigree information and genomic information sampling of founders was random and independent for the two purebred populations. The purebred populations were kept separate for 39 generations. At each generation, 25 randomly selected sires were mated with 25 randomly selected dams; i.e., the effective population sizes were approximately 50 [17], and not all animals produced offspring. In the following, the two purebred populations are referred to as Population A and Population B.

Crossbred population
The first crossbred generation was founded by mating 75 males from Population A and 75 females from Population B. These animals were drawn from generation 32 of their respective population. The first generation in the crossbred population is referred to as generation 33. For generations 34 to 39, crossbred animals were created by mating 75 males from one of the purebred populations with 150 females from the crossbred population; i.e., for these generations, each purebred sire was mated with two crossbred dams. Sires were from Population A in odd-numbered generations and Population B in evennumbered generations. In the following, the crossbred population is referred to as Population C.

Genomic architecture
The genome consisted of five 100-cM chromosomes. Each chromosome contained 3500 markers and 350 quantitative trait loci (QTL). Marker positions, QTL positions, and allele frequencies were randomly and uniformly distributed. Marker and QTL genotypes were initialized in the first generation of the historical population.
On average, 12,104 of the 17,500 markers segregated with a minor allele frequency (MAF) higher than 0.01 in generation 32 of either purebred population. Meanwhile, 13,769 markers segregated with a MAF higher than 0.01 when marker genotypes were pooled across the purebred populations. Similarly, 1223 of the 1750 QTL segregated in generation 32 of either purebred population and 1394 QTL segregated when QTL genotypes were pooled across the purebred populations.

Genetic effects
We simulated both additive and dominant QTL effects. Additive and dominant QTL effects were identical across populations.
The additive genetic animal effects were solely based on additive QTL effects. The absolute additive QTL effects were drawn from a gamma-distribution with the standard parameters in QMSim [16]. Additive QTL effects were scaled by QMSim such that the additive genetic animal variance was 0.2 after the historical population [16].
The dominant QTL effects, were simulated as described by Wellmann and Bennewitz [18]: where d Q is a vector of dominant QTL effects; h ∼ N ( 1 2 1, 1 10 I) is a vector of dominance degrees; • is the Hadamard product; β 1 is a vector of additive QTL effects of the first QTL-allele; and β 2 is a vector of additive QTL effects of the second QTL-allele. Dominant genetic animal effects, d, were calculated as the sum of dominant QTL effects where the animal was heterozygous. Dominant genetic animal effects were scaled such that the dominant genetic animal variance was 0.1 in Population C. On average, 8% of the loci showed overdominance; 45% showed partial dominance that was greater than half the allele substitution effect; and 46% showed partial dominance that was less than half the allele substitution effect. (1)

Phenotypes
We defined two phenotypes: a phenotype without dominance effects, y A = a + e , and a phenotype with dominance effects, y AD = a + d + e , where a is a vector of additive genetic animal effects, d is a vector of dominant genetic animal effects, e is a vector of environmental effects, and e ∼ N (0, 0.8I) . Note that y A and y AD have different narrow-sense heritabilities: h 2 A = 0.2 and h 2 AD = 0.2/1.1.

Information used for prediction
We used the information such as to represent a system where only crossbred animals were phenotyped and only the purebred animals were genotyped (Fig. 1). It is common practice to not genotype crossbred animals because it is more important to genotype selection candidates than phenotyped animals [19]. More specifically, pedigree information was kept only for animals born in generations 32 through 39. Animals born before generation 32 were regarded as unknown. Marker information was only available for purebred animals born in generations 35 through 39. Phenotypes were only available for crossbred animals born in generations 33 through 39.

Prediction General
We compared four methods for multibreed relationship matrices, i.e., the NRM [8]; García-Cortés and Toro (GT) [9]; Strandén and Mäntysaari (SM) [12]; and Legarra et al. (MF) [14]. All four methods can be extended to include genomic information using the single-step procedure [13,20,21]. For each method, we describe the theory, pedigree(s), incorporation of genomic information, the statistical model, and calculation of predicted breeding values. In Appendix 1, each method is showcased with a small example pedigree. We highly recommend readers who are unfamiliar with the methods to view Appendix 1 after reading their respective sections in Methods.

The NRM method
This method can be used for multibreed analyses in multiple ways. We use the NRM method such that we have one relationship matrix per breed. This allows us to partition the breeding values of crossbred animals into one term per purebred population. Furthermore, it allows the additive genetic variances to differ between purebred populations. In this study, the NRM method required two relationship matrices; one for terms contributed from Population A, and one for terms contributed from population B.
The recursive algorithm for each of the NRM matrices is: where i and j denote animals, a ij is the pedigree-based covariance between the additive genetic effects of animals i and j, s is the sire of j , and d is the dam of j [8].
The pedigrees for the two relationship matrices were different. The pedigree for Population A included animals from both Population A and Population C, and the pedigree for Population B included animals from both Population B and Population C. To create the pedigree for Population A, animals from Population B were removed from the pedigree and vice versa.
We used two genomic relationship matrices for the NRM method; G NRM , were calculated using VanRaden's first method [22], genotypes from purebred animals in generations 35 to 39, and marker allele frequencies in the respective purebred base-populations. When calculating G VanRaden A and G VanRaden B , a marker was included if its minor allele frequency was higher than 0.01 in its respective purebred base-population. The positive definiteness of genomic relationship matrices was ensured by using the weighted average between VanRaden's first method and the sub-matrix of genotyped animals from its respective pedigree-based relationship matrix: G NRM , were scaled and centered such that their average diagonal and off-diagonal elements were equal to those of the sub-matrices of genotyped animals from their respective pedigree-based relationship matrices. We calculated combined relationship matrices for genotyped and non-genotyped animals [20,21] because some animals were not genotyped. The combined relationship matrices for genotyped and nongenotyped animals were H NRM The statistical model for the NRM method was: where y is a vector of phenotypes; b is a vector of parameters for the general mean, pedigree-derived breed proportion, and pedigree-derived heterosis; a A is a vector of additive genetic effects from Population A; a B is a vector of additive genetic effects from Population B; e is a vector of residuals; and X , Z A , and Z B are design matrices. The three vectors with random effects (a A , a B , and e ) were assumed to be distributed as: Poulsen et al. Genetics Selection Evolution (2022)  where â A and â B are the vectors of predicted additive genetic effects in the statistical model for the NRM method (Eq. 4); subscript P denotes that the sub-vector only contains predicted effects from purebred animals; subscript C denotes that the sub-vector only contains predicted effects from crossbred animals; and 0s are vectors of zeros.

The GT method
This method partitions the additive genetic relationship into several partial relationship matrices [9]: one for each breed (partial relationship matrices for breed-specific terms; A GT A and A GT B in our study), and one for each pair of breeds (partial relationship matrices for segregation terms; A GT AB in our study). The partial relationship matrix for segregation terms captures the increase in additive genetic variance in crossbred animals [4,9].
The recursive algorithm for calculating A GT A is [9]: where i , j , s , and d are as described for the algorithm for the NRM method (Eq. 2); a ij is the pedigree-based covariance between the breed-specific partial additive genetic effects of animals i and j; and f A i is the proportion of genetic material from Population A in animal i . The sub-matrix of A GT A for purebred animals is identical to its analogous sub-matrix of A NRM A . The recursive algorithm for A GT AB is: where i , j , s , and d are as for the algorithm for the NRM method (Eq. 2); a ij is the pedigree-based covariance between the additive genetic segregation effects of animals i and j; f A s and f B s are the proportions of genetic material from Population A and Population B, respectively, in the sire of animal j ; and f A d and f B d are the proportions of genetic material from Population A and Population B, respectively, in the dam of animal j . Both diagonal and off-diagonal elements in A GT AB can only be non-zero for descendants of crossbred animals.
The pedigree for the GT method included all the animals, purebred and crossbred, in generations 32 through 39.
We used two genomic relationship matrices for the GT method; G GT A and G GT B . Generally, the single-step procedure for the GT method requires that marker alleles are phased and traced such that their breed of origin can be determined [13]; however, tracing the breed of origin of alleles was not required in this study because we only used genotypes from purebred animals. Therefore, G GT A and G GT B were the same as the genomic relationship matrices for the NRM method; i.e., The single-step procedure [20,21] was used for the partial relationship matrices for breed-specific terms. The combined partial relationship matrices for breed-specific terms for genotyped and non-genotyped animals were H GT A and H GT B for animals with genetic contributions from Populations A and B, respectively. The partial relationship matrix for the segregation term did not include genomic information.
The statistical model for the GT method was: where y, b, and X are as described for the statistical model for the NRM method (Eq. 4); a A is a vector of breed-specific partial additive genetic effects from Population A; a B is a vector of breed-specific partial additive genetic effects from Population B; a AB is a vector of additive genetic segregation effects between Populations A and B; e is a vector of residuals; and Z A , Z B , and Z AB are design matrices. The four vectors of random effects (a A , a B , a AB and e ) were assumed to be distributed as: where A GT A is a partial relationship matrix for the breed-specific term from Population A; A GT B is a partial relationship matrix for the breed-specific term from Population B; A GT AB is a partial relationship matrix for the segregation term between Populations A and B; σ 2 A A is the additive genetic variance in Population A; σ 2 A B is the additive genetic variance in Population B; σ 2 A AB is the segregation variance between Populations A and B; 0s are vectors or matrices of zeros; I is an identity matrix; and σ 2 e is the residual variance. For prediction with genomic information, A GT A was replaced with H GT A and A GT B was replaced with H GT B . The vector of predicted breeding values for the GT method was: where â A , â B , â AB are the vectors of predicted partial additive genetic effects in the statistical model for the GT method (Eq. 8); subscript P denotes that the sub-vector only contains the predicted effects from purebred animals; subscript C:F1 denotes that the sub-vector only contains the predicted effects from F1 crossbred animals; subscript C:R denotes that the sub-vector only contains the predicted effects from rotationally crossbred animals; and 0s are vectors of zeros.

The SM method
This method is an approximation of the GT method and it partitions the additive genetic variance in the same way.
The relationship matrices for the SM method are calculated as: where F A and F B are diagonal matrices with square roots of breed proportions for populations A and B, respectively; A NRM AB is a NRM-based relationship matrix representing at least all the descendants of crossbred animals; and F AB is a diagonal matrix with square roots of the term from Eq. 7. As for the GT method, the sub-matrices of A SM A and A SM B for purebred animals are identical to submatrices from A NRM A and A NRM B for purebreds animals, respectively.
The SM method is equivalent to randomregressions of additive genetic effects on F A , F B , and F AB , respectively [12], because In this study, we apply the SM method through random-regression.
Three pedigrees were constructed for the SM method: one for each purebred population, which are identical to those for the NRM method, and the third is for the partial relationship matrix for the segregation term between Populations A and B. The partial relationship matrix for the segregation term between Populations A and B was calculated with a pedigree from which all purebred and F1 animals had been removed.
We did not use the same pedigree for segregation effects as described by the SM method [12]. They used the full pedigree to construct an additive genetic relationship matrix on which they applied random regression. However, using the full pedigree may promote discreprancies between the GT and SM methods. According to the GT method, segregation effects are independent among all offspring from F1 animals and their magnitude only depend on the breed proportions of parental animals. For the SM method, a deep pedigree for segregation effects would increase the likelihood of both non-zero inbreeding coefficients in offspring from F1 animals and covariance between offspring from F1 animals. Therefore, the compliance between the GT and SM methods should be greater if purebred and F1 animals are removed from the pedigree for segregation effects, as done in this study.
The genomic relationship matrices for this method were the same as for both the NRM and GT methods. As for the GT method, we calculated combined relationship matrices for genotyped and non-genotyped animals for the breed-specific terms but not for the segregation term.
The statistical model for the SM method was: where F A , F B , and, F AB are as defined for the calculation of partial relationship matrices with the SM method (Eqs. 11,12,13); and the remaining components are the same as in the statistical model for the GT method (Eq. 8). Note that the additive genetic vectors now consist of regression coefficients. The four vectors of random effects (a A , a B , a AB and e ) were assumed to be distributed as: (14)  where â A , â B , â AB are the vectors of predicted partial additive genetic effects in the statistical model for the SM method (Eq. 14); and 0, subscript P, subscript C:F1, and subscript C:R are as defined for the GT method (Eq. 10).

The MF method
This method is conceptually different from the other methods. The other methods model populations as separate entities while the MF method models populations as sub-populations derived from a common ancestral population. In practice, this is done by identifying each sub-population through a metafounder, calculating an additive genetic relationship matrix, Ŵ , between metafounders, and then incorporating this information into one shared additive genetic relationship matrix for all populations, A(Ŵ) . In theory, this method should simultaneously account for both the breed-specific terms and the segregation term [14].
The metafounder relationships can be calculated in several ways [14,23]. We used the method proposed by Garcia-Baccino et al. [23]: where γ A is the metafounder relationship for Population A; γ B is the metafounder relationship for Population B; γ AB is the metafounder relationship between Populations A and B; σ 2 is the variance of marker allele frequencies in Population B; σ p * A p * B is a covariance between marker allele frequencies in Populations A and B; p * A and p * B are marker-allele frequencies in the base populations of Populations A and B, respectively; and the asterisk superscripts in p * A and p * B denote that allele annotations were randomized such that E(p * A ) = E(p * B ) = 1 2 . In this study, metafounder relationships were calculated with estimated marker allele frequencies in generation 32. We estimated marker allele frequencies as proposed by Gengler et al. [24] and genotypes from purebred animals in generations 35 to 39. Marker allele frequencies were estimated independently for each purebred population. Finally, metafounder relationships were calculated from markers that have a minor allele frequency higher than 0.01 when averaged across the purebred base-populations. The average metafounder relationship matrix across replicates, Ŵ , was: The recursive algorithm for the MF method is: where i , j , s , and d are as in the recursive algorithms for the NRM and GT methods (Eqs. 4 and 8); a ij is as in the recursive algorithm for the NRM method (Eq. 4); γ A , γ B , and γ AB are the metafounder relationships (Eq. 17); m A is a vector of base animals from Population A; m B is a vector of base animals from Population B; ∧ is the logical "and"; and ∨ is the logical "or". Please note that the last two elements of the recursive algorithm for the MF method are the same as in the algorithm for the NRM method. In other words, the only differences between the NRM method and the MF method are that base animals are related and their inbreeding coefficient can be greater than zero. These differences then carry over into the additive genetic relationships for animals which are not in the base population.
The pedigree for the MF method included all the animals, purebred and crossbred, in generations 32 to 39.
The MF method uses one genomic relationship matrix across all populations; G MF . A preliminary genomic relationship matrix, G VanRaden , was calculated using Van-Raden's first method [22], and genotypes from purebred animals in generations 35 to 39; however, we scaled and centered the genomic relationship matrix with allele frequencies of 0.5. Markers were included in the genomic relationship matrix if their minor allele frequency was higher than 0.01 when pooling genotypes from the purebred base-populations. The positive definiteness of the genomic relationship matrix was ensured by using the weighted average of G VanRaden and the sub-matrix of genotyped animals from the pedigree-based relationship matrix: 22 is the sub-matrix of genotyped animals from the pedigree-based relationship matrix. The genomic relationship matrix, G MF , was not scaled and centered such that its average diagonal and off-diagonal elements were equal to those of {A(Ŵ)} 22 because G MF and {A(Ŵ)} 22 are comparable when G MF and Ŵ are calculated with the same set of markers. We calculated a combined relationship matrix for genotyped and non-genotyped animals [14,20,21], H(Ŵ) , because some animals were not genotyped.
The statistical model for the MF method was: where y, b, and X are as described for the NRM method (Eq. 4); a is a vector of additive genetic effects; e is a vector of residuals; and Z is a design matrix. The two vectors of random effects (a and e ) were distributed as: where A(Ŵ ) is the additive relationship matrix with metafounders [14]; Ŵ is the additive relationship matrix between metafounders; σ 2 A MF is the additive genetic variance in the ancestral population; 0s are vectors or matrices of zeros; I is an identity matrix; and σ 2 e is the residual variance. The additive genetic relationship matrix, A(Ŵ) , was replaced with H(Ŵ) when breeding values were predicted with genomic prediction. (19) The vector of predicted breeding values for the MF method was: where â is the vector of predicted additive genetic effects in the statistical model for the MF method (Eq. 19).

Variance components
We estimated variance components for each method and its respective statistical model (Eqs. 4, 8, 14, and 19). Variance components were only estimated with pedigree information. Breeding values were predicted with these estimated variance components regardless of whether breeding values were predicted with or without genomic information.
The estimated variance components for the phenotype without dominance effects are in Table 1. For presentation only, the estimated additive genetic variance from the MF method was transformed using the estimated metafounder relationships, Ŵ , such that the parametrization was the same as for the GT method [14]: , and σ 2 A AB are the partial additive genetic variance components; σ 2 A MF is the estimated additive genetic variance in the ancestral population (Eq. 20); and γ A , γ B , and γ AB are metafounder relationships (Eq. 8).
We calculated true partial additive genetic variance components and used them as reference for the magnitude of the estimated variance components in Table 1. The true partial additive genetic variance components (21)  were calculated with the parametrization of the GT method and the phenotype without dominance effects. The true partial additive genetic variance for breedspecific effects from Population A was calculated as: where σ 2 A A is the true partial additive genetic variance for breed-specific effects from Population A; n qtl is the number of QTL; p i,A is the allele frequency at QTL i in base animals from Population A; β i,1 is the additive genetic effect of the first QTL allele at QTL i; and β i,2 is the additive genetic effect of the second QTL allele at QTL i. The partial additive genetic variance for breed-specific effects from Population B was calculated in the same way.
The true partial additive genetic variance for segregation effects between Populations A and B was calculated as [4]: where β i,1 , β i,2 , and n qtl are as in Eq. 23; σ 2 AB is the true partial additive genetic variance for segregation effects between Populations A and B; p i,F 1 is a vector of QTL allele frequencies in generation 33 of Population C; σ 2 A is the true partial additive genetic variance for breed-specific effects from Population A; and σ 2 B is the true partial additive genetic variance for breed-specific effects from Population B.

Software for analysis and prediction
Most data-handling was carried out in the R-software [25]. The relationship matrices for the GT and MF methods were calculated using the RcppArmadillo R-package [26]. The relationship matrices for the NRM and SM methods were calculated using the DMU software [27]. Variance components were estimated using the AI-ReML algorithm in the DMU software package [27]. Additive genetic effects were predicted using the best linear unbiased prediction (BLUP) method and the Preconditioned Conjugate Gradient algorithm implemented in DMU software [27].

Comparison of the methods General
We compared how well the methods predicted accurate and unbiased breeding values in animals from generation 39. In the following, we describe how we calculated true breeding values, accuracies, and biases, and the statistical methods used to compare the methods. We stratified the comparison according to population.

True breeding values
The true breeding value depends on whether the phenotype includes only an additive genetic term, or both additive genetic and dominant genetic terms. The true breeding values with only an additive genetic term were calculated as: where Q is a QTL genotype matrix with allelic loads of the first allele; β 1 is a vector of additive genetic effects of the first QTL allele; β 2 is a vector of additive genetic effects of the second QTL allele; and J is a matrix of 1s with dimensions equal those of Q.
In the presence of dominance, the true breeding value of an animal depends on its ability to promote both additive and dominance genetic effects in its offspring [28,29]. Therefore, the true breeding value now depends on the genotypes of the mate. True breeding values with a dominance term can be calculated with allele frequencies from the population of mating candidates [28,29]. The true breeding values with both an additive term and a dominance term were calculated as: where X ∈ {A, B, C} denotes the population to which the possible mating candidates belong; p X is a vector of QTL allele frequencies in population X; d Q is a vector of dominant QTL effects; • is the Hadamard product; 1 is a vector of ones; and tbv A , Q, β 1 , β 2 , and J are as for true breeding values with only an additive genetic term (Eq. 25).

Accuracy and bias
We evaluated the methods according to their prediction accuracy and prediction bias. We used two measures for the prediction bias [30,31]: level bias and dispersion bias.
The prediction accuracy was defined as Pearson's correlation between true breeding values and predicted breeding values: where ρ(.) is the Pearson correlation function; tbv is a vector of true breeding values; and ebv is a vector of predicted breeding values.
The level bias was calculated as: Accuracy = ρ(tbv, ebv), (28) µ bias = ebv − tbv + tbv base , where µ bias is the level bias; ebv is the mean predicted breeding value in validation animals; tbv is the mean true breeding values in validation animals; and tbv base is the mean true breeding value in base animals. The correction for tbv base was required because the true breeding values, in contrast to predicted breeding values, differed from zero in the base populations. There is no level bias when µ bias is equal to 0.
The dispersion bias was calculated as: where b bias is the dispersion bias; cov() is the empirical covariance; ebv is a vector of predicted breeding values; tbv is a vector of true breeding values; and var() is the empirical variance. There is no dispersion bias when b bias is equal to 1.

Statistical analysis of accuracy, level bias, and dispersion bias
The accuracies and biases were compared across methods, use of genomic information, and replicates; but not across populations and definition of true breeding values. We used non-parametric tests because accuracies and biases were heteroscedastic across methods and not normally distributed.
We investigated whether a method was more accurate or biased than others using paired Wilcoxon signed rank tests. We used paired tests to compare the methods to remove the variation caused by the stochastic simulation; i.e., the methods were paired within replicates. Furthermore, we investigated whether the methods were biased using the one-sample Wilcoxon signed rank tests. The null hypotheses for these tests were that the level biases were equal to 0 and that the dispersion biases were equal to 1.
We used the Bonferroni-correction of p-values to control for multiple testing: α bon = α/n tests = 0.05/1000 , where α is the significance level and n tests is the number of statistical tests. Among the 1000 tests, n p × n g × n m × (n m − 1)/2 = 840 were comparisons between validation parameters and (n p − 1) × n g × n m = 160 were tests for whether validation parameters differed from expected values, where n p = 3 is the number of validation parameters (accuracy, level bias, and dispersion bias); n g = 10 is the number of (29) b bias = cov(tbv, ebv) var(ebv) , groups within which validation parameters were compared; and n m = 8 is the number of unique combinations between methods and use of genomic information.

Expected pattern in results
The accuracies and biases are expected to differ between populations A, B, and C. For animals in generation 39 of Population A, halfsibs are the closest relatives with phenotypes. For animals in generation 39 of Population B, cousins are the closest relatives with phenotypes. For animals in generation 39 of Population C, own performance is available for all animals. Therefore, we expect that prediction is most accurate in Population C, less accurate in Population A, and least accurate in Population B.

Prediction accuracy
Generally, the GT and MF methods were as accurate or more accurate than the SM and NRM methods ( Table 2). Use of genomic information always increased the prediction accuracy ( Table 2). For Population A, the methods were equally accurate for prediction of breeding values without genomic information (median: 0.37-0.41, Table 2). When breeding values were predicted with genomic information, the MF and GT methods were the most accurate (median: 0.59-0.65, Table 2). The SM method was generally as accurate as the MF and GT methods (median: 0.58-0.65, Table 2), while the NRM method was always the least accurate (median: 0.56-0.63).
For Population B, the methods were equally accurate for prediction of breeding values without genomic information (median: 0.29-0.35, Table 2). When breeding values were predicted with genomic information, the MF method and the GT method were the most accurate for prediction of any definition of true breeding value (median: 0.50-0.55, Table 2) while the SM and NRM methods were the least accurate (median: 0.48-0.51).
For Population C, the GT and MF methods were the most accurate (median: 0.61-0.63, Table 2). The least accurate methods were the SM method (median: 0.57) and the NRM method (median: 0.48-0.49), respectively ( Table 2).

Level bias
The level biases were not statistically significantly different from 0 for the phenotype without a dominant genetic term (Table 3). For the phenotype with a dominant genetic term, the level biases were statistically significantly different from 0 for mating an animal with another animal from the same population.

Dispersion bias
In general, the dispersion biases for the GT, MF, and SM methods were not statistically significantly different from 1 (Table 4).  For Population A, the dispersion biases for the GT and MF methods were not statistically significantly different from 1 for the phenotype without a dominant genetic term (median: 0.94-1.01). The dispersion biases for SM method were not statistically significantly different from 1 when breeding values were predicted without genomic information or the phenotype was without a dominant genetic term (median: 0.78-0.94). The dispersion biases for the NRM method were always statistically significantly different from 1.
For Population B, the dispersion biases for the GT and MF methods were not statistically significantly different from 1 (median: 0.91-1.12). The dispersion biases for the SM method were only statistically significantly different from 1 when breeding values were predicted with genomic information and the phenotype did not include a dominant genetic term (median: 1.20). The dispersion biases for the NRM method was statistically significantly different from 1 in almost all cases (median: 0.67-0.82).
For Population C, the dispersion biases for the GT and MF methods were not statistically significantly different from 1 when the phenotype did not include a dominant genetic term (median: 0.99-1.02). The dispersion biases for the SM and NRM methods were always statistically significantly different from 1.

Discussion
As hypothesized, the GT and MF methods were generally the most accurate and least biased methods for prediction of breeding values with phenotypes from rotationally crossbred animals. The SM method was almost as accurate as the GT and MF methods but was also more biased. The NRM method was the least accurate and most biased of the methods.

The GT and MF methods
We found that the GT and MF methods performed similarly for prediction of breeding values with phenotypes from rotationally crossbred animals. This is in accordance with the fact that the MF method, in theory, can account for both breed-specific terms and segregation terms from GT method [14]. More specifically, the GT and MF methods are equivalent when the tranformations of Eq. 22 yield the estimated variance components from the GT method. However, this relies on the accurate estimation of the metafounder relationships which has some degree of estimation error. Fortunately for the MF method, it is the relative sizes of γ A , γ B , and γ AB which determine the relative sizes of the partial additive genetic parameters, σ 2 A A , σ 2 A B , and σ 2 A AB (Eq. 22). As long as Eq. 22 holds true, changes to the metafounder relationships are accounted for through changes to the estimated additive genetic variance in the ancestral population, σ 2 MF . One major advantage of the MF method is that genomic information can readily be included in the additive relationship matrix with metafounders using the single-step procedure [14], regardless of the genetic composition of the animals in the relationship matrix. On the contrary, the single-step procedure has only been developed for the partial relationship matrices for breed-specific terms  from the GT method; i.e., a combined partial relationship matrix for both genotyped and non-genotyped animals for segregation terms has not been developed [6]. This may make the MF method more appropriate than the GT method when rotationally crossbred animals are genotyped.
Based on this study, it is not possible to conclude whether the GT or MF method is better for the analysis of these specific populations.

The SM method
This method was generally as accurate and unbiased as the GT and MF methods for prediction in purebred animals but less accurate and more biased for prediction in rotationally crossbred animals (Tables 2 and 4). The inaccuracy and bias of the SM method may be caused by its inability to properly separate the phenotype into its components ( Table 1).
The SM method is only an approximation to the GT method and discreprancies between the two are expected. For example, for the GT method and disregarding inbreeding, the covariance between siblings depends on the diagonal elements of their shared parents (Eq. 7). Meanwhile, for the SM method, the covariance between siblings depends on the product between their own regression covariates for partial additive genetic effects (Eqs. 11,12,13). Consequently, the SM method is a better approximation to the GT method between animals where the weighted average of diagonal and off-diagonal elements of common ancestors is equal to the product between the animals' regression covariates for partial additive genetic effects and their additive genetic covariance according to the NRM method.
In a rotational crossbreeding system, breed proportions differ across generations. Consequently, the weighted average of diagonal and off-diagonal elements of common ancestors can differ from the product between the animals' regression covariates for partial additive genetic effects and their additive genetic covariance according to the NRM method. For example, in this study and disregarding inbreeding, the covariance of partial additive breed-specific effects from Population A between full sibs i and j from generation 34 and Population C was not the same for the GT and SM methods: where subscripts i, j, s, and d denote animals; f A is the breed proportion from Population A; and superscript NRM:A denotes that the covariance was calculated with (30)  It is simple to see that the GT and SM methods do not always produce identical relationships. However, it is challenging to explain how discreprancies between the GT and SM methods across the three partial additive relationship matrices affect the partitioning of random effects. Nevertheless, according to our study, the SM method seems to be a good approximation of the GT method when the aim is to predict breeding values in purebred animals.

The NRM method
This method has the most inaccurate assumptions for additive genetic effects among the methods investigated. In rotationally crossbred animals between divergent purebred populations, the model does not fit the data if the partial additive genetic variances due to breedspecific effects are not proportional to breed proportions [10], and the segregation variance is not modelled [4]. Therefore, it was expected that this method was the least accurate and most biased among those investigated (Tables 2, 3, 4).
The NRM method is a common approach for multibreed analyses. The main argument for the NRM method is that it is commonly implemented into softwares for genetic evaluations. However, we argue that the GT, MF, and SM methods either are accessible or can easily become accessible. Currently, the GT or MF methods may not be implemented in softwares for genetic evaluations, but both random regression and the NRM method are. The combination of random regression and the NRM method enables the use of the SM method which, in this study, was more accurate and less biased than the NRM method (Tables 2, 3, 4). In the future, the GT and MF methods should become accessible through their implementation into commonly used softwares for genetic evaluations. The implementation of both the GT and MF methods is simple as the algorithms for directly (31) computing their inverse covariance matrices are very similar to the algorithm for the NRM method [9,14]. Consequently, the time required for implementing the GT and MF methods should be greatly reduced as a large proportion of program code from the NRM method can be reused. All things considered, we do not recommend the NRM method for genetic analyses with phenotypes of rotationally crossbred animals, because its alternatives are more accurate, less biased, and easily accessible.

Simulation design
Results from simulation studies are most relevant when the simulated populations are representative of real populations. Populations can be described with several parameters, however, the divergence between the populations is a key argument for the relevance of multibreed relationship matrices [4]. The magnitude of divergence between two populations can be represented by the ratio between the segregation variance and the additive genetic variance in F2 animals: σ 2 AB / 1 2 σ 2 A + 1 2 σ 2 B + σ 2 AB ; which, in turn, can be calculated using the metafounder relationships (Eq. 22). Using this measure, the average magnitude of divergence between Populations A and B is 15% based on the metafounder relationships (Table 1). Meanwhile, this measure for the magnitude of divergence is 16% between DanBred Landrace and DanBred Yorkshire pigs [32], 15% between Hereford and Zebu cattle [33], and on average 11% (min: 3%, max: 25%) between subpopulations of Manech Tête Rousse sheep [34]. Therefore, the magnitude of divergence between populations A and B is representative of the divergence between real populations.
It would have been reasonable to compare the methods with a different simulation design, which would most likely give a different result. However, the purebred populations need to have diverged from each other; otherwise segregation effects would be small. We ensured that the purebred populations had diverged by simulating separate population bottlenecks in the two populations, and not a shared population bottleneck; by only sampling 50 animals (0.2% of the historical population) when founding the purebred populations; by keeping the effective population sizes small in the purebred populations ( N e ≈ 50 animals); and by isolating the purebred populations for 32 generations prior to the pedigreed generations. In a scenario where the purebred populations had only slightly diverged from each other, segregation effects would be small and the additive genetic variances would be the same in the purebred populations. This would diminish the argument for partial additive relationship matrices for the breed-specific terms and the segregation term. In other words, it would be better to regard the two purebred populations as one purebred population.
A simulation design with less diverged purebred populations would most likely yield the same ranking of the methods but with less absolute differences between their prediction accuracies.
In this study, only genetic drift caused changes in allele frequencies. In practice, allele frequencies are also affected by selection. Simulating selection would most likely also change the results. However, we have no reason to believe that selection would change the ranking between the methods, because all the methods theoretically can account for selection, and because their mechanism for doing so is the same [8,9,12,14].

Genotypes from crossbred animals
It is simpler to incorporate genomic information from crossbred animals into some methods than into others. For the MF, SM, and NRM methods, genomic information on crossbreds can be incorporated as for purebred animals. For the GT method, it becomes necessary to trace the breed of origin of alleles to construct genomic relationship matrices for breed-specific terms [13]. Furthermore, to our knowledge, it is not known how genomic information should be incorporated into partial relationship matrices for segregation terms. Although it is simple to incorporate genomic information for the MF, SM, and NRM methods, it is not known whether the resulting relationship matrices correctly represent the additive genetic covariance between animals. In particular, this is the case for the SM method and our application of the NRM method, as they are approximations. Although relevant, it was outside the scope of this study to compare the methods in a scenario with genomic information from crossbred animals.

Synthetic breeds
This study was on genetic analyses with rotationally crossbred animals, but our results may also apply to other genetic analyses of mixed populations. For example, some breeding companies create synthetic breeds. In practice, synthetic breeds are crossbred populations and they are subject to the same mechanisms as other crossbred populations. The only difference between a rotationally crossbred population and a synthetic breed is that sires are not necessarily purebred for synthetic breeds. Similar to the rotationally crossbred populations, the complex distributions of genetic effects may complicate accurate and unbiased prediction of breeding values in synthetic breeds. Our results may assist with the choice of method for the relationship matrix used in genetic analysis of synthetic breeds.

Solving BLUP equation systems
The choice between methods may also be impacted by their computational requirements. For all the relationship matrices that were studied here, the inverse can be directly computed [8,9,14]. However, the resulting equation systems differ in dimensions and sparseness. Using the GT, SM, or NRM method results in a larger equation system than with the MF method; especially with large numbers of breeds and crossbred animals. Meanwhile, the MF method contains more non-zero elements than the other methods; and using the MF method with the single-step procedure may require the inversion of one large genomic relationship matrix rather than the inversion of smaller genomic relationship matrices as with the other methods. Comparison of computational demands between the methods was outside the scope of this study but it could be relevant when computer hardware is a limiting factor.

Conclusion
In the scenarios that we investigated, models using the additive relationship matrix with metafounders [14] or the partial relationship matrices by García-Cortés and Toro [9] were generally more accurate and less biased than those using the partial relationship matrices by Strandén and Mäntysaari [12] or the usual numerator relationship matrix [8].

Appendix 1: Multibreed relationship matrices with a small example pedigree
The differences between the NRM, GT, and SM methods are easier to understand through examples. This example is based on a pedigree with both purebred animals, F1 animals, F2 animals, and a F2-backcross animal (Table 5).
We use the GT method as reference because it is theoretically correct. The methods yield different additive relationship matrices for the term from breed A (Tables 6, 7, 8). The NRM method calculates the correct relationships for purebred animals; but is erroneous after it encounters crossbred animals. The diagonal elements for crossbred animals are not scaled according to their breed proportions, and this error affects both diagonal and off-diagonal elements for descendants of the crossbred animals (Tables 6 and 7). The SM method yields the same diagonal elements as the GT method in the absence of inbreeding ( Table 8). The off-diagonal elements between F1 and F2 crossbred animals are also correct. The off-diagonal elements between purebred animals and crossbred animals are erroneous, and so is the off-diagonal element for the F2-backcross (animal 9; Table 8).
The methods also yield different partial additive relationship matrices for the segregation term (Tables 9 and  10). The SM method calculates non-zero off-diagonal elements for related animals where the off-diagonal element is zero for the GT method. Furthermore, the off-diagonal elements between animals 7 and 9 are erroneous as is the diagonal element for animal 9.
The relationship matrix from the MF method is not directly comparable to those from the other methods (Table 11) although it is theoretically equal to the GT method [14].