# Genetic evaluation for three-way crossbreeding

- Ole F. Christensen
^{1}Email author, - Andres Legarra
^{2}, - Mogens S. Lund
^{1}and - Guosheng Su
^{1}

**47**:98

https://doi.org/10.1186/s12711-015-0177-6

© Christensen et al. 2015

**Received: **24 June 2015

**Accepted: **4 December 2015

**Published: **22 December 2015

## Abstract

### Background

Commercial pig producers generally use a terminal crossbreeding system with three breeds. Many pig breeding organisations have started to use genomic selection for which genetic evaluation is often done by applying single-step methods for which the pedigree-based additive genetic relationship matrix is replaced by a combined relationship matrix based on both marker genotypes and pedigree. Genomic selection is implemented for purebreds, but it also offers opportunities for incorporating information from crossbreds and selecting for crossbred performance. However, models for genetic evaluation for the three-way crossbreeding system have not been developed.

### Results

Four-variate models for three-way terminal crossbreeding are presented in which the first three variables contain the records for the three pure breeds and the fourth variable contains the records for the three-way crossbreds. For purebred animals, the models provide breeding values for both purebred and crossbred performances. Heterogeneity of genetic architecture between breeds and genotype by environment interactions are modelled through genetic correlations between these breeding values. Specification of the additive genetic relationships is essential for these models and can be defined either within populations or across populations. Based on these two types of additive genetic relationships, both pedigree-based, marker-based and combined relationships based on both pedigree and marker information are presented. All these models for three-way crossbreeding can be formulated using Kronecker matrix products and therefore fitted using Henderson’s mixed model equations and standard animal breeding software.

### Conclusions

Models for genetic evaluation in the three-way crossbreeding system are presented. They provide estimated breeding values for both purebred and crossbred performances, and can use pedigree-based or marker-based relationships, or combined relationships based on both pedigree and marker information. This provides a framework that allows information from three-way crossbred animals to be incorporated into a genetic evaluation system.

## Background

Commercial pig producers generally use a terminal crossbreeding system with three breeds. In this system, F1 sows from two maternal breeds are mated to purebred boars from a breed that has high-level production traits (growth, leanness, feed efficiency) to produce pigs for slaughter. Commonly, boar lines in Europe are Duroc and Pietrain and sows are crosses between Large White and Landrace. Genetic evaluation is usually done within each of these breeds based on recorded phenotypes on purebred animals. However, ideally genetic evaluation of purebreds should incorporate phenotypes of interest recorded on crossbreds, and breeding values for performance in the three-way cross should be estimated.

Example pedigree

id | Father | Mother | Population |
---|---|---|---|

1 | 0 | 0 | \(\mathcal {A}\) |

2 | 0 | 0 | \(\mathcal {A}\) |

3 | 0 | 0 | \(\mathcal {B}\) |

4 | 0 | 0 | \(\mathcal {C}\) |

5 | 1 | 2 | \(\mathcal {A}\) |

6 | 3 | 5 | \(\mathcal {A}\mathcal {B}\) |

7 | 4 | 6 | \(\mathcal {C}(\mathcal {A}\mathcal {B})\) |

8 | 4 | 6 | \(\mathcal {C}(\mathcal {A}\mathcal {B})\) |

Breed \(\mathcal {A}\) specific partial relationship matrix \(\mathbf {A}^{\mathcal {A}}\) for the pedigree in Table 1

id | 1 | 2 | 5 | 6 | 7 | 8 |
---|---|---|---|---|---|---|

1 | 1 | 0 | \(\frac{1}{2}\) | \(\frac{1}{4}\) | \(\frac{1}{8}\) | \(\frac{1}{8}\) |

2 | 1 | \(\frac{1}{2}\) | \(\frac{1}{4}\) | \(\frac{1}{8}\) | \(\frac{1}{8}\) | |

5 | 1 | \(\frac{1}{2}\) | \(\frac{1}{4}\) | \(\frac{1}{4}\) | ||

6 | \(\frac{1}{2}\) | \(\frac{1}{4}\) | \(\frac{1}{4}\) | |||

7 | \(\frac{1}{4}\) | \(\frac{1}{8}\) | ||||

8 | \(\frac{1}{4}\) |

The aim of this work was to develop models for three-way terminal crosses that handle both pedigree-based and marker-based relationships, as well as combined relationship matrices based on both pedigree and marker genotypes. As indicated above, an essential part of the model is the specification of relationships such that the model can be fitted by using standard animal breeding software.

## Methods

Breed \(\mathcal {B}\) specific partial relationship matrix \(\mathbf {A}^{\mathcal {B}}\) for the pedigree in Table 1

id | 3 | 6 | 7 | 8 |
---|---|---|---|---|

3 | 1 | \(\frac{1}{2}\) | \(\frac{1}{4}\) | \(\frac{1}{4}\) |

6 | \(\frac{1}{2}\) | \(\frac{1}{4}\) | \(\frac{1}{4}\) | |

7 | \(\frac{1}{4}\) | \(\frac{1}{8}\) | ||

8 | \(\frac{1}{4}\) |

Breed \(\mathcal {C}\) specific partial relationship matrix \(\mathbf {A}^{\mathcal {C}}\) for the pedigree in Table 1

id | 4 | 7 | 8 |
---|---|---|---|

4 | 1 | \(\frac{1}{2}\) | \(\frac{1}{2}\) |

7 | \(\frac{1}{2}\) | \(\frac{1}{4}\) | |

8 | \(\frac{1}{2}\) |

*i*be \(g_i\), then the additive variance is:

*b*and \(b^{\prime }\) denote breeds, \(f^b_i\) is the breed

*b*content of individual

*i*, \(\sigma ^2_{g,b}\) is the breed

*b*genetic variance, \(g_{f(i)}\) and \(g_{m(i)}\) are the additive genetic values of parents

*f*(

*i*) and

*m*(

*i*), respectively, and \(\sigma ^2_{g,b,b^{\prime }}\) is the breed

*b*and breed \(b^{\prime }\) segregation genetic variance. The additive covariance between genotypic values of individuals

*i*and \(i^{\prime }\) is:

*i*.

Breed \(\mathcal {A}\mathcal {B}\) segregation partial relationship matrix \(\mathbf {A}^{\mathcal {A}\mathcal {B}}\) for the pedigree in Table 1

id | 9 | 10 |
---|---|---|

9 | \(\frac{1}{2}\) | 0 |

10 | \(\frac{1}{2}\) |

*b*specific partial relationship matrix and matrix \(\mathbf {A}^{b,b^{\prime }}\) the breed

*b*and breed \(b^{\prime }\) segregation partial relationship matrix. The vectors \(\mathbf {g}^b\) and \(\mathbf {g}^{b,b^{\prime }}\) depend on genetic origin, such that \(\mathbf {g}^b\) is the breed

*b*specific partial genetic vector, and \(\mathbf {g}^{b,b^{\prime }}\) is the breed

*b*and breed \(b^{\prime }\) segregation partial genetic vector. Matrices \(\mathbf {A}^b\) and \(\mathbf {A}^{b,b^{\prime }}\) have sparse inverses that can be computed using the usual methods for the additive relationship matrix (see [14]). In this paper, the approach using a partition of the genetic effects into independent terms is named partial genetic approach.

Common relationship matrix \(\mathbf {A}(\varvec{\Gamma })\) for the pedigree in Table 1

id | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
---|---|---|---|---|---|---|---|---|

1 | \(1+\frac{\gamma _{\mathcal {A}}}{2}\) | \(\gamma _{\mathcal {A}}\) | \(\gamma _{\mathcal {A}\mathcal {B}}\) | \(\gamma _{\mathcal {A}\mathcal {C}}\) | \(\frac{1}{2}+\frac{3\gamma _{\mathcal {A}}}{4}\) | \(\frac{1}{4}+\frac{3\gamma _{\mathcal {A}}}{8}+\frac{\gamma _{\mathcal {A}\mathcal {B}}}{2}\) | \(\frac{1}{8}+\frac{3\gamma _{\mathcal {A}}}{16}+\frac{\gamma _{\mathcal {A}\mathcal {B}}}{4}+\frac{\gamma _{\mathcal {A}\mathcal {C}}}{2}\) | \(\frac{1}{8}+\frac{3\gamma _{\mathcal {A}}}{16}+\frac{\gamma _{\mathcal {A}\mathcal {B}}}{4}+\frac{\gamma _{\mathcal {A}\mathcal {C}}}{2}\) |

2 | \(1+\frac{\gamma _{\mathcal {A}}}{2}\) | \(\gamma _{\mathcal {A}\mathcal {B}}\) | \(\gamma _{\mathcal {A}\mathcal {C}}\) | \(\frac{1}{2}+\frac{3\gamma _{\mathcal {A}}}{4}\) | \(\frac{1}{4}+\frac{3\gamma _{\mathcal {A}}}{8}+\frac{\gamma _{\mathcal {A}\mathcal {B}}}{2}\) | \(\frac{1}{8}+\frac{3\gamma _{\mathcal {A}}}{16}+\frac{\gamma _{\mathcal {A}\mathcal {B}}}{4}+\frac{\gamma _{\mathcal {A}\mathcal {C}}}{2}\) | ||

3 | \(1+\frac{\gamma _{\mathcal {B}}}{2}\) | \(\gamma _{\mathcal {B}\mathcal {C}}\) | \(\gamma _{\mathcal {A}\mathcal {B}}\) | \(\frac{1}{2}+\frac{\gamma _{\mathcal {B}}}{4} + \frac{\gamma _{\mathcal {A}\mathcal {B}}}{2}\) | \(\frac{1}{4}+\frac{\gamma _{\mathcal {B}}}{8} + \frac{\gamma _{\mathcal {A}\mathcal {B}}}{4} +\frac{\gamma _{\mathcal {A}\mathcal {C}}}{2}\) | \(\frac{1}{4}+\frac{\gamma _{\mathcal {B}}}{8} + \frac{\gamma _{\mathcal {A}\mathcal {B}}}{4} +\frac{\gamma _{\mathcal {A}\mathcal {C}}}{2}\) | ||

4 | \(1+\frac{\gamma _{\mathcal {C}}}{2}\) | \(\gamma _{\mathcal {A}\mathcal {C}}\) | \(\frac{\gamma _{\mathcal {A}\mathcal {C}}+\gamma _{\mathcal {B}\mathcal {C}}}{2}\) | \(\frac{1}{2}+\frac{\gamma _{\mathcal {C}}+\gamma _{\mathcal {A}\mathcal {C}}+\gamma _{\mathcal {B}\mathcal {C}}}{4}\) | \(\frac{1}{2}+\frac{\gamma _{\mathcal {C}}+\gamma _{\mathcal {A}\mathcal {C}}+\gamma _{\mathcal {B}\mathcal {C}}}{4}\) | |||

5 | \(1+\frac{\gamma _{\mathcal {A}}}{2}\) | \(\frac{1}{2}+\frac{\gamma _{\mathcal {A}}}{4} + \frac{\gamma _{\mathcal {A}\mathcal {B}}}{2}\) | \(\frac{1}{4}+\frac{\gamma _{\mathcal {A}}}{8} + \frac{\gamma _{\mathcal {A}\mathcal {B}}}{4}+ \frac{\gamma _{\mathcal {A}\mathcal {C}}}{2}\) | \(\frac{1}{4}+\frac{\gamma _{\mathcal {A}}}{8} + \frac{\gamma _{\mathcal {A}\mathcal {B}}}{4}+ \frac{\gamma _{\mathcal {A}\mathcal {C}}}{2}\) | ||||

6 | \(1+\frac{\gamma _{\mathcal {A}\mathcal {B}}}{2}\) | \(\frac{1}{2}+\frac{\gamma _{\mathcal {A}\mathcal {B}}+\gamma _{\mathcal {A}\mathcal {C}}+\gamma _{\mathcal {B}\mathcal {C}}}{4}\) | \(\frac{1}{2}+\frac{\gamma _{\mathcal {A}\mathcal {B}}+\gamma _{\mathcal {A}\mathcal {C}}+\gamma _{\mathcal {B}\mathcal {C}}}{4}\) | |||||

7 | \(1+\frac{\gamma _{\mathcal {A}\mathcal {C}}+\gamma _{\mathcal {B}\mathcal {C}}}{4}\) | \(\frac{1}{2}+\frac{\gamma _{\mathcal {C}}+\gamma _{\mathcal {A}\mathcal {B}}}{8} +\frac{\gamma _{\mathcal {A}\mathcal {C}}+\gamma _{\mathcal {B}\mathcal {C}}}{4}\) | ||||||

8 | \(1+\frac{\gamma _{\mathcal {A}\mathcal {C}}+\gamma _{\mathcal {B}\mathcal {C}}}{4}\) |

Pedigree in Table 1 with metafounders

id | father | mother |
---|---|---|

\(\mathcal {A}\) | - | - |

\(\mathcal {B}\) | - | - |

\(\mathcal {C}\) | - | - |

1 | \(\mathcal {A}\) | \(\mathcal {A}\) |

2 | \(\mathcal {A}\) | \(\mathcal {A}\) |

3 | \(\mathcal {B}\) | \(\mathcal {B}\) |

4 | \(\mathcal {C}\) | \(\mathcal {C}\) |

5 | 1 | 2 |

6 | 3 | 5 |

7 | 4 | 6 |

8 | 4 | 6 |

First, partial genetic and common genetic approaches for constructing pedigree-based relationships are presented, then the corresponding two different ways of constructing marker-based relationships are presented, and finally the genetic variances and covariances in model (2) are shown for the two approaches. Detailed derivations are in the “Appendix”.

### Additive genetic model for crossbred \(\mathcal {C}(\mathcal {A}\mathcal {B})\) performance: partial genetic approach

*b*proportion, and the breed-segregation partial relationship matrix is defined by the recursive formulas:

To illustrate the different partial relationship matrices, we analysed the small pedigree in Table 1. Tables 2, 3, 4 and 5 show the partial relationship matrices for this example.

*i*, where \(\Phi _{\mathcal {C}(\mathcal {A}\mathcal {B}),i}\) is the Mendelian sampling term. The Mendelian sampling terms are independent among the \(\mathcal {C}(\mathcal {A}\mathcal {B})\) crossbred animals, and by making the approximation that father

*f*(

*i*) is not inbred and since mother

*m*(

*i*) is not inbred, the variance is constant. In this way, the Mendelian sampling error term can be included into the residual error term \(\mathbf {e}_{\mathcal {C}(\mathcal {A}\mathcal {B})}\) in model (2), and the model can be formulated using three breed-specific partial relationship matrices defined on the \(\mathcal {A},\mathcal {B},\mathcal {C}\) and \(\mathcal {A}\mathcal {B}\) animals. However, as explained in Christensen et al. [11], such a reduced model cannot be extended to incorporate marker genotypes since these provide information about the Mendelian sampling term. Therefore, we did not pursue the reduced form of the model any further.

Note that model (2) with relationships as presented here is the most obvious generalisation of the Wei and van der Werf model in Eq. (1) from two to three breeds since base individuals are assumed unrelated. Without a formulation using partial relationship matrices, it would be difficult to estimate parameters in this model using standard animal breeding software.

### Additive genetic model for crossbred \(\mathcal {C}(\mathcal {A}\mathcal {B})\) performance: common genetic approach

*b*, and

*b*, i.e. base animals are inbred with coefficient \(\gamma _b/2\) and related with relationship coefficient \(\gamma _b\). Furthermore,

*b*and \(b^{\prime }\), i.e. base animals in different breeds are related. Therefore, a joint relationship matrix is specified among all base animals, and by applying the usual recursive definition:

Legarra et al. [15] suggested a framework where individuals in the base population of the pedigree are related because they originate from overlapping ancestral populations with a finite size, and they termed each of these ancestral populations as a meta-founder to be included in the pedigree. Here, \(\mathcal {A},\mathcal {B},\mathcal {C}\) are meta-founders, and each base individual in the pedigree has a meta-founder, which is both its parents; see example in Table 7. When extending the pedigree and the matrix \(\mathbf {A}(\varvec{\Gamma })\) with these meta-founders, Legarra et al. [15] showed that the algorithms for computing the sparse inverse matrix \(\mathbf {A}(\varvec{\Gamma })^{-1}\) directly as in Henderson [16] and submatrices of \(\mathbf {A}(\varvec{\Gamma })\) by the Colleau algorithm [17] are as usual.

The parameter \(\sigma _g^2\) in Eq. (5) does not correspond to the usual genetic variance which is the variance among unrelated individuals in the base population. As explained in Legarra et al. [15], \(\sigma _g^2(1 - \gamma _b/2)\) corresponds to the variance among unrelated breed *b* animals, and therefore the genetic variances for crossbred \(\mathcal {C}(\mathcal {A}\mathcal {B})\) performance are \(\sigma _g^2(1 - \gamma _{\mathcal {A}}/2)\), \(\sigma _g^2(1 - \gamma _{\mathcal {B}}/2)\) and \(\sigma _g^2(1 - \gamma _{\mathcal {C}}/2)\), corresponding to \(\sigma _{g,\mathcal {A}}^2\), \(\sigma _{g,\mathcal {B}}^2\) and \(\sigma _{g,\mathcal {C}}^2\) in the previous section, respectively. In addition, Legarra et al. [15] explained that the breed-segregation variance is \(\sigma ^2_g((\gamma _{\mathcal {A}}+\gamma _{\mathcal {B}})/2-\gamma _{\mathcal {A},\mathcal {B}})/4\), which corresponds to \(\sigma _{g,\mathcal {A}\mathcal {B}}^2\) in the previous section.

### Genomic model for crossbred \(\mathcal {C}(\mathcal {A}\mathcal {B})\) performance: partial genetic approach

Marker-based partial relationship matrices are constructed by tracing breed of origin of alleles and defining relationships according to breed of origin. Assume that breed of origin of alleles can be determined for all animals and define breed-specific allele content matrices as: matrix \(\mathbf {m}^b\) with entries 0, 1, 2 for purebred *b* animals, matrices \(\mathbf {z}^{\mathcal {A}}\) and \(\mathbf {z}^{\mathcal {B}}\) with entries 0, 1 for paternal and maternal alleles, respectively, for crossbred \(\mathcal {A}\mathcal {B}\) animals, matrix \(\mathbf {z}^{\mathcal {C}}\) with entries 0, 1 for paternal allele of crossbred \(\mathcal {C}(\mathcal {A}\mathcal {B})\) animals, and finally matrices \(\mathbf {z}_p^{\mathcal {A}}\) and \(\mathbf {z}_p^{\mathcal {B}}\) with entries 0, 1, respectively, for crossbred \(\mathcal {C}(\mathcal {A}\mathcal {B})\) animals when the breed-specific allele is inherited and zero otherwise. This means that breed of origin of each allele needs to be traced, usually by a phasing software [18].

*i*,

*j*) equal to \(p^{\mathcal {A}}_j\) when the crossbred \(\mathcal {C}(\mathcal {A}\mathcal {B})\) individual

*i*inherited an \(\mathcal {A}\) specific allele and zero otherwise, and \(s^{\mathcal {A}}\) is a scaling parameter. The marker-based breed \(\mathcal {B}\) specific partial relationship matrix \(\mathbf {G}^{\mathcal {B}}\) is defined similarly to \(\mathbf {G}^{\mathcal {A}}\), and the marker-based breed \(\mathcal {C}\) specific partial relationship matrix is

*n*is the number of markers. Note that diagonal elements of \(\mathbf {G}^{\mathcal {A}\mathcal {B}}\) equal diagonal elements of \(\mathbf {A}^{\mathcal {A}\mathcal {B}}\) (i.e. 1/2). Off-diagonal elements of \(G^{\mathcal {A}\mathcal {B}}\) measure whether pairs of individuals share more alleles from a particular parental breed (\(\mathcal {A}\) or \(\mathcal {B}\)) than expected. Expectations of off-diagonal elements \(\mathbf {G}^{\mathcal {A}\mathcal {B}}\) equal off-diagonal elements of \(\mathbf {A}^{\mathcal {A}\mathcal {B}}\) (i.e. 0).

The breed-specific partial marker-based relationship matrices above require estimates of breed-specific allele frequencies. Such estimates can be obtained from marker genotypes of purebred animals and breed-specific marker alleles for crossbred animals. Furthermore, there is a need to adjust these matrices to be compatible with partial pedigree relationship matrices similar to Christensen et al. [11, 19], i.e. \(\mathbf {G}^b_a=\mathbf {G}^b\beta _b+\alpha _b\mathbf {J}^b\) where \(\alpha _b\) and \(\beta _b\) are parameters and \(\mathbf {J}^b\) is a matrix with entries \(\mathbf {J}^b_{i,i^{\prime }}=f^b_i f^b_{i^{\prime }}\). The scaling parameters \(s^b\) in marker-based relationship matrices \(\mathbf {G}^{b}\), b=\(\mathcal {A},\mathcal {B},\mathcal {C}\) are unspecified above, since the compatibility adjustment involves a scaling parameter \(\beta _b\) for each breed, and therefore \(s^b\) can be arbitrary. On the other hand, matrix \(\mathbf {G}^{\mathcal {A}\mathcal {B}}\) does not need an adjustment.

Finally, to incorporate the fact that marker genotypes only capture a fraction of the genetic effects, the partial marker-based relationship matrices \(\mathbf {G}^{b}\), \(b\in \mathcal {A},\mathcal {B},\mathcal {C}\) and \(\mathbf {G}^{\mathcal {A}\mathcal {B}}\) above may be replaced by matrices \(\mathbf {G}^{b}_{\omega }=\mathbf {G}^{b}(1-\omega )+\mathbf {A}^{b}\omega\), \(b\in \mathcal {A},\mathcal {B},\mathcal {C}\) and \(\mathbf {G}^{\mathcal {A}\mathcal {B}}(1-\omega )+\mathbf {A}^{\mathcal {A}\mathcal {B}}\omega\), respectively, where \(\omega\) is the fraction of genetic variance not captured by marker genotypes [4].

### Genomic model for crossbred \(\mathcal {C}(\mathcal {A}\mathcal {B})\) performance: common genetic approach

*s*is scaling parameter. As in Christensen [20] and Legarra et al. [15], we chose common allele frequencies, i.e. \(p_j=0.5\), and then determine the parameters in matrix \(\varvec{\Gamma }\) and parameter

*s*such that the pedigree-based and marker-based relationship matrices are compatible. Parameters in matrix \(\varvec{\Gamma }\) and scaling parameter

*s*can be estimated by matching \(\mathbf {A}(\varvec{\Gamma })\) and \(\mathbf {G}\) for purebred individuals; see Legarra et al. [15]. For example, if genotyping is done in each of the three pure breeds then the following system of equations can be used to determine the parameters:

*s*and can therefore be solved directly to obtain estimates.

### Genetic models for both purebred and crossbred \(\mathcal {C}(\mathcal {A}\mathcal {B})\) performances

In the previous sections, partial genetic and common genetic models for additive genetic effects for crossbred \(\mathcal {C}(\mathcal {A}\mathcal {B})\) performance were presented, and in both cases genomic versions of the models and combined relationship matrices were shown. Now, we show how the genetic variances and covariances for the model in Eq. (2) look like in the two cases.

In the common genetic case, there are three parameters \(\sigma _{a,\mathcal {A},\mathcal {B}}\), \(\sigma _{a,\mathcal {A},\mathcal {C}}\) and \(\sigma _{a,\mathcal {B},\mathcal {C}}\) which are genetic covariances between purebred performances, and these parameters are not present in the partial genetic case. The reason is that they would not be identifiable since there is no specification of the relationships across breeds in the partial genetic case. In the common genetic case, the identifiability of \(\sigma _{a,\mathcal {A},\mathcal {B}}\), \(\sigma _{a,\mathcal {A},\mathcal {C}}\) and \(\sigma _{a,\mathcal {B},\mathcal {C}}\) relies on the genomic relationships between pairs of animals in different breeds. In the partial genetic case, there are four genetic parameters for crossbred performance, \(\sigma _{g,\mathcal {A}}^2\), \(\sigma _{g,\mathcal {B}}^2\), \(\sigma _{g,\mathcal {C}}^2\) and \(\sigma _{g,\mathcal {A}\mathcal {B}}^2\) that scale each of the four partial relationship matrices, whereas in the common genetic case there is only one such parameter \(\sigma _g^2\). As explained in a previous section, there is a correspondence between these parameters via the parameters in matrix \(\varvec{\Gamma }\) as follows: \(\sigma _{g,b}^2=\sigma _g^2(1 - \gamma _b/2)\), \(b=\mathcal {A},\mathcal {B},\mathcal {C}\), \(\sigma ^2_{g,\mathcal {A}\mathcal {B}}=\sigma ^2_g((\gamma _{\mathcal {A}}+\gamma _{\mathcal {B}})/2-\gamma _{\mathcal {A},\mathcal {B}})/4\). However, note that there is a difference between estimating \(\sigma _{g,\mathcal {A}}^2\), \(\sigma _{g,\mathcal {B}}^2\), \(\sigma _{g,\mathcal {C}}^2\) and \(\sigma _{g,\mathcal {A}\mathcal {B}}^2\) from phenotypes as in the partial genetic case, and determining these from a general \(\sigma ^2_g\) and parameters in \(\varvec{\Gamma }\), which are estimated based on marker genotypes as in the common genetic case.

## Discussion

For three-way crossbreeding, we presented models based on both pedigree-based, marker-based and combined relationships. Using combined relationship matrices results in a model for genetic evaluation where both pedigree and marker genotypes are used simultaneously for genetic evaluation, i.e. a single-step method for genomic evaluation. This paper provides the models and mathematical formulas, but a numerical implementation is needed before the methods are ready for use in practice. Such methods make it possible to incorporate phenotypes and genotypes on crossbreds into an existing genetic evaluation system, assuming that such a system is based on a single-step method.

The models for three-way crossbreeding investigated in this paper were four-variate models where each variable was measured in a specific population, \(\mathcal {A}\), \(\mathcal {B}\), \(\mathcal {C}\) or \(\mathcal {C}(\mathcal {A}\mathcal {B})\). The main scenario that we have in mind is a scenario where the four variables represent the same biological trait measured in four different genetic backgrounds and possibly different environments, but in principle the four variables could also be different biological traits. An extension of the model to a situation where multiple biological traits are measured in each of the four populations is in principle straightforward since the additive relationship matrices are the same, although in practice it may require the estimation of a very large number of genetic parameters. Extending the approaches to other types of models that are implemented in standard animal breeding software, like threshold models, models with indirect genetic effects, models for test-day records, etc. is also in principle straightforward. Finally, modifying the models to other scenarios with data recording, for example with records on \(\mathcal {A}\mathcal {B}\) individuals or no records on one of the pure breeds, is also straightforward. In general, designing data recording for these complicated models is an issue, and for example to obtain precise estimates of the genetic correlation parameters, it would be important that the relationships between crossbred animals with records and purebred animals with records are close.

Two types of approaches for constructing additive relationships were presented, based on different assumptions about allele substitution effects of causal loci or SNPs. In the partial genetic approach, allele substitution effects of SNPs were assumed independent between breeds, whereas in the common genetic approach, they were assumed to be the same in different breeds. The partial genetic approach requires that alleles are traced according to breed of origin, which is feasible in some scenarios but may be difficult with sufficient accuracy in others. In particular, when crossbred \(\mathcal {C}(\mathcal {A}\mathcal {B})\) animals are genotyped, a reasonable requirement is that breed \(\mathcal {C}\) fathers are also genotyped which would make the tracing of the breed \(\mathcal {C}\) paternal allele feasible, but the tracing of the breed of origin (\(\mathcal {A}\) or \(\mathcal {B}\)) of the maternal allele may be more uncertain and depend on whether \(\mathcal {A}\mathcal {B}\) mothers are genotyped (may not be due to logistical issues), maternal grandfathers are genotyped and maternal grandmothers are genotyped (may be difficult to obtain if these are from multiplier herds). An advantage of the common genetic approach is that the marker-based relationship matrix is easier to construct because tracing the breed of origin of alleles is not required, but a disadvantage may be the computational burden of using a larger relationship matrix. In addition, parameters in matrix \(\varvec{\Gamma }\) need to be estimated and the sensitivity of genetic evaluation to these estimates is unknown. Future research using simulated and real data is needed to clarify the differences between the two approaches.

Other terminal crossbreeding systems are of interest in pig production. Models for two-way crossbreeding are relevant for sow-traits measured on animals from breed \(\mathcal {A}\) and \(\mathcal {B}\) and cross \(\mathcal {A}\mathcal {B}\), and such models were presented in Christensen et al. [11] using partial genetic relationship matrices. An alternative to this partial genetic approach would be to use the common genetic approach presented here. The four-way crossbreeding system where crossbred \(\mathcal {C}\mathcal {D}\) sires are mated to \(\mathcal {A}\mathcal {B}\) dams to produce \((\mathcal {C}\mathcal {D})(\mathcal {A}\mathcal {B})\) pigs for slaughter, is also used in pig production. The approaches in this paper can be extended to such a system, and the resulting model would be a five-variate model. Using the partial genetic approach, there would be four breed-specific partial relationship matrices and two breed-segregation partial relationship matrices, and the corresponding model for purebred and crossbred performances would contain 14 genetic parameters, whereas using the common genetic approach, the model for purebred and crossbred performances would contain 15 genetic parameters.

Many papers have reported genetic correlations between purebred and crossbred performances [21–26]. The reported estimated correlations ranged from 0.38 to 0.946, depending on trait and on differences in the environment, and in general with relatively high standard error on the estimates. The higher the genetic correlation, the less gain there will be by including crossbred data into the genetic evaluation system. All these results are from two-way crosses, and the authors are not aware of publications based on data from three-way crossbreeding where data in purebred and crossbred populations are considered to be different traits. The models presented in this paper should be useful to investigate such data from three-way crossbreeding.

## Conclusion

Models for genetic evaluation in the three-way crossbreeding system are presented. These models provide estimated breeding values for both purebred and crossbred performances, and can use pedigree-based or marker-based relationships, or combined relationships based on both pedigree and marker information. This provides a framework that allows information from three-way crossbred animals to be incorporated into a genetic evaluation system.

## Declarations

### Authors’ contributions

OFC concieved the study and derived the formulas with help from AL. OFC took the lead in writing the manuscript, and AL, GS and MSL helped with the writing and with discussions. All authors read and approved the final manuscript.

### Acknowledgements

The work was performed in a project funded through the Green Development and Demonstration Programme (grant no. 34009-12-0540) by the Danish Ministry of Food, Agriculture and Fisheries, the Pig Research Centre and Aarhus University. AL acknowledges financing from INRA SelGen metaprogram in projects X-Gen, EpiSel, SelHet and SelDir, and is grateful to the genotoul bioinformatics platform Toulouse Midi-Pyrenees for providing computing storage resources. Comments and corrections from reviewers and editors are also acknowledged.

### Competing interests

The authors declare that they have no competing interests.

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

## Authors’ Affiliations

## References

- Meuwissen THE, Hayes BJ, Goddard ME. Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001;157:1819–29.PubMedPubMed CentralGoogle Scholar
- Legarra A, Aguilar I, Misztal I. A relationship matrix including full pedigree and genomic information. J Dairy Sci. 2009;92:4656–63.PubMedView ArticleGoogle Scholar
- Aguilar I, Misztal I, Johnson DL, Legarra A, Tsuruta S, Lawlor TJ. Hot topic: A unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluations of Holstein final score. J Dairy Sci. 2010;93:743–52.PubMedView ArticleGoogle Scholar
- Christensen OF, Lund MS. Genomic prediction when some animals are not genotyped. Genet Sel Evol. 2010;42:2.PubMedPubMed CentralView ArticleGoogle Scholar
- Ibánẽz-Escriche N, Fernando RL, Toosi A, Dekkers JCM. Genomic selection of purebreds for crossbred performance. Genet Sel Evol. 2009;41:12.PubMedPubMed CentralView ArticleGoogle Scholar
- Kinghorn BP, Hickey JM, van der Werf, JHJ. Reciprocal recurrent genomic selection for total genetic merit in crossbred individuals. In: Proceedings of the 9th World Congress on Genetics Applied to Livestock Production, 1–6 August 2010; Leipzig; 2010. paper 0036. urlhttp://www.kongressband.de/wcgalp2010/assets/pdf/0036.Google Scholar
- Zeng J, Toosi A, Fernando RL, Dekkers JCM, Garrick DJ. Genomic selection of purebred animals for crossbred performance in the presence of dominant gene action. Genet Sel Evol. 2013;45:11.PubMedPubMed CentralView ArticleGoogle Scholar
- Wei M, van der Werf JHJ. Maximizing genetic response in crossbreds using both purebred and crossbred information. Anim Prod. 1994;59:401–13.View ArticleGoogle Scholar
- Wei M, van der Werf JHJ, Brascamp EW. Relationship between purebred and crossbred parameters: II genetic correlation between purebred and crossbred performance under the model with two loci. J Anim Breed Genet. 1991;108:262–9.View ArticleGoogle Scholar
- Baumung R, Sölkner J, Essl A. Correlation between purebred and crossbred performance under a two-locus model with additive by additive interaction. J Anim Breed Genet. 1997;114:89–98.PubMedView ArticleGoogle Scholar
- Christensen OF, Madsen P, Nielsen B, Su G. Genomic evaluation of both purebred and crossbred performances. Genet Sel Evol. 2014;46:23.PubMedPubMed CentralView ArticleGoogle Scholar
- Stuber CW, Cockerham CC. Gene effects and variances in hybrid populations. Genetics. 1966;64:1279–86.Google Scholar
- Lo LL, Fernando RL, Grossman M. Covariance between relatives in multibreed populations: additive model. Theor Appl Genet. 1993;87:423–30.PubMedView ArticleGoogle Scholar
- García-Cortés LA, Toro MA. Multibreed analysis by splitting the breeding values. Genet Sel Evol. 2006;38:601–15.PubMedPubMed CentralGoogle Scholar
- Legarra A, Christensen OF, Vitezica ZG, Aguilar I, Misztal I. Ancestral relationships using metafounders: finite ancestral populations and across population relationships. Genetics. 2015;200:455–68.PubMedView ArticleGoogle Scholar
- Henderson CR. A simple method for computing the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics. 1976;32:69–83.View ArticleGoogle Scholar
- Colleau J-J. An indirect approach to the extensive calculation of relationship coefficients. Genet Sel Evol. 2002;34:409–21.PubMedPubMed CentralView ArticleGoogle Scholar
- Sargolzaei M, Chesnais JP, Schenkel FS. A new approach for efficient genotype imputation using information from relatives. BMC Genomics. 2014;15:478.PubMedPubMed CentralView ArticleGoogle Scholar
- Christensen OF, Madsen P, Nielsen B, Ostersen T, Su G. Single-step methods for genomic evaluation in pigs. Animal. 2012;6:1565–71.PubMedView ArticleGoogle Scholar
- Christensen OF. Compatibility of pedigree-based and marker-based relationship matrices for single-step genetic evaluation. Genet Sel Evol. 2012;44:37.PubMedPubMed CentralView ArticleGoogle Scholar
- Brandt H, Täubert H. Parameter estimates for purebred and crossbred performances in pigs. J Anim Breed Genet. 1998;115:97–104.View ArticleGoogle Scholar
- Kiszlinger HN, Farkas J, Köver G, Onika-Szvath S, Nagy I. Genetic parameters of growth traits from a joint evaluation of purebred and crossbred pigs. Agric Cons Sci. 2011;76:223–6.Google Scholar
- Wei M, van der Werf JHJ. Genetic correlation and heritabilities for purebred and crossbred performance in poultry egg production traits. J Anim Sci. 1995;73:2220–6.PubMedGoogle Scholar
- Zumbach B, Misztal I, Tsuruta S, Holl J, Heering W, Long T. Genetic correlations between two strains of Durocs and crossbreds from differing production environments for slaughter traits. J Anim Sci. 2007;85:901–8.PubMedView ArticleGoogle Scholar
- Lutaaya E, Misztal I, Mabry JW, Short T, Timm HH, Holzbauer R. Genetic parameter estimates from joint evaluation of purebreds and crossbreds in swine using the crossbred model. J Anim Sci. 2001;79:3002–7.PubMedGoogle Scholar
- Bloemhof K, Kause A, Knol EF, van Arendonk JAM, Misztal I. Heat stress effects on farrowing rate in sows: genetic parameter stimation using within-line and crossbred models. J Anim Sci. 2011;90:2009–119.Google Scholar
- de los Campos G, Sorensen D, Gianola D. Genomic heritability: what is it? PLoS Genet. 2015;11:1005048.Google Scholar