### Simulation

Datasets were simulated to compare the different models, in terms of accuracy of predicted breeding values. An effective population size of 500 animals was simulated, including 250 females and 250 males. This structure was kept constant for 1000 generations. Mating was performed by drawing the parents of an animal randomly from the animals of the previous generation. In total, 25 replicated datasets were simulated.

The simulated genome spanned 5 M (Morgan). Ten thousand bi-allelic loci were simulated across five chromosomes, with equal 0.05 cM distances between adjacent loci. In the first generation, animals received at random alleles 1 or 2 with equal chance. In the 1000 generations thereafter, each locus had a mutation rate of 2.5 × 10^{-5}, so that a mutation drift balance was reached within a limited number of generations [13]. A mutation caused allele 1 to become allele 2, and vice versa. Genotypes from the last four generations, as well as pedigree information of the last six generations, were retained for analysis. In total, on average across replicates, 5,655 loci segregated in the last four generations. These four generations will hereafter be referred to as generations 1 to 4.

Two hundred loci segregating in generations 1 to 4 and evenly distributed across the genome, were drawn to be QTL loci. These QTL were used to simulate two traits, with heritabilities of 0.9 and 0.6, reflecting average offspring performances such as daughter yield deviations [14] or de-regressed proofs [15]. For example, if one considers that the animals in the reference population reflect dairy bulls each with 100 daughters and their phenotypic records, the chosen heritabilities of 0.6 and 0.9 correspond to traits with heritabilities at the phenotypic level of 0.06 and 0.33, respectively, i.e. a fertility and a production trait in dairy cattle. The heritabilities of 0.6 and 0.9 were derived using the formula
e.g. [16], where
is the reliability of selection (in this case the heritability used to simulate the phenotypes of the animals in the reference population), *n* is the number of daughters and *h*
^{
2
} is the heritability at the phenotypic level. The two traits were simulated by drawing the allele substitution effects of each QTL locus from a multivariate normal distribution that followed the simulated genetic correlation. Three genetic correlations were considered, i.e. 0.2, 0.5, or 0.8.

### Models

Four different models were used to estimate breeding values. The general multi-trait model was:

where *y*
_{
ij
} is the phenotypic record for trait j of animal *i, μ*
_{
j
} is the overall mean for trait *j, animal*
_{
ij
} is the random polygenic effect of animal *i* for trait *j, SNP*
_{
ijkl
} is a random effect for allele *l* on trait *j* at locus *k* of animal *i*, and *e*
_{
ij
} is a random residual for animal *i*.

The first model omitted the SNP effects, and used a relationship matrix based on the pedigree retained to estimate the polygenic effects and the polygenic (co)variances of traits 1 and 2 (model A). The second model was the same as the first model, but included a genomic relationship (

**G**) matrix calculated by using all the markers to estimate the polygenic effects (model G). This

**G** matrix was calculated as described by VanRaden [

17]:

where *p*
_{
i
} is the frequency of the second allele at locus *i*, and **Z** is derived from genotypes of all included animals, by subtracting 2 times the allele frequency expressed as a difference of 0.5, i.e. 2(*p*
_{
i
} - 0.5), from matrix **M** that specifies the marker genotypes for each individual as -1, 0 or 1. Here, we used allele frequencies of 0.5 that reflected allele frequencies in the base generation i.e. in the very first generation of the simulation.

The third and fourth models included both a polygenic effect with a pedigree-based relationship matrix, and SNP effects. The difference between the third and fourth model resulted from considering one (model 3) or two (model 4) distribution(s) for the SNP effects.

SNP effects, in the general model denoted as

*SNP*
_{
ijkl
}, were estimated in models 3 and 4 as

*q*
_{
ijkl
}
*×*
*v*
_{
jk
} , according to Meuwissen and Goddard [

11], where

*q*
_{
ijkl
} is the size of the effect of allele

*l* at locus

*k* and

*v*
_{
jk
} is a scaling factor in the direction vector for locus

*k* that scales the effect at locus

*k* for trait

*j*. In the original implementation by Meuwissen and Goddard [

11], the variance of the direction vector

*v*
_{
.k
}, denoted as

**V**, is sampled per locus for each trait

*j* separately, without considering covariances between the traits across loci. Here, in both models 3 and 4 and for the estimation of

**V**, covariances between traits across loci are considered. Therefore, the prior distribution for

**V** in this case was, according to Meuwissen and Goddard [

11]:

where

**S**
_{0(no) }was chosen such that it reflected the total genetic (co)variance between traits

*n* and

*o*, divided by the total number of SNP.

**V..** was sampled from the following conditional

*m* variate-inverted Wishart distribution with (

*nloc* + 10) degrees of freedom:

where
, *nloc* = number of evaluated marker loci, and 10 is the number of degrees of freedom for the prior distribution.

Model 4 was similar to model 3, but included a QTL-indicator (

*I*
_{
k
}) for each bracket, that had a value of either 0 or 1. According to Meuwissen and Goddard (2004), in this case the prior distribution of

**V..** is similar to that from model 3, but here

**S**
_{
0(..)
} was chosen such that it reflected the total genetic (co)variances of traits

*n* and

*o*, divided by the total number of expected QTL instead of the number of SNP. Furthermore,

**V..** was sampled from an inverted Wishart distribution as described above for model 3, but in this case:

Where the QTL-indicator

*I*
_{
k
} was sampled from:

where *p*
_{
k
} is the prior QTL probability, i.e. the probability that *I*
_{
k
} is equal to 1, which follows a Bernoulli distribution. Prior QTL probabilities used in the analyses reflected the prior assumption that 100 QTL underlie both traits.

The third model is referred to as model BCπ0, since this model is similar to a model that is termed BayesCπ0 [18]. The fourth model is referred to as Bayesian Stochastic Search Variable Selection (BSSVS) e.g. [10].

In all the models, the residuals were assumed to be normally distributed *N*(**0**, **R**), where **R** is the *m* × *m* residual covariance matrix. In models A, BCπ0 and BSSVS, the polygenic values were assumed to be normally distributed *N*(**0**, **A** ⊗ **G**
_{
A
}), where A is the additive relationship matrix and **G**
_{
A
} is the *m* × *m* polygenic covariance matrix. Matrices **R** and **G**
_{
A
} were both sampled in the Gibbs sampler from an inverted Wishart distribution, with a uniform prior distribution.

Models A, BCπ0 and BSSVS were performed using Gibbs sampling with residual updating. Model A was run for 5,000 cycles, discarding 2,000 cycles for burn-in. Models BCπ0 and BSSVS were run for 10,000 cycles, discarding 2,000 cycles for burn-in. Except for the multi-trait runs in the second scenario where 30,000 cycles were run with 10,000 cycles discarded for burn-in, since initial results showed that more cycles were required for convergence in that scenario. Model G was performed using ASReml [19], because initial analyses using the Gibbs sampler showed slow convergence of the genetic variances for scenario 2.

In the multi-trait analyses of scenario 2 for the models that were analyzed using the Gibbs sampler, residuals for missing phenotypes in generation 2 were sampled using an EM algorithm. The missing residuals were drawn from the following distribution, according to VanTassell and VanVleck [

20]:

where *m* stands for missing and *o* for observed records. This allowed us to sample the effects in the model using residual updating. Residual (co-)variance matrices were estimated conditional only on residuals linked to observed records.

Each simulated dataset and scenario were analyzed three times with all four models: first traits 1 and 2 were analyzed separately in a single-trait (ST) model, and then both traits were analyzed together in a multi-trait (MT) model.