We conducted an in silico divergent genomic selection experiment in Drosophila using the sequence based virtual breeding (SBVB) software [15]. SBVB can use real sequence data for founder animals in a simulated population and simulates the genomes and phenotypes of offspring in a very efficient and flexible manner according to specified genetic architectures. Although SBVB is essentially a gene-dropping algorithm, it allows implementing selection by efficiently reading and writing haplotypes (see Additional file 1: Figure S1).

### Data

We downloaded the public SNP data from the 205 inbred lines of the *Drosophila melanogaster* Genetic Reference Panel v2 (DGRP Freeze 2.0, http://dgrp2.gnets.ncsu.edu/ [17]). SNPs from chromosome 4 and indels were removed, resulting in 3,954,651 SNPs that were used for analyses. Chromosome 4 in Drosophila is normally ignored in population genetic studies since it is very small, does not recombine and is mostly heterochromatic. Missing values were imputed with Beagle4 [18]. SBVB allows the specification of variable recombination rates along the genome and between sexes as well as sex chromosomes. We used the genetic map from Flybase (www.flybase.org) and allowed for the fact that no recombination occurs in male Drosophila.

### Genetic architecture

Four hundred causal SNPs, i.e. quantitative trait nucleotides (QTN) were considered in the analysis. We used the 103 SNPs and their estimated additive effects on the phenotype “chill coma recovery time” which are reported by [19] in their supplementary Table 3. For those SNPs that were identified in both sexes (12 out of 103), we used the average between-sex effect size as QTN effect. Otherwise, the specific sex effect was taken as the additive genetic effect in both sexes. In addition, we used 297 randomly chosen SNPs with their additive effects that were simulated by using an exponential distribution with rate parameter equal to 5. The purpose of this was to generate a larger number of loci with smaller effects than those detected by the association study in [19], since the quantitative trait loci (QTL) that are reported as significant are typically those with the largest effects, leading to a marked upward bias [20]. The sign (− or +) of the simulated additive effect sizes was sampled with equal probability.

Genotypic values were simulated according to two extreme architectures.

#### Epistatic architecture

All loci (400 = 103 + 297) were randomly grouped in 200 epistatic pairs. Second-order epistasis followed the complementary model [5], where the genotypic values for the nine possible two-locus genotype combinations are equal to:

$$\begin{array}{*{20}c} {} & {C_{1} C_{1} } & {C_{1} C_{2} } & {C_{2} C_{2} } \\ {B_{1} B_{1} } & z & z & 0 \\ {B_{1} B_{2} } & z & z & 0 \\ {B_{2} B_{2} } & 0 & 0 & 0 \\ \end{array}$$

This architecture assumes equal values for the additive effect of locus *B* and locus *C* and complete dominance (*a* = *d*), with *B*
_{1} and *C*
_{1} being the dominant alleles. The genotypic value ‘*z*’ was computed as the arithmetic mean of the effects of the two original loci, that is, either the effect reported by [19] or that simulated from the exponential distribution. In this architecture, the double mutation *B*
_{1}
*C*
_{1} yields the same phenotype as either one alone. The genotypic value of an individual was computed as the sum of the two-locus genotypic value at each of the 200 epistatic pairs. We chose the complementary model because it is simple to interpret and allows for substantial non-additive genetic effects (~25% of the genetic variance is non-additive with a ‘U’ shape distribution of allele frequencies) [5].

#### Additive architecture

The genotypic value of an individual was obtained by summing all additive genotypic effects across loci, according to each individual’s genotypes (i.e., *a*, 0 or −*a* for *B*
_{1}
*B*
_{1}, *B*
_{1}
*B*
_{2} and *B*
_{2}
*B*
_{2}, respectively). Here, the value ‘*a*’ of each locus is equal to the value ‘*z*’ used for the corresponding epistatic pair previously described.

Each individual’s phenotype was calculated from its genotypic value adding an environmental effect taken from a normal distribution with mean 0 and variance \(\sigma_{e}^{2}\). In both architectures, \(\sigma_{e}^{2}\) was adjusted so that the broad sense heritability (\(H^{2}\)) was 0.5, before selection started. Narrow sense heritability was approximately 0.25 for the epistatic architecture.

### Evaluation model

Breeding values were predicted using GBLUP [21]. Briefly, GBLUP uses SNPs to build genomic relationship matrices (\({\mathbf{G}}\)). Four alternative models were used to evaluate individuals.

#### Additive model using sequence data (A-SEQ)

The evaluation model included a mean plus additive values distributed as \({\mathbf{a}}\sim N({\mathbf{0}},{\mathbf{G}}\sigma_{a}^{2} )\), where \({\mathbf{G}}\) was obtained using all SNPs. Computation of \({\mathbf{G}}\) is described below.

#### Additive model using causal SNPs (A-QTN)

As above except that \({\mathbf{G}}\) was obtained by using only causal SNPs (QTN).

#### Full epistatic model using sequence data (E-SEQ)

The evaluation model included a mean, (statistical) additive values, distributed as \({\mathbf{a}}\sim N({\mathbf{0}},{\mathbf{G}}\sigma_{a}^{2} )\), a dominant random deviation \({\mathbf{d}}\sim N({\mathbf{0}},{\mathbf{D}}\sigma_{d}^{2} )\), and an (statistical) epistatic effect distributed as \({\mathbf{h}}\sim N({\mathbf{0}},{\mathbf{G}} \ne {\mathbf{G}}\sigma_{p}^{2} )\), where \(\ne\) denotes the Hadamard product [22, 23]; \({\mathbf{G}}\) and \({\mathbf{D}}\) were obtained by using all SNPs. Computation of \({\mathbf{D}}\) is described below.

#### Full epistatic model using causal SNPs (E-QTN)

As above except that \({\mathbf{G}}\) and \({\mathbf{D}}\) were obtained by using only the QTN.

For a given set of markers (all or only causal SNPs), \({\mathbf{G}}\) was obtained from \({\mathbf{MM}}^{\prime } /\sum\nolimits_{{j = 1}}^{k} 2 p_{j} q_{j}\), where the elements of the \({\mathbf{m}}\) vectors for each individual are equal to \(- 2p_{j}\), \(1 - 2p_{j}\), and \(2 - 2p_{j}\) for genotypes *B*
_{1j
}
*B*
_{1j
}, *B*
_{1j
}
*B*
_{2j
} and *B*
_{2j
}
*B*
_{2j
}, respectively [21], \(p_{j}\) is the frequency of allele *B*
_{1j
} for the genotyped individuals of the population, \(q_{j} = 1 - p_{j}\), and \(k\) is the number of SNPs. \({\mathbf{D}}\) was obtained as in [7]:

$${\mathbf{D}} = \frac{{{\mathbf{M}}_{{\mathbf{d}}} {\mathbf{M}}_{{\mathbf{d}}}^{'} }}{{\mathop \sum \nolimits_{j = 1}^{k} 4p_{j}^{2} q_{j}^{2} }},$$

where the elements of the \({\mathbf{m}}_{d}\) vectors for each individual are equal to \(- 2p_{j}^{2}\), \(2p_{j} q_{j}\) and \(- 2q_{j}^{2}\) for genotypes *B*
_{1j
}
*B*
_{1j
}, *B*
_{1j
}
*B*
_{2j
} and *B*
_{2j
}
*B*
_{2j
}, respectively.

Breeding values were predicted with the Bayesian generalized linear regression (BGLR) package [24]. For prediction purposes, variance components were assumed unknown and, thus, estimated simultaneously (marginalized). BGLR implements various Bayesian regression models that were developed for genomic applications, including the GBLUP model. An eigenvalue decomposition of the covariance matrices (\({\mathbf{G}}\) and \({\mathbf{D}}\)) was used, given its good convergence properties [25]. Default prior parameters and 10,000 iterations plus 2000 burn-in cycles were used in the Markov chain Monte Carlo (MCMC) method, resulting in 100 to 150 effective iterations [26], depending on the parameter and replicate. To verify whether 10,000 iterations were sufficient for our purposes, we compared the prediction of breeding values obtained in chains with 10k and 200k iterations, which was on average equal to 0.98 so we used 10k for computational speed.

The linear predictor included an intercept plus a linear regression on additive effects. We also included both a linear regression on dominant effects and a linear regression on epistatic effects for both E-SEQ and E-QTN evaluation models. Gaussian prior densities were used for all the linear regressions. The residual variance prior was assigned a scaled-inverse Chi-square density and the intercept is assigned a flat prior by default. For the other variances, we also used scaled-inverse Chi-squared densities with hyperparameters set by using the default rules in BGLR (see appendix of [24] at http://www.genetics.org/content/suppl/2014/07/09/genetics.114.164442.DC1/164442SI.pdf). In short, the number of degrees of freedom was 5 (which provides a rather uninformative prior) and scale parameter S such that the R^{2} of the model is matched. BGLR was run at each generation of selection, and predictions were obtained using all the phenotypes and genotypes of all the animals of the preceding generations including the present generation.

### Selection scheme

We generated the base population starting with the 205 sequenced lines of DGRP2 and performing 10 generations of random mating to decrease LD and to generate heterozygous individuals, since parents were homozygous at most sites. The size of the generated base population was N = 500. Selection intensity was 10% in both sexes. At each generation, 25 males and 25 dams were chosen based on genomic breeding values that were predicted using different genomic models (A-SEQ, E-SEQ, A-QTN and E-QTN) and randomly mated; each mating produced 20 offspring with an equal sex ratio. At each generation, breeding values were predicted using all molecular and phenotypic information up to that generation. This scheme was continued for seven discrete generations. Both upward and downward selections were performed. We ran 10 replicates of each of the 16 experiments (two genetic architectures × four evaluation models × two directions of selection). In each replicate, the same set of SNP effects was used, but different base populations with different haplotype structures were generated, although all were initiated with the same real Drosophila data.

### Response to selection, prediction accuracy and additive variance over generations

We investigated the influence of the genetic architecture, the direction of selection and the evaluation model on response to selection, genomic prediction accuracy and the narrow sense heritability over generations. For the total cumulative response to selection, we computed the phenotypic mean per generation (N = 500), averaged across replicates and expressed in standard deviation (SD) units of the base population phenotypic distribution. Prediction accuracy was computed as the Pearson correlation between true and predicted breeding values. Under random mating, the breeding value of an individual is defined as twice the phenotypic mean of its offspring since it deviated from the phenotypic population mean. Calculating true breeding values is straightforward under an additive architecture, since they are equal to the simulated genotypic values. However, this equality does not hold for the epistatic architecture. From the simulations, we have the true total genetic values but not the breeding values. In this case, we used the original definition and empirically estimated the ‘true’ breeding value of an individual that generates 1000 offspring, which result from mating the individual to randomly chosen individuals from the same generation. Additive genetic variance was computed as the variance of ‘true’ breeding values among the individuals in the generation of interest.

### Linkage disequilibrium and inbreeding

Long-range LD between causal SNPs was assessed at the beginning (t_{0}) and at the end (t_{7}) of each selection experiment and replicate. A pair-wise r^{2} estimation implemented in PLINK [27], defined as the squared correlation coefficient of genotypes at two loci, was used to measure LD between all causal SNP pairs within a chromosome. The curve of the decay of r^{2} with physical distance was fitted for each experiment by nonlinear regression, using Hill and Weir’s [28] expectation of r^{2}. Genomic inbreeding coefficient estimates (\(F_{\text{h}}\)) were obtained for each individual using PLINK’s—het function [27], which is based on excess SNP homozygosity, as \(F_{{{\text{h}}_{i} }} = \left( {O_{i} - E} \right)/\left( {k - E} \right)\), where \(O_{i}\) is the number of observed homozygous genotype counts for individual \(i\), \(E = \sum\nolimits_{j = 1}^{k} 1 - 2p_{j} \left( {1 - p_{j} } \right)\) is the number of homozygous genotype counts expected by chance for the base population, \(p_{j}\) is the frequency of *B*
_{1} in the base population, and \(k\) is the number of SNPs.