Strategies for implementing genomic selection in family-based aquaculture breeding schemes: double haploid sib test populations

Nirea, Kahsay G; Sonesson, Anna K; Woolliams, John A; Meuwissen, Theo HE

doi:10.1186/1297-9686-44-30

Research
Open access
Published: 30 October 2012

Strategies for implementing genomic selection in family-based aquaculture breeding schemes: double haploid sib test populations

Kahsay G Nirea¹,
Anna K Sonesson²,
John A Woolliams^1,3 &
…
Theo HE Meuwissen¹

Genetics Selection Evolution volume 44, Article number: 30 (2012) Cite this article

4468 Accesses
18 Citations
Metrics details

Abstract

Background

Simulation studies have shown that accuracy and genetic gain are increased in genomic selection schemes compared to traditional aquaculture sib-based schemes. In genomic selection, accuracy of selection can be maximized by increasing the precision of the estimation of SNP effects and by maximizing the relationships between test sibs and candidate sibs. Another means of increasing the accuracy of the estimation of SNP effects is to create individuals in the test population with extreme genotypes. The latter approach was studied here with creation of double haploids and use of non-random mating designs.

Methods

Six alternative breeding schemes were simulated in which the design of the test population was varied: test sibs inherited maternal (Mat), paternal (Pat) or a mixture of maternal and paternal (MatPat) double haploid genomes or test sibs were obtained by maximum coancestry mating (MaxC), minimum coancestry mating (MinC), or random (RAND) mating. Three thousand test sibs and 3000 candidate sibs were genotyped. The test sibs were recorded for a trait that could not be measured on the candidates and were used to estimate SNP effects. Selection was done by truncation on genome-wide estimated breeding values and 100 individuals were selected as parents each generation, equally divided between both sexes.

Results

Results showed a 7 to 19% increase in selection accuracy and a 6 to 22% increase in genetic gain in the MatPat scheme compared to the RAND scheme. These increases were greater with lower heritabilities. Among all other scenarios, i.e. Mat, Pat, MaxC, and MinC, no substantial differences in selection accuracy and genetic gain were observed.

Conclusions

In conclusion, a test population designed with a mixture of paternal and maternal double haploids, i.e. the MatPat scheme, increases substantially the accuracy of selection and genetic gain. This will be particularly interesting for traits that cannot be recorded on the selection candidates and require the use of sib tests, such as disease resistance and meat quality.

Background

In traditional aquaculture breeding schemes, selection for traits that cannot be measured on the selection candidates (e.g. disease resistance and fillet quality) is based on a performance test of sibs of the candidates, i.e. information on test sibs is used to calculate breeding values for the selection of parents. This is due to the fact that measuring meat quality traits requires killing of the fish and fish that have been challenge-tested for disease resistance cannot be used as breeding stock. However, with a sib test, only 50% of the total genetic variance of the candidates is exploited, perhaps less. Recently, with the advent of high-throughput genotyping of genetic markers, genomic selection[1] has been taken up by animal breeders. With genomic selection, the total genetic value of the selection candidates is predicted based on the simultaneous estimation of single nucleotide polymorphism (SNP) effects using a set of individuals that have been genotyped and phenotyped[1]. Compared to traditional genetic evaluation methodologies and marker-assisted selection, genomic selection can result in an increase in the accuracy of selection for an individual without a phenotype provided that the test set is sufficiently large and relevant to the selected population[2]. Genomic selection is increasingly used in dairy cattle breeding[3–5] and in plant breeding[6, 7] but is not yet used in selective breeding in aquaculture.

Simulation studies have examined possible strategies to implement genomic selection in family-based aquaculture breeding schemes, generally following two stages[8]: first, SNP marker effects are estimated in a test population consisting of sibs of the selection candidates and second, genome-wide breeding values of the genotyped selection candidates are estimated by summing up the estimated SNP marker effects. The benefits reported from these strategies are promising. In one study[8], in which sibs were performance-tested every generation, the accuracy of the estimated breeding values of selection candidates increased, which increased genetic gain because genetic gain is directly related to the accuracy of selection. In addition, genomic selection more than halved the rate of inbreeding compared to traditional BLUP selection using similar resources. A major contributor to these results is the accurate estimation of the within-family variance with genomic selection. Other studies have supported these findings[9].

Two factors are important for maximizing the accuracy of genomic breeding values: (1) increasing the precision of estimates of SNP marker effects in the test population; this can be achieved by increasing the number of animals in the test population and by increasing the number of SNP markers sufficiently to capture the genetic variance throughout the genome[10, 11]; (2) maximising the relationship between individuals in the test and candidate populations[12].

Another means of increasing the accuracy of estimates of SNP effects is to have the test population consist of extreme genotypes. This approach has been exploited with the use of double haploids for QTL mapping in fish[13]. Double haploids are homozygous for all loci and thus achieve in a single generation more homozygosity than 10 generations of continuous full-sib mating[14]. Double haploids are produced by chromosome manipulation techniques such as gynogenesis and androgenesis, which produce female and male double haploids, respectively. With both these techniques, the duplicated chromosomes can be combined either before (mitotic) or after recombination (meiotic). Although the availability of double haploids is a major advantage in fish breeding, a number of drawbacks have also been reported, including technological challenges, costs of implementation, and the low viability of the progeny due to inbreeding depression[15]. An alternative approach to increasing homozygosity is to use non-random mating designs such as maximum coancestry mating.

Based on these considerations, it is hypothesized that designing a test population using double haploids or non-random mating can increase the accuracy of estimates of SNP effects in test sibs, which in turn will increase the accuracy of predicted breeding values when applied in genomic selection schemes. Given the reliance of many aquaculture schemes on sib testing, this hypothesis was tested by simulating a typical breeding scheme in fish.

Methods

Simulation of populations was carried out in two steps: (1) to create base populations (G0) with a set of genomic data and (2) to simulate breeding schemes derived from these base populations. Details are presented in the following section.

Simulation of the base population (generation G0)

We simulated a Fisher-Wright population with an effective population size of 1000 (500 males and 500 females) for 4000 generations to construct the base population G0. Four thousand generations has been shown to be sufficient to achieve mutation-drift equilibrium and stationary distributions of pair-wise linkage disequilibrium[8]. Within each of these generations, 500 males and 500 females were produced by random selection and mating of a sire and dam, with replacement after each mating.

A diploid genome with 10 chromosomes of 1 Morgan (M) each was simulated. SNP mutations and recombinations were introduced every generation at a rate of 10^-8 per base pair per meiosis, assuming 10⁸ base pairs per M, which is close to the infinite sites mutation model[16]. SNP were passed from parent to offspring following Mendelian inheritance and recombination followed the Haldane mapping function[17].

After 4000 generations, the G0 generation was created with N_m = 3000 and N_f = 3000 offspring obtained from the random mating of n_s = 50 sires and n_d = 50 dams with replacement. To obtain a reliable result, the base population was replicated 100 times. For each base population, six different breeding schemes were run. The average of these replicates was used for comparison of the breeding schemes. Quantitative SNP effects were simulated to attribute breeding values to each individual for the trait evaluated in the sib test, as described in the following section.

Simulation of generations for breeding schemes

Males and females from each G0 family were equally divided to create a test population and a candidate population, each with 3000 individuals. The phenotypes and genotypes of the test population were then evaluated using genomic evaluation techniques to estimate the SNP effects. Genomic estimated breeding values (EBV) for the candidates were computed based on their genotypes and these estimates. The best n_s = 50 male and n_d = 50 female candidates were selected to be sires and dams based on the EBV and randomly mated in pairs with n_o = 120 offspring/pair to produce generation G1 with 6000 offspring. Generation G0 also acted as the base population for pedigree numerator relationships.

In G1, the procedure for G0 was repeated up to the point of mating. The offspring were divided into test and candidate populations. Genomic evaluation was carried out using the phenotypes of the test population to estimate SNP effects; the best n_s = 50 male and n_d = 50 female candidates were then selected. In the genomic evaluation of G1, the accumulated data from both G0 and G1 test populations were used to estimate SNP effects.

The selected males and females in G1 were then used to produce a candidate population and a test population in G2. The mating of the G2 individuals defined the different breeding schemes of the study. For all schemes, the 3000 individuals from the G2 candidate population were created by pair-wise random mating among the n_s = 50 sires and n_d = 50 dams. However, individuals in the test population were created either as diploids following random mating, or following minimum or maximum coancestry mating, or as double haploids as described in the following section. Finally, in G2, SNP effects were re-estimated using all the accumulated test data from G0, G1, and G2. Then, the genomic EBV of the candidates were calculated and comparisons among the different breeding schemes were made.

Deriving alternative test populations in G2

Two types of approaches were adopted to design the test population in G2: a diploid approach in which mating among G1 parents was managed, and a double haploid approach following the random mating among the G1 parents.

Three diploid approaches were simulated as follows:

Random mating (RAND)

Mating pairs were chosen from the n_s and n_d selected parents using random sampling without replacement to produce the G2 test population. Mating pairs were re-sampled from the same sets of parents to produce the G2 candidate population. This was done to allow fair comparisons with the assortative mating schemes described below.

Maximum coancestry (MaxC)

Using the selected males and females, mating pairs were chosen to maximize the average coancestry of the mates based on pedigree. First, a matrix of coancestries for all possible matings was constructed. Starting from an initial set of mating pairs, two pairs were chosen at random and their mates swapped. If this resulted in an increase in average coancestry, the swap was accepted, otherwise it was rejected. This was repeated until no further improvement was obtained.

Minimum coancestry (MinC)

Minimum coancestry mating was the same as maximum coancestry mating, except that mating pairs were designed to minimize average coancestry among the set of mates as in[18].

Three double haploid approaches were simulated. For all three approaches, the parents and mating pairs that created the G2 test and candidate populations were the same randomly selected group. In MatPat, 1500 progeny each were created by mitotic androgenesis from the G1 male parents, and by mitotic gynogenesis from the G1 female parents. Each parent produced 30 double haploid offspring. In Pat, all 3000 progeny came from the 50 male G1 parents by mitotic androgenesis, with 60 offspring per parent and no contribution from the female G1 parents. In Mat, G1 female parents contributed all the offspring by mitotic gynogenesis in a similar fashion as Pat.

Simulation of markers, quantitative trait loci and true breeding values

In G0, a random sample of 1000 SNP from among those with a minor allele frequency (MAF) greater than 0.05 were assigned to be QTL. Additive allelic values were assigned to each QTL by independent sampling of effects from a Laplace distribution. True breeding values were computed as the sum of allelic effects at the 1000 QTL as follows:

T B V_{i} = \sum_{J = 1}^{1000} z_{i j 1} g_{j 1} + z_{i j 2} g_{j 2}

(1)

where z_ijk is the number of copies of allele k (k = 0, 1 or 2) at the j^th QTL locus of individual i and g_jk is the sampled effect. Allelic effects were then scaled to set the total genetic variance (σ_A²) observed in G0 equal to 10.

Phenotypes with a heritability of 0.05, 0.1 or 0.4 were created by sampling environmental deviations from appropriate normal distributions and added these to the true breeding value. The m = 5000 SNP loci with the highest MAF, excluding those that had been selected to be QTL, were selected as markers and the test and candidate sibs were genotyped for these m markers.

Estimation of SNP effects

Estimation of SNP effects followed the GS-BLUP model[1] for n phenotypes with the m marker loci:

y = \sum_{j = 1}^{m} x_{ij} u_{j} + e

(2)

where y is a vector of phenotypes, x_ij is the standardized number of a randomly chosen reference allele (allele “1”) carried by animal i at the j^th marker locus, as described below, u_j is the effect of allele “1” at locus j, and e is a vector of random errors assumed to be distributed as N(0, σ_e²I). The variance of each marker effect was assumed to be drawn from identical independent distributions with $σ_{i}^{2} = \frac{σ_{A}^{2}}{m}$ . standardized number of “1” alleles was computed as:

x_{ij} = \frac{x_{i j 1}^{*} - 2 p_{j}}{\sqrt{2 p_{j} (1 - p_{j})}}

(3)

where p_j is the frequency of allele “1” at the j^th marker locus and x_{ij 1}^* is the number of “1” alleles carried by individual i.

The elements x_ij form the incidence matrix X and the vector of SNP effects u was estimated (û) from:

[X^{T} X + \frac{σ_{e}^{2}}{σ_{i}^{2}} I] [\hat{u}] = [X^{T} y]

(4)

The EBV of candidate i was predicted by using their SNP genotypes and summing up their marker effects, as estimated using the test population, as:

G E B V = X \hat{u}

(5)

Standardization of SNP covariates x_ij for the candidates was carried out using the same values of p_j as used for the test animals, i.e. the estimates of frequencies were obtained using both candidate and test sets.

Statistics

Outputs of the base populations obtained from the simulation of the founder ancestors were stored. Each replicate had its own base population. Averages of 100 replicates of each breeding scheme with different scenarios were compared in terms of inbreeding, genetic gain, accuracy of selection and variance reduction generated in G2. Inbreeding coefficients were computed using G0 as the base population.

Results

Trend in genetic parameters

Figure1 shows the trend in genetic parameters from G0 to G2 for the candidate population. Results are only shown for the RAND and MatPat schemes and h² = 0.05 because of their extreme values, while with the other schemes intermediate values were obtained. More detailed results will be presented in the next section. In the base generation G0, the accuracy of selection (1A), genetic levels (1B) and levels of inbreeding (1C) were zero and genetic variance (1D) was 10. For all schemes, the onset of inbreeding was in G2. An increasing trend in genetic gain and accuracy of selection was observed from G1 to G2 and genetic variance decreased. Increases in genetic gain and selection accuracy and the reduction in genetic variance were greatest for the MatPat scheme.

Genetic parameters in generation G2

Accuracy of selection

Accuracies of selection generated in the candidate population for all schemes are in Table1. The highest accuracy of selection was obtained with the MatPat scheme, while with the Pat and Mat schemes it was lowest. As expected, Pat and Mat were always very similar in accuracy. A substantial increase in accuracy was observed for the MatPat scheme compared to the RAND scheme. For example, accuracy of selection increased by 19% for h² = 0.05, by 12% for h² = 0.1 and by 7% for h² = 0.4. In contrast, use of a non-random mating scheme had only a small impact on accuracy and none of the differences were statistically significant (p > 0.05).

Table 1 Accuracy of estimated breeding values for the candidate population in G2

Full size table

Genetic gain

The genetic gains (Δ G) generated in the candidate population are presented in Table2 and on the whole they agree with the observed accuracies in Table1. The highest Δ G was achieved with the MatPat scheme, while the lowest Δ G was obtained with the Pat and Mat schemes. Compared to the RAND scheme, all schemes had a statistically significant (p > 0.05) increase in Δ G by 22% for h² = 0.05, by 12% for h² = 0.1 and by 6% for h² = 0.4 (Table2).

Table 2 Genetic gain generated in G2 in the candidate population

Full size table

Level of inbreeding

Levels of inbreeding generated in the candidate population are in Table3. As expected, level of inbreeding decreased as heritability increases. Compared to the diploids schemes, the differences in inbreeding attained appeared to be slightly higher in the double haploids schemes at lower heritability (h²=0.05). Thsese differences diminished as heritability increases. In addition, across all level of heritabilities, the level of inbreeding generated in the double haploid and diploid schemes were not significantly different (p > 0.05) from each other.

Table 3 Level of inbreeding generated in G2 in the candidate population

Full size table

Variance reduction

The genetic variances generated in the candidate population are in Table4. As a result of selection of parents in G1, the genetic variance was reduced by 15% to 30% in G2 compared to G0, depending on the scheme and heritability. Comparisons among the double haploid schemes show that the genetic variance retained was slightly higher in the Mat and Pat schemes and lower in the MatPat scheme. In contrast, the genetic variances retained within the diploid schemes were not significantly different (p > 0.05) from each other, and for higher heritabilities tended to be intermediate between the MatPat and the single sex double haploid schemes. Overall, the pattern of differences in genetic variances between heritabilities and schemes was qualitatively similar to the pattern observed for differences in accuracies (Table1).

Table 4 Fraction of genetic variance retained in G2 in the candidate

Full size table

Discussion

It has been reported that genomic selection in aquaculture breeding schemes can increase selection accuracy for candidates and increase genetic gain compared to traditional aquaculture sib testing schemes[8, 9, 19]. Our study shows that, when using genomic selection, creating double haploids as part of the process of sib testing can increase the selection accuracy of candidates and the genetic gain even more. These additional increases in accuracy and genetic gain were most dramatic with a low heritability (~22%) but were still substantial when heritability was 0.4 (~7%). However, this result was only obtained when both sexes were used to create double haploids for testing. When only one sex is double haploid, selection accuracy was reduced because only chromosome sets from the dam (sire) entered the test population and this was not offset by increasing the number of observations per chromosome set.

In this study, attempting to increase homozygosity above that obtained from random mating through maximum coancestry mating when breeding the test population had no detectable impact on genetic gain or inbreeding. This is because assortative mating was only done for one generation, which is unlikely to produce extreme genotypes. If breeding of the test population with the MaxC scheme had been continued for 10 generations or more, similar results to those obtained with the double haploids scheme would have been observed because it takes approximately 10 generations of continuous full sib mating to produce fully inbred lines[14]. Clearly one generation of either MaxC or MinC is insufficient to deliver any benefit.

The increase in selection accuracy of the candidate population is due to the increase in accuracy of the estimates of SNP effects in the test population because the double haploids have more extreme genotypic values. It is important to note that the study design used here permits this inference because the non-random mating structure in the test population was not replicated in the candidate breeding population, which was always bred at random. Therefore, the increased predictive accuracy was due to the test design and not the results of differences in family structure amongst the candidates. For example, if MinC mating had been implemented in the candidate population, say for five generations, to improve the family structure, genetic effects would eventually have been better estimated and a substantial increase in selection accuracy might have occurred[20].

The benefit of the improved accuracy from generating extreme genotypes in the test population does come at a cost in robustness to the underlying genetic architecture. The design provides an estimate of a, i.e. half the difference between the genetic value of homozygotes. In this study, the allelic effects were simulated to be additive, so the estimates of an allelic substitution were not biased by the absence of heterozygotes. However, if the dominance deviation (d) is not equal to zero, the average effects obtained from homozygotes are biased by d(1-2p) where p is the minor allele frequency. This bias increases as p reduces. Most QTL are expected to have a low minor allele frequency, potentiating the bias. Presence of epistatic gene actions are expected to result in similar biases from estimations based on homozygotes only.

The advantage obtained with the MatPat scheme was achieved at a cost of reduced genetic variation in G2 compared to other schemes. However, the results show that this reduction in genetic variation was mainly generated by additional linkage disequilibrium[21] created by the higher accuracy obtained with the MatPat scheme. First, inbreeding accounted for only a loss of 2% of the genetic variance and any difference in inbreeding between MatPat with the other schemes was not substantial. Under the infinitesimal model, the loss of genetic variance due to linkage disequilibrium would be ½ρ² where ρ is the accuracy of selection[21]. Thus with ρ = 0.6, predicted loss would be 18%. Using (1-½ρ²)(1-F) predicted losses in genetic variance observed in Table4 from the accuracies of selection presented in Table1 gives a very close approximation. Experience with the infinitesimal model has demonstrated that increasing selection intensity results in greater short and medium term genetic gain even though this also increases the linkage disequilibrium. However, this is not the case when selection is on a known or marked QTL along with estimates of unlinked polygenic effects, for which slowing fixation of the QTL has been shown to result in increased long-term gain[22]. The existence of a trade-off between accuracy and long-term gain that is independent of inbreeding (genomic or pedigree) has not been reported to date.

There are two methodological approaches to produce double haploids, either mitotic or meiotic. In this study, mitotic gynogenesis and mitotic androgenesis were used. There are good reasons to expect that the extra genetic gains would have been somewhat less with meiotic gynogenesis and meiotic androgenesis because these technologies result in fish that are less homozygous than their mitotic counterparts, i.e. only a subset of their genotypes are homozygous[15].

The increased accuracy of selection with use of a double haploid sib test population results from the explanatory variables in the regression equation (Equation 2) taking more extreme values due to inbreeding, which, based on regression theory[23] , is known to increase the accuracy of the estimation of SNP effects. As derived in the Appendix, the increase selection accuracy from the perspective of genomic relationships and selection index theory can be explained using the formula:

E [r_{GS}^{2}] = A_{21}^{T} P^{- 1} A_{21} + t r a c e (P^{- 1} V)

(6)

Where A₂₁ is the relationship between sib test individuals and one of the selection candidates, and P is a phenotypic covariance matrix for the sib test individuals, and V is the variance of relationships between the test sibs.

This shows the expected reliability (= squared accuracy) of genomic selection is approximately equal to the reliability of traditional EBV plus a term that increases with the variances of the deviations of genomic relationships from their expectations in the test population. The implications of this formula go further in defining conditions that maximize the expected reliability of genomic selection: (1) the training animals should be as little related as possible, which makes P^-1 large; (2) the training animals should be as much related to the selection candidates as possible, which makes A₂₁ large; and (3) deviations of the genomic relationships from the traditional relationships should be as large as possible, resulting in large V. Points (1) and (2) were also observed by[12]. Double haploids have the same expected relationship with the candidates as diploid training animals (½), but the variance of their genomic relationship with the candidates is increased due to inbreeding. Thus, the increase in accuracy of genomic selection when using double haploids can be explained in two ways: (1) the more extreme regression factors in (Equation 2) allow SNP effects to be more accurately estimated, or (2) their more variable relationships with the candidates can be used by selection indices and BLUP to increase the accuracy of the EBV of the candidates.

Here, GS-BLUP[1] was used for genetic evaluation. Similar outcomes in terms of increases in accuracy and genetic gain from double haploids and non-random mating are expected with other genomic evaluation methods that are currently used. For example, the BayesB method[1] concentrates on certain important regions of the genome. Double haploid individuals are also double haploids for these regions of the genome. Therefore, the variance of genomic relationships between individuals in the test and candidate individuals also increases at these regions of the genome and thus delivers a more accurate estimation of SNP effects than the test population produced by random mating.

It is expected that the use of double haploid sib test populations increases the accuracy of genomic selection for any candidate population because use of double haploids achieves a more accurate estimation of the SNP effects in the test population by increasing the variance of the genomic relationships between the test and candidate population. This is expected to hold for any candidate populations. However, the selection candidates should not be completely different from the test population

Relevance of the study

This study shows the benefits of using double haploids as test sibs in aquaculture genomic selection breeding schemes. An increase of 7 to 19% in selection accuracy, leading to a 6 to 22% increase in genetic gain was obtained. This resulted from more accurate estimation of SNP effects and required a mixture of paternal and maternal double haploid test sibs in combination with genomic selection. Increases in accuracy and genetic gain from use of double haploids were greater with lower heritability levels. Therefore, the outputs of this study can be used to increase genetic gain for difficult traits such as disease resistance and meat quality, which cannot be recorded on selection candidates. In practical applications, eggs may be collected from the nucleus breeding parents and divided in two parts and 50% would be fertilized in a natural way, forming the candidate population, and the rest further divided into one half that is submitted to gynogenesis and the other half to androgenesis. This would result in a mixture of maternal and paternal double haploid genome fishes from each family in the test population.

However, there are practical problems to overcome for the implementation of double haploids in aquaculture breeding schemes, including biases from non-additive gene effects costs and other practical constraints. There may also be ethical and regulatory issues related to animal welfare associated with the use of chromosome manipulation techniques.

Conclusions

This study has shown that the use of double haploids that produce inbred test sibs for estimation of SNP effects significantly increased selection accuracy and genetic gain. This required a mixture of test sibs with maternal and paternal double haploid genomes. The approach yielded increases in selection accuracy of up to 19% and in genetic gain of up to 22%. The double haploid technique produces inbred fish in one generation, which increased the accuracy of the estimation of SNP effects. Another strategy, in which the test population was designed based on non-random mating such as maximum coancestry mating, hardly improved selection accuracy. Finally, this study demonstrated the benefit of using distinct designs for the testing versus the candidate population.

Appendix

Genetic relationships and accuracy of selection

Based on selection index theory and assuming genetic variance is 1, the reliability (squared accuracy) of traditional selection is (assuming):

r^{2} = A_{21}^{T} P^{- 1} A_{21}

where A₂₁ is the relationship between sib test individuals and one of the selection candidates, and P is a phenotypic covariance matrix for the sib test individuals, i.e. P = A₂₂ + E, where A₂₂ is the pedigree-based relationship between the sib test individuals and E is a diagonal matrix of scaled environmental variances with (1-h²) / h² on the diagonal. Let the genomic relationship matrices be denoted by G₂₂ = A₂₂ + Δ and G₂₁ = A₂₁ + δ, where Δ and δ represent changes from the expected relationships due to the genomic information. Then with genomic selection, the squared accuracy of selection is:

\begin{array}{l} r_{GS}^{2} & = {(A_{21} + δ)}^{T} {(P + Δ)}^{- 1} (A_{21} + δ) \\ = A_{21}^{T} {(P + Δ)}^{- 1} A_{21} + δ^{T} {(P + Δ)}^{- 1} δ \end{array}

assuming that A₂₁ and δ are independent.

Since (P + Δ) = P(I + P^− 1Δ), (P + Δ)^− 1 ≈ P^− 1 − P^− 1ΔP^− 1 then: ,

\begin{array}{l} r_{GS}^{2} & = {(A_{21} + δ)}^{T} {(P + Δ)}^{- 1} (A_{21} + δ) \\ = A_{21}^{T} P^{- 1} A_{21} + δ^{T} P^{- 1} δ - δ^{T} P^{- 1} Δ P^{- 1} δ \end{array}

assuming the A₂₁^TP^− 1ΔP^− 1A₂₁ term is on average 0, since Δ is on average 0. Taking expectations:

\begin{array}{l} E [r_{GS}^{2}] & = A_{21}^{T} P^{- 1} A_{21} + E [δ^{T} P^{- 1} δ] \\ - E [δ^{T} P^{- 1} Δ P^{- 1} δ] \end{array}

where the term E[δ^TP^− 1δ] = trace(P^− 1V), where V = var(δ). The third term

\begin{array}{l} E [δ^{T} P^{- 1} Δ P^{- 1} δ] & = E_{Δ} [E [δ^{T} P^{- 1} Δ P^{- 1} δ | Δ]] \\ = E_{Δ} [t r a c e (Δ P^{- 1} V P^{- 1})] \\ = 0, s i n c e E [Δ] = 0 . \end{array}

Thus,

E [r_{GS}^{2}] = A_{21}^{T} P^{- 1} A_{21} + t r a c e (P^{- 1} V)

References

Meuwissen THE, Hayes BJ, Goddard ME: Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001, 157: 1819-1829.
PubMed Central CAS PubMed Google Scholar
VanRaden PM, Van Tassell CP, Wiggans GR, Sonstegard TS, Schnabel RD, Taylor JF, Schenkel FS: Invited review: reliability of genomic predictions for North American Holstein bulls. J Dairy Sci. 2009, 92: 16-24. 10.3168/jds.2008-1514.
Article CAS PubMed Google Scholar
Hayes BJ, Bowman PJ, Chamberlain AJ, Goddard ME: Invited review. Genomic selection in dairy cattle: progress and challenges. J Dairy Sci. 2009, 92: 433-443. 10.3168/jds.2008-1646.
Article CAS PubMed Google Scholar
Goddard ME, Hayes BJ, Meuwissen THE: Genomic selection in farm animal species - Lessons learnt and future perspectives. Proceedings of the 9th World Congress on Genetics Applied to livestock. 2010, Leipzig,http://www.kongressband.de/wcgalp2010/assets/pdf/0701.pdf,
Google Scholar
Habier D: More than a third of the WCGALP presentations on genomic selection. J Anim Breed Genet. 2010, 127: 336-337. 10.1111/j.1439-0388.2010.00897.x.
Article PubMed Google Scholar
Heffner EL, Sorrells ME, Jannink JL: Genomic selection for crop improvement. Crop Sci. 2009, 49: 1-12. 10.2135/cropsci2008.08.0512.
Article CAS Google Scholar
Jannink JL, Lorenz AJ, Iwata H: Genomic selection in plant breeding: from theory to practice. Brief Funct Genomics. 2010, 9: 166-177. 10.1093/bfgp/elq001.
Article CAS PubMed Google Scholar
Sonesson AK, Meuwissen THE: Testing strategies for genomic selection in aquaculture breeding programs. Genet Sel Evol. 2009, 41: 37-10.1186/1297-9686-41-37.
Article PubMed Central PubMed Google Scholar
Nielsen HM, Sonesson AK, Yazdi H, Meuwissen THE: Comparison of accuracy of genome-wide and BLUP breeding value estimates in sib based aquaculture breeding schemes. Aquaculture. 2009, 289: 259-264. 10.1016/j.aquaculture.2009.01.027.
Article CAS Google Scholar
Daetwyler HD, Villanueva B, Woolliams JA: Accuracy of predicting the genetic risk of disease using a genome-wide approach. PLoS One. 2008, 3: e3395-10.1371/journal.pone.0003395.
Article PubMed Central PubMed Google Scholar
Goddard M: Genomic selection: prediction of accuracy and maximisation of long term response. Genetica. 2009, 136: 245-257. 10.1007/s10709-008-9308-0.
Article PubMed Google Scholar
Pszczola M, Strabel T, Mulder HA, Calus MPL: Reliability of direct genomic values for animals with different relationships within and to the reference population. J Dairy sci. 2012, 95: 389-400. 10.3168/jds.2011-4338.
Article CAS PubMed Google Scholar
Martinez VA, Hill WG, Knott SA: On the use of double haploids for detecting QTL in outbred populations. Heredity. 2002, 88: 423-431. 10.1038/sj.hdy.6800073.
Article CAS PubMed Google Scholar
Wright S: Systems of mating. II. The effects of inbreeding on the genetic composition of a population. Genetics. 1921, 6: 124-143.
PubMed Central CAS PubMed Google Scholar
Komen H, Thorgaard GH: Androgenesis, gynogenesis and the production of clones in fishes: a review. Aquaculture. 2007, 269: 150-173. 10.1016/j.aquaculture.2007.05.009.
Article Google Scholar
Kimura M: The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics. 1969, 61: 893-903.
PubMed Central CAS PubMed Google Scholar
Haldane JBS: The combination of linkage values, and the calculation of distances between the loci of linked factors. Genetics. 1919, 8: 299-309.
Google Scholar
Sonesson AK, Meuwissen THE: Mating schemes for optimum contribution selection with constrained rates of inbreeding. Genet Sel Evol. 2000, 32: 231-248. 10.1186/1297-9686-32-3-231.
Article PubMed Central CAS PubMed Google Scholar
Villanueva B, Fernández J, García-Cortés LA, Varona L, Daetwyler HD, Toro MA: Accuracy of genome-wide evaluation for disease resistance in aquaculture breeding programs. J Anim Sci. 2011, 89: 3433-3442. 10.2527/jas.2010-3814.
Article CAS PubMed Google Scholar
Nirea KG, Sonesson AK, Woolliams JA, Meuwissen TH: Effect of non-random mating on genomic and BLUP selection schemes. Genet Sel Evol. 2012, 44: 11-10.1186/1297-9686-44-11.
Article PubMed Central CAS PubMed Google Scholar
Bulmer MG: The effect of selection on genetic variability. Amer Nat. 1971, 105: 201-211. 10.1086/282718.
Article Google Scholar
Gibson JP: Short term gain at the expense of long-term response with selection on identified loci. Proceedings of the 5th world congress on genetics applied to livestock production. 1994, University of Guelph, Guelph, 201-204.
Google Scholar
Galton F: Regression towards mediocrity in hereditary stature. J Anthropol Inst. 1886, 15: 246-263.
Google Scholar

Download references

Acknowledgements

This study was funded by grant 190442/S40 from the Research Council of Norway.

Author information

Authors and Affiliations

Department of Animal and Aquacultural Sciences, Norwegian University of Life Science, P.O. Box 5003, Ås, 1432, Norway
Kahsay G Nirea, John A Woolliams & Theo HE Meuwissen
Nofima AS, P.O. Box 210, Ås, 1431, Norway
Anna K Sonesson
The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, Scotland, UK
John A Woolliams

Authors

Kahsay G Nirea
View author publications
You can also search for this author in PubMed Google Scholar
Anna K Sonesson
View author publications
You can also search for this author in PubMed Google Scholar
John A Woolliams
View author publications
You can also search for this author in PubMed Google Scholar
Theo HE Meuwissen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kahsay G Nirea.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

All authors were involved in the design of the study. KGN wrote the draft manuscript and ran the computer programs. THEM and AKS wrote simulation computer programs. AKS, THEM and JAW edited the drafted manuscript. All authors have read and approved the final manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Nirea, K.G., Sonesson, A.K., Woolliams, J.A. et al. Strategies for implementing genomic selection in family-based aquaculture breeding schemes: double haploid sib test populations. Genet Sel Evol 44, 30 (2012). https://doi.org/10.1186/1297-9686-44-30

Download citation

Received: 25 April 2012
Accepted: 22 October 2012
Published: 30 October 2012
DOI: https://doi.org/10.1186/1297-9686-44-30

Strategies for implementing genomic selection in family-based aquaculture breeding schemes: double haploid sib test populations

Abstract

Background

Methods

Results

Conclusions

Background

Methods

Simulation of the base population (generation G0)

Simulation of generations for breeding schemes

Deriving alternative test populations in G2

Simulation of markers, quantitative trait loci and true breeding values

Estimation of SNP effects

Statistics

Results

Trend in genetic parameters

Genetic parameters in generation G2

Accuracy of selection

Genetic gain

Level of inbreeding

Variance reduction

Discussion

Relevance of the study

Conclusions

Appendix

Genetic relationships and accuracy of selection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Genetics Selection Evolution

Contact us