A combination of walk-back and optimum contribution selection in fish: a simulation study

The aim of this paper was to study the performance of a novel fish breeding scheme, which is a combination of walk-back and optimum contribution selection using stochastic simulation. In this walk-back selection scheme, batches of different sizes (50, 100, 1000, 5000 and 10 000) with the phenotypically superior fish from one tank with mixed families were genotyped to set up the pedigree. BLUP estimated breeding values were calculated. The optimum contribution selection method was used with the rate of inbreeding (ΔF) constrained to 0.005 or 0.01 per generation. If the constraint on ΔF could not be held, a second batch of fish was genotyped etc. Compared with the genotyping of all selection candidates (1000, 5000 or 10 000), the use of batches saves genotyping costs. The results show that two batches of 50 fish were often necessary. With a batch size of 100, genetic level was 76–92% of the genetic level achieved for schemes with all fish being genotyped and thus candidates for the optimum contribution selection step. More parents were selected for schemes with larger batches, resulting in a higher genetic gain, especially when all selection candidates were genotyped. There was little extra genetic gain in genotyping of 1000 fish instead of 100 for the larger schemes of 5000 and 10 000 candidates. The accuracy of breeding values was similar for all batch sizes (~0.30), but higher (~0.5) when all candidates were included. Since only the phenotypically most superior fish were genotyped, BLUP-EBV were biased. Compared with genotyping of all selection candidates, the use of batches saves genotyping costs, while simultaneously maintaining high genetic gains.


INTRODUCTION
Family-based selection for fish is today based on the rearing of fullsib families in separate tanks until the fish is large enough to be tagged with physical tags. A sample of a given number of tagged individuals from each fullsib family is then mixed. This paper examines how efficient selection and control of inbreeding can be performed without a need for family tanks and extensive individual tagging of fish.
An alternative to the separate rearing in tanks is to mix fish from different families into only one tank at a younger age, and identify parents of individual fish using genetic (DNA) markers. Parentage testing using genetic markers are used for fish populations, e.g. [4]. Physical tagging is also needed on these genotyped fish, so that the genotyping results can be traced back to the right individual. This strategy saves the investment and running costs of holding all tanks, but identification using genetic markers might become more expensive than rearing families in separate tanks until physical tagging is possible. Another result of using only one tank is that no common environment (tank) effect needs to be accounted for in the estimation of breeding values. This tankeffect varies greatly. For example, in Atlantic salmon and rainbow trout populations, the common environment effect (i.e., tank effect) on body weight was estimated to be about 2-6% [19,20]. For more recently domesticated species, where the environment is not yet standardized, the common environment effect can be rather large. In the Atlantic cod [7], the common environment effect on juvenile body weight was estimated to be 3-12% and in the rohu carp this effect (i.e. nursery pond effect) on harvest body weight was 32% [6]. Herbinger et al. [13] found a correlation of family growth performance of Atlantic salmon in single or mixed tanks being close to zero, indicating a large environmental tank effect. However, these fish were young, weighing ∼5 g. The tank effect has been shown to decrease with age of the fish [11], i.e. the correlation between fullsibs is expected to decrease with the age of fish.
Identification costs would be high if all fish needed to be genotyped, but genotyping costs can be reduced by only genotyping some individuals in the tank. Walk-back selection is a selection method for schemes with only one tank [3]. In a walk-back selection scheme, one assumes that selection is for a trait that can be easily recorded on the selection candidate itself (e.g. body weight). Firstly, the individual fish with the highest phenotypic value is selected. Thereafter, the individual with the second highest phenotypic value is genotyped (etc.). When using the within-family selection strategy, the second fish will become selected if it is not a full-or halfsib of the previously selected fish. This process is continued until the appropriate numbers of males and females needed for mating are obtained. Doyle and Herbinger [3] argued that within-family selection resulted in low rates of inbreeding (∆F) and genetic gain (∆G) not significantly lower than when there was individual selection. These schemes were, however, not compared at the same rate of inbreeding. Optimum contribution selection is a group selection method that maximises genetic gain with a restriction on ∆F for schemes with both discrete [9,17] and overlapping [10,18] generations. It is dynamic, such that it adapts to current selection candidates, and can therefore correct skewnesses in contribution of families over generations. Such skewness of contributions of families results in increased rates of inbreeding. The optimum contribution selection method resulted in increased genetic gain with up to 44% [18] compared to truncation selection for BLUP estimated breeding value [12] schemes at the same rate of inbreeding for livestock schemes. However, optimum contribution selection has not been tested in breeding schemes for aquaculture species.
The aim of this paper was to study the performance (genetic response and inbreeding) of a breeding scheme, which is a combination of walk-back and optimum contribution selection, using stochastic simulation. Because it is not practical to sample just one fish at the time to be genotyped for the parentage test, batches of different sizes with fish with the highest phenotypic values will be tested in the simulation study.

Breeding scheme
The structure of the simulated breeding scheme was that of a closed nucleus with discrete generations. Genetic values were simulated according to the infinitesimal model [2]. Genotypes, g i , of the unrelated base animals were sampled from the distribution N(0, σ 2 a ). The trait was recorded on an equal number of males and females before selection. Record y i was calculated as y i = g i + e i , where e i is the environmental effect, which was sampled from N(0, 1−σ 2 a ) making the base generation phenotypic variation (σ 2 p ) equal to 1.0. The base generation additive genetic variance, σ 2 a , was 0.1, or 0.25, corresponding to a heritability, h 2 , of 0.1 and 0.25, respectively. Later generations were obtained by simulating progeny genotypes from g i = 0.5g s + 0.5g d + m i , where s and d denote sire and dam of progeny i, respectively, and m i = Mendelian sampling component, which was sampled from N(0, 0.5(1−F sd )σ 2 a ), where F sd is the average of the inbreeding coefficients of the sire and the dam.
Schemes had 1000, 5000 or 10 000 candidates at each generation. ∆F was restricted to 0.005 or 0.010 per generation, which is an indication of the maximum acceptable rate of inbreeding (e.g. [8]). The size of each batch was set to 50, 100 and 1000. Due to the extensive computer time needed, batch size of 5000 or 10 000 fish were considered only for one scheme each for schemes with 5000 or 10 000 candidates.
The results after eight generations of selection are given. They are based on averages over 50 replicated schemes, except the schemes with batch sizes of 5000 (20 replicates) and 10 000 (10 replicates) candidates.
The accuracy of selection was calculated as the correlation between the estimated breeding values and the simulated true breeding values.

Selection
The following selection procedure was used: (1) Select a batch of the largest fish and genotype for parentage testing.
(3) Use the optimum contribution selection method to calculate optimum contributions of each selection candidate as described below. (4) If the constraint on ∆F cannot be held, select another batch (1) of the largest fish and go to (2). If the constraint on ∆F can be held, stop and conduct matings of the selected individuals as described below.
The optimum contribution selection was used as proposed by Meuwissen [17]. This method maximizes the genetic level of the next generation of animals, where c t is a vector of genetic contributions of the selection candidates to generation t+1 and EBV t is a vector of estimated breeding values of the candidates for selection in generation t, calculated as in Henderson [12]. The objective function, c t 'EBV t , is maximized for c t under two restrictions; the first one is on the rate of inbreeding and the second one is on the contribution per sex. The desired rate of inbreeding, ∆F d , is obtained by constraining the average coancestry of the selection candidates to C t+1 = 1 − (1 − ∆F d ) t [9]. The actual contributions of the individuals are then obtained such that they fulfil the constraint C t+1 ≥ c t 'A t c t /2, where A t is a (n × n) relationship matrix among the selection candidates. Note that the level of the constraint C t+1 , can be calculated for every generation before the breeding scheme commences. The contribution of each sex is constrained to 1 / 2 , i.e. Q'c t = 1 / 2 where Q is a (n × 2) incidence matrix of the sex of the selection candidates (the first column yields ones for males and zeros for females, and the second column yields ones for females and zeros for males) and 1 / 2 is a (2 × 1) vector of halves. The optimization procedure was explained in [17]. The output from the selection method is a vector with genetic contributions for each selection candidate, c t . Random mating was applied. A progeny was allocated a sire and dam assigned by randomly sampling a sire and a dam with sampling probabilities following the optimal contributions of the sires and dams. Each such sample of parents produced one male and one female progeny.

RESULTS
The constraint on ∆F was held for all schemes, such that the level of F at generation 9 (F) was at or somewhat lower than 0.077 for schemes with a restriction of 0.010 and 0.039 for schemes with a restriction of 0.005 (Tabs. I, II, III). The standard error of F was between 0.0000 and 0.0004 over all schemes.

Schemes with 1000 candidates
The results of schemes with 1000 candidates per generation are shown in Table I. With a heritability of 0.10 and a restriction on ∆F of 0.010, genetic level at generation 9 (G) was 1.51 σ p for schemes with a batch size of 50, 1.54 σ p for schemes with a batch size of 100 and 1.90 σ p for schemes where all 1000 candidates were genotyped. Hence, genetic gain increased, as expected, with batch size and the highest genetic gain was achieved when all individuals were genotyped and thus selection candidates for the optimum contribution step. The use of batches of 50-100 candidates resulted in 79-92% of the genetic level that was achieved when all 1000 candidates were genotyped.
For schemes with batch size of 50, more than one batch had to be used in some replicates, i.e. average number of male (25.9) plus female (27.1) candidates was 53.0. For schemes with batch size of 100 or 1000, only one batch was used.
More parents were selected for schemes with larger batch size, because the increased selection intensity requires an increased number of selected parents in order to achieve the same ∆F. For schemes with batch size of 50, 24.5 sires and 25.6 dams were selected whereas for schemes with batch size of 1000, 56.4 sires and 56.2 dams were selected. For schemes with a more stringent restriction on ∆F of 0.005 per generation, more selection candidates, i.e. more batches, were in general needed to keep the restriction. For example, with 1000 selection candidates, batch size of 100 and heritability of 0.25, only 100 candidates (i.e. one batch) were needed for the scheme with a ∆F restriction of 0.01, whereas 122 candidates were needed on average for the scheme with a ∆F restriction of 0.005.

Schemes with 5000 and 10 000 candidates
For larger schemes with 5000 (Tab. II) and 10 000 (Tab. III) candidates per generation, the same trends were seen as for schemes with 1000 candidates per generation. In general, the larger schemes resulted in a higher number of selected parents and genetic gain. The latter was possible, because ∆F was constrained. There was little extra gain in genotyping 1000 fish instead of only 100 for these larger schemes. Yet, genotyping of all candidates led to the highest genetic gain, such that the use of batches of 50-1000 candidates resulted in 75-78% of the genetic level that was achieved when all 5000 or 10 000 candidates were genotyped.

Effect of population size on genetic gain
It is expected that genetic gain increases with the size of the breeding scheme, but also that this relationship is not linear. In Figure 1, we see that G increases less when going from 5000 to 10 000 candidates per generation than when going from 1000 to 5000 candidates per generation, but still there is no plateau of genetic gain such that the schemes with even higher number of candidates per generation would probably yield even higher genetic gain than the largest schemes here.

DISCUSSION
This study shows that a combination of walk-back and optimum contribution selection makes it possible to achieve high genetic gains at a constrained rate of inbreeding while substantially reducing the costs of genotyping for parentage testing. With batch sizes of 50-100 fish, genetic level was 75-92% of the genetic level achieved for schemes with all (1000-10 000) fish being genotyped, the higher level being for schemes with high heritability.
In principle, the presented combination of walk-back and optimum contribution selection is a two-stage selection scheme, where in the first stage fish are selected on their phenotypic value and in the second stage, optimum contribution selection is used. Generally, two-stage selection schemes are efficient, as was found here, especially when the correlation between the first and second stage selection criterion is high [21].
The results of this study show that the constraint on ∆F was kept for all schemes. For schemes with a constraint on ∆F of 0.010 or 0.005, one batch of 50 fish was not always sufficient to keep the constraint, but instead two batches were necessary. The reason for working with e.g. two batches of 50 fish instead of one batch with 100 fish is that genotyping costs can be reduced in the first case. If, in some generations of selection, all fish in one batch come from only very few families because they have the highest phenotypic values, we would like to have the opportunity to take in more candidates such that the constraint on ∆F can be kept. In other generations, it might be enough with one batch of candidates.
Breeding values that are calculated using only a selected subset of the total population, which is the case in the schemes in this study, will show a selection bias [12]. These biases may differ for different animals, leading to some reranking of animals and thus to some reduction in accuracy of selection. More research is needed to correct BLUP-EBV for selection biases, because these biases will result in some reduction of accuracy of selection and in biased predictions of selection response.
For these schemes, there was little change in accuracy of the estimated breeding values over batch size, except for the schemes where all candidates were genotyped. For example, for schemes with 5000 candidates per generation, heritability of 0.10 and ∆F restriction of 0.010, accuracy of the estimated breeding values was 0.33, 0.25, 0.28 and 0.55 for batch sizes of 50, 100, 1000 and 5000, respectively (results not presented). The relatively higher accuracy for the batch size of 50 may be explained by the reduced selection intensity due to the low number of selection candidates, which reduces the Bulmer effect [1]. The high accuracy of selection at the large batch size may be due to the larger number of relatives with records. In general, the accuracy of the estimated breeding values was, as expected, higher for schemes with higher heritability.
The results in Tables II and III show approximately no increase in genetic gain when batch size increased from 100 to 1000. This insensitivity of genetic gain to increase in batch size was probably because the number of selected parents was usually around 100 and there was little or no extra gain from the genotyping of 900 more candidates. However, a large increase in genetic gain was found when batch size increased to 5000 and 10 000, respectively. When genotyping all candidates, the accuracy of selection was substantially increased (see previous paragraph), which resulted in this marked increase in genetic gain.
It was assumed that the survival rate was equal for all families. This is not the case in real populations [13]. Very unequal sizes of families within the batches of genotyped animals may often imply that optimum contribution selection does not achieve its constraint and thus that more batches need to be genotyped, i.e. genotyping costs can be increased substantially. It is, however, possible to reduce differences in early survival by keeping families separated until the survival rate has stabilized, when an equal number of fish per family can be mixed.
After the optimum contributions of the candidates had been calculated, progeny were randomly allocated to a sire and a dam with sampling probabilities following the optimal contributions of the sires and dams. This implies that the actual contributions of the sires and dams will deviate by chance from the optimal contributions. It is possible to set up breeding schemes where the number of progeny of the sires and dams correspond more precisely to the optimal contributions, but in real life breeding schemes, the actual number of progeny will most likely deviate from their optimum values. These deviations were simulated here by the sampling deviations from the optimal number of progeny. In real life breeding schemes, the deviations may, however, be of a different nature than simulated here, because they occur for different reasons, e.g. parents may obtain a large full sib family or no offspring, and fish may be mated to a limited number of mates.
It was assumed that there were no genotyping errors, such that the pedigree was set up without error. In practice, however, there will probably be errors both with the genotyping and the coupling of genotyping results with the physical tag. Practical experience shows, however, very high accuracy in the assignment rates of 90-95% using microsatellites [5,23].
The main limitation of these schemes is the assumption that all traits are measured on the candidates. Today, most comprehensive breeding goals include traits such as fillet quality, disease resistance traits in addition to growth or shape. Most of these traits are not measured on the candidates, but on sibs of the candidates. However, new technology allows for non-invasive measurements of fillet quality traits, e.g. fat deposition in Atlantic halibut [15] or fat composition in Atlantic salmon [22], such that more traits could be measured on the candidates. Although these two methods are non-invasive, they have not yet been used for live fish (under anaesthetics) under large-scale practical conditions. If not all traits can be recorded simultaneously, the earlier and/or least expensive recordings could be used in the first selection step, and the other traits could be recorded later and only on the candidates for the optimum contribution selection step. Challenge tests for disease resistance, based on sib selection schemes today, could be replaced by marker-assisted selection schemes (e.g. variants of bottom-up schemes [16] or top-down schemes [14] developed for livestock species), where candidates can be tested for genetic markers associated with disease resistance. Efficient marker-assisted selection programs for fish have, however, not yet been presented.
The advantages of combined walk-back and optimum contribution selection schemes are the following: (1) The costs of genotyping for parentage testing can be reduced substantially in large fish breeding schemes while maintaining high genetic gain. (2) There are no common environmental effects due to tank since only one tank is used. (3) New technologies will make it possible to measure more traits on the candidates, where the earlier and/or least expensive recording would be selected for in the first selection step, and the later in life and/or more expensive recordings could be included in the optimum contribution selection step. This multi-trait selection increases the accuracy of estimated breeding values and hence genetic response. (4) Some costs and "infrastructure" associated with marker-assisted selection programs are already made, e.g. genotyping and the coupling of genotyping results with the physical tag. Hence, an extension to marker-assisted selection schemes would be possible.