Optimum contribution selection for conserved populations with historic migration
© Wellmann et al.; licensee BioMed Central Ltd. 2012
Received: 11 June 2012
Accepted: 17 October 2012
Published: 15 November 2012
Skip to main content
© Wellmann et al.; licensee BioMed Central Ltd. 2012
Received: 11 June 2012
Accepted: 17 October 2012
Published: 15 November 2012
In recent decades, local varieties of domesticated animal species have been frequently crossed with economically superior breeds which has resulted in considerable genetic contributions from migrants. Optimum contribution selection by maximizing gene diversity while constraining breeding values of the offspring or vice versa could eventually lead to the extinction of local breeds with historic migration because maximization of gene diversity or breeding values would be achieved by maximization of migrant contributions. Therefore, other objective functions are needed for these breeds.
Different objective functions and side constraints were compared with respect to their ability to reduce migrant contributions, to increase the genome equivalents originating from native founders, and to conserve gene diversity. Additionally, a new method for monitoring the development of effective size for breeds with incomplete pedigree records was applied. Approaches were compared for Vorderwald cattle, Hinterwald cattle, and Limpurg cattle. Migrant contributions could be substantially decreased for these three breeds, but the potential to increase the native genome equivalents is limited.
The most promising approach was constraining migrant contributions while maximizing the conditional probability that two alleles randomly chosen from the offspring population are not identical by descent, given that both descend from native founders.
Many local varieties of domesticated animal species have been established in the last centuries. However, due to agricultural innovations since the beginning of the 19th century and subsequent intensification of production, many landraces are no longer adapted to their changing environments [1, 2]. They have been crossed with superior breeds in order to improve the economic value of the breeding stock. Gene flow usually occured from the economically most important breeds to the landraces, but not backwards. Consequently, most historic breeds are now extinct and the remaining ones have considerable genetic contributions from a small number of economically superior breeds. Efforts are needed to prevent the remaining historic breeds and their gene pools to become extinct. Conservation efforts can have different objectives. Objectives of breeding programs can be to breed back the historic breeds by removing genetic contributions of migrants, to conserve the breeds in their present appearance, or to increase the economic values. In any case, genetic contributions arising from more frequent breeds are not subject to conservation efforts since their genes are widespread.
Meuwissen  proposed to maximize the expected mean breeding value of the offspring while constraining its gene diversity to a predefined value. A related but not equivalent approach is to maximize the gene diversity in the offspring with or without constraining its expected mean breeding value to a predefined value. In this paper, the latter approach is applied and generalized. This approach seems more appropriate for conserved populations because for these populations the focus is on conservation. In general, the method consists of calculating an optimum contribution c a (or the desired number of offspring) for each breeding individual a such that the offspring population maximizes an appropriate objective function ϕunder some side conditions.
In the classical approach  (Approach A) the gene diversity GD in the offspring O(c) is maximized, where the vector c contains the genetic contribution of each breeding individual to the offspring population. Thus, ϕ A (c) = GD(O(c)). Gene diversity of a population is the probability that two alleles randomly chosen from the population are not identical by descent (IBD). However, this objective function may be not appropriate for conserved populations because maximization of gene diversity could be achieved by maximization of genetic contributions of migrants. Thus, this approach could eventually lead to extinction of the native breeds. Gene diversity should not fall below a certain level in order to avoid inbreeding depression. Gene diversity is, however, not the parameter that should be maximized in conserved populations. In conserved populations, we are interested in the conservation of alleles that come from native founders, as migrant alleles usually originate from non-endangered breeds. That is, we want to maximize the probability ϕ B that both alleles are not IBD and descended from native founders (Approach B), or the probability ϕ C that both alleles are not IBD and at least one of them descended from native founders (Approach C). We also considered the possibility of maximizing the conditional probability ϕ D that both alleles are not IBD, given that both descended from native founders (Approach D). For Approach D, we constrained the mean migrant contribution in the offspring population.
Lacy  introduced the concept of founder genome equivalents (FGE). The FGE of a population is the minimum number of unrelated founders that would be needed to establish a population that has the same gene diversity as the population under study. Recall that gene diversity is the probability that two alleles chosen at random are not IBD. However, a more important parameter to characterize the value of a breed for conservation purposes is the conditional probability that two randomly chosen alleles are not IBD, given that both descended from native founders. We call it the conditional gene diversity of the population. Large conditional gene diversity indicates that many native founder alleles have been retained in the population even though they may be at low frequencies. This has led to the following definition of the native genome equivalents (NGE) of a population as the minimum number of unrelated founders that would be needed to establish a population that has the same conditional gene diversity as the population under study. It can be interpreted as the FGE that originate from native founders and that are still present in the population. Besides maintaining the economic value of the breed, the main objective of a conservation program for a population with historic migration is to maximize the NGE and to minimize the genetic contributions of migrants simultaneously.
In this paper, we compare objective functions ϕ A ,ϕ B , ϕ C and ϕ D with respect to their ability to conserve the gene diversity, to increase the FGE originating from native founders (i.e. the NGE), and to decrease the genetic contributions of migrants. Algorithms for solving these optimization problems are also derived and implemented in the R package PedAnalysis. Methods were applied and effective sizes were calculated for three German cattle breeds: Vorderwald, Hinterwald and Limpurg.
Since the methods were applied to populations with overlapping generations, all definitions are based on birth cohorts rather than generations. A birth cohort J is a set of individuals born in a particular time interval, e.g. the individuals B t born in year t, or the population P t at time t. Since the date of death is unknown in most cases, the population P t consists of all individuals up to a particular age T. This age T could be the average age of individuals when their last offspring was born, or, for simplicity, it could be the generation interval I. Thus, population P t consists of all individuals born in the time interval t-T,t.
where alleles X J and Y J are randomly chosen with replacement from birth cohort J, and founder alleles are assumed to be pairwise different. An equivalent representation is , where is the average coancestry in birth cohort J.
where is the set of alleles that come from native founders and is the set of alleles that come from migrants.
A further parameter that can be of interest is the effective size of the population. The effective size N e (t 1 t 2) of a population within a time interval t 1 t 2 is the size of an idealized random mating population of constant size that causes the same decrease of gene diversity as the true population within generations. However, in breeds with steady gene flow from other populations, the gene diversity does not decrease below a certain level, so this definition of the effective size does not make much sense for populations with migration. Therefore, we use a slightly different definition. We define the native effective size N eN (t 1 t 2) as the size of an idealized random mating population of constant size that causes the same decrease of the conditional gene diversity condGD(P t ) as the true population within generations. The effective population size at time t, defined as N eN (t) = lim ε→0 N eN ([t − ε, t + ε]), was calculated as described in , except that it was calculated from the conditional gene diversity. The native effective size quantifies the decrease of genome equivalents originating from native founders because the NGE depend only on the conditional gene diversity, as can be seen from the previous two equations. In a population without migration, N e and N eN are equal. However, in a population with steady gene flow from other populations, N eN is smaller than N e because the gene diversity approaches a plateau level, so N e (t) goes to infinity.
The population P t at time t, which consists of all individuals up to an age of T years, has gene diversity GD(P t ), native genome equivalents , and genetic contribution from native founders. Note that , so is the probability that a randomly chosen allele from age cohort J descends from a native founder. Besides monitoring of these quantities, a major task for a conservation program is the calculation of optimal genetic contributions for the breeding individuals that maximize the conditional gene diversity in the offspring and simultaneously maximize the genetic contribution from native founders in the offspring. Moreover, a sufficient level of gene diversity must be maintained in order to avoid inbreeding depression. In general, however, the quantities and cannot be maximized simultaneously, so an objective function is needed that considers each appropriately.
is maximized. This approach is intuitively appealing because it maximizes NGE. It has, however, the disadvantage that the conditional gene diversity can be large even for offspring populations with very large migrant contributions. This is due to conditioning on the event that the randomly chosen alleles X J and Y J originate from native founders. This can be seen as follows. Take a solution of the optimization problem and suppose that at least one migrant is a potential breeding individual. Then it can be shown mathematically that the genetic contribution of this migrant to the offspring population can be arbitrarily increased without changing the value of the objective function. Thus, the solution of the optimization problem may be not unique, and one solution maximizes migrant contributions. In order to avoid this, we put an additional constraint on the maximum permissible value for the genetic contribution from migrants to the offspring population.
where allele X i is randomly chosen from the two alleles of individual i at a particular locus.
These probabilities have the advantage that they can easily be computed with existing software, e.g. with function kinship() from the R-package kinship. For calculation of , the parents of all migrants were identified with the same dummy individual and for this individual a pedigree with several generations of selfing was added. The coancestry of individuals i, j, computed from this extended pedigree is equal to . Equality holds only approximately because only a finite number of generations of selfing was added. For calculation of , the parents of all migrants were identified with one single dummy individual, the parents of all native founders were identified with another single dummy individual, and for both individuals pedigrees with several generations of selfing were added. The coancestry of individuals i, j, computed from this extended pedigree, is equal to . For example, consider two full sibs i, j whose sire is a migrant and whose dam is a native founder. Their coancestry is , but , and .
Let be the N t ×N t coancestry submatrix for the N t individuals from population P t that is obtained from the true pedigree (i.e., . The N t ×N t matrix that contains the probabilities for each pair of individuals i, j from population P t is denoted as , and the N t ×N t matrix that contains the probabilities is denoted as . That is, rows and columns that correspond to individuals not born in time interval [t-T,t] and dummy individuals were excluded from the matrix.
where the approximation is exact if the events and are independent. Therefore, an approximate solution was obtained by maximizing objective function ϕ B under the additional constraint . The resulting contributions for the breeding individuals were used as starting values for general nonlinear optimization in order to obtain the exact solution. In the applications, the threshold value was quite arbitrarily chosen as the 75% quantile of the genetic contributions from native founders to individuals in the population. The same quantile was used for all breeds and years in order to make the results comparable. Results could be improved by choosing breed dependent threshold values.
We used the interior point method ipop in R-package kernlab (see ) for objective functions ϕ B and ϕ D , whereas for objective functions ϕ A and ϕ C with positive definite matrices we used solve.QP from R-package quadprog. It implements the dual method of Goldfarb and Idnani [11, 12].
Only three local cattle varieties of Baden and Württemberg in the south-west of Germany have been preserved from extinction. These are the Vorderwald cattle, Hinterwald cattle, and Limpurg cattle. Other local breeds were replaced by Simmentaler Fleckvieh after their introduction at the beginning of the 19th century because the small landraces were not suitable for tillage .
The small Hinterwald cattle could be preserved as an almost pure breed until the beginning of the 20th century [13, 14] because the poor soil quality in its region of origin was not suitable for larger breeds. Nevertheless, this breed adopted the colour of the Simmentaler Fleckvieh during the 19th century . The Hinterwald cattle were occasionally crossed with the Vorderwald cattle  and with Fleckvieh.
The red-and-white marked, colour-sided Vorderwald cattle were frequently crossed with Simmentaler cattle. Consequently, the white stripe along the back became rare already around 1900 . After the Second World War, Vorderwald cattle were also crossed with Ayrshire, Red Holstein and Montbéliard cattle in order to improve milk yield. These crosses were registered as Vorderwald cattle. Extinction probabilities for Vorderwald and Hinterwald cattle were estimated by .
The yellow coloured Limpurg cattle were not only frequently crossed with Simmentaler cattle , but also occasionally with Braunvieh and Gelbvieh cattle  in order to increase body size. Nevertheless, the population size decreased dramatically. Only 17 Limpurg cows were registered in 1967, so the breeding association was dissolved. Several Limpurg cattle, however, were rediscovered in 1986 and a new stud book was established. Not only Limpurg cattle were registered, but also Fleckvieh crosses, and some Gelbvieh and Glan-Donnersberger bulls .
The data consisted of the pedigrees and additional information on 25 412 Hinterwald cattle, 185 315 Vorderwald cattle, and 4 150 Limpurg cattle. Vorderwald cattle without offspring were removed from the data in order to reduce the data set. Pedigrees of Hinterwald and Vorderwald cattle trace back only to 1948 because the stud books were renewed after the Second World War. Pedigrees of Limpurg cattle trace back only to 1970. Cattle from other breeds were considered to be migrants. Additionally, Hinterwald and Vorderwald cattle with unknown pedigree born after t s = 1970 were also considered migrants, although some may have purebred ancestors. Limpurg cattle with unknown pedigree were considered to be migrants if they were born after t s = 1988. The generation intervals were similar for the three breeds (unpublished results). Here, we assumed a generation interval of I = 5.3 years for all breeds.
The right hand side of Figure 1 shows for each breed how the genetic contributions of migrants changed over time. Migrant contributions are shown for the true population P t and for the hypothetical offspring populations that would be obtained if optimum contribution selection were applied to population P t . The solid lines show that migrant contributions increased steadily for all three breeds. The dashed line for offspring A shows that all three breeds would become extinct if optimum contribution selection were used to maximize the gene diversity in the offspring. In contrast, objective functions ϕ B and ϕ C would reduce migrant contributions substantially by more than 50% in all three breeds. According to the constraint applied for objective function ϕ D , the corresponding line shows the 25% quantile of the migrant contributions in the population.
The right hand side of Figure 2 shows the changes in gene diversity. It can be seen that the gene diversity is high for all three breeds. This is caused by migration. Note that the native effective population size quantifies the decrease of genome equivalents arising from native founders, so the gene diversity can be constant (or increase due to migration) even if the native effective population size is small. As expected, optimum contribution selection with objective functions ϕ B , ϕ C , or ϕ D would cause a moderate but an acceptable loss of gene diversity.
Most of the time, the native effective size N eN was above 50 for the three breeds and due to migration, N e was larger than N eN . An effective size of at least 50 is considered acceptable, although an N e of 100 is recommended to be on the safe side . Many cattle breeds have effective sizes between 50 and 100 regardless of the total population size. Therefore, in order to conserve the overall gene diversity, it is generally recommended to conserve a large number of breeds with small population sizes rather than a small number of breeds with large population sizes. In this case, different alleles would be preserved in different subpopulations. These populations can be used as resources to identify advantageous genes that can be introgressed into commercial populations. Conserved populations must be sufficiently large to allow for this. However, breeds that are close to the economic viability threshold and populations that are expected to occupy niches that are different from that of established commercial breeds, should have larger population sizes in order to enable a sufficient selection response. Examples of the importance of farm animal genetic resources are the introgression of the polled gene into economically important cattle breeds, the introduction of indicine cattle breeds to South America because of their adaption to extreme environments, and introgression of genes for disease resistance into highly productive susceptible breeds .
The current N eN of the Vorderwald cattle was smaller than the estimates of the effective size obtained by  with other methods. The reason is probably that other methods do not distinguish between migrants and native founders. Genome equivalents arising from native founders are likely to decline faster than those arising from migrants because migrants are usually from economically superior breeds. The sufficiently large N eN show that for all three breeds, migration from other breeds was much larger than it was needed to avoid unacceptably high inbreeding depression. As a consequence, these breeds share only a small portion of their genes with the corresponding historic breeds of the same name. We showed that it is still possible to substantially increase the genetic contribution from the historic breeds by optimum contribution selection.
For optimum contribution selection, the choice of the objective function was crucial. Maximization of gene diversity (Approach A) turned out to substantially increase the migrant contributions and thus would lead to the extinction of these breeds. Approach B has the desired effect to substantially decrease the migrant contributions but does not put enough weight on the conservation of gene diversity. It is not recommended because it would reduce NGE and cause the largest loss of gene diversity. Approach C is recommended for conserved populations because for all three breeds the use of this objective function substantially decreased the migrant contributions, increased the NGE, and caused only a moderate decrease of gene diversity. Approach D can also be recommended, although it requires choice of a threshold for the migrant contributions. If the threshold is chosen appropriately, then this approach causes the largest increase in NGE. However, the potential to increase the NGE was small for the breeds considered. Interestingly, for the current populations, optimum contributions for Approach A were slightly negatively correlated with the optimum contributions obtained for the other approaches, whereas the optimum contributions for the approaches B, C, and D were pairwise positively correlated (not shown).
Amador et al. proposed two other approaches to reduce migrant contributions. Their first approach was to minimize migrant contributions in the offspring population. Their second approach was to minimize the probability that two alleles randomly chosen from the offspring population are IBD and descend from migrants. This objective function was computed from partial coancestry coefficients , but could also be computed by the methodology introduced in this paper. For both approaches, the maximum rate of inbreeding was restricted. However, provided that an acceptable rate of inbreeding can be achieved, it is not obvious why it is desirable that alleles originating from migrants should be not IBD in the offspring population. In contrast, all approaches proposed in this paper aim at increasing the probability that alleles originating from native founders are not IBD. Amador et al. concluded that even with only a few generations without management, a small amount of introgression can spread into the population and it may be almost impossible to recover. This was not observed in our study. The reason is probably that the total population sizes of the cattle breeds were much larger than their effective sizes, which increased the probability to find individuals with small migrant contributions. Moreover, the cattle populations may deviate from random mating populations because some breeders avoid the use of bulls with high migrant contributions.
Another approach could be to minimize the effective number of non-founders N enf , as defined by Caballero and Toro , in the offspring population. This approach would be equivalent to maximization of which would be achieved by increasing the average relationship in the offspring population O and by increasing the effective number N ef of founders in the offspring generation. Thus, the rate of inbreeding would have to be restricted by this alternative approach. This approach, however, would by definition not be optimal with respect to the objective functions introduced in this paper.
Our results show that migrant contributions can be substantially decreased for all three breeds, but the potential to increase the NGE is limited. The reduction of migrant contributions would be largely achieved in the first generation of management. In subsequent generations, some further improvement would be possible due to biological restrictions in previous generations. However, thereafter the management method becomes equivalent to an equalization of family sizes and no further reduction of migrant contributions could be achieved. Moreover, pedigree-based optimum contribution selection cannot remove genetic contributions of migrants that arose before recording of pedigrees started. However, removal of migrant contributions that arose earlier can be done subsequent to pedigree-based optimum contribution selection by identification of chromosome segments that are also present in the migrant breeds and by removal of those individuals with large migrant contributions from the breeding pool. Since migrants are usually males, haplotype variants of the Y-chromosome can be used as markers for paternal lineage  to identify the migrant breeds. For individuals that are not removed from the breeding pool (i.e individuals with small migrant contributions), optimum contributions can be calculated based on genomic relationships. In order to avoid that this approach causes the frequencies of migrant alleles to increase, the set of breeding individuals could be enlarged with individuals of the migrant breeds. After the optimum contributions have been computed, the contributions of these additional migrant individuals are set to zero, and the optimum contributions for individuals of the breed of interest are rescaled, so that they add up to one. Thereafter, it would be beneficial to combine closely related breeds with low gene diversity in order to reduce extinction probabilities , and to split breeds with a high gene diversity into several subpopulations in order to reduce the decrease of overall gene diversity . Breeds with highest value for conservation should be given priority . These breeds are likely found near the domestication center (since genetic diversity declines with increasing distance from the domestication centre ), far from the native areas of economically superior breeds, or live in harsh environmental conditions. Candidates are also breeds that are used for uncommon purposes (e.g. fighting cattle, cattle breeds used for cow racing).
The usual recommendation to optimize contributions for breeding individuals by maximizing gene diversity in the offspring is not suitable for populations with historic migration because maximization of gene diversity would be achieved by maximization of migrant contributions. Thus, this approach, applied to populations with migration, would rapidely lead to their extinction. Two approaches can be recommended. The first is to maximize the probability that two alleles randomly chosen from the offspring population are not IBD and that at least one of them descended from a native founder (Approach C). The other approach is to constrain migrant contributions while maximizing the conditional probability that two alleles randomly chosen from the offspring population are not IBD, given that both descended from native founders (Approach D). Migrant contributions could be substantially decreased for the three breeds investigated here, but the potential to increase the NGE is limited.
Programs for pedigree-based optimum contribution selection and for the analyses presented in this paper are available in R package PedAnalysis from the first author. Since migrants are usually from genetically superior breeds, optimum contribution selection is likely to reduce breeding values if there is no constraint on the expected breeding value of the offspring. The program for optimum contribution selection allows adding the constraint that the expected mean breeding value of the offspring does not fall below a certain value. Moreover it is possible to put a constraint on the maximum number of offspring per male and female.
The data were kindly provided by Henning Hamann, Landesamt für Geoinformation und Landentwicklung Baden-Württemberg, Germany. The authors thank the referees for pointing us to another paper dealing with this subject.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.