The use of molecular markers in conservation programmes of live animals

Monte Carlo simulation has been carried out to study the benefits of using molecular markers in a conservation programme to minimize the homozygosity by descent in the overall genome. Selection of the breeding individuals was either at random or based on two alternative criteria: overall heterozygosity of the markers or frequency-dependent selection. Even molecular information was available for all the 1 900 simulated loci, a conventional tactic such as restriction in the variance of the family size is the most important strategy for maintaining genetic variability. In this context: a) frequency-dependent selection seems to be a more efficient criterion than selection for heterozygosity; and b) the value of marker information increases as the selection intensity increases. Results from more realistic cases (1, 2, 3, 4, 6 or 10 markers per chromosome and 2, 4, 6 or 10 alleles per marker) confirm the above conclusions. This is an expensive strategy with respect to the number of candidates and the number of markers required in order to obtain substantial benefits, the usefulness of a marker being related to the number of alleles. The minimum coancestry mating system was also compared with random mating and it is concluded that it is advantageous at least for many generations. © Inra/Elsevier, Paris


INTRODUCTION
The interest in conserving different breeds and strains of farm livestock has arisen owing to the awareness of dangers created by the continuous decrease in the number of commercially exploited breeds and/or by the reduction of genetic variability imposed in modern breeding programmes [14].
The limited size of conserved populations of domestic strains causes inbreeding and loss of genetic variance, which lowers the performance of animals for at least some traits and increases the risk of extinction [12]. There are several ways to measure genetic variation and its loss but there is a consensus that in populations with genealogical records, calculation of inbreeding and coancestry coefficients are the most common tools for monitoring conservation schemes and for designing strategies to minimize inbreeding [3,4].
The application of new technologies in molecular biology provides information on genotypes of several polymorphic loci and therefore allows one to quantify the genetic variability by a list of alleles and their joint distribution of frequencies at many loci. A summary of this information is given by the observed genetic heterozygosity (homozygosity) defined as the proportion of loci heterozygous (homozygous) either at individual or at population level. Other measures are the effective number of alleles or the expected genetic heterozygosity, both related to the squares of allele frequencies [1,2]. The use of molecular markers allows one to increase the efficiency of conservation methods. Chevalet and Rochambeau [8] proposed a selection using an index equal to the inverse of the product of the frequencies of the alleles and more recently Chevalet [7] proposed a selection using an index equal to the heterozygosity measured at several marker loci.
In this paper, we present Monte Carlo simulation results on the benefits of using molecular information in a small conservation nucleus, considering different alternatives: individual or within-family selection, heterozygosity or frequency-dependent selection and random or minimum coancestry mating.

SIMULATION
The breeding population consisted of N s (= 4, 8 or 16) sires and N d = 3 N S dams. Each dam produced three progeny of each sex. These three N d offspring of each sex were the maximum possible number of candidates for selection to form the breeding individuals of the next generation.
The genome was simulated as 19 chromosomes, each with 100 loci placed at 1 cM intervals. All the loci of the founder population, 2 (N s +N d ), were considered different by descent. For selection purposes, a variable number of marker loci with a variable number of alleles were also situated in the chromosomes in an equally spaced manner. These marker loci were generated in linkage equilibrium in the base population.
Selection was either at random or based on two alternative criteria based on genetic markers. a) Selection for overall heterozygosity of the markers (HET), where the value of the genotype at each locus was computed as 1 if it was heterozygous, or 0 if it was homozygous, the value of an individual being the sum over loci.
b) Frequency-dependent selection (FD), where the value assigned to the genotype increased as the population frequency of the alleles that make this genotype decreased. There are many possible schemes of frequency-dependent selection but perhaps the simplest one is that proposed by Crow [9] in his basic textbook on population genetics. In this particular scheme, the value of the genotype A,!4j at each locus is (1 &mdash; p,/2)(l &mdash; p j/ 2), p i and p j being the frequencies of the A i and A j alleles, respectively, and therefore the homozygote for the rare allele is favoured over the heterozygote, which is favoured over the homozygote for the more common allele (except when the allelic frequencies are equal, where heterozygotes are favoured). For biallelic dominant markers, the equivalent method is to assign to the genotypes A 2 A 2 and A l A_ the values (1 -p 2/ 2) 2 and (1 -p l/ 2) 2 , respectively. The value of an individual is the sum over all the marker loci. In a small number of additional simulations, the effective number of alleles of the selected individuals as a group was used as selection criterion. By analogy with the concept defined by Crow and Kimura !10!, this parameter was calculated as n a = L/ ! ! p ! where p ij is the average i j frequency, in the selected population, of the allele i at locus j, and L is the number of marker loci.
Two types of selection were also considered: a) within-family selection (WFS), where each dam family contributed one dam and each sire family contributed one sire to the next generation; b) individual selection (IND) where no restriction was imposed on the number of breeding animals that each family contributed to the next generation. Two types of matings were implemented: a) random mating, and b) minimum coancestry mating where the average pairwise coancestry coefficient in the selected group was minimized. Minimum coancestry mating was implemented using linear programming techniques !20!.
The selection scheme was carried out for 15 generations. In each generation, several parameters were calculated : a) the proportion of the genome identical by descent calculated over the 1 900 loci that describe the genome; b) the proportion of homozygosity for the marker loci used in the selection criterion; c) the average inbreeding and coancestry coefficients of selected individuals calculated from the pedigrees; and d) the effective number of alleles calculated as previously indicated.

Complete molecular information
For different population structures, criteria and types of selection (including the situation of no selection due to the lack of molecular information) and random mating, the average homozygosity by descent of the population and the inbreeding coefficient calculated through the pedigree are shown in table 1. The average coancestry coefficient of all possible mates between the sires and dams of the previous generation was also calculated but is not included in the table because it gives values almost identical to those of inbreeding, as expected due to random mating.
With random choice of breeding animals (no molecular information available), the true values of genomic homozygosity at generation 15 were almost identical to the values of inbreeding calculated from pedigree records.
On the other hand, the inverse of the effective number of alleles coincided with the mean coancestry (including self-coancestries and reciprocals) since 1/n a = ! !P!;/7/ can be interpreted as the probability that two alleles taken i j at random from the pool of gametes produced by the current population are identical by descent. From table I, it is clear that, besides the obvious effect of the number of breeding individuals, the most important factor lowering the rate of homozygosity was restriction on the variance of family size (i.e. ensuring that each sire family leaves a sire and each dam family leaves a dam to the next generation), which resulted in decreasing this rate by about 25 %.
When selection using complete molecular information was practised, the inbreeding coefficient did not reflect the true homozygosity and the discrepancy increased as selection intensity increased. The criterion of restricted family size was of paramount importance. When the maximum molecular information was used but no restriction was placed on family size, the homozygosity was always greater than when molecular information was ignored but within-family selection was practised. With individual selection, from the maximum number of candidates available (3 N d ), a variable number (N d , 2 N d or 3 N d ) was chosen at random to be genotyped and then the best individuals were selected. The efficiency of the use of markers decreased as selection intensity increased. That implies that a selection intensity lower than those tested could have been optimal for this number of generations. Although there is no guarantee that these results will be maintained in the long term, they are rather paradoxical and can be attributed to the fact that as selection intensity increases there is a tendency to coselect fullor half-sibs. This is essentially the same effect that was first considered by Robertson [15] in the context of truncation selection and more recently analysed by Woolliams et al. [22] and Santiago and Caballero [17]. Within-family selection involves a restriction on the family size and, with this type of selection and for both criteria, the efficiency increased as selection intensity increased.
In the framework of individual selection, frequency-dependent selection (FD) is more efficient for controlling the homozygosity than selection for overall heterozygosity of the markers (HET), except for the highest selection intensity which is also due to an increased importance of Robertson's effect. But with restricted family size, frequency-dependent selection is more efficient in controlling homozygosity than selection for overall heterozygosity in all the analysed cases. An indication of the genetic similarity among the selected individuals is given by the effective number of alleles (n a ), inversely related to their coancestry. In the nucleus of eight sires and 24 dams, the values of n a in generation 15 are 3.82 (HET) and 3.52 (FD) for the more intense individual selection, but 5.37 (HET) and 7.23 (FD) for the more intense within-family selection.
The effect of minimum coancestry mating was also considered. With this mating system, the average value of the coancestry coefficient between pairs of selected sires and dams was greater (from 5 to 29 %) than the inbreeding coefficient of the progeny. It induced in all cases a delay in the appearance of inbreeding. Table II is equivalent to table I but with minimum coancestry mating (mCM) instead of random mating (RM). At generation 15, the values of the homozygosity attained were considerably lower with the use of mCM. The advantage of mCM over RM ranged from 6 to 33 %.
The diverse situations analysed were also compared according to their rate of homozygosity per generation. This parameter was calculated from generation 6 to 15 as ,0.Ho = (Hot -Ho t -l )/(l -Ho'-'), where Hot was the average homozygosity by descent of individuals in generation t (averaged over replicates). In the absence of molecular information, the rate of homozygosity per generation was higher for mCM than for RM, when the variance of family size was restricted. The opposite occurred with individual random choice of breeding animals. This indicates that with restriction on family size RM would be superior in the long term. Some simulation results indicated that the RM superiority will be attained very late, mCM being advantageous for more than 50 generations. In the nucleus of eight sires and 24 dams, the values of homozygosity in generation 50 were Ho 5° = 61.64 (RM) and 59.30 (mCM), for individual random choice, and Ho 50 = 49.15 (RM) and 48.20 (mCM) for within-family choice of breeding animals.
The rate of homozygosity summarizes the evolution of genetic variability during the period involved, but when molecular information is used for selection, it does not have an asymptotic meaning and, therefore, it will not necessarily give a good prediction of the increase of homozygosity in later generations. In this case, the disadvantage of the combination of mCM and restricted family size for controlling the homozygosity rate is attenuated. Additional simulation results for a longer term horizon indicated that, in the situations considered, mCM was also superior to RM for more than 50 generations. In the nucleus of eight sires and 24 dams with the more intense frequency-dependent selection, the values of Ho 5° were 51.35 (RM) and 44.38 (mCM) for individual selection, and 26.59 (RM) and 24.32 (mCM) for within-family selection.

Limited number of markers and alleles per marker
The relative value of the number of markers and the number of alleles per marker has been analysed only for the breeding structure of eight sires, 24 dams and two offspring of each sex per family using RM and WFS in a variety of situations. The homozygosity rate per generation was calculated for both the marker loci and the whole genome.
Two extreme situations were initially considered: a) maximum number of alleles (64, in this particular case) at a limited number of markers per chromosome; and b) maximum number of markers (100 per chromosome) with a limited number of alleles per marker. With totally informative markers, the benefits of using an increasing number of them followed the law of diminishing returns. The use of one marker per chromosome reduced by 5.85 (HET) or 21.00 % (FD) the rate of homozygosity attained without molecular information, while the corresponding values when two markers are genotyped were 8.47 (HET) and 27.16 % (FD). Six markers per chromosome could be enough to achieve similar homozygosity rates to those obtained with 100 markers. On the other hand, if the maximum number of markers is available, then 6-8 alleles per marker allow for the maximum efficiency to be attained.
In a more realistic situation, the joint effect of variable numbers of candidates, markers per chromosome and alleles per marker are shown in figures 1 and 2. The results of figure 1 confirm that frequency-dependent selection was a better method than selection for heterozygosity and that the advantage increased as molecular information increased. The relative value of increasing the number of candidates was also greater with more markers per chromosome although the effect followed the law of disminishing returns as shown in figure 2. Finally, the relative advantage of higher number of alleles also increased as both the number of candidates and the number of markers increased (figure !).
In summary, these results emphasize that an expensive strategy with respect to the number of candidates and the number of markers is required to obtain appreciable benefits.
More detailed results for both the rate of homozygosity in the whole genome and at the marker loci in a breeding population of eight sires and 24 dams chosen from 48 candidates of each sex, using within-family selection with two selection criteria (HET and FD) and two types of matings (mCM and RM) are given in tables III and IV. Contrary to the genomic homozygosity rate, homozygosity rate of markers increased as the number of alleles and/or markers increased owing to decreasing level of homozygosity in the initial base population.
It was confirmed that the value of a marker is related to the number of alleles, especially for FD selection. For example, two markers with six alleles were equally as valuable as (HET) or more valuable than (FD) three markers with two alleles (HET). The greater efficiency of frequency-dependent selection over selection for heterozygosity was more marked for maintaining marker heterozygosity than for maintaining genome heterozygosity and, for example, in the case of one marker with two alleles, all the initial marker heterozygosity was maintained after 15 generations. This advantageous characteristic could be relevant if the objective were to maintain the heterozygosity of a specific chromosomal region.
The rate of genomic homozygosity was higher for mCM matings owing to the balanced family structure but, as indicated before, the advantage of R.M appeared very late (after more than 50 generations in all the situations considered). On the other hand, the rate of marker homozygosity was lower for mCM in all cases of selection for heterozygosity considered or was equal in the cases of low number of markers (one, two or three per chromosome) and frequency-dependent selection. The effective number of alleles retained (results not shown), in contrast to homozygosity, was higher for strategies maintaining more heterozygosity. However, as expected, the loss of alleles was greater when the initial number was higher. For example, with one marker per chromosome, RM and HET, if the number of initial alleles was ten, only half of them (n a = 4.62) were retained at generation 15, whereas if the number of initial alleles was two, both of them were retained (n a = 1.91).
A way of diminishing genotyping costs is to use dominant markers such as RAPD or AFLP. In table V, dominant and codominant markers are compared considering bi-allelic loci with either equal or unequal frequencies of the two alleles. For the codominant markers, the results with equal and unequal frequencies were similar although the situation of equal frequencies was advantageous especially as the number of markers increased. The use of frequencydependent selection with dominant markers caused only a small reduction in efficiency compared with codominant bi-allelic markers, although the reduction was greater if the objective was to maintain heterozygosity at markers. The effectiveness of dominant markers was greater if the two phenotypes of each locus were at intermediate frequencies, which implied that the dominant alleles were at low frequencies. Although this comparison with bi-allelic codominant markers is satisfactory, the usual microsatellites are multi-allelic. According to the results of tables III and IV, obtaining similar homozygosity rates with microsatellites and dominant markers would require, for the second one, a greater number of individuals and/or markers to be genotyped. The first tactic would be adequate for RAPD markers and the second one for AFLP, which produces many markers per analysed sample. 4. DISCUSSION Molecular markers have received considerable attention in recent years as a tool to aid conservation of genetic variability in both captive and natural populations !2!. Amplification of DNA sequences by the polymerase chain reaction (PCR) offers a non-destructive means for genotyping endangered species. With this technique, microsatellite DNA markers have been considered the most useful for conservation programmes because they are highly informative and because of their codominant nature. Other markers such as RAPD and AFLP are also very promising owing to their simplicity and low cost, although generally they are dominant markers which are not yet included in the gene maps of domestic animal species. Until now, genetic markers have been used to calculate genetic distances between breeds, to resolve taxonomic uncertainties and to determine paternity. However, their application in practical conservation programmes of strains of domestic species is only beginning, and there is no example of conservation units where markers are routinely scored and utilized.
Probably the clearer and less controversial application of molecular markers in conservation genetics will be to identify distinct populations that need to be conserved and to infer the genetic relationships among the possible founders so that the initial animals that constitute the conserved population carry most of the genetic variability present in the population. A less studied issue is the usefulness of markers in delaying the inevitable loss of genetic variability in a population of limited size in the generations following its foundation.
Monte Carlo simulation allows one to evaluate the gains expected with the use of these technologies. In the present work, we have studied a particular nucleus of small size mimicking the conservation programme carried out in strains of Iberian pig [16], but the conclusions could be generalized. Markers have been generated in linkage equilibrium, but this limitation is not very important: we have run some simulations with the parameters considered in table III, but assuming that base populations have undergone ten previous generations of random individual selection. As an example, with four markers and six alleles per marker and frequency-dependent selection, the rates of homozygosity (%) of the genome and of the markers were AHo = 1.02 and AHo m = 0.18, respectively, instead of the current values of AHo = 1.05 and AHo m = 0.27 for a base population in linkage equilibrium, indicating that the efficiency of maintaining genetic variability will be improved, especially with respect to the markers.
The main measure of genetic variability that we have chosen is the global homozygosity by descent of all the genome calculated in all the candidates for selection. The homozygosity for the markers themselves would indicate the success of a conservation programme to maintain the variability at specific loci of potential economic or biological interest. Another measure of the genetic variability used in conservation genetics is the effective number of alleles, which is inversely related to the expected homozygosity and therefore to the overall coancestry of the population. According to Allendorf [1], heterozygosity is a simple and accurate indicator of the loss in genetic variation and is a good measure of the ability of the population to respond to selection in the short term, whereas the effective number of alleles will be optimal for long term considerations and will be more affected by bottleneck effects.
When molecular information is used as a selection criterion, there is a disagreement between the true homozygosity by descent and the inbreeding coefficient calculated by pedigree analysis. Moreover, the rate of homozygosity, unlike the rate of inbreeding, does not attain an asymptotic value after the first generations but it will decrease as selection proceeds. Some theoretical work needs to be carried out on the prediction of homozygosity by descent under these circumstances. Figure 3 summarizes the relative advantages of the diverse tactics analysed in this paper. When the molecular information is lacking, the first clear conclusion that appears in this study is that the use of conventional tactics such as restriction of family size is the most important criterion that should be considered in the genetic management of a conservation programme. Standardizing family sizes is predicted to double effective population size and is widely recommended when breeding rare breeds [3,11,12,18] and Brisbane and Gibson [5] proposed the minimization of the mean coancestry of individuals chosen for breeding as the optimal criterion for maintaining genetic variability. But the implementation of this criterion requires an iterative procedure which may be computationally expensive. However, if only fulland half-sib relationships are considered, this criterion would be the same as minimizing the variance of family sizes.
The use of minimum coancestry matings is another important tool for delaying the loss of heterozygosity and is especially efficient for maintaining the heterozygosity of the markers themselves. The advantage will disappear in the long term if there is a balanced family structure, but only after a very large number of generations. Furthermore, as variance of family size increases, the advantage of random mating will disappear even in the long term (see Caballero [6] for a discussion on this point).
When the use of molecular markers is considered in the framework of the traditional strategies of minimizing the variance of family sizes, frequencydependent selection seems to be a more efficient criterion than selection for heterozygosity to minimize the increase in homozygosity either of all the genome or of the markers themselves. An additional advantage of frequencydependent selection is that it can be readily applied to dominant markers such as RAPD or AFLP. However, there are many possible ways of implementing frequency-dependent selection. In this paper, we have followed the model of