Impact of strong selection for the PrP major gene on genetic variability of four French sheep breeds (Open Access publication)

Effective selection on the PrP gene has been implemented since October 2001 in all French sheep breeds. After four years, the ARR "resistant" allele frequency increased by about 35% in young males. The aim of this study was to evaluate the impact of this strong selection on genetic variability. It is focussed on four French sheep breeds and based on the comparison of two groups of 94 animals within each breed: the first group of animals was born before the selection began, and the second, 3–4 years later. Genetic variability was assessed using genealogical and molecular data (29 microsatellite markers). The expected loss of genetic variability on the PrP gene was confirmed. Moreover, among the five markers located in the PrP region, only the three closest ones were affected. The evolution of the number of alleles, heterozygote deficiency within population, expected heterozygosity and the Reynolds distances agreed with the criteria from pedigree and pointed out that neutral genetic variability was not much affected. This trend depended on breed, i.e. on their initial states (population size, PrP frequencies) and on the selection strategies for improving scrapie resistance while carrying out selection for production traits.


INTRODUCTION
Selection on major genes may affect within-population genetic variability. First, the polymorphism at a major gene itself depends on allele frequencies and disappears when an allele is fixed, a situation that can occur when the best genotype is homozygous. Second, it is well known that, in the vicinity of the genes under selection, allele frequencies change due to the hitchhiking phenomenon. Third, in a finite population, the carriers of the favourable genotype are more related to each other than randomly chosen individuals, which leads, for an equal number of reproducers, to a smaller effective population size than expected in a pure drift situation [13]. The risk of losing genetic variability under gene or marker assisted selection has been highlighted in many theoretical studies, e.g. [7,18], but it has been illustrated in only a few cases of real livestock populations [14]. However, simulations [17] have indicated that, when introduction of selection on a major gene leads to less intense selection on production traits, the selected animals tend to be less closely related.
Since October 2001, a selection programme based on using the existing variability of the PrP gene has been implemented in France under coordination and funding by the French Ministry of Agriculture, and with EU support. All French sheep breeds are concerned in order to progressively increase the frequency of the ARR ''resistant'' allele and to eliminate the VRQ ''very susceptible'' allele [9]. For cost-effectiveness reasons, it was decided to concentrate selection efforts and funds on registered nucleus flocks, in order to select and provide resistant rams to the whole sheep population. For each breed, a specific programme was defined, taking into account the main breed characteristics: initial PrP allele frequencies, disease prevalence, type of breed (milk, meat and rare), population size, etc. In addition, to reduce the risk of decreasing genetic progress on production traits and to avoid loss of genetic variability, rules dealing with the management of sires [22] and conservation of semen from susceptible elite rams in the national cryobank [5] were followed. After four years of implementation, this large-scale major gene assisted selection programme has provided impressive results: more than 400 000 genotypes have been determined, and the ARR allele frequency in the young candidate sires has increased from 51 to 86%, on average, over breeds [4].
The aim of the present study was to evaluate the consequences on the genetic variability due to selection of French sheep breeds on the PrP gene since 2001. Four breeds representing various situations were chosen for that purpose. The evolution of genetic variability was assessed via both pedigree information and polymorphisms at microsatellite markers.

Breeds and animals sampled
Among the 26 main French sheep breeds undergoing selection, four breeds were studied i.e. three meat breeds: Berrichon du Cher (BCF), Charollais (CHL) and Causses du Lot (CDL) and one dairy breed: Manech tête rousse (MTR). This choice resulted from the diversity of initial PrP allele frequencies among French breeds [21] and from some specificities of the breeding programme, including strategies to select for the ARR allele and preserve genetic variability (Tab. I). The BCF breed had the highest ARR allele frequency, i.e. 80%, before the PrP selection programme started. It was also the breed with one of the worst situations in terms of genetic variability due to the very limited size of the selection nucleus, the lack of management of the genetic variability and the intensity of the selection processes [8]. The CDL breed had the lowest initial ARR frequency (15%), and strong efforts to select for scrapie resistance were made, due to the high prevalence of the disease in its breeding area. As a consequence, genetic progress for production traits and management of the genetic variability were considered of secondary importance. The CHL breed showed the highest evolution of PrP frequencies among the French sheep breeds considering both the VRQ and the ARR alleles. This breed was also characterised by a large population size, weak selection procedures and favourable genetic variability criteria as defined by Huby et al. [8], although no specific rules for managing the population were applied. The MTR breed had a low initial ARR frequency (16%) and the highest prevalence of scrapie. This dairy breed, which represents the second largest population in France, was managed with an efficient breeding programme based on selection for dairy traits and control of the genetic variability. Thus, these four breeds are not representative of a hypothetical ''average'' situation, but exemplify the diversity of situations encountered in sheep breeding in France.
In each of the four breeds, two groups of 94 young rams were selected, leading to eight samples of animals. These rams were randomly chosen among young candidate sires, which were gathered each year from the different selection flocks and the different elite ram lines, in order to be performance tested in the BCF, CHL and CDL breeds, and progeny tested in the MTR breed. Young candidate sires were considered to be representative of the genetic diversity in selection flocks and, partly, of that in commercial flocks (due to the gene flow). The first group of 94 animals (sample 1) included young rams born before 2000, i.e. before selection for scrapie resistance began. For these rams, DNA was collected and stored, giving samples, which retrospectively represented Impact of PrP selection on genetic variability the situation before selection on the PrP gene started. The second group (sample 2) included young rams born in 2004, i.e. after 3-5 years of selection, depending on the breed.

Molecular information
The PrP gene and the 29 microsatellite markers were genotyped for all the animals by LABOGENA (http://www.labogena.fr). For the PrP gene, four alleles were identified using the Taqman method [12]: ARR, AHQ, ARQ and VRQ (ARH and ARQ alleles are confounded). The 29 markers were genotyped using a 3100 ABI PRISM Ò DNA sequencer (Applied Biosystems, Foster City, CA, USA). Five markers were chosen on chromosome 13, at various distances from PrP: the relative positions of markers McM152, HUJ616 and BMS1669 came from the NCBI map in conformity with the International Sheep Genomics Consortium; S11 and S04 are located within the ovine PRNP gene, at about 20 and 45 kb, respectively, from the DNA site coding for the prion protein in exon 3 [6]. The position of PrP is supposed to be at 2 cM from marker BMS1669, according to [27]. The other 24 markers are on other chromosomes and were therefore considered as neutral. Most of them are recommended for measurement of diversity by the FAO-ISAG [25]. General information about the PrP gene and all the markers used in this study are summarised in Table V.

Pedigree information
Genealogical data came from the national sheep database. The file contained all recorded animals born between 1970 and 2004 and their known ancestors, in the framework of the official performance recording. The numbers of animals in the pedigree data file were about 140, 427, 827 and 364 thousands in BCF, CHL, CDL and MTR breeds, respectively.

Comparison of samples and comparison of criteria of variability
The analysis of genetic variability was performed separately for each breed. Results obtained for the two young ram samples were compared, allowing quantification of the evolution of genetic variability between two periods: before selection for scrapie resistance (sample 1) and after 3-5 years of intense selection on the PrP gene (sample 2). The genetic variability was assessed from the molecular information and from the pedigree data. Parameters associated with Impact of PrP selection on genetic variability the molecular information were computed locus per locus. Results for the PrP gene and its flanking markers on chromosome 13 are presented separately. The remaining markers, considered as independent, were analysed together to give an overview of the assumed neutral genetic variability, which could be compared to that assessed from the pedigree data.

Criteria of variability based on molecular information
Allele frequencies and number of alleles were estimated by direct counting. At a given locus, the expected heterozygosity, (H) was computed according to the classical formula: where p i is the estimated allele i frequency, the sum being over all alleles. Wright F-statistics F IS and F ST defined as heterozygote deficiency within population and between populations, respectively, were computed using GENEPOP 4.0 [24]. In addition, between-sample diversity was estimated by the Reynolds genetic distance (D), which was chosen because it has been shown to be appropriate for livestock populations with short-term divergence [10,23]. Considering the first sample as the founder population, this distance was computed as: where p 1,i is the frequency of allele i in the first sample and p 2,i is the frequency of this allele in the second sample [11]. Distance D was also calculated between breeds from allele frequencies of the first samples, in order to compare within-breed to between-breed genetic diversity.
We tested for congruence or correlations among the different D distance matrices based on 30 individual loci, according to the procedure developed by Moazami-Goudarzi and Laloë [20]. The Reynolds distance matrices between the eight groups were generated for each locus and correlations between these matrices were estimated using a Mantel procedure [19]. Next, a principal component analysis (PCA) on the matrix of correlations was applied. The correlation circle realised by this PCA provided a visual assessment of marker congruity.

Criteria of variability based on pedigree data
The PEDIG software [2] was used to analyse the genealogical data. For each ram sample, the pedigree completeness level was assessed by computing 668 I. Palhiere et al. the average number of equivalent complete generations known (Eq.G) over each ram. The Eq.G was computed as the sum, over all known ancestors, of the terms 1/2 n , where n is the ancestor's generation number [15]. For each sample, the major ancestors were detected using an iterative method [3] and their marginal expected genetic contributions to the gene pool of the sample analysed were computed. Then, the major ancestors were ranked by decreasing marginal contributions, in order to determine the number of ancestors explaining 50% of the gene pool of the sample. The average coefficient of kinship [16] between animals of each sample was computed. Finally, individual coefficients of inbreeding were computed by the method of VanRaden [26]. The evolution of the average coefficient of inbreeding was assessed for the young candidate elite rams (performance tested in BCF, CHL and CDL breeds; progeny tested in the MTR breed) per birth year from 1992 to 2004, and the annual increase of inbreeding was estimated by linear regression over time. This allowed enlarging the view of genetic variability evolution, because the period studied was larger and the population analysed involved the whole cohorts of the young candidate sires evaluated each year (no sampling).

Genetic variability criteria deduced from molecular information
Number of alleles, expected heterozygosity and F IS between samples, for each breed, are presented in Table II. For the PrP gene, the strong change in heterozygosity illustrates the effectiveness of selection for scrapie resistance in elite rams, over a few years. Indeed, all rams in the BCF breed and most in CDL and CHL had ARR/ARR genotypes in 2004, despite the fact that the ARR allele frequencies were not very large at the beginning of selection, especially for CDL and CHL (Tab. I). In the MTR breed, selection response for the PrP gene was impressive as well, with an increase of ARR frequency from 16 to 68%, even if less dramatic than in the other breeds. Most animals were ARQ/ARQ in the first sample and ARR/ARQ in the second, due to assortative mating, which explains the increase of heterozygosity and the high and negative value of F IS .
The impact on markers at chromosome 13 was strongly dependent on the relative position of the marker from the PrP coding gene. As expected, the S04 and S11 markers, which are on the PrP gene (Tab. V) and should reach a mono-allelic state as soon as ARR is fixed on PrP, were strongly affected. The BMS1669 marker also showed a reduction of heterozygosity, similar to that of the S04 and S11 markers, except in the CHL breed. The loss of diversity was small for the HUJ616 marker, and even more so for the McM152 marker, which Impact of PrP selection on genetic variability are estimated to be at 13 and 27 cM from PrP, respectively. The impact of selection on neutral genetic diversity seems to be very low, according to the evolution of expected heterozygosity on the 24 microsatellite markers. Average differences between successive samples were close to zero for all breeds. The evolutions of the average number of alleles and values of F IS agree with this trend. The correlation circle among the Reynolds distances computed for each marker (Fig. 1) showed that the PrP gene, S04, S11 and, to a lower extent, BMS1669, were different from other markers. This was confirmed by a detailed analysis of the Reynolds distances between ram samples within each breed, computed for the three types of loci (Tab. III). As expected, the highest Reynolds distance was found for the PrP gene, more markedly in the CDL (1.852) and the MTR (1.713) breeds. The next highest values were observed for the S04, S11 and BMS1669 markers. The smallest distances were observed for the HUJ616 and McM152 markers and for ''neutral markers'', providing Impact of PrP selection on genetic variability evidence that genetic differentiation between samples was very small irrespective of breed. In addition, the Reynolds distances observed between samples were much smaller than the distances between breeds, which ranged from 0.101 to 0.186 (data not shown). The values of F ST between ram samples within breed (results not shown) agree with the results from the Reynolds distances. For the neutral markers, F ST values ranged from 0.0004 in CHL to 0.0086 in BCF whereas for the PrP gene, they ranged from 0.1348 in BCF to 0.6162 in CHL.

Genetic variability assessed via pedigree data
Considering the most recent samples of young rams, pedigrees were found to be rather complete in the BCF, CHL and MTR breeds, with respectively, 7.2, 7.5 and 6.0 Eq.G, and less complete in the CDL breed with only 4.3 Eq.G. The average coefficient of relationship between young rams increased from the first sample to the second, in BCF, CHL and MTR (Tab. IV). The largest increase was found in the BCF breed while the CDL breed showed a decrease of the average coefficient of relationship. The pedigree completeness level has to be considered, because of its impact on the evolution of the average coefficient of relationship. The Eq.G was higher in the second sample, for all breeds: it showed an increase of +0.53 in BCF, +0.79 in CDL, +0.91 in CHL and +1.97 in MTR (results not shown). This partly explains the increase of the average coefficient of relationship in the BCF, CHL and MTR breeds.
The number of ancestors for a cumulative contribution of 50%, which is less sensitive to the quality of genealogical data [3], suggests an evolution between samples similar to that of the average coefficients of relationship. The BCF breed, which already had a reduced genetic variability, showed the highest deterioration. The CDL breed had a gain of genetic variability between successive ram samples. The young rams of the MTR and the CHL breeds were little affected.  Impact of PrP selection on genetic variability Figure 2 shows the evolution of inbreeding between 1992 and 2004. Both the average coefficient of inbreeding in a given year and the rate of inbreeding were higher in the BCF breed than in the other breeds. BCF rams born in 2004 had an unusual increase of inbreeding relative to previous birth years. For young rams in MTR, the average coefficient of inbreeding grew gradually, with no visible change in the rate after implementation of the selection programme on the PrP gene. In the CHL and CDL breeds, a slight rise of inbreeding had been observed since 2000 and 2001, respectively. Taking into account the generation lengths of the breeds, these average annual rates of inbreeding roughly correspond to realised effective population sizes of 126 in BCF, 676 in CDL, 399 in CHL and 159 in MTR between 1992 and 1999. In comparison, between 2000 and 2004, the realised effective population sizes were estimated at 43 in BCF, 137 in CDL, 132 in CHL and 206 in MTR.

Impact of selection for scrapie resistance on genetic variability
The between-sample period length represents about one generation. During this very short time, an impressive loss of genetic variability was observed for the PrP gene, as a consequence of the strong selection acting directly on this gene. In the most recent sample, the ARR allele was found to be fixed in the BCF breed, and close to fixation in the CDL and CHL breeds, whereas in the MTR breed most of the young elite rams carried the ARR/ARQ genotype.
Simultaneously, even though to a lesser extent, the variability of the five markers located in the vicinity of the PrP gene changed (Fig. 1). As expected, the S04 and S11 markers were strongly affected by selection for the ARR allele, evidence of their high proximity to the coding gene. Therefore, selection for ARR/ARR animals will result in keeping animals that are carriers of only one of the three loci (PrP, S11 and S04) haplotype. However, the S04 and S11 markers were less affected by selection than PrP, due partly to an incomplete linkage disequilibrium and, mostly, to their small initial polymorphism (e.g. for the S04 marker, with alleles 139 and 146, the frequencies moved from 0.87 and 0.13 before selection to 0.94 and 0.04 after selection, in the CDL breed). BMS1669, which is supposed to be at 2 cM from the PrP gene, showed a smaller but significant evolution of its polymorphism. HUJ616 and McM42 were weakly affected, in agreement with their distance from the PrP gene: 13 and 27 cM, respectively.
With regard to neutral genetic variability, pedigree data and the molecular information suggested little evolution between both samples of young rams. Thus, no consequence of severe bottlenecks was observed in our data. Several explanations can be proposed: (1) Considering the short time during which selection was applied (about one generation), it may be too early to observe the consequence of an effective population size reduction, particularly on heterozygosity, which decreases more slowly than allele diversity. However, criteria based on pedigree information (average coefficients of relationship and numbers of ancestors contributing for a cumulative contribution of 50%), usually more sensitive to recent selection events, indicated no strong decrease of genetic variability. The reduction of realised effective population sizes between 1992-1999 and 2000-2004 gives a contradictory picture. However, this can be explained by reasons beyond selection for the PrP gene. In the CHL and CDL breeds, selection effectiveness for production traits has been enhanced (more AI, stronger selection of elite reproducers) since 2000 and 2001, respectively, i.e. when the selection for the PrP gene began. The BCF breed had an unusual value of inbreeding in 2004 (full sibs were selected as candidate sires by mistake), responsible for an abnormally low effective population size.
(2) Introducing selection for scrapie resistance in breeding programmes often led the breeding organisations to redefine the relative importance of the different criteria used for previously elite rams selection. For instance, decrease of selection load on standard traits and lower pressure on the genetic value of elite dams of young elite rams carrying the ARR allele. Consequently, elite rams from new origins, ancestors or farms, were selected. This is illustrated in the CDL breed Impact of PrP selection on genetic variability where genetic variability in young rams increased after introducing selection for the PrP gene (Tabs. II and IV), and also by simulation results [17]. (3) Implementation of practical rules for managing genetic variability in the breeding programmes might limit the loss of within-breed variability. Before the PrP selection began, active sires (resistant and susceptible ones) were grouped depending on their relationship. Selection for production traits and scrapie resistance was done within-group, in order to keep each ram line, using assortative mating with genotyped sire dams and genotyping a large number of candidate young sires before their genetic evaluation (high and early selection on PrP genotypes) [22]. The young rams of the MTR breed, for which this method had been applied rigorously, illustrate well the effectiveness of these rules in preserving genetic variability and genetic progress [4], despite a low initial frequency of ARR (Tabs. II and IV). The alternative strategy using only ARR/ ARR rams from the beginning of the PrP selection would elicit a rapid increase in scrapie resistance, but would have strong consequences on genetic progress and genetic variability, as described by Alfonso et al. [1].

Comparison of results from pedigree data and from neutral markers polymorphisms
The criteria measuring genetic variability from pedigree data represent a polymorphism and its evolution at a neutral locus, anywhere in the genome. In the case of the breeds considered here, pedigree data and molecular markers assumed to be neutral (relative to the selection objectives) provided consistent views of neutral genetic variability, as observed by Alfonso et al. [1] in the Latxa breed. However, some differences were found from one breed to another. For instance, in the CDL breed, results from pedigree data provided a more optimistic picture than results from the markers, whereas the opposite was observed in BCF. Among the four breeds studied, BCF had the highest rate of inbreeding (see Fig. 2 and [8]), but the mating structure did not lead to substantial deficiency in heterozygotes in comparison to the expected value from observed allele frequencies, as revealed by the small F IS value (Tab. II). Moreover, the Reynolds distance in the young rams of the CHL breed, which was two times lower than in the other breeds, does not reflect the difference in genetic variability observed from pedigrees, which is similar to those observed in MTR and CDL (in absolute terms). Despite these little differences, pedigree data represent a good source of information for characterising the neutral genetic variability, especially because it is easy and inexpensive to have the available information. As a consequence, these data allow the analysis of larger samples both in terms of number of animals and years, which strongly reduce problems due to sampling (Fig. 2).

Generalisation of results and recommendations
Can the results based on four breeds be extended to other French sheep breeds and to any population applying intensive selection on a major gene? The choice of these four breeds among the 26 main French sheep breeds was made with the idea of considering a variety of situations: small population size (BCF breed), low initial frequency of ARR allele (CDL and MTR breeds), high evolution of PrP frequencies (CDL, CHL and MTR breeds), high weight of the PrP gene in the selection objective (CDL and CHL breeds), lack of effective strategy for maintaining genetic variability and genetic progress on production traits (BCF, CDL and CHL breeds). Faced with this panel of situations, our results can be used to draw some lessons. The initial frequency of the favourable allele (ARR here) may be, in theory, a determining criterion for evaluating the risk of loss of genetic variability. The present study partly contradicts this idea. Young rams of breeds with initial unfavourable PrP frequencies (CDL, CHL and MTR) were found to be little affected whereas young rams of the BCF breed had the highest deterioration of genetic variability, despite a suitable initial ARR frequency. This deterioration did not result from the introduction of selection for the PrP gene ( Fig. 2) but was rather an evidence of the difficulty in maintaining the within genetic variability in a breed with both a small effective population size and effective selection procedures such as BCF breed. In addition, it is clear that for any breed, applying rules for the management of active sires within groups of relatives is sensible to maintain genetic variability and also genetic progress.