Molecular genetic analysis of a cattle population to reconstitute the extinct Algarvia breed

Background Decisions to initiate conservation programmes need to account for extant variability, diversity loss and cultural and economic aspects. Molecular markers were used to investigate if putative Algarvia animals could be identified for use as progenitors in a breeding programme to recover this nearly extinct breed. Methods 46 individuals phenotypically representative of Algarvia cattle were genotyped for 27 microsatellite loci and compared with 11 Portuguese autochthonous and three imported breeds. Genetic distances and factorial correspondence analyses (FCA) were performed to investigate the relationship among Algarvia and related breeds. Assignment tests were done to identify representative individuals of the breed. Y chromosome and mtDNA analyses were used to further characterize Algarvia animals. Gene- and allelic-based conservation analyses were used to determine breed contributions to overall genetic diversity. Results Genetic distance and FCA results confirmed the close relationship between Algarvia and southern Portuguese breeds. Assignment tests without breed information classified 17 Algarvia animals in this cluster with a high probability (q > 0.95). With breed information, 30 cows and three bulls were identified (q > 0.95) that could be used to reconstitute the Algarvia breed. Molecular and morphological results were concordant. These animals showed intermediate levels of genetic diversity (MNA = 6.0 ± 1.6, Rt = 5.7 ± 1.4, Ho = 0.63 ± 0.19 and He = 0.69 ± 0.10) relative to other Portuguese breeds. Evidence of inbreeding was also detected (Fis = 0.083, P < 0.001). The four Algarvia bulls had Y-haplotypes H6Y2 and H11Y2, common in Portuguese cattle. The mtDNA composition showed prevalence of T3 matrilines and presence of the African-derived T1a haplogroup. This analysis confirmed the genetic proximity of Algarvia and Garvonesa breeds (Fst = 0.028, P > 0.05). Algarvia cattle provide an intermediate contribution (CB = 6.18, CW = -0.06 and D1 = 0.50) to the overall gene diversity of Portuguese cattle. Algarvia and seven other autochthonous breeds made no contribution to the overall allelic diversity. Conclusions Molecular analyses complemented previous morphological findings to identify 33 animals that can be considered remnants of the Algarvia breed. Results of genetic diversity and conservation analyses provide objective information to establish a management program to reconstitute the Algarvia breed.


Background
Breeding practices designed to alleviate production constraints are prejudicial to the survival of traditional domestic animal breeds, and tend to lead to impoverishment of the gene pool [1]. The Food and Agriculture Organization of the United Nations has encouraged a series of conservation measures designed to help prevent irreversible loss of diversity of domesticated animal species [2]. A heightened awareness of the cultural, historical and social heritage represented by traditional breeds has led to increased interest in their preservation [3]. Despite its small geographic area, Portugal hosts a wide variety of domestic breeds [4], with as many as 13 autochthonous cattle breeds recognized [4,5]. Analysis of genetic diversity of some of these breeds has used blood protein polymorphisms [6], microsatellite variation [7][8][9][10][11][12], mitochondrial DNA (mtDNA) [13,14] and Y chromosome sequence variation [15]. These studies have shown that, among European cattle, the Mirandesa breed is one of the most important targets for preservation based on its contribution to diversity [8]. In general, southern European cattle appear to represent particularly important reservoirs of genetic diversity [16].
Algarvia cattle are native to the Algarve region of southern Portugal and were first represented in 1868 [17] and later described with more details [18][19][20][21][22][23]. Based on its morphology, the breed has been classified in the Aquitanian (or Red Convex) group, together with Alentejana, Mertolenga, Garvonesa and Minhota [23,24]. Except for Minhota, these breeds are distributed throughout the southern river Tagus valley. Algarvia cattle were used predominantly for meat production and/or draft. Although they were never formally registered as an independent breed, 25,000 -29,000 animals were officially catalogued between 1940 and 1970 [25]. Since then, the population has declined rapidly and the breed was considered to be effectively extinct by the 1980s [1], an event that subjected the Regional Agricultural Authorities to criticism for failure to implement proper conservation measures.
The morphology of eight autochthonous Portuguese cattle breeds has been analyzed by numerical taxonomic methods [26], and these analyses have been extended recently to include animals thought to derive from the Algarvia breed [27]. This study identified a uniform group of animals that shared many of the phenotypic characteristics of that breed. These remnants were preserved mainly for cultural reasons by traditional breeders in different areas of Algarve and southern Alentejo regions.
In the present study, we have characterized these putative Algarvia animals using nuclear microsatellites, Y chromosome markers and mtDNA sequences to determine to which extent they can be genetically distinguished from related autochthonous breeds, and to identify individuals that would represent suitable progenitors for a breeding programme to reconstitute the Algarvia breed.

Sampling procedure
Forty-six animals (42 females and four males) were chosen from ten independent herds and consisted of individuals thought to derive from the Algarvia cattle on the basis of phenotypic similarity, as judged by traditional farmers with experience on the breed characteristics [27].
Additionally, samples from Garvonesa (29) and Preta (47) breeds were collected because they are present throughout the southern region of the country and admixture with Algarvia cannot be excluded. The herds in which putative Algarvia individuals were located also had animals from the Alentejana breed. A 9 ml whole blood sample was collected from each individual by jugular venipuncture in tubes containing EDTA-K3 as anticoagulant. Genomic DNA was extracted from leukocytes using the Puregene DNA Isolation Kit (Gentra Systems, Minneapolis, USA).

Statistical analysis of microsatellite data
Allele frequencies were determined with GENALEX version 6 [30]. The software GENEPOP version 3.4 [31] was used to perform global and per locus/per population Hardy-Weinberg Equilibrium (HWE) tests, and to test for genotypic linkage disequilibrium (LD). Exact probability tests were done for loci with four or fewer alleles; otherwise, a Markov chain method was employed [31] with 10,000 dememorization steps, 500 batches and 5,000 iterations. GENETIX version 4.02 [32] was used to estimate within-population observed (H o ) and unbiased expected (H e ) heterozygosities [33], the mean number of alleles (MNA), and inbreeding coefficients (F is ) [34]. The statistical significance of F is > 0 was obtained based on 1,000 permutations. Pairwise population F st values were calculated with FSTAT version 2.9.3 [35] and P-values obtained based on 1,000 randomizations. This software was also used to estimate allelic richness (R t ) per locus and population. To investigate breed relationships, neighbour-joining (N-J) dendrograms [36] were constructed from D A genetic distances [37] using POPULATIONS version 1.2.28 [38]. Bootstrap values were obtained with 1,000 replicates over loci. A dendrogram based on allelesharing distances between individuals was also con-structed using this software. TREEVIEW version 1.6.6 [39] was used to visualize and edit the dendrograms. A factorial correspondence analysis (FCA) was done with GENETIX to investigate relationships among individuals. Assignment tests were performed to identify individuals most representative of the Algarvia breed and to detect admixture. STRUCTURE version 2.2 [40] was used to estimate the most probable number of population clusters (K). The analysis was done without prior information on populations, assuming correlated allele frequencies and admixture [41]. Ten independent runs with 100,000 Markov Chain Monte Carlo (MCMC) iterations and 10,000 burn-in were performed at each K (1 ≤ K ≤ 9) to calculate ΔK as in Evanno et al. [42]. A longer run (1,000,000 iterations and 100,000 burn-in) was done for the most probable K to determine the number of individuals within each cluster. An assignment test with these settings but including prior breed information was also performed. The partially Bayesian simulation-exclusion procedures of GENECLASS version 2.0 [43] were used for assignment tests with 10,000 Monte Carlo resamplings of individuals [44][45][46].
The contribution of each population to the overall genetic diversity was analysed considering the within-(CW) and between-breed (CB) diversity components, and aggregate genetic diversity (D1 = F st *CB + (1-F st )*CW) as described by Ollivier and Foulley [47]. METAPOP software version 1.0.2. [48] was used to account for allelic diversity and estimate the contribution of each population (c i ) to a pool of maximal genetic diversity [49]. Equal weights were given to within-and between-breed coancestries (λ = l). The average molecular coancestry (f m ) of each population was also obtained with this software.

Analysis of Y chromosome markers
The four putative Algarvia males were genotyped for one SNP (UTY intron 19 AY936543: g.423C > A), one indel (ZFY intron 10 AF241271: g.697_8indelGT), and five microsatellites (DDX3Y_1, BM861, INRA189, UMN0103 and UMN0307) located in the male-specific region of the bovine Y chromosome. Analyses were done as described by Ginja et al. [15] for a comparison with previously identified patrilines in Portuguese autochthonous breeds.

Analysis of mtDNA sequence variation
A 919 bp PCR fragment containing the complete mtDNA control region was obtained and sequenced for Algarvia animals. The analysis was done as described by Ginja et al. [50] and sequences were aligned with the taurine reference sequence [GenBank: V00654, [51]] using the Multalin interface http://bioinfo.genotoul.fr/multalin/ multalin.html. Haplotypes were identified with GENALEX [30] and ARLEQUIN version 2.0 [52] was used to calculate haplotype diversity (H), nucleotide diversity (π), and the mean number of pairwise nucleotide differences (MNPD) accounting for heterogeneity of substitution rates per site [53]. mtDNA sequences of the southern Portuguese autochthonous breeds and the imported Limousin were obtained from the GenBank database [accession numbers: FJ815445-59, FJ815525-40, FJ815573-88, FJ815620-35 and FJ815880-95] and used in ARLEQUIN to estimate pairwise-population F st values (5% significance level obtained with 10,000 permutations). A Median-Joining (MJ) network of haplotypes was constructed with NETWORK version 4.5.10 [54] software to investigate breed relationships.

Genetic relationships between Algarvia and other breeds
The genetic distance analysis confirmed that the putative Algarvia animals were close to the southern Garvonesa and Alentejana breeds, and distant from northern breeds such as Mirandesa (Additional file 1 Figure S1). It also showed that this group of animals was more closely related to Preta than to Brava de Lide, both of which are considered to belong to the Black Orthoid breed group. Relationships among the individuals of Algarvia and of southern Portuguese breeds that clustered with Algarvia in Additional file 1 Figure S1 are shown in the N-J dendrogram of allele sharing distances in Figure 1. Limousin cattle was included in the analysis because they are raised in the southern region of Portugal and have been used to upgrade the autochthonous breeds [24]. Thirty-two Algarvia animals formed two closely related subgroups each containing a few Alentejana animals. Five Algarvia individuals clustered with Garvonesa (AG24, AG27, AG33, AG34 and AG35), four with Preta (AG40, AG41, AG42 and AG43) and five with Mertolenga (AG12, AG17, AG23, AG32 and AG37). Results of the FCA showed that Preta is the most distant breed but that among southern Portuguese breeds including Algarvia genetic differentiation was weak (Additional file 2 Figure  S2). The relationships among individuals represented in the FCA graph were consistent with those shown in Figure 1.

Assignment of Algarvia cattle
STRUCTURE analyses assume that within a population all loci are in HWE and linkage equilibrium [40]. Although for some breeds a high number of loci showed significant (P < 0.05) deviations from HWE without correction for multiple testing (Additional file 3 Table S1), the assignments with STRUCTURE were conducted to include all loci, because some deviations from HWE are not expected to affect the performance of the test [55]. The HWE deviations found in Brava de Lide and Preta breeds are most probably related with a Wahlund effect and/or inbreeding, considering that the F is values for these breeds were highly significant [see discussion for Brava de Lide in [12]]. Within breeds, LD was significant (P < 0.001) for one pair of loci in the Alentejana breed, four in the Preta and four in the Brava de Lide, but none of these corresponded to markers located on the same chromosome. The assignment tests of STRUCTURE and GENECLASS were done exclusively for the Algarvia ani- mals, the related southern breeds (Alentejana, Garvonesa, Mertolenga and Preta) and the Limousin cattle to determine which Algarvia animals clustered as an independent group, and to detect admixture. The STRUC-TURE assignments without prior information on source breeds showed the highest ΔK at K = 6 (Additional file 4 Figure S3). The estimated genotype membership coefficients (q) obtained for each individual are shown in Figure 2. Algarvia animals clustered with the Alentejana and Mertolenga breeds at K = 2 and only appeared as an independent cluster at K = 6.
Results of GENECLASS assignments are summarized in Additional file 5 Table S2. Animals were correctly assigned if the genotype probability was higher than the threshold exclusively in their source population. For Algarvia animals, ~22 to 50% of the individuals were correctly assigned with accuracies > 0.94 (ratio between the number of correctly assigned individuals and the sum of correctly and incorrectly assigned). Depending on the threshold considered, between 2 and 20% of the Algarvia animals were excluded (genotype probabilities lower than the threshold in all populations) and 30 to 76% were assigned to multiple populations (genotype probabilities greater than the threshold in at least two populations).

Genetic diversity of Algarvia
The genetic diversity (MNA = 6.0 ± 1.6, R t = 5.7 ± 1.4, H o = 0.63 ± 0.19 and H e = 0.69 ± 0.10) of the 33 Algarvia animals identified with STRUCTURE was identical to that found for the related Alentejana breed, and slightly lower than the average estimates across all breeds included in this study (Additional file 3 Table S1, MNA = 6.8 ± 0.6, R t = 6.0 ± 1.8, H o = 0.67 ± 0.05 and H e = 0.70 ± 0.03). Ten population-specific alleles were found in the Algarvia population but none had a frequency greater than 0.05. Deviations from HWE were significant (P < 0.05) due to heterozygote deficit at ten loci (BM203, BM1824, BM2113, BRRIBO, ETH152, ILSTS035, SPS115, TGLA53, TGLA122, and TGLA345). Evidence of inbreeding within the Algarvia group could be inferred from the F is estimate (0.083) which was significantly (P < 0.001) greater than zero. The molecular coancestry of Algarvia animals (f m = 0.322) was slightly higher than the overall value of 0.310. Pairwise population F st estimates showed that Algarvia animals are genetically closer to the Alentejana breed (F st = 0.045, P <0.05) than to the other southern breeds (results not shown).

Y chromosome haplotypes of Algarvia
Among the four Algarvia bulls, two (ALG3M and ALG4M) had the H11Y2 haplotype which is fixed in the Alentejana breed but also common in other Portuguese breeds, whereas the two other animals (ALG1M and ALG2M) had the H6Y2 haplotype which is fixed in the Garvonesa breed but also found in other Portuguese breeds as well as in Charolais and Limousin breeds [15].

mtDNA haplotypes of Algarvia
Complete mtDNA control region sequences (909 bp) were obtained for the 33 Algarvia animals (four bulls and 29 females) assigned to this group with STRUCTURE (sequence quality of AG19 was low and thus was discarded). Sequence alignment is shown in Additional file 6 Figure S4 with polymorphic positions represented. A total of 12 distinct haplotypes was identified [GenBank: G086285-G086317] based on 21 variable sites, of which 11 were phylogenetically informative, nine were singletons, and one was an indel. The European T3 haplogroup (T at nt16255) was the most frequent (29 animals) but the African-derived T1a type (T at nt16050, C at nt16113 and C at nt16255) was also detected in three Algarvia individuals (AG16, AG40 and AG43). Genetic diversity estimates in the Algarvia population were H = 0.81 ± 0.05, π = 0.003 ± 0.002 and MNPD = 3.03 ± 1.62. Pairwise F st values showed that Algarvia is significantly differentiated from all breeds except from Garvonesa (F st = 0.028, P > 0.05).
Haplotype relationships represented in the MJ-network ( Figure 3) showed that the most common haplotype in the Algarvia population (11 animals, including males AG3M and AG4M) was shared with one Alentejana individual, whereas the second most frequent haplotype (10 animals) was shared with three Garvonesa, two Mertolenga and one Preta individuals. Two Algarvia animals (AG40 and AG43) and one Garvonesa shared a T1a haplotype. Interestingly, a T1a haplotype found in one Algarvia (AG16) and three Alentejana animals was substantially different from other haplotypes of this haplogroup. This haplotype has a C and an A at positions nt16122 and nt16196, respectively (nt330 and nt404 in Additional file 6 Figure S4), which are characteristic of the African-derived AA mtDNA lineage described in Latin American Creole cattle and also found in Iberia [56]. Although this haplotype lacks the C and T at positions nt16053 and nt16139, respectively (nt261 and nt347 in Additional file 6 Figure S4), that also define AA, it can represent more ancestral Iberian mtDNA lineages [for a discussion see [50]].

Conservation analysis
Contributions of each breed to the overall genetic diversity are shown in Table 2. Following the Weitzman approach, and based on population pairwise D A values, the between-breed influence on diversity (CB) varied from 3.90 (Arouquesa) to 10.23 (Brava de Lide). The within-breed diversity (CW) values varied from -0.62 (Brava de Lide) to 0.43 (Mertolenga). The F st value estimated across all breeds and used to calculate the aggregate genetic diversity (D1) was 0.089. Among autochthonous breeds, the lowest value for D1 was found in the Mirandesa breed (0.23) and the highest for the Preta breed (0.92). The influence of the Algarvia breed on the overall genetic diversity was intermediate across all estimates except for the allelic diversity-based (c i ) calculation. Algarvia ranked within the six breeds that contributed most to CB (6.18), and showed a lower contribution to both CW (-0.06) and D1 (0.50). When the allelic diversity was considered, Algarvia and seven other autochthonous breeds made no contribution to the overall genetic diversity.

Discussion
Decisions to initiate costly conservation programmes need to take into account assessment of extant variability and diversity loss, as well as cultural and economic aspects [57][58][59]. Algarvia cattle were adapted to the climatic and geographical conditions of Algarve, a region that is highly susceptible to suffer from climate change [60]. In addition to its cultural and historical relevance, reconstituting this breed could also contribute to reinforce sustainable agriculture, a non negligible component of the economic activity of the region. The use of molecular data to assess the genetic structure of domestic species in conservation programmes has been described [59,[61][62][63][64][65][66][67]. Analysis of morphological and molecular information provides a more solid basis to define the characteristics of a breed than the use of morphological traits alone. DNA markers were not available at the time when the Algarvia breed became nearly extinct, and no biological samples have been preserved. Based on morphology descriptions [22], putative Algarvia descendants were identified [27]. As preliminary work towards recon- stituting this breed, we used molecular markers to investigate to which extent these animals could be distinguished from other Portuguese cattle. In agreement with their morphological classification [20,23,27,68], genetic distances and factorial correspondence analyses showed a close relationship between Algarvia and other breeds of the Aquitanian (Red Convex) group. Algarvia and Garvonesa breeds are considered to be descendants of the Alentejana breed [23], and our study shows that Algarvia is closely related to these breeds. This genetic proximity probably also reflects recent admixture, because most of the Algarvia animals were found in herds predominantly composed of Alentejana animals. Genetic erosion due to crossbreeding is expected to have occurred since the breed began to decline about 35 years ago. We used assignment tests to determine whether a cluster of putative Algarvia animals could be distinguished from the related southern breeds and to identify potential candidates to reconstitute the breed. STRUCTURE results confirmed that the most probable partition of the data agreed with the number of populations tested. Although the Algarvia group was the last to emerge as an independent cluster, it was possible to identify 17 cows that belonged to this group with q values > 0.95 without using prior information on populations. With prior information on sample origin, 33 animals (30 cows and three bulls) were classified as Algarvia. The partially Bayesian method of GENECLASS resulted in relatively low percentages of animals classified in each breed but, because of the high accuracy, it was useful to confirm reference animals in each population. Even though GENECLASS methods are conservative [55], the results were consistent with those of STRUCTURE with 23 animals assigned to Algarvia of which only two (AG34 and AG1M) were not among those selected by STRUCTURE.
Based on their genotypes, three bulls (AG2M, AG3M and AG4M) identified as Algarvia can be used to reconstitute the breed. The remaining bull (AG1M) was possibly admixed with Alentejana, according to the STRUCTURE analysis, although with GENECLASS it was assigned to the source population but with a low probability. Overall, a certain amount of convergence of results from independent approaches is noted, since 19 of these animals (16 cows and three bulls) were also identified as belonging to the core group to reconstitute the Algarvia breed based on numerical taxonomy analyses of morphological characters [27]. Admixture was detected in several Algarvia females, three of which (AG33, AG35 and AG41) did not represent descendants of this breed because they were misclassified. Based on morphology, these animals clustered within a group also containing Alentejana, Garvonesa and Mertolenga animals [27]. The heterogeneous composition of the putative Algarvia population is reflected by the lower average genotype membership coefficients when compared to those of breeds Algarvia is represented by the 33 animals identified with STRUCTURE CB: Weitzman estimate of between-breed genetic diversity; CW: within-breed genetic diversity; D1: aggregate genetic diversity, 0.089*CB + 0.911*CW; ci: contribution of each breed to a pool of maximal genetic diversity [49] having herdbook registries. This result was not surprising given the expected dilution of Algarvia through crossbreeding.
The Y-haplotypes of the putative Algarvia bulls provided additional evidence of the genetic proximity with Alentejana and Garvonesa breeds, but did not exclude possible admixture with imported breeds such as Limousin or Charolais. mtDNA analysis corroborated the close genetic relationship between the core group of 33 Algarvia animals and the Garvonesa breed through their sharing of haplotypes. Common Iberian matrilines (European T3 and African T1a) were found in the Algarvia population, as well as a distinct haplotype (AG16) possibly related to the more ancestral African-derived AA haplogroup found in Creole cattle [50]. The use of genotypic data, together with morphological analysis, facilitated the definition of a group of animals that could be used to reconstitute the Algarvia breed. How conservative the inclusion criteria should be relative to the acceptable degree of admixture will depend on breeding strategies yet to be defined.
The genetic diversity of the core set of 33 Algarvia animals was slightly lower than that found across Portuguese breeds. A significant heterozygote deficit was detected, possibly due to inbreeding, which is consistent with the strong genetic erosion. These results were not unexpected considering the extremely reduced number of extant Algarvia descendants. Another possible explanation for HWE deviations could be the sampling of Algarvia animals from independent herds, which could generate population subdivision and an increased frequency of homozygotes (e.g. Wahlund effect).
Gene diversity-based estimates indicate that the Algarvia population makes an intermediate contribution to the overall genetic diversity of Portuguese cattle. In contrast, the allelic diversity-based estimates suggest that the variation found in Algarvia is represented in the genetic pool of four other autochthonous breeds. The decision concerning which of these measures should be used in management programs to evaluate breed contributions to overall genetic diversity is not consensual [61]. In the case of Portuguese autochthonous breeds that are considered endangered [4], particularly for the Algarvia breed, immediate conservation measures should aim at maximizing gene diversity rather than allelic diversity. This approach would maintain allelic diversity and guarantee a more effective response to selection while controlling inbreeding [61,64].

Conclusion
Although nearly three decades have elapsed since the Algarvia breed was declared effectively extinct, we were able to identify a small group of cows and bulls with phenotypic characteristics of this breed. Analyses of autosomal, maternal and paternal markers have helped refine previous morphological findings to identify 33 animals that can be considered remnants from the Algarvia breed. For cultural and economic reasons, reconstituting the Algarvia breed is relevant to maintain the distinct regional identity of the Algarve. Molecular analyses have characterized the genetic diversity of the core set of animals which, together with information from conservation analyses, can be used to establish a management program to reconstitute the Algarvia breed.