Mitochondrial D-loop sequence variation among Italian horse breeds

The genetic variability of the mitochondrial D-loop DNA sequence in seven horse breeds bred in Italy (Giara, Haflinger, Italian trotter, Lipizzan, Maremmano, Thoroughbred and Sarcidano) was analysed. Five unrelated horses were chosen in each breed and twenty-two haplotypes were identified. The sequences obtained were aligned and compared with a reference sequence and with 27 mtDNA D-loop sequences selected in the GenBank database, representing Spanish, Portuguese, North African, wild horses and an Equus asinus sequence as the outgroup. Kimura two-parameter distances were calculated and a cluster analysis using the Neighbour-joining method was performed to obtain phylogenetic trees among breeds bred in Italy and among Italian and foreign breeds. The cluster analysis indicates that all the breeds but Giara are divided in the two trees, and no clear relationships were revealed between Italian populations and the other breeds. These results could be interpreted as showing the mixed origin of breeds bred in Italy and probably indicate the presence of many ancient maternal lineages with high diversity in mtDNA sequences.


INTRODUCTION
Mitochondrial DNA (mtDNA) analysis has often been used in evolutionary studies. Feral and domestic equine cells contain a large number of maternally inherited mitochondria (from 100 to 1000) [1,23]. The D-loop region of mtDNA is particularly interesting due to the high variability level [1,23], the moderate mutation rate estimated at one site every 6000 years in humans [18], the matrilineal transmission and the lack of recombination [21].
The aim of this study was to investigate genetic diversity of the mtDNA D-loop hypervariable region in seven Italian horse populations, in order to evaluate their matrilineal relationships. The Italian autochthonous breeds considered were the following: Giara, Haflinger, Lipizzan, Maremmano, and a small feral Sardinian population called Sarcidano. Italian Trotter and Thoroughbred horses were also included. We considered the relationships within these horse breeds bred in Italy and Spanish, Portuguese, North African and Wild horses (the Mongolian wild horse) in order to provide information about the origin of the Italian populations. An Equus asinus sequence was used as the outgroup.

MATERIALS AND METHODS
Total DNA was extracted, following standard procedures, from peripheral blood samples of five horses for each of the following breeds: Giara (GRH), Haflinger (HFL), Italian trotter (ITR), Lipizzan (LPZ), Maremmano (MAH), Sarcidano (SRH) and Thoroughbred (TBH). The Giara and Sarcidano were bred in feral conditions and samples were selected at random. The horses in other breeds were selected by pedigree analysis, in order to obtain information about maternal lineages.
The D-loop region was amplified by the polymerase chain reaction (PCR) using two primers specifically designed from a published horse sequence (GeneBank X79547): forward 5'-AACGTTTCCTCCCAAGGACT-3' and reverse 5'-TCAGCAACCCTCCCAACTAC-3' [5,24]. The amplicon obtained was a 397-bp fragment between the tRNA Pro and the large central conserved sequence block (sites 15382 and 15778), which is considered as the most polymorphic mtDNA region [24].
The reaction profiles included the following: one cycle of denaturation at 94 • C for 9 min followed by 30 cycles of denaturation at 94 • C for 60 s, annealing at 48 • C for 45 s and extension at 74 • C for 1 min; a final extension at 74 • C for 30 min. PCR products were purified and sequenced using the BigDye Terminator Kit (Applied Biosystems) on an ABI PRISM 377 DNA Sequencer equipped with Sequencing Analysis  and Sequence Navigator  (Applied Biosystems).
Mitochondrial DNA sequences were compared with a reference sequence from a "Swedish horse" (GeneBank X79547) by the BLAST2 SEQUENCES programme [19].
Multiple alignments between our sequences and the literature ones were performed using CLUSTAL W software [20]. Kimura two-parameter distances, calculated on the basis of an equal substitution rate per site [11], were estimated by PHYLIP software package version 3.5c [3]. Cluster analysis using the Neighbour-joining method [17] was performed by the same programme to obtain a phylogenetic tree viewed in TreeView software [16]. A bootstrap analysis on 1000 replicates was applied in order to evaluate the robustness of the dendrogram.

RESULTS
We identified 22 haplotypes among the 35 horses (Tab. I). For each breed we identified from 2 to 5 haplotypes (Tab. II).
The identified haplotypes differed from the reference sequence (GeneBank X79547) by 5-12 sites and from each other by 1-15 sites, within the 397 bp amplicons (Tab. I).
We found 29 base substitution sites in comparison with the reference sequences. The detected mutations corresponded to transitions and we did not find inversions (Tab. I). One deletion, also reported in three sequences (AF481311, AF481320, AF481322) by Hill et al. [4], was identified in Thoroughbred samples. Two substitutions (positions 15945 and 15720), already mentioned in other breeds [2,8,12], were identified (Tab. II).
The dendrogram reported in Figure 1 was performed with the Neighbourjoining algorithm using the Kimura two-parameter distances estimated on the D-loop sequences.
The clustering of haplotypes shows seven clades A to G (Fig. 1). Clade A joins horses belonging to haplotypes 9, 10, 13, 15, 18, 19 and 22. They differ from each other by 1-7 nucleotide substitutions (Tab. I). Clade B is represented by haplotype 1 that includes six horses, whereas clade C joins the haplotypes 6 and 21, differing from each other by one mutation (Tab. I). Haplotypes 2, 4 and 8 are clustered in clade D and they have 3-4 nucleotide substitution differences. Haplotype 2 shows a characteristic substitution site at position 15601, whereas haplotypes 4 and 8 presented characteristic nucleotide substitutions at  We also analysed our data in a wider context. Only 7 haplotypes (1, 3, 7, 11, 17, 21 and 22) showed a 100% alignment upon comparison of our sequences with those with the same 397 bp length or longer found in the GeneBank database.
Our data from Thoroughbreds compared with those present in the literature on the same breed showed a similarity between haplotype 11 and the founderhaplotype K identified by Hill et al. [4]. The other haplotypes differed from Hill's haplotypes by 1 to 12 nucleotide substitutions.
The Lipizzan bred in Italy showed haplotypes that are also frequent in other breeds. The haplotypes 1 and 21 were similar respectively to the Allegra and Monteaura haplotypes identified by Kavar et al. [9] and were considered more frequent in the Lipizzan maternal lines.
In Figure 2 a Neighbour-joining dendrogram is shown using the mtDNA D-loop sequences selected from the GenBank database.
However, the five Giara samples were selected at random, while selection for the other breeds used pedigree information to maximise maternal diversity.
The presence in Thoroughbreds and in Lipizzans bred in Italy of haplotypes more frequent in maternal lines and considered in Lipizzan "ancestral" [9], is in accordance with the wide genetic base of the maternal lines of these breeds [4,8,9].  Relationships among the Italian horse breeds obtained using Kimura twoparameter distances are reported in Figure 1.
According to the large number of haplotypes identified in the analysed sample, our breeds except Giara spread out in the dendrogram clades.
In the dendrogram performed using mtDNA data, Giara appears to be homogeneous and clustered in a unique clade (F) (Fig. 1).
The wide variability of the D-loop sequences among our populations may be caused by the multiple origins of the breeds bred in Italy, in accordance with the results of other authors studying different horse populations [8,10,13]. In reference to these breeds, the high variability of the mtDNA haplotypes within Italian populations is probably due to the important role played by other horse populations influencing the evolution of Italian horse breeds.
The analysis of our sequences in a wider context is reported in Figure 2. All the horse breeds bred in Italy but Giara appear to be spread out in the different clades. Giara is grouped in a unique clade joint with Sorraia, Garrano and Potoka, considered as very ancient breeds (Fig. 2). The two Sorraia haplotypes are closely related to the two Giara haplotypes 16 and 17. In fact haplotype 17 aligns 100% with Sorraia AF447764, whereas it differs from Sorraia AF447765 by 3 nucleotide substitutions. Haplotype 16 differs from Sorraia AF447764 by just one substitution and from Sorraia AF447765 by two substitutions. The close relationships between Giara and the other breeds in the cluster could be interpreted by the presence of common ancient maternal lineages.
However, the poor number of Giara horses sampled prevents us from making any conclusion on this hypothesis and the question needs to be investigated more.
The distribution of our haplotypes in the different clades suggests that, as reported by some authors [7,13,22], the modern horse mtDNA sequences do not define monophyletic groups. In particular compared to wild progenitors, modern horse populations are not derived from a single stock of wild horses. The horse domestication probably involved several distinct populations. The initially domesticated horses spread out and incorporated wild mares forming different mtDNA clusters [7]. In this case, the phylogenetic differences detected in our breeds could be explained by the presence of a very ancient mitochondrial diversity.
In conclusion, we provide a preliminary sequence characterisation and phylogenetic study by mitochondrial D-loop DNA polymorphism of seven Italian horse breeds. Horse populations bred in Italy are the result of multiple origins since they retain very ancient mitochondrial diversity.