Genetic variability and history of a native Finnish horse breed

Background The Finnhorse was established as a breed more than 110 years ago by combining local Finnish landraces. Since its foundation, the breed has experienced both strong directional selection, especially for size and colour, and severe population bottlenecks that are connected with its initial foundation and subsequent changes in agricultural and forestry practices. Here, we used sequences of the mitochondrial control region and genomic single nucleotide polymorphisms (SNPs) to estimate the genetic diversity and differentiation of the four Finnhorse breeding sections: trotters, pony-sized horses, draught horses and riding horses. Furthermore, we estimated inbreeding and effective population sizes over time to infer the history of this breed. Results We found a high level of mitochondrial genetic variation and identified 16 of the 18 haplogroups described in present-day horses. Interestingly, one of these detected haplogroups was previously reported only in the Przewalski’s horse. Female effective population sizes were in the thousands, but declines were evident at the times when the breed and its breeding sections were founded. By contrast, nuclear variation and effective population sizes were small (approximately 50). Nevertheless, inbreeding in Finnhorses was lower than in many other horse breeds. Based on nuclear SNP data, genetic differentiation among the four breeding sections was strongest between the draught horses and the three other sections (FST = 0.007–0.018), whereas based on mitochondrial DNA data, it was strongest between the trotters and the pony-sized and riding horses (ΦST = 0.054–0.068). Conclusions The existence of a Przewalski’s horse haplogroup in the Finnhorse provides new insights into the domestication of the horse, and this finding supports previous suggestions of a close relationship between the Finnhorse and eastern primitive breeds. The high level of mitochondrial DNA variation in the Finnhorse supports its domestication from a large number of mares but also reflects that its founding depended on many local landraces. Although inbreeding in Finnhorses was lower than in many other horse breeds, the small nuclear effective population sizes of each of its breeding sections can be considered as a warning sign, which warrants changes in breeding practices. Electronic supplementary material The online version of this article (10.1186/s12711-019-0480-8) contains supplementary material, which is available to authorized users.


Background
Domestication of animals and plants has played an essential role in human history. At present, there are more than 8800 breeds of 38 domesticated animal species listed in the Domestic Animal Diversity Information System (DAD-IS) [1]. Domestication has influenced the behaviour, morphology, physiology and performance of a species through selective breeding. It has also led to the fixation of breed-specific traits through inbreeding, genetic drift, founder effect and selection [2,3]. Two examples of the occasionally large phenotypic differences between wild and domesticated animals are the great variation in coat colour [4] and body size (e.g., in dogs; see, for example, [5]) in domesticated species that still often show limited variation within breeds. Thus, captive breeding may decrease the overall genetic variation but may increase it for certain traits.
A study of the genome-wide genetic diversity of 36 horse breeds [6] based on single nucleotide polymorphism (SNP) genotyping data showed that the diversity was low in breeds that have experienced high selective to each breeding section (e.g., maximum withers height of 148 cm for pony-sized horses, pulling capability for draught horses, athletic conformation for harness trotters and rhythmic, elastic and advancing gaits for riding horses [15,17]). It is assumed that while all present-day Finnhorses can be traced back to only four founding stallions, which were born between 1879 and 1929, the number of founding mares was large [17]. The small number of founding stallions, together with the strict breeding standards, presumably decreased the genetic variation of the breed at the beginning of the 1900s. However, the later formation of the four separate breeding sections likely increased genetic differentiation. In addition, the decline in numbers until the 1980s created a population bottleneck, which possibly further decreased the genetic diversity and increased the effect of genetic drift. Today, the number of individuals in the draught horse breeding section, in particular, is alarmingly small. However, according to [6], the levels of heterozygosity and inbreeding in the Finnhorse breed are not markedly different from those of other horse breeds (H E = 0.301, inbreeding coefficient F IS = − 0.004 and effective population size N e = 575 estimated from 27 Finnhorses, whereas the corresponding mean values for all studied breeds are equal to 0.295, 0.007 and 341, respectively). This higher than expected level of heterozygosity and lower than expected inbreeding coefficient for a breed with strong directional selection and a recent decrease in population size have been explained by its within-breed genetic structure [6], i.e. its differentiation into four breeding sections.
It is still unknown how the history of the Finnhorse has influenced the genetic diversity of the breed as a whole or the differentiation among its four breeding sections. Therefore, our aims were to (1) characterize the genetic variation of the breed and breeding sections by analyses based on whole-genome SNPs and mitochondrial DNA sequences, (2) estimate the genomic and mitochondrial differentiation between the four breeding sections, and (3) estimate inbreeding and effective population sizes.

Sampling and DNA extraction
Nine hundred and ninety-one samples of horses were obtained either directly from horse owners (hair samples from the mane or tail, N = 960) or from a horse hospital (blood samples, with the permission from the owners, N = 31). Among these, 852 samples were from Finnhorses (67 harness trotters, 79 riding horses, 30 draught horses, 51 pony-sized horses and 641 individuals that are not registered in the studbook) with 16 individuals that were concurrently registered in two breeding sections (three as trotter and riding, three as riding and draught, five as riding and pony-sized, one as draught and pony-sized, four as trotter and draught), and thus they were included in both breeding sections for the estimations of genetic diversity and differentiation between sections. We also analysed samples from several other breeds: eight horses from different Baltic breeds (two Estonian horses, one Estonian riding pony, two Estonian Sport horses, one Tori horse, two Latvian Sport horses and one horse of unknown breed that originated from Estonia), 30

Mitochondrial analyses
We amplified a 774 bp long part of the mitochondrial control region by using the primers Eca_tRNAThr_L (5′-AAA CCA GAA AAG GGG GAA AA-3′, [18]) and Eca_ CR690_H (5′-TTG TTT CTT ATG TCC CGC TACC-3′,  designed for this study). PCR were carried out with 2 µL of 5 × Phusion buffer, 0.2 µL of 10 mM dNTPs, 0.5 µL of both primers (10 µM), 0.5 to 2 µL (~ 20 to 200 ng) of template DNA and 0.1 µL of Phusion DNA polymerase, the reaction conditions included an initial denaturation step at 98 °C for 30 s, followed by 35 cycles at 98 °C for 10 s, 53 °C for 30 s and 72 °C for 30 s, and a final extension for 10 min at 72 °C. PCR products were sequenced using the primer Eca_CR690_H, the BigDye Terminator v.3.1 kit and an ABI 3730 automatic sequencer (Applied Biosystems). Sequences were aligned by eye with the program BioEdit 7.2.5. [19]. In order to classify the sequences into previously defined haplogroups, we also included in the alignment the sequences from [11] (GenBank Accession Nos. JN398377-JN398457), and drew a haplotype network using TCS v. 1.21 [20]. When both a mare and its fowls were sampled, the fowls were excluded from the dataset.
We determined the best substitution model for the alignment with the program Mega 6.06 [21] and used the suggested model (see Results) to calculate pairwise Φ STvalues between the Finnhorse breeding sections with Arlequin v.3.5.1.3 [22]. Indices of DNA polymorphism (nucleotide diversity π , mutation parameter θ , haplotype diversity ĥ and number of haplotypes) were calculated with DnaSP v. 5.1 [23]. We also used the mitochondrial sequences to estimate the past and present female effective population sizes of the Finnhorse breed based on the Bayesian skyline plot in the program BEAST v.1.8.0 [24]. BEAUTI v.1.8.0 [24] was used to create an input file for BEAST, implementing the aforementioned substitution model with five gamma categories. We performed this analysis with two datasets, the first one including all the samples from the four Finnhorse breeding sections together with a random sample of 50 individuals that were not registered in the studbook, and the second one including only the samples of the breeding sections. Markov chain Monte Carlo (MCMC) was run for 10,000,000 steps, using ten groups and the piecewiseconstant model. Posterior distributions and effective sample sizes were inspected with TRACER v.1.7.1 [25], which was also used to analyse the skyline. To transform the obtained times into years and female effective size estimates into individuals, we used the minimum mutation rate of 2.9 × 10 −6 and maximum mutation rate of 10 × 10 −6 [26].

SNP data
We randomly sampled 12 horses from each breeding section of Finnhorses (trotters, riding horses, pony-sized horses and draught horses), 12 horses that were not in the studbook and 12 horses from other breeds, i.e. one Estonian horse, two Warmblood trotters, one Irish Cob, one Welsh Mountain, two Shetland ponies, one KWPN (Royal Dutch Sport horse), one Gotland Russ, one FWB (Finnish Warmblood), one Norwegian Fjord and one American Quarter horse (Table 1). These 72 horses were genotyped using the Illumina Equine SNP70 BeadChip at the laboratory of Dr. Van Haeringen (Wageningen, the Netherlands). This chip includes 65,157 SNPs across the horse genome. Across all samples, the average genotyping call rate was 0.988, after excluding three samples with call rates ranging from 0.355 to 0.419 (one mare and one stallion pony-sized Finnhorse and one mare Finnhorse that is not registered in the studbook; these individuals were not included in further analyses). The data were pruned for minor allele frequencies (MAF = 0.05) and linkage (with a sliding window of 50 SNPs, shifting the window 5 SNPs forward and removing SNPs with an r 2 higher than 0.5; -maf 0.05 -indep-pairwise 50 5 0.5) using the PLINK 1.9 software [27,28]. Finally, 37,445 SNPs remained for further analyses, unless stated otherwise.

Genetic variation, inbreeding and effective population size
Genomic diversity and inbreeding were estimated with PLINK 1.9, using the functions -het (observed and expected homozygous genotype counts and method-of moments F coefficient), -ibc (inbreeding coefficients F I , which is the variance-standardized relationship minus 1, based on the variance of additive genetic values within an individual, F II , which is based on the excess of heterozygosity within an individual, and F III , which is based on the correlation between uniting gametes within an individual [29]) and -homozyg (runs of homozygosity, ROH, using a scanning window of 50 SNPs and recording a ROH when at least 100 SNPs are included across a total length of 1000 kb or more; the X chromosome was excluded). Pedigree-based inbreeding coefficients ( F ped ) were obtained from the studbook provided by the Finnish trotting and breeding association [30] or calculated based on the pedigrees from a web portal of a pedigree database of horses that are registered mainly in Finland [31] with a minimum of six generations backwards. Observed and expected heterozygosities were calculated with Arlequin v.3.5.1.3 [22]. Correlations between the different estimates were calculated using the Pearson correlation coefficient. Past and present effective population sizes were estimated with SNeP v.1.1 [32], by setting default values, for horses of each breeding section, horses that were not in the studbook, all the Finnhorses and all data (including the mixed breed group). This program estimates the trends of the historical effective population size trajectories from SNP data based on linkage disequilibrium [32].

Population genetic structure
First, we used a Bayesian iterative algorithm implemented in the program Structure v. 2.3.1 [33] to investigate the presence of population genetic structure by placing samples into groups formed by individuals sharing similar patterns of variation. We used all the 65,157 SNPs and set the length of burn-in periods at 1000, used 10,000 MCMC replications and set the number of populations (K) at 1 to 10 for 10 iterations and used the admixture model with correlated allele frequencies. We performed the runs without LOCPRIOR, i.e. a priori knowledge of the sampled breeding sections or breeds. The Structure results were used to calculate Evanno's ΔK [34] in Excel. Furthermore, the iterations of the best number of populations (K = 2) were used to construct a barplot with the help of the output from the web-based program Structure Harvester v.0.6.94 [35], which was used as input for the program CLUMPP v.1.1.2 [36]. This program aligns the membership coefficients of the iterations. Then,

Table 1 Samples analysed with the Illumina Equine SNP70 BeadChip, and the obtained expected heterozygosity and inbreeding estimates
Note, if only one individual of a breed was analysed, the individual numbers and sizes of ROH and inbreeding coefficients are presented, but not averages N = number of individuals, H E = expected heterozygosity, F IS = inbreeding coefficient based on difference between expected and observed heterozygosities, mean number of ROH = mean number of runs of homozygosity extending over 1000 kb, mean size of ROH extending over 1000 kb, F I = inbreeding coefficient based on variance-standardized relationship minus 1, F II = inbreeding coefficient based on excess of heterozygosity and F III = inbreeding coefficient based on correlation between uniting gametes and F PED = inbreeding coefficient based on pedigrees

Genetic variation and inbreeding
Analysis of the mitochondrial sequences resulted in the alignment of a 631-bp long fragment from 743 Finnhorses and 121 horses of other breeds (fowls excluded). The best-fitting substitution model was TN93 + G + I, with a gamma shape parameter of 0.51 and proportion of invariant sites of 0.56 (BIC 34822.183 compared to the second best, GTR + G + I, 34843.342). Among all the horses, we detected 249 haplotypes and 158 polymorphic sites, with a nucleotide diversity of 0.022 (SD = 0.00016) and haplotype diversity of 0.982 (SD = 0.0015). Among the Finnhorses, we detected 203 haplotypes and 150 polymorphic sites, with a nucleotide diversity of 0.022 (SD = 0.00018) and haplotype diversity of 0.979 (SD = 0.0019). Among the horses from the four Finnhorse breeding sections, the diversity parameters showed little variation, with nucleotide diversities being highest for draught horses and lowest for trotters and haplotype diversity and θ being lowest for pony-sized horses ( Table 2). Of all haplogroups (A to R, as defined in [11]), all haplogroups except O and K were present in the Finnhorses ( Fig. 1) and (see Additional file 1). The sequenced part of the control region seemed to perform as well as the whole mitogenome for assigning the sequences into haplogroups. In riding horses, the most frequent haplogroups were L and M, in trotters B and Q, in draught horses B, C and M, and in pony-sized horses G and L (Table 3). Interestingly, two of the haplotypes identified in the Finnhorses belong to haplogroup F, which is present in the Przewalski's horse (Equus przewalskii). A Blast search against GenBank for these Finnhorse sequences also resulted in best matches with the Przewalski's horse sequences (e: 0.0, identity: 626-627/630 bp). For the other breeds for which we had more than 20 representatives, Shetland ponies were in haplogroups D, G, I, L, M, N and Q, warmblood trotters in A, B, G, I, L and N and Pura Raza Españolas in B, G, L, N and Q (see Additional file 1).
Within each Finnhorse breeding section, the values of expected heterozygosity from the SNP data were lowest for the trotters (0.318) and highest for the pony-sized horses (0.326). Average F IS , which was estimated with PLINK, ranged from 0.003 (pony-sized horses) to 0.027 (trotters). F IS -estimates from Arlequin were slightly different, ranging from 0.004 (trotters) to 0.014 (pony-sized horses) (see Table 1).
Compared to all other breeds, we found the smallest number of ROH in the Finnhorse breed, with an average number of 8.1 while it ranged from 11 to 38 in all the other breeds studied here (Table 1)

Effective population sizes
The Bayesian skyline results show that the female effective population size increased until approximately 110 to 300 years ago (applying the maximum and minimum mutation rates, respectively), at which point it started to decrease with a steep decline that started about 30 to 110 years ago (applying the maximum and minimum rates, respectively), both estimates including the time of foundation of the breed (Fig. 2a). When samples from only the Finnhorse breeding sections were analysed, the steep decline was more pronounced, and during the last 50 years, the female effective population size almost halved from 18,200 (8700 with the maximum rate) individuals to 9900 (2800 with the maximum rate) individuals (Fig. 2b). The current female effective population size of the Finnhorse breed that is estimated from the complete data, including the individuals that are not in the studbook, is equal to 17,200 (4900 with the maximum rate) individuals. The effective population sizes estimated from the SNP data using SNeP were small, i.e. only 45 for trotters, 56 for riding horses, 43 for pony-sized horses, 52 for draught horses and 49 for horses that are not registered in the studbook. The effective population size for the entire Finnhorse data was equal to 161, and for the whole SNP data, i.e. including all the other breeds, 205. All the horses of the Finnhorse breeding sections as well as the horses that are not in the studbook showed a decline in effective population size over the complete estimated time period of 1000 generations. When the complete data was considered, the decline began slightly later, about 900 generations ago (Fig. 3).

Table 3 Haplogroup frequencies in the four Finnhorse breeding sections
The two highest frequencies in each section are marked in italics. Haplogroup names A-R are after [9]

Genetic differentiation
In the mitochondrial DNA dataset, the pairwise Φ STvalues were significant between riding horses versus trotters and draught horses, and between pony-sized horses versus trotters and draught horses. We detected no differentiation between trotters and draught horses or between pony-sized horses and riding horses (Table 4). For the SNP data, Evanno's ΔK suggested that a value of 2 was the best number of genetic clusters, which separated the Finnhorses from the mixed breed group (Fig. 4). Similarly, the principal components analysis supported the separation between Finnhorses and the other breeds (Fig. 5a). Among the Finnhorse breeding sections, trotters and ponysized horses are separated along the x-axis (Fig. 5b), but this is not clearly reflected by the F ST -values. The highest values were between the draught-type Finnhorses and all other groups, whereas the lowest values were between the riding and pony-sized Finnhorses (Table 4).

Genetic variation
Among our sample of 743 Finnhorses, we found 16 of the 18 previously described horse haplogroups, with only the European haplogroup K and the Middle Eastern haplogroup O missing. Interestingly, haplogroup F, which so far was detected only in the Przewalski's horse, was present in four of the Finnhorses analysed here, of which three were not included in any of the breeding sections and one was registered as a riding horse. To date, haplogroup F has not been found in any modern horse breed, which is unexpected since it was recently suggested that the Przewalski's horse, the only extant wild horse species, is derived from the early domestic horses of the Botai culture [38]. This result contradicted the findings of previous studies that have placed the Przewalski's horse as a sister taxon of domestic horses, with some possible gene flow from the domestic horse to the Przewalski's horse [39,40]. The existence of the Przewalski haplogroup F in the Finnhorse, if verified by sequencing whole mitogenomes,  Within the Finnhorse breeding sections, haplogroups L and M were most frequent in riding horses, haplogroups B and Q in trotters, haplogroups B, C and M in draught horses, and haplogroups G and L in pony-sized horses. All these haplogroups were considered as of European origin by [11] except for haplogroup Q, which has a more Asian or Middle Eastern distribution and is present, for example, in the Arabian horse. Mitochondrial DNA diversity in the Finnhorses ( π = 0.022 and ĥ = 0.979) is relatively similar to that detected in many other horse breeds [41][42][43]. In the domestic horse, it has been suggested that the large number of haplogroups and haplotypes spread over wide geographic regions results from a large number of mares having been incorporated into the domestic horse population [11,12,44]. This ancestral polymorphism has seemingly been retained in the Finnhorse as well, which is probably due to the high level of maternal genetic diversity of the founding population of the Finnhorse. Contrary to a previous Bayesian skyline plot that was constructed from a mixture of breeds in [11] and showed a decrease in the female effective population size about 7000 years ago and an increase thereafter, the female effective size in Finnhorses began to decrease approximately 300 years ago. This decline was accentuated approximately 110 years ago, at the time when the breed was founded, and again approximately 50 years ago, when the breeding sections for trotters, riding and pony-sized horses were founded (Fig. 2a, b). This is possibly due to a founder effect connected with the selection of horses that were included in the breed and in the breeding sections.

Table 4 Mitochondrial Φ ST -values between the Finnhorse breeding sections estimated with the Tamura and Nei substitution model (below the diagonal) and F ST -values for the SNP-genotyped horses (above the diagonal) between the Finnhorse breeding sections and a group of mixed breeds
The present Finnhorse is closely related to the native Scandinavian, Estonian and Mongolian horses [6,13] and presumably to the Russian heavy Mezen horse and the native Lithuanian Žemaitukas horse [14]. In our study, the Finnhorses also cluster with the native Estonian horse and with the native British and Irish breeds (Fig. 5). These close relationships between eastern, southern and western native breeds might explain the presence of haplotypes in the Finnhorse that are found in breeds from Europe, Central Asia and Middle East. Historically, King Gustav Vasa established stud farms in Finland as early as the 16 th century, with horses being imported from the Netherlands and northern Germany, to increase the size of Finnish horses. These horses most likely had the same ancestry as the modern Friesian and Oldenburg breeds, including the Spanish ancestry that was used to create the Friesian breed [45]. During and after the Thirty Years' War (1618-1648), the Finnish cavalrymen who returned home brought with them horses from Central Europe and the Baltic region that were then bred with the Finnish horses. During the eighteenth and nineteenth centuries, some Arab horses were imported to Finland as well as Warmblood and heavy Ardennes horses from Sweden, and Orlov Trotters and 'cossack horses' , possibly Don horses, from Russia [14,46]. By the early twentieth century, all this crossbreeding had resulted in three different types of Finnish horses: heavy draught-type horses, light and long-legged (race) horses, and light and tough ponysized horses [17]. Although most of the imported horses were stallions because they had more breeding value, occasionally imported mares may have introduced their Central European or eastern mitochondria into the Finnish horse population. Thus, the history of the breed may have resulted in a high level of mitochondrial genetic diversity and in a large female effective population size.
Analysis of the nuclear diversity, measured as the expected heterozygosity based on SNP data, showed that its level in Finnhorses was similar to that of many other breeds, including, for example, the Akhal-Teke, Andalusian, Lusitano, Mongolian and Tuva horses [6], which all have 'moderate levels' of nuclear diversity (the highest level i.e. 0.337 in Swiss Warmblood and Paint horses and the lowest level i.e. 0.239 in Clydesdale). The lowest and highest levels of diversity among our data for Finnhorse were in trotters (H E = 0.318) and pony-sized horses (H E = 0.326), respectively. These estimates are slightly higher than the estimates obtained from 27 Finnhorses (0.301) in [6], where the dataset was pruned for linkage equilibrium with an r 2 > 0.4 compared to our r 2 > 0.5, which might have had some effect on the estimates. Nevertheless, it is evident that the nuclear diversity of each of the Finnhorse breeding sections remains at a good level. Moreover, although ROH were longer in Finnhorses, there does not appear to be any strong inbreeding, since the numbers of ROH and inbreeding coefficients are smaller than in the other breeds that we studied here. Average inbreeding coefficients ( F II ) estimated in [6] varied from 0.015 in the Mongolian horse to 0.261 in Clydesdale, and was 0.052 for Finnhorses, whereas, based on a larger sample size, we found a lower estimate i.e. 0.032. Although the criteria used for the detection of ROH vary among studies, the length of the genome covered by ROH can be grossly compared between various studies of horse breeds. Using a 50-SNP window and setting the minimum length of ROH to 1000 kb, we found that the mean genome length covered by ROH in the Finnhorse breeding sections ranged from 104.4 to 137.6 Mb, whereas it ranged from 92.0 to 517.0 Mb in the other breeds that we studied. In a recent study [47], also based on a 50-SNP window but using another SNP chip to detect homozygous segments of more than 500 kb, the mean genome length covered by ROH was 305.1 Mb and ranged from 227.5 Mb in the Noriker breed (a heavy Austrian draught horse) to 396.5 Mb in Purebred Arabians. Another study [48], which used the previous version of the Illumina Equine BeadChip and a 50-SNP window to detect homozygous segments of more than 40 kb, found that the mean genome length covered by ROH ranged from 416.5 Mb in a Dülmen horse (a native German pony) to 953.2 Mb in a Thoroughbred.
Although we found no clear evidence of inbreeding, the effective population sizes for each breeding section and for the horses not registered in the studbook were very small (from 43 to 56) when estimated based on the nuclear SNPs. This small effective size can result from only having four founding stallions for the breed but may also stem from recent breeding practices; although such practices were designed to avoid inbreeding, only five stallions have made approximately 50% of the genetic contribution during the 2005-2014 period (estimated from pedigree data based on the expected proportion of alleles in an individual originating from an ancestor [49]). Currently, the number of matings for each stallion is limited to 150 per year [15], which could still be far too many if the aim is to increase the effective population size of the breeding sections and of the whole breed. Moreover, the overall number of stallions in each of the breeding sections is currently relatively small (218 stallions in harness trotters, 103 in riding horses, 64 in pony-sized horses and 30 in draught horses [15]), and the horses share many ancestors in their pedigrees. The effective population size of the entire breed obtained by combining all breeding sections and horses not in the studbook was 161, which is still a reasonably good level and might, in fact, favour the current breeding practice, which is that any registered Finnhorse is allowed to be included in the breeding sections provided that the section-criteria are met. Thus, the non-studbook horses serve as a large gene pool for the breeding sections.

Genetic differentiation and history of breed
Based on the SNP data, we found a clear differentiation between the Finnhorses and the other breeds. All the Finnhorses clustered together in the Bayesian clustering analysis and along the first principal component in the PCA analysis, and were separated from the other breeds that we studied here. However, the differentiation between breeding sections was less clear. In the PCA, some clustering of the trotters and pony-sized horses was observed, but when we estimated F ST , F ST -estimates were significant only for the comparisons between the draught horses and the others (Table 4). Based on mitochondrial DNA, differentiation was strongest between the pony-sized horses and trotters, and overall, many pairwise comparisons were significant ( Table 4). The only non-significant comparisons were between the trotters, draught horses and non-studbook horses and between the riding and pony-sized horses. The higher F ST -values obtained based on mitochondrial DNA compared to SNP data are most likely due to the different mode of inheritance of these two marker types: mitochondrial DNA is haploid and maternally inherited and thus has onefourth of the effective population size of the diploid and biparentally inherited nuclear SNPs. This intensifies the effect of genetic drift on mitochondrial markers, leading to faster differentiation. The mainly non-significant F STvalues between the Finnhorse breeding sections suggest very weak, if any, differentiation among trotters, ponysized horses and riding horses. The traits that are used as criteria for accepting individuals in the breeding sections are quantitative, possibly very complex and likely to have very variable heritabilities. Indeed, estimated heritabilities in the Finnhorse breed vary considerably from height at withers and at croup (0.89 and 0.90, respectively) to movement at walk and trot (0.13 and 0.18, respectively) [50]. The low heritabilities of many of the traits used, the acceptance of new individuals to breeding sections from the 'common gene pool' (i.e. non-studbook horses), the fact that several horses are registered in several breeding sections and that the breeding sections have been founded fairly recently, weaken the differentiation among the breeding sections.

Conclusions
We found that the level of mitochondrial DNA variation was high in the Finnhorse, which confirms that the horse domestication in general involved a large number of mares, and the Finnhorse breed was founded using many local landraces. One of the haplogroups (F) that we identified in the Finnhorse breed has to date not been reported in domestic horses, but only in the Przewalski's horse. This observation opens up new avenues for the study of the domestication of the horse. We show that the female effective population sizes in the Finnhorse and its breeding sections are large, but that their nuclear effective population sizes are small and the genetic differentiation between the breeding sections remains low. Although inbreeding in Finnhorses is lower than in many other horse breeds, the small nuclear effective population sizes can be considered a warning sign. The current common breeding practice of using only a few studs, which, e.g., have been successful in harness racing or riding competitions, should be restricted and the number of studs used should be increased. This could be achieved by, for example, further restricting the number of matings/stud and by encouraging the use of studs from rare pedigrees.

Additional files
Additional file 1. Haplotype network of all the sequenced samples. White haplotypes (marked with an asterisk in the side panel) are from [11] and recovered from GenBank.