Improving our knowledge about the within-breed diversity and the between-breed relationships and structure in cattle is fundamental for improving selection designs and breeds, understanding environmental adaptation, enhancing efficient use of the breeds and implementing conservation programs [32, 33]. With the development of high-throughput genotyping technologies, analyzing the genetic structure of livestock species has become feasible. However, to date, much of the effort has been devoted to dominant commercial breeds, with local breeds generally poorly studied , although they can represent outstanding genetic resources for the local economy. This study investigated the genome-wide structure of 30 Italian local cattle and two cosmopolitan breeds using medium-density genome-wide SNPs. Except for a very few breeds (e.g. Valdostana), we were able to sample all the local cattle breeds reared in Italy and officially recognized. Therefore, our analysis concerns the largest and most complete dataset available for Italian cattle breeds.
Although homozygosity may be overestimated for populations that were not included in the design of the SNP array (ascertainment bias) , the Illumina set of 50 K SNPs was highly informative for all the Italian cattle breeds analyzed here. Heterozygosity values, polymorphism levels and recent and historical Ne estimates were very consistent. Average MAF were homogeneous among the breeds, and, on average, SNPs were equally informative for all breeds, which is consistent with previous observations , even if at the individual level, the MAF of SNPs can vary considerably. The He and MAF values obtained in our study agree with the range of values that was reported in a study on the development and characterization of a medium-density SNP genotyping assay for cattle , and are similar to those observed in other European breeds  using the same SNP genotyping assay. This finding is likely due to the effect of the European ancestry of Italian cattle breeds . We observed a decline in Ne with time in all breeds, as previously reported for other cattle breeds [4, 12, 23]. In fact, estimated trends in Ne indicate that, in the past, the Ne of Italian cattle breeds was large. The size of cNe differed among breeds, and several Italian breeds displayed small values, which is consistent with their low genetic diversity and high genomic inbreeding coefficients. In agreement with our results, small cNe were reported also for Iranian native cattle breeds  (from 13 to 107) and the Belgian Campine breed . In the last 50 years, the number of individuals of these latter breeds has decreased dramatically, as is the case for Italian local breeds. With such small cNe, their inbreeding rate has increased markedly until recently, as shown by the detection of long ROH in several local Italian breeds. The small cNe inferred for most of the Italian local breeds, as well as the low values of their genetic diversity indices, may be also a consequence of population bottlenecks that occurred due to the geographic isolation of some farms and the reduced interest of breeders for these breeds . In contrast, the Sarda breed had a larger Ne at all generations, and the Sardo-Bruna had the largest cNe, which is probably due to admixture within these two breeds and crossbreeding with other cattle breeds. In general, strong selection pressures and use of artificial insemination (AI) are the main causes of small Ne in livestock. In local cattle breeds, selection pressure is usually very weak and AI is almost absent, but conversely, uncontrolled mating of related animals is common, and thus inbreeding and low genetic diversity are the most likely cause of their small Ne . No correlation was found between cNe and recent and historical Ne estimated 13, 20 and 80 generations ago. Estimates of Ne vary a lot with the method used . Moreover, it should be emphasized that these estimates can be strongly biased when the sample size is small, probably because of the LD generated by the sampling process . Actually, none of the formulas proposed to date for estimating Ne from LD provide reliable predictions , probably because they all rely on simplifications or assumptions. Therefore, these estimates (in particular the cNe) must be considered with care, even more so when the sample size is less than 15 animals . Anyway, our results are in agreement with other studies in cattle [4, 23].
Across all the breeds analyzed here, generally, the cosmopolitan breeds (Italian Holstein, Brown and Simmental, Limousin and Charolais) showed moderate levels of genetic diversity. Moreover, although some breeds (Rossa Siciliana, Sardo-Bruna, Sardo-Modicana, Agerolese) are endangered or have a small census population size, they did not show signals of low levels of genetic diversity. ROH analysis and Ne values corroborated these results, likely because of their admixed origins. However, the levels of polymorphism and genetic variability were lower in Pontremolese and Mucca Pisana than in all other breeds. These results are as expected since these two breeds experienced a stronger reduction in numerical size in the second half of the 20th century. Currently, the Pontremolese population includes less than 30 cows and the Mucca Pisana population is composed of nearly 200 cows. In both cases, the number of bulls is very small (2 and 10 for Pontremolese and Mucca Pisana, respectively). The low genetic variability observed for these breeds agrees with the theoretical expectation for populations, which have undergone a severe bottleneck . Genetic diversity is an intrinsic factor that influences the adaptive capacity and resilience of populations. This low genome-wide genetic variability in these two breeds, as that observed in other local breeds, may be also related with a lower adaptation potential, which could represent a threat to their long-term persistence . Therefore, our findings raise the possibility of a risk for the genetic diversity of Italian local cattle breeds, and the decrease in Ne should be taken into account and monitored.
Runs of homozygosity
Abundant genome-wide SNPs are particularly suitable for detecting genomic regions with reduced heterozygosity and, recently, an alternative method called runs of homozygosity (ROH) was implemented . Currently, ROH-based F estimates (FROH) are considered one of the most powerful approaches to detect inbreeding effects  and may also reveal recent population bottlenecks or signatures of directional selection [25, 43]. Because the parameters used to detect ROH vary among analyses, it is not easy to compare results from different studies on ROH. Moreover, ROH have rarely been estimated in local breeds. Notwithstanding, our results are in agreement with the values reported in other studies on cattle [4, 44,45,46]. Analysis of individual ROH may be useful for conservation programs, since animals that have high levels of ROH, as observed in some Pontremolese, Varzese-Ottonese and Mucca Pisana animals, could be excluded or assigned lower priority for mating purposes in endangered populations, to minimize the loss in genetic diversity and maintain or increase Ne . The length of ROH represents an important source of information on past and present demographic and genetic processes that shape the genetic diversity of livestock species . Since recombination events split long chromosome segments, long ROH are a consequence of recent inbreeding. In contrast, short ROH are indicative of relatedness dating back to more ancient times , which are in most cases not considered in an individual’s recorded pedigree. Thus, under the assumption that 1 cM = 1 Mb, a minimum ROH length of 4, 8 or 16 Mb implies a common ancestor 12, 6 or 3 generations ago, respectively . Thus, analyses that are carried out with ROH of different lengths allow us to estimate the distance between the current and base populations, and provide information on the age of inbreeding . Generally, all breeds showed long ROH, but for some local breeds (such as Varzese-Ottonese, Calvana, Pontremolese, Mucca Pisana) their number increased. These results are also corroborated by the higher FROH>16Mb obtained for Garfagnina, Mucca Pisana and Pontremolese. Therefore, strong evidence of recent inbreeding (3 generations ago) exists for these Italian breeds, due to the decrease in their Ne. In fact, isolation of breeds with a small population size increases the probability that identical segments are inherited. It is likely that the long ROH are also signatures of the extensive use of a few bulls within herds and of mating among relatives. If long ROH accumulate in the genome of some individuals, they could seriously impact the overall biological fitness . Indeed, long ROH can be enriched in genomic regions that carry deleterious mutations, and Kim et al.  showed that a strong relationship exists between the proportion of ROH in the genome and the number of individuals that carry deleterious homozygous mutations. Hence, the ROH levels estimated in our study may be informative to better understand the inbreeding history of the breeds analyzed. For example, this consideration is consistent with the management strategies known for the Pontremolese breed and its demographic decline that was reported ~ 30 years ago . Similar results were also reported in Spanish goats, in which the most abundant and long ROH were identified in breeds that are at the verge of extinction . On the contrary, the high FROH and low Ne values observed for the Italian Brown breed can be ascribed to the intense use of a small number of (closely related) bulls for artificial insemination. More generally, these results highlight that both ancient and recent inbreeding have impacted the genome of Italian cattle breeds and that several local breeds, in particular the four autochthonous cattle breeds of Tuscany (Pontremolese, Garfagnina, Mucca Pisana and Maremmana), have probably not undergone recent extensive crossbreeding since the long ROH in their genomes have not been broken down . This scenario probably reflects practices in the management of breeds with less controlled breeding that do not always prevent crosses between related animals, although they deserve special attention and conservation efforts. On the contrary, the limited occurrence of long ROH in Piedmontese indicates that this breed has benefited from proper breed management and has a sufficiently large Ne and thus a low degree of recent consanguinity. Therefore, we conclude that the genetic diversity indices, the effective population size and the genomic inbreeding levels were congruent with the protection status of the local Italian cattle breeds based on their reduced demographic size. The results also reflect the need to implement conservation programs, in particular for the breeds with a limited distribution.
Population structure and admixture
Determination of the structure and the genetic relationships of a population has proven useful in conservation programs and for developing suitable management practices [4, 15, 32,33,34,35]. To understand these aspects, we carried out an MDS and ADMIXTURE analysis, inferred TREEMIX ancestry graphs, calculated the Reynold’s genetic distances and the pairwise estimates of FST for the Italian local cattle breeds. Except for Podolian-derived breeds, the MDS grossly separated the breeds according to their genetic origin and/or to their geographical proximity between breeding areas. A previous study on breeds (Calvana, Chianina and Maremmana) reared in the ancient Etruria region (Tuscany, Central Italy), reported that they are genetically closer to Near Eastern than to European genetic stocks . This oriental genetic signature was observed also in modern Tuscan human populations, which have been shown to be genetically close to Anatolian and Middle Eastern human populations . To better understand the genetic relationships between Italian cattle breeds, by considering the possible connections with breeds/populations that are presumed to have contributed to the shaping of the current genetic background of some of them, we performed an additional MDS analysis among Italian and European cattle breeds. Clustering of the breeds is generally consistent with their geographical origin. The Northern Italian breeds are much closer to several European breeds than to other Italian breeds, which indicates a contribution of continental European ancestry in the formation of these Italian cattle breeds. Another observation was that, even on the European scale, populations of the same major commercial breeds cluster together, e.g. Italian Holstein and European Holstein, and Italian Brown and Brown Swiss. Moreover, our results are consistent with those of a mtDNA study , which showed that the Turkish breeds (such as the Anatolian Black breed), clustered with the Tuscany (particularly with Calvana and Chianina) and the Cinisara breeds. The observed separation between Podolian-derived breeds and Northern Italian populations corroborates previous studies on the genetic relationship of some local Italian cattle breeds using blood group systems and blood proteins . Our results were also corroborated by a recent study on mtDNA variation in different Podolian breeds , which revealed a genetic proximity between the Italian Podolian-derived and the Turkish breeds. Their findings also show that, generally, the values of haplotype diversity indices were lower in some Italian Podolian-derived (Calvana, Podolica and Maremmana) than in non-Podolian breeds, which is consistent with our results of He_hap.
The presence of a general North–South distribution of the genetic diversity along the Italian Peninsula was highlighted and confirmed by the low genetic differentiation (FST) among some local breeds from the same geographic area, such as between Sarda and Sardo-Bruna or among the Sicilian breeds. Previously, a similar geographical pattern was described by using a medium-density SNP array in Italian goat  and sheep breeds . Our results did not show any obvious relationship between the patterns of clustering and the productive aptitude of Italian cattle breeds, contrary to what was reported for sheep breeds . Moreover, no significant correlation between genetic differentiation and geographical distances was observed for Italian cattle breeds. Traspov et al.  showed that, in local pig breeds, there was no geographical gradient of the distribution of their genetic variability and suggested that, even for breeds that had close geographical origins (as was the case for the Mucca Pisana and Pontremolese breeds in our study), the implemented breeding schemes led to a high genetic differentiation. Moreover, similar results were also observed in pig populations from America , which are probably the consequence of complex genetic histories. Therefore, these results confirm that the admixture between geographically distant populations could be a major force in breaking regional genetic-geography concordance . Although the Northern and Northern-central Italian populations have different demographic histories and different breeding goals, our results show that they overlap in a cluster and they cannot be easily discriminated; furthermore, their MDS coordinates identified only small areas on the plot, as a consequence of the reduced within-breed genetic variability . In general, the MDS plot was consistent with the admixture analysis, in which some kind of hierarchical structuring was identified. The plot obtained by ADMIXTURE showed that at K = 8, some groups shared a substantial proportion of their ancestry, such as the Sicilian (Cinisara, Modicana and Rossa Siciliana), the Podolian-derived breeds and Northern and Northern-central Italian populations. This observation was also consistent with the TREEMIX results. It is worth mentioning that, on the one hand, the breeds that were the most homogeneous at the lower K values also displayed the lowest heterozygosity level, a phenomenon known as ‘inbreeding bias’ . On the other hand, for the breeds that, at the best K value (24), displayed less distinct clusters, there was no evidence for high levels of inbreeding due to their admixed origins, and this suggests that crossbreeding with other breeds occurred. Therefore, our results are largely consistent with the breeding history of the Italian cattle breeds, given that some breeds are the result of crossbreeding. For example, the Sardo-Modicana breed originated by crossing local Sarda cows with Modicana bulls  and the Sardo-Bruna breed by crossing local Sarda cows with Brown bulls. The genetic relationships between the Chianina and Calvana breeds are strongly supported by historical data and were previously investigated through AFLP (amplified fragment length polymorphism)  and microsatellite markers . Some Sarda animals were mixed with Piedmontese, which reflects possible admixture. Similarly, Rossa Siciliana showed a high level of gene flow with Limousin. In fact, crossbreeding between local breeds and meat breeds is common practice, to improve meat production but also for long-term breeding. Therefore, this practice explains the results on admixture between local and meat breeds. In addition, the genetic structure of the Agerolese breed was typical of a breed that showed admixture with other breeds, as confirmed by admixture events in Italian Holstein and Italian Brown, and in agreement with its genetic origins. It is indeed a dual-purpose breed, which originated during the 19th century from autochthonous cows crossed with Brown Swiss and Holstein–Friesian bulls . However, the limited number of males (e.g. only 13 males in natural service in the 2002)  that were available in the breeding program could result in a limited Ne size that can, in turn, lead to a strong impact of genetic drift. Our results also indicate that the genetic component of some commercial dairy breeds, such as the Italian Holstein, was relatively small in the Italian local breeds. In fact, the lowest FST values of the Italian Holstein was with the Burlina breed (0.066). Moreover, as illustrated by the MDS and ADMIXTURE results, the Italian Holstein breed is genetically distant from the local breeds. Nonetheless, differentiation between some local breeds (Barà-Pustertaler, Sarda, and Sardo-Bruna) and the commercial meat breeds (Limousin and Charolais) was very low (FST ~ 0.04), which suggests, as mentioned above, that the use of these breeds could have affected their genetic background. Among all the local breeds, Mucca Pisana was highly differentiated and presented only low levels of admixture with other breeds; this breed consistently demonstrated higher FST values with the other breeds analyzed. In a recent study on Russian cattle breeds, analogous results were reported for the Yakut cattle, which was the most divergent breed . For both Yakut and Mucca Pisana breeds, it is not clear why they are so divergent. It may be due to a combination of a small historical population size and a long history of isolation from other breeds. Our results also suggested that, compared to other breeds, the genome of the Mucca Pisana breed contains less genetic data from any other ancestral breed that it may have interacted with, which can be considered as a typical signal of inbreeding  and is consistent with the values of the genetic diversity indices. The long branch observed for this breed suggests that it is a differentiated and isolated population with a small Ne size. In fact, Mucca Pisana showed the smallest recent and past Ne estimates. In a study on sheep breeds, Kijas et al.  found that short branches were associated mainly to highly heterozygous breeds, while long branches were associated to much less heterozygous breeds. The above evidence for the Mucca Pisana breed was also confirmed by the MDS, ADMIXTURE, TREEMIX and FST analyses, and can be explained by its demographic history since it was reared for a long time in a geographically separate valley (Valle del Serchio) . Therefore, it is likely that this breed experienced reproductive isolation and reduced gene flow, and thus acquired a strong genetic identity. Since the Neighbor-Network analysis takes gene flow among breeds into consideration (reticulation), it may provide a more likely reconstruction compared to linear representations . Indeed the Neighbor-Network graph highlighted several clear clusters and relationships between breeds that originated from the same area, as shown in the MDS plot (Fig. 3). The sub-branches of the Neighbor-Network graph were also in agreement with the results of the admixture, genetic relationship and genetic diversity analyses within breeds. In fact, the length of each sub-branch reflected the results of recent and past Ne estimates. The reticulations towards the extremity of the networks also suggest increased genetic relatedness between breeds and therefore past hybridization events between these populations, and are consistent with several of the admixture events highlighted by TREEMIX between breeds and groups. Indeed, Neighbor-Net is a robust tool for reconstructing phylogenetic networks. Therefore, the low pairwise Fst values, shared ancestry, admixture events, and reticulations observed on the phylogenetic tree between some breeds, as well as between the Sicilian breeds, all suggest high levels of gene flow between these populations, and as well between some breeds originating from the same area. These results can be attributed to the fact that geographical proximity facilitates the gene flow, and that breeds from the same breeding area are more likely to have common ancestries.