Signatures of selection reveal candidate genes involved in economic traits and cold acclimation in five Swedish cattle breeds

Background Thousands of years of natural and artificial selection have resulted in indigenous cattle breeds that are well-adapted to the environmental challenges of their local habitat and thereby are considered as valuable genetic resources. Understanding the genetic background of such adaptation processes can help us design effective breeding objectives to preserve local breeds and improve commercial cattle. To identify regions under putative selection, GGP HD 150 K single nucleotide polymorphism (SNP) arrays were used to genotype 106 individuals representing five Swedish breeds i.e. native to different regions and covering areas with a subarctic cold climate in the north and mountainous west, to those with a continental climate in the more densely populated south regions. Results Five statistics were incorporated within a framework, known as de-correlated composite of multiple signals (DCMS) to detect signatures of selection. The obtained p-values were adjusted for multiple testing (FDR < 5%), and significant genomic regions were identified. Annotation of genes in these regions revealed various verified and novel candidate genes that are associated with a diverse range of traits, including e.g. high altitude adaptation and response to hypoxia (DCAF8, PPP1R12A, SLC16A3, UCP2, UCP3, TIGAR), cold acclimation (AQP3, AQP7, HSPB8), body size and stature (PLAG1, KCNA6, NDUFA9, AKAP3, C5H12orf4, RAD51AP1, FGF6, TIGAR, CCND2, CSMD3), resistance to disease and bacterial infection (CHI3L2, GBP6, PPFIBP1, REP15, CYP4F2, TIGD2, PYURF, SLC10A2, FCHSD2, ARHGEF17, RELT, PRDM2, KDM5B), reproduction (PPP1R12A, ZFP36L2, CSPP1), milk yield and components (NPC1L1, NUDCD3, ACSS1, FCHSD2), growth and feed efficiency (TMEM68, TGS1, LYN, XKR4, FOXA2, GBP2, GBP5, FGD6), and polled phenotype (URB1, EVA1C). Conclusions We identified genomic regions that may provide background knowledge to understand the mechanisms that are involved in economic traits and adaptation to cold climate in cattle. Incorporating p-values of different statistics in a single DCMS framework may help select and prioritize candidate genes for further analyses.

decreased rapidly in the early to mid-20th century due to the development of more efficient breeding programs using e.g. artificial insemination (AI), and to the increased competition from breeds with a higher milk yield [19]. However, due to the growing awareness of the value of local breeds as genetic resources, several actions have been adopted for their preservation. The native Swedish cattle breeds still display considerable phenotypic diversity in terms of coat color, body weight and size, milk content, and polled phenotype, as well as substantial genetic variation, as described by Upadhyay et al. [16]. Moreover, indigenous breeds may carry specific gene variants that contribute to adaptation to their local environment. Knowledge about the genomic regions that display signatures of selection in breeds adapted to different environments is of great value to design future breeding strategies, both in local breeds and in larger commercial breeds [20,21].
Well-documented studies on signatures of selection have shown that they can help identify polymorphisms and/or candidate genes that underlie economical traits in cattle breeds. As reviewed by Gutiérrez-Gil et al. [22], several genes have been identified, for example ATP binding cassette subfamily G member 2 (junior blood group) (ABCG2), which is associated with milk composition, coiled-coil-helix-coiled-coil-helix domain containing 7 (CHCHD7) and PLAG1 zinc finger (PLAG1) with body size, diacylglycerol O-acyltransferase 1 (DGAT1) with milk production, XK related 4 (XKR4) with growth trait, and melanocortin 1 receptor (MC1R) and KIT proto-oncogene, receptor tyrosine kinase (KIT) with coat color and spotting. Such findings are important to understand the mechanisms that explain the phenotypic diversity among breeds. However, these studies were often based on single statistical tests which have limited power to detect selection signatures. Thus, the purpose of our study was to combine multiple statistics of signatures of selection within a single DCMS framework [15] by taking the correlation between them into account to detect signatures of selection in the genome of Swedish indigenous and commercial cattle breeds with greater statistical power and higher resolution.

Sample collection and data quality control
Genotype data of Swedish cattle breeds obtained with GeneSeek ® Genomic Profiler High-Density Bovine 150 K (GGP HD150K) single nucleotide polymorphisms (SNPs) array that were previously described in a study by Upadhyay et al. [16] were downloaded from the DRYAD (https ://datad ryad.org/) public data repository. Five breeds including Fjällnära (n = 16), Fjäll (n = 23), Swedish Holstein-Friesian (n = 24), Swedish Red (n = 25), and Swedish Red Polled (n = 18) were retained for subsequent analyses (Table 1), whereas the remaining four breeds with a small sample size were discarded. All individuals from the five retained breeds, except one Swedish Red Polled individual born in 2016, were born between the mid-1970s and the early 2000s [16]. Inclusion of very close relatives such as full-sibs or parent-offspring-pairs was avoided as much as possible by using available information on the location of the farms and on pedigrees [16]. Detailed information about sample collection and DNA extraction is in [16]. This dataset overlapped partially with that used in the study by Johansson et al. [17].
Data quality control was performed using the PLINK v1.9 [23] software, separately for each of the two breed groups, including the northern breeds (Fjällnära and Fjäll) and middle-southern Swedish breeds (Swedish Holstein-Friesian, Swedish Red, and Swedish Red Polled) (see Table 1). SNPs with a call rate lower than 0.95 and a MAF lower than 0.05, and those for which the Hardy-Weinberg equilibrium Chi square test p-value was lower than 10 −6 were discarded. In addition, SNPs that were duplicated in the map file, or located on a sex chromosome, and/or had an unidentified position on the UMD3.1 assembly [24] were removed using the PLINK --exclude option. After merging the two datasets (using PLINK's --merge option), a subset of 105,362 mutual SNPs and 102 individuals (with an overall call rate of 99.92%) remained for subsequent analyses. It should be noted that within-group quality control was performed to provide high-quality SNPs for haplotype phasing as requested by SHAPEIT2 [25].

Principal component (PC) analysis
We used the PLINK --ibs-matrix command to estimate the identity-by-state (IBS) matrix between individuals. The output was used to perform a principal component (PC) analysis of genetic distances with the prcomp R function to visualize the distribution of samples, and the results were plotted by using R (https ://www.r-proje ct.org/).

De-correlated composite of multiple signals (DCMS)
In this study, five statistics including fixation index (F ST ) [3], haplotype homozygosity (H1) [5], modified haplotype homozygosity (H12) [5], Tajima's D index [6], and nucleotide diversity (pi) [4] were combined into a single DCMS framework [15] as described in Yurchenko et al. [20]. DCMS is similar to composite selection signals (CSS) [13] by combining p-values, with the difference that it takes the respective correlation between the various statistics into account [7,15]. The DCMS statistic can be calculated at position l as follows [15]: where p lt shows the p-value at position l for statistic t ; r it refers to the correlation between the test statistic of the i th and t th methods, and n is the total number of test statistics (combined) in the DCMS. The expression 1/ n i=1 |r it | is called weight factor, which ranges from 1/n to 1. For example, in a given dataset with n = 3 different uncorrelated ( r i =j = 0 ) test statistics (t), the weight factor will be 1 for each t and DCMS l will be the sum of the log ( (1 − p lt )/p lt ); if the three statistics are fully correlated ( r i =j = 1 ), the weight factor for each statistic (t) will be 1/3 , and DCMS l will be the average of log((1 − p lt )/p lt ). Indeed, the weight factors help avoiding excessive contribution of highly correlated statistics in the DCMS calculation.
(1) The focus is on conservation; said to be hardy; live weight of cows 350-600 kg [16] Haplotype-based H1 and H12 statistics Effective population size (N e ) is a required parameter in haplotype phasing. It was initially estimated, separately, for each of the two breed groups using the SNeP software [26] with the parameters described in [27]. SHAPEIT2 [25] was used for haplotype phasing of the autosomal genome, also separately for each breed group. In SHA-PEIT2, we set the parameters conditioning states to 400 (--states 400), and the N e to 108 (--effective-size 108) for the group of southern-middle breeds and 60 (--effective-size 60) for the group of northern breeds. A bovine genetic map [28] was used to accompany SHAPEIT2 for haplotype phasing in order to correct for the variation in recombination rate along the cattle genome. Then, using a customized R script, the phased haplotypes were transformed to the format requested by the H12_H2H1. py script (https ://githu b.com/ngaru d/Selec tionH apSta ts). This python script with a window size of 14 SNPs and step size of 1 (-window 14 -jump 1) was run to estimate the H1 and H12 statistics for each autosome and each breed, as described in [20].

Tajima's D and nucleotide diversity (pi) statistics
Both Tajima's D and pi statistics were estimated with the vcftools software [29]. Tajima's D statistics were calculated for each breed and chromosome, separately, using the --TajimaD function considering non-overlapping sliding windows of 300 Mb (--TajimaD 300000). The estimated D values for each 300-Mb bin were assigned to SNPs within that bin, and missing values were converted into zeros. Pi statistics were calculated for each breed and chromosome separately with the --site-pi function; and in order to reduce noise, the outputs were smoothed for each chromosome through the R's runmed function with a window size of 31 SNPs (k = 31, endrule = "constant") as described in [20].

Fixation index (F ST )
Fixation index, which is a measure of population differentiation, was calculated for each SNP and each breed against the remaining samples from the other breeds using PLINK --fst and --within functions. F ST values less than 0 were converted into zeros and the statistics were then smoothed using the runmed function as described for the pi statistics.

Gene annotation
Genes that were located in genomic regions including consecutive SNPs with a q-value lower than 0.05 were considered as statistically significant intervals and the boundary for each interval was determined by searching for the first flanking SNP showing a q-value higher than 0.1. Then, the protein coding genes were extracted from the significant regions based on the UMD3.1 bovine reference genome assembly [24]. Finally, we performed an extensive review of the literature to annotate functions of the identified genes.

Results
As illustrated in Fig. 1a, five cattle breeds that originated from the northern, middle and southern parts of Sweden were included in our study. According to the PC analyses ( Fig. 1b), the first PC clearly differentiated our dataset into two clusters of breeds, i.e. the northern breeds (Fjäll and Fjällnära) and the middle-southern breeds (Swedish Holstein-Friesian, Swedish Red Cattle, and Swedish Red Polled). PC analysis of these cattle breeds (Fig. 1b) led us to remove three samples that were located outside of their expected breed cluster. Therefore, 99 individuals (genotyped for 105,362 SNPs) remained for analyses of signatures of selection.

De-correlated composite of multiple signals (DCMS)
After calculation of the within-breed DCMS statistics for 105,362 SNPs, p-values were fitted to a normal distribution and corrected for multiple testing (FDR < 0.05). We identified 58, 37, 38, 39, and 51 genomic regions for the Fjällnära, Fjäll, Swedish Holstein-Friesian, Swedish Red, and Swedish Red Polled breeds, respectively (Table 2) and [see Additional file 1: Table S1]. The average length of the significant regions ranged from 142.3 (± 187.1) kb in Swedish Red Polled to 315.4 (± 400.8) kb in Swedish Red Cattle ( Table 2). The total length of the genome covered with significant signals and the number of identified protein coding genes between breeds ranged from 72.6 to 134.1 Mb and from 61 to 99, respectively (Table 2). Overall, 355 unique protein coding genes were detected within the significant intervals for all  (Table 2) and [see Additional file 2: Table S2], among which several were detected in more than one breed [see Additional file 3: Table S3].
The distribution of the regions of signatures of selection across the genome of the five Swedish breeds is represented in Fig. 2 2) and [see Additional file 1: Table S1].
Gene annotation of the identified regions detected various verified and novel candidate genes that are associated with a diverse range of traits including high altitude adaptation and response to hypoxia (DCAF8,  Table 3).

Discussion
Both the economic and fitness-related traits that were highlighted in the current study are quantitative traits, which are generally affected by many genes, most of which have small effects. Selection for complex traits may occur simultaneously across many loci (with less intensity), which would leave weak signals across the genome [35]. In spite of this, selection for polygenic traits with some alleles that have a large effect may leave detectable signals. In the current study, genomic regions that were under putative selection in five native and commercial Swedish dairy cattle breeds from different parts of the country were studied. Selection signatures were identified by decomposition of p-values of five test statistics, instead of using single statistical tests separately or focusing on regions detected in more than one test. According to previous studies, composite measures of signatures of selection can provide an unbiased criterion to identify variants under selection more precisely [15,36]. Therefore, by using decomposition of p-values to identify signatures of selection, candidate genes can be identified with higher power and greater precision for future research in medicine, agriculture, and livestock breeding.
The clear clustering of breeds into two groups (mountain breeds vs. middle-southern breeds) as shown in the PC plot presented here was not in complete accordance with the results of Upadhyay et al. [16] who could not clearly differentiate all the Swedish cattle breeds in separate clusters based on the first PC. This could be attributed to the fact that a larger number of Swedish breeds, i.e. representing an additional source of variation, were included in their study [16]. In addition, a recent study showed that the Swedish Fjäll cattle are closely related to the Northern, Western and Eastern Finn cattle and Icelandic cattle [37]. However, the map of signatures of selection reported here is in line with previous studies [16,38,39] that showed that Swedish Red Polled and Fjäll are two distinct breeds sharing no close genetic relationship with each other.
The historical background and usage differ between Swedish cattle breeds. Sweden spans about 1572 km from north to south [40] and climatic conditions for dairy production range from the subarctic cold climate in the more mountainous north, to the continental climate in the more densely populated southern part [41]. According to our results, none of the detected regions with identified genes was significant in all the five breeds analyzed. However, significant genomic regions harboring genes were associated with the polled phenotype, body size and stature, resistance to gastrointestinal nematodes, response to hypoxia, growth and feed intake, feed efficiency, and reproduction overlapped between some of the breeds [see Additional file 3: Table S3]. The limited overlap between the detected putative regions across these Swedish cattle breeds suggest that some breedspecific selection occurred in the past. Nevertheless, in our study of signatures of selection, we found several genomic regions and genes that are involved in economic traits, such as milk production, growth, feed efficiency, reproduction and cold acclimation and that can be used for mapping causal mutations, identifying candidate genes and making better selection decisions in future cattle genetic improvement programs.

Signatures of potential human-mediated selection
In the commercial breeds, which originate from southern Sweden, such as the Swedish Red and the Swedish Holstein-Friesian, production of large quantities of milk has been economically advantageous. The old Swedish Friesian cattle that was rather a dual purpose breed was gradually transformed into the Swedish Holstein cattle from the 1970s to the mid-1990s, when imported Holstein bulls were used to improve milk yield and udder conformation [42]. Thus, the Swedish Holstein-Friesian became a more distinct dairy breed type, with a tall and lean conformation. Interestingly in our study, genes that are known to be associated with these traits were identified for this breed.   [43][44][45][46][47]. The genes that we identified within this region, including PLAG1, CHCHD7, MOS, RPS20, and LYN, are known as major genes with a role in both human height and cattle stature [44,46,[48][49][50][51][52]. This region (i.e. BTA14:24.4-25.4) is also known to affect a variety of reproduction-related traits such as maturity index, scrotal circumference, age at puberty in Brahman cattle [53,54], and blood levels of IGF1 in Brahman and Tropical Composite cattle [55]. The PLAG1 gene is significantly correlated with the lactation phenotype [56], which suggests that this region has a major pleiotropic effect in various cattle breeds. Other genes were also identified such as GBP4, FGF6, TIGAR , CCND2 (involved in body weight and stature), GID4, ATPAF2, ELOVL3, NFKB2 (involved in milk yield and components), TMEM68, TGS, LYN, CEP72, SLC9A3 (involved in growth and feed efficiency) (see Table 3). The modern Swedish Red Cattle is a dairy breed, but historically it was important for beef production in Sweden, and still shows today a better average carcass conformation classification and a slightly higher carcass fat content than Swedish Holstein [57]. The Swedish Red Cattle breed is also known for producing higher concentrations of milk fat and protein than Swedish Holstein [58]. Our results also highlight putative signatures of selection in regions on BTA13 and 14 related to carcass quality and milk composition in Swedish Red Cattle. One of the interesting genes under putative selection is CRH (top ranked gene in the DCMS analysis), which is consistent with the breeding history of this breed. CRH, which is located at 31.49 Mb on BTA14, plays an important role in several physiological and biological pathways regarding the stimulation of ACTH (adrenocorticotropin) secretion, that up-regulates cortisol [59]. The cortisol hormone stimulates gluconeogenesis in liver and lipolysis in adipose tissue, and inhibits glucose consumption in adipose tissue and muscle [59]. Polymorphisms in the CRH gene are associated with marbling score [59], growth and carcass yield [60], and milk production [61,62], which suggests a pleiotropic effect of this gene. FOXA2 is another identified gene that is known to be one of the important transcriptional activators and plays a role in the regulation of energy homeostasis and feeding. FOXA2 is significantly associated with chest girth, body weight, and growth traits in Chinese cattle [63]. Other interesting genes were identified in Swedish Red Cattle such as PLAG1, XKR4, and IMPAD1, which are positional candidate genes for pleiotropic QTL for growth traits [64]. According to the literature, PLAG1 is associated with carcass weight, stature, body weight and milk NFKB2 (8) Milk content [128] q-value: q-value of the most significant SNP within the significant genomic region Breed: breed names are shown in abbreviated form (full names are in Table 1) Candidate genes: candidate genes within significant genomic regions

RFI Residual feed intake
The full list of genes is in Additional file 2 Ghoreishifar et al. Genet Sel Evol (2020) 52:52 yield [45,46,[65][66][67][68], XKR4 with birth weight, growth, and feed intake [51,67,69], and IMPAD1 with carcass weight, stature and body weight [56,65]. In native breeds from remote mountain areas in northern Sweden, which are located far from commercial dairies, the production of milk suitable for storable dairy products such as cheese was most valuable in the past. A putative signature of selection was identified in breeds from the northern region (i.e. Fjäll and Fjällnära) containing the first ranked gene, i.e. FCHSD2 and the gene AQP7 in the Fjäll breed. A meta-analysis on French dairy cattle breeds reported AQP7 and FCHSD2 as candidate genes within QTL regions on BTA8 and 15, respectively, associated with milk fat percentage [70]. We also identified two other genes (UCP2 and UCP3) in the Fjällnära breed, for which significant associations were reported between a polymorphism in UCP2 and calving interval in dairy cattle, and between polymorphisms in UCP3 and production and fat content in dairy cattle [71] and carcass phenotype in beef cattle [72,73]. Gene expression analyses also revealed an association between UCP2 and residual feed intake in Nellore cattle [74].

Body size and cold acclimation
Compared to the commercial southern breeds, and in addition to selection pressures for milk yield and composition, historically the native breeds had to cope with cold weather, in some high mountain regions, a limited amount of feed, and even starvation during winter [75]. These breeds generally have a smaller body size (see Table 1). The DCMS analyses identified potential evidence of adaptation to the cold climate of the mountain regions characterized by a very limited amount of feed supply (see Table 1). We identified several genes including CCND2, FGF6, TIGAR , KCNA6, NDUFA9, AKAP3, C5H12orf4, and RAD51AP1 on BTA5 (BTA5: 105.75-106.52) that have been under strong selection (q-value 2.6E − 13) in Fjällnära cattle. Interestingly, a genomewide association study (GWAS) on cattle populations from Siberia showed associations between these genes and body measurements [76]. In a gene expression study of bovine QTL, TIGAR was found to be significantly differentially expressed and associated with body weight [77]. Moreover, a meta-analysis of GWAS studies for cattle stature reported CCND2 as the second most significant gene regulating stature in mammals [78]. Another gene, CSMD3, which encodes a transmembrane protein [79] was reported to be associated with body measurements in cattle [80].
Body size and shape vary among breeds [2,81]. Whereas most bovine breeds from temperate regions with access to a good supply of high-quality nutrients have a larger body size, breeds that live in high altitude areas with a limited amount of feed often have a smaller body size. Our findings also agree with the Geist theory [82], according to which food availability per animal during the growing season is a determining factor for body size evolution in mammals. We hypothesized that small body size was subject to natural selection in Swedish northern breeds since it probably allowed them to develop hardiness and endurance characteristics necessary for searching food and water over long distances. This is supported by a previous study that compared the grazing pattern of a native cattle (Fjäll) with that of a commercial breed (Swedish Holstein-Friesian) and showed that the Fjäll breed explored across extensive regions with an inclination towards various types of plants [83]. In addition, shortage of stored feed during winters in years with bad harvests or extra cold and long winters occurred before the 20th century. Such historical scarcity of feed, especially towards the end of long winters when animals were kept indoors, probably favored animals with a small body weight because this is associated with lower maintenance requirements [84]. As shown in Table 1, Fjäll cattle have a small body size, which is only a bit larger than that of the Fjällnära breed. However, we did not identify genes associated with body measurements in the Fjäll breed, possibly because they have been under stronger artificial selection for milk production and kept in areas of northern Sweden with less harsh conditions.
We also identified genes under selection in the northern breeds, including PPP1R12A, AQP3, AQP7 and SLC16A3 in the Fjäll breed, and UCP2, UCP3, ACAA2 and TIGAR in the Fjällnära breed, which are known to be related with altitude adaptation and response to hypoxia. Hypoxia or hypoxic stress is described as a decline in oxygen levels below the normal levels of 20.8 to 20.95% and results in metabolic adaptation at both the cellular and organism levels. Phosphorylation of the PPP1R12A (also called MYPTI) gene is known to increase as the level of oxygen decreases [85,86]. In a study comparing Ethiopian sheep breeds that are adapted to different ecological regions, Edea et al. [87] identified four genes including PPP1R12A that showed signatures of selection likely related to high altitude adaptation. Gene expression analyses revealed SLC16A3 as one of the significant up-regulated genes (p-value = 1.77E − 02) in response to high altitude adaptation of sheep fetal carotid arteries [88]. For the other northern breed, Fjällnära, we found that the four genes under putative selection, UCP2, UCP3, ACAA2 and TIGAR , are involved in the response to hypoxia (GO:0001666) and other related Gene Ontologies. The genes identified by the DCMS analysis suggest that the Fjällnära and Fjäll breeds are probably adapted to the high altitude mountains. However, it should be noted that the highest mountain reaches ~ 2000 m in Sweden, and less than 2500 m in Scandinavia (Norway), and that these breeds have been maintained at lower altitudes. For example, the remaining Fjällnära cattle were sampled on farms located at 485 to 682 m above average sea level. Thus, their adaptation to high altitudes may have occurred before their recorded history in Scandinavia. A functional gene network analysis, AQP3 and AQP7 (and AQP5, another gene of the aquaporin family, which we did not identify here but was reported in a study on Russian cattle breeds by Yurchenko et al. [20]) were also identified as candidate genes for the regulation of thermal adaptation via the transport of water, glycerol, and small solutes across cell membranes [89].

Signatures of selection for disease resistance genes
Until the end of the 19th century, the northern cattle grazed on woodlands in the summertime and had to be guarded from predators, which is why they are assumed to have developed hardiness and robustness to cope with their local environment [90,91]. In concordance with the historical background of these breeds, our results suggest selection for stress response and immunity-related traits. Our results revealed an overlapping putative region under selection including the FCHSD2, P2RY2, P2RY6, ARHGEF17 and RELT genes in the Fjällnära and Fjäll breeds. Moreover, the P2RY6, ARH-GEF17 and RELT genes were also identified in Swedish Red Cattle [see Additional file 3 Table S3]. FCHSD2, RELT and ARHGEF17 have already been identified in sheep as candidate genes that play a role in the serum levels of immunoglobulin A (IgA), which is an indicator of resistance to gastrointestinal nematodes [92]. In addition, two other genes, CYP4F2 and SLC10A2, were identified on BTA7 and 12 in Fjäll Cattle. In cattle, CYP4F2 was reported to be involved in the host resistance to the intestinal worm Cooperia oncophora [93] and to gastrointestinal nematodes [94]. Hempel et al. [95] showed that SLC10A2 was among the top five most significantly up-regulated genes (q-value = 0.000897) in a comparison between infected vs. uninfected cows with Mycobacterium avium subsp. paratuberculosis (an intracellular bacterium). Paratuberculosis infections are very rare in Sweden, but this gene could be involved in the response to infections by other bacteria or selected for other reasons related to its role in the uptake of intestinal bile acids. Our findings suggest a possible selection for adaptive immunity genes in these Swedish indigenous breeds. These genes explain a small proportion of the variance [92], which suggests that resistance/susceptibility to gastrointestinal nematodes is a complex trait that is determined by multiple groups of gene networks rather than the effect of an individual gene. However, since the P2RY6, ARHGEF27, RELT, FCHSD2, CYP4F2 and SLC10A2 genes appear to be under selection in the Swedish indigenous breeds, it would be interesting to analyze them in more detail. Moreover, future studies should investigate whether phenotypic variation in susceptibility to gastrointestinal nematodes or other important traits exists between these breeds.
The four CHIA, CHI3L2, PRDM2 and KDM5B genes that we identified on BTA4 and 16 in the Swedish Red Cattle were previously reported as disease resistance genes (i.e. to gastrointestinal nematodes or bacterial infection) in independent studies on sheep and cattle breeds [96][97][98][99]. In Swedish Red Polled, we found that the TMEM37 gene (BTA2: 71.46-71.66), known to be involved in the resistance to gastrointestinal nematodes, ranked as the closest gene to the most significant SNP showing a signature of selection (q-value = 0.0063). Li et al. [94] reported a set of 64 candidate genes that displayed significant overexpression at a high 5% FDR threshold level for resistance to gastrointestinal nematodes and included TMEM37. Similarly, we detected SPSB1 (BTA16: 44.96-45.25), which was previously reported to be under selection in a meta-assembly of studies on signatures of selection in various cattle breeds [100], and in another analysis of bovine signatures of selection [101]. SPSB1 (along with SPSB2) regulates the amount of nitric oxide (NO) produced via the induction of NO synthase [102]. Nitric oxide is one of the main contributors of reactive nitrogen known to have a major role in host defense by killing intracellular pathogens [102]. In Swedish Holstein-Friesian, the GBP6 gene identified on BTA3 (BTA3: 53.73-55.25), is part of the six top genes that are known as key players in immune response to Mycobacterium avium subsp. Paratuberculosis [103].
A relatively small number of genes (n = 61) was identified in significant genomic regions of the Swedish Red Polled breed (Table 1). More importantly, by reviewing the literature, we found only a small number of genes in the detected selection signature region in this breed that were associated with acclimation or economic traits (Table 3). This could be attributed to the rapid reduction of N e of this breed (genetic bottleneck) with only 20 cows still present in 1979 [104], which suggests that strong genetic drift may have occurred in this population, although the genetic variation in this breed increased in the last decades by the importation of red polled cattle from Norway and Finland. Schlamp et al. [8] reported that the identification of signatures of selection in small populations can be problematic, since they can be weakened by the strong genetic drift that occurs in such small populations.

Conclusions
In total, 108 genomic regions (including 355 unique protein-coding genes) that display signatures of selection in Swedish indigenous (northern breeds) and commercial (southern origin) cattle breeds were identified by incorporating p-values of different statistics in a single DCMS framework. These signatures of selection are in line with the history of these breeds. Some of these signatures of selection are located in regions carrying genes associated with economic traits such as milk yield and components, and growth and carcass traits, thus describing response to human-mediated selection in commercial breeds. Other signatures of selection that were detected in regions that harbor genes for body size may be involved in the adaptation of these native breeds to the specific climate of the northern part of the country, which is characterized by cold weather and, historically, by a limited amount of feed during the winter. For example, four genes that are involved in body size traits were subject to selection in both Fjällnära (a native mountain breed with a small body size) and Swedish Holstein-Friesian (a commercial breed with a larger body size). Indeed, our results may provide background knowledge to better understand the genetic mechanisms that are involved in economic traits and adaptation to local climate. Additional high-resolution studies focusing on the putative regions of signatures of selection and using larger sample sizes and recorded phenotypes may help pinpoint the candidate genes under selection.
Additional file 1: Table S1. List of genomic regions under putative selection (q-value < 0.05) identified using the DCMS (De-correlated composite of multiple signals) method for each breed. The first column represents the abbreviated name for each breed, and the second column shows the number (ID) of each genomic region identified as under putative selection within each breed.
Additional file 2: Table S2. Full list of protein coding genes identified (using UMD3.1 genome assembly) from the genomic regions under putative selection. The 1 st column represents the abbreviated name for each breed, and the 6 th column represents the number (ID) of significant genomic region under putative selection (within each breed) and the rank of each gene within that region (i.e. the value 1 represents the closest gene to the most significant SNP within each region).
Additional file 3: Table S3. Mutual genes identified among breeds.