This study used two approaches to identify putative selective sweeps that could be associated with phenotypes, which contribute to domesticability, biological types (adaptation, draught, meat and milk) and to desirable morphologies that might have impacted the extent and distribution of variability within the genomes of South African cattle breeds. The first approach detected complete sweeps that indicate fixation of long haplotypes within breeds as suggested by Ramey et al. [13]. However, the effects of selection on the distribution of genetic variation can be confounded with patterns of genetic variation caused by demographic events such as the size, structure and mating pattern of a population [10]. To distinguish between the effects of selection and those of demographic events, Hayes et al. [38] suggested that the location of the detected loci should be investigated. For instance, demographic events may alter patterns of allele frequencies across the entire genome while selection events are more likely to alter allele frequencies at the loci that are in close vicinity to the mutations that are under selection [38]. In addition, fixed long homozygous haplotypes can also occur due to strong inbreeding following a founder effect [38]; however, a study by Makina et al. [23] demonstrated that the level of inbreeding was relatively low within each of the breeds studied here. Long homozygous haplotypes in breeds that were not included in the design of the BovineSNP50 assay (e.g. Nguni and Afrikaner) could have been created by chance because of the SNP ascertainment bias which would lead to lower overall average MAF for the SNPs on the assay in these breeds. To partially counter this effect, the number of loci required to declare a selective sweep, N, was defined individually for each breed (Table 1) and a larger N was required for breeds with larger numbers of monomorphic and low MAF SNPs.
LD-based methods such as the long range haplotype, extended haplotype homozygosity and integrated haplotype score approaches can be also used to identify genomic regions with unusually long haplotypes that have a high frequency in the population [39]. These approaches are useful to identify variants that have undergone a partial or incomplete selective sweep, in which a new mutation has a frequency that has risen to a modest value in the population but has yet to reach fixation [40]; however these approaches are somewhat sensitive to marker density, which was relatively low in this study. While the across-population extended haplotype homozygosity test can compare haplotype lengths between populations to control for local variation in recombination rate [41], signals of strong recent selection were analyzed within each breed.
The second approach detected genomic regions with high F
ST between African and European breed pairs using sliding windows throughout the genome [14] to reveal differentiation that could result from different selection histories for production or adaptation to local environments. However, such differentiation could be caused by drift. In contrast to the first approach, the F
ST approach can detect different types of selection signatures [40], which may explain why the two methods did not produce overlapping signals. One of the limitations associated with the first approach was the calibration relative to the size of the sweeps. While intensive selection in a small population can cause the rapid fixation of a long haplotype, weak selection in a large population would result in the fixation of only a short haplotype, which may not be identified with this approach [13]. Because of the requirement that each of the N contiguous loci should have a MAF less than α, for a small α, N was chosen to be sufficiently large so that the probability of observing N contiguous loci with a MAF less than α by chance alone would be very low and a sufficiently small chromosomal region was defined so that the targeted sweeps would not be smaller than 47 × (N − 1) kb, where 47 kb represents the median interval between SNPs on the BovineSNP50 assay [13]. Furthermore, the design of the BovineSNP50 assay led to lower average MAF and larger numbers of monomorphic SNPs for the Afrikaner and Nguni breeds, which are phylogenetically distant from the breeds that were used to discover the SNPs on the assay [25]. To adjust for this phylogenetic bias, N was individually defined for each breed (Table 1) and a larger N was required for breeds with larger numbers of monomorphic and low MAF SNPs. Finally, the ascertainment bias of common SNPs in the design of the BovineSNP50 assay might explain the inability to detect common sweeps among the Afrikaner, Nguni and Drakensberger breeds using the first analytical method.
Overall, this study detected 47 candidate genomic regions that are potentially either historically or currently under selection within and between six cattle breeds in South Africa. Twenty of these candidate genomic regions were detected within breeds and 27 were detected as regions that had diverged between breeds. In addition, 12 of these candidate genomic regions were shared between breeds and ten had previously been reported [13, 36, 42–44]. Furthermore, no putative selection signatures were predicted to be shared across the South African (indigenous and locally developed) and Bos taurus cattle breeds (Angus and Holstein), which is probably due to the different environmental and demographic forces to which these breeds were exposed during breed formation [2].
Domestication has caused considerable changes in the morphology and behaviour of livestock species, as has artificial selection for the specific traits that were selected during breed formation and subsequently for specific breeding objectives [17]. Coat colours are easily identifiable phenotypes that probably played an important role in selection before farmers gained access to objective measurements [17]. In certain breeds, such as Nguni, colour patterns have cultural connotations and coloured hides have different economic values [1]. The melanocyte stimulating hormone receptor gene (MC1R) on BTA18 between 14,757,060 and 14,758,700 bp, which influences the production of eumelanin and pheamelanin pigment and is responsible for the pigmentation of skin, eyes and hair [45], was found to be differentially selected between Holstein and Nguni cattle but not between the South African Afrikaner (red), Drakensberger (black) or Bonsmara (red) breeds. This could be due to specific alleles at the MC1R gene that are under selection in the Nguni breed. Ramey et al. [13] observed a sweep at MC1R in Hanwoo cattle which are yellow. Furthermore, Stella et al. [43] and Flori et al. [46] reported that the MC1R gene was under selection in cattle. MC1R has been proposed to have three alleles, i.e. E
D for breeds with a black coat (e.g., Holstein, Angus and Murray Grey), e for breeds with recessive red coat (e.g., Limousin, Shorthorn and Hereford) and E
+, also called “wild type” for all other breeds except Hereford [47]. The dominant E
D allele is responsible for black coat colour, whereas the recessive e/e genotype results in red coats. However, wild type E
+
E
+ homozygotes may display variable colour patterns, since other genes (e.g., Agouti) can influence the pigments produced [37]. The presence of a putative selection signature on MC1R in Nguni cattle, which are characterized by multi-coloured skin patterns that may present various forms (white, brown, golden yellow, black, dappled, or spotted), is of interest and suggests the existence of additional functional alleles at MC1R as was also suggested by the presence of a sweep at MC1R in yellow Hanwoo cattle [13]. Identifying the mutations that underlie these signals would allow a better understanding of the role of MC1R in coat colour patterning in cattle.
Behavioural changes such as reduction in fear and anti-predator responses and increase in sociability are believed to have been selected during domestication [48]. This study detected several putative selection signatures that could be related to the development of the nervous system as well as the regulation of a wide range of tissue and cell functions including behaviour, for example, regions harbouring WNT5B, FMOD, and PRELP (Afrikaner), CCR7 (Nguni) and OVOS, and SLC6A17 (Bonsmara). The Bovine HapMap Consortium [6] and Gautier et al. [44] also reported selection signatures in regions that contain genes associated with the nervous system of cattle.
South African cattle are farmed in regions that are characterized by periodic drought, seasonal dry periods, and nutritional shortages in the natural veld and are subjected to a variety of external and internal parasites and stock diseases [1]. A number of candidate genes and of gene families that were previously associated with one or more performance attributes of tropical adaptation [36, 44] have been selected in Nguni cattle. For example, keratin genes (KRT222, KRT24, KRT25, KRT26 and KRT27) and one heat shock protein gene (HSPB9) on BTA19 between 42,896,570 and 42,897,840 bp were found to be under selection. Heat shock proteins are differentially expressed between indicine and taurine cattle in the tropical environments of Africa and are associated with tropical adaptation in Zebu cattle [36, 44]. Keratins (heteropolymeric structural proteins) form the basis of the structural constituent of the epidermis during epidermal development. Epidermal development occurs in response to adaptation to different climatic and environmental conditions, including tick exposure [49]. In addition, keratins play a role in the formation of the hair shaft [50]. Skin colour and the thickness of the hair directly influence the thermo-tolerance of cattle that live in the tropics [51]. Nguni cattle have a smoother and shinier hair coat than European cattle breeds. Due to these characteristics, Nguni cattle regulate their body temperature and maintain cellular functions more efficiently during heat [20] and also resist better to tick infestation [19]. The absence of such signals in other local cattle breeds such as Afrikaner, Drakensberger and Bonsmara, which also display some ability to survive under extreme conditions [19] may be explained by the fact that the method based on F
ST is most efficient at detecting differentiation when the region is near fixation for alternate alleles in the breeds compared [39]. Thus, while these loci may be under selection in these breeds, the desirable alleles may still have intermediate frequencies. This agrees with the results of Muchenje et al. [19] and Marufu et al. [21] who reported that Nguni cattle were more resistant to ticks and could better survive to extreme conditions than other local South African breeds.
Several candidate genes that are related to antigen recognition, which is a key process in the development of immune response were identified as being under selection in this study, and include MTPN (Afrikaner), CYM (Afrikaner and Nguni), CDC6, CDK10, KCNBI and TNS4 (Nguni), NDUFA12, ALOX15B, and ALOX12B (Bonsmara), and SLC25A48 and SERPINA3-8 (Drakensberger). The CD family of immune response genes was described by Meissener et al. [52] as being closely involved with molecular functions and pathways of the major histocompatibility complex (MHC). The TNFAIP8L2 gene has a major role in individual immune homeostasis [53] and the NDUFA12 gene that has diverging allele frequencies between taurine and Zebu cattle is associated with tick resistance. These observations are consistent with the tolerance of Afrikaner, Nguni, Drakensberger and Bonsmara cattle to various tick and parasitic diseases [19, 21]. Furthermore, candidate genomic regions that include the MTPN and PDPR (Afrikaner), DCC (Afrikaner, Nguni, Drakensberger and Bonsmara), OTX2 (Angus), DNAH2, TMEM88 and GUCY2D (Bonsmara), EBF1 (Nguni), and CXCL14 and SLC25A48 (Drakensberger) genes overlap with previously identified QTL that affect tick resistance and nematode tolerance in cattle.
Several candidate genes within the selected regions are indirectly or directly involved in reproductive pathways including spermatogenesis, ovulation rate, oestrus processes, testis development and prostaglandin development in cattle. These included OVOS2 (Bonsmara), ADIPOR2 (Afrikaner and Nguni), WC1 (Drakensberger and Bonsmara), RBBP8 (Bonsmara), SERPINA3-8, HOXC12 and HOXC13 (Drakensberger), and FBXL4 (Afrikaner and Nguni). It has been shown that all these breeds are able to reproduce under harsh environmental conditions; they are considered to be excellent dam lines for crossbreeding, with few calving difficulties [1], which supports the presence of putative selection signatures at loci involved in reproduction that probably occurred during the adaptation of these breeds to South African conditions. In addition, these regions overlap with previously reported QTL associated with reproduction in cattle.
Candidate genes related to growth and muscle development were also detected as being under selection, i.e. DDX19A, TMEM51, and MTPN (Afrikaner), IGFBP4, (Nguni), TGFB1 and KCNB1, (Drakensberger), MYO6, KIAAI1797 and EFHD2 (Bonsmara), AJAP1 (Angus), and ATOX1 (Holstein). In addition, some of these regions overlap with previously identified QTL that are associated with stature, body weight and growth in cattle. Furthermore, some of the putative selection signatures detected in this study overlap with previously reported QTL that affect milk yield and quality (BTA3, 5, 10, 16 and 23), feed efficiency (BTA13, 16 and 18), fat thickness (BTA5, 18 and 19), marbling score and carcass weight (BTA3, 5, 16, 20 and 27) as well as somatic cell count (BTA3, 5, 7, 9, 18 and 22).
The overall goal of this study was to identify candidate genomic regions targeted by selection within and between the major cattle breeds of South Africa. The fact that 12 of the identified candidate genomic regions were shared among several of the breeds analysed in this study and that 10 were validated by previous studies reduces the probability of detecting false positives [13]. False positives that could have been introduced by the SNP ascertainment bias or the LD pruning in the F
ST analyses should be identified in future studies using the BovineHD BeadChip or sequence data. Results of this study provide insights into the genetic mechanisms that underlie traits of economic importance among cattle breeds in South Africa in particular with regard to adaptation to tropical and subtropical environments via increased resistance to tick and parasite-borne diseases and enhanced reproduction and production potential.