Skip to main content

Genetic variants in mammary development, prolactin signalling and involution pathways explain considerable variation in bovine milk production and milk composition

Abstract

Background

The maintenance of lactation in mammals is the result of a balance between competing signals from mammary development, prolactin signalling and involution pathways. Dairy cattle are an interesting case study to investigate the effect of polymorphisms that affect the function of genes in these pathways. In dairy cattle, lactation yields and milk composition (for example protein percentage and fat percentage) are routinely recorded, and these vary greatly between individuals. In this study, we test 8058 single nucleotide polymorphisms in or close to genes in these pathways for association with milk production traits and determine the proportion of variance explained by each pathway, using data on 16 812 dairy cattle, including Holstein-Friesian and Jersey bulls and cows.

Results

Single nucleotide polymorphisms close to genes in the mammary development, prolactin signalling and involution pathways were significantly associated with milk production traits. The involution pathway explained the largest proportion of genetic variation for production traits. The mammary development pathway also explained additional genetic variation for milk volume, fat percentage and protein percentage.

Conclusions

Genetic variants in the involution pathway explained considerably more genetic variation in milk production traits than expected by chance. Many of the associations for single nucleotide polymorphisms in genes in this pathway have not been detected in conventional genome-wide association studies. The pathway approach used here allowed us to identify some novel candidates for further studies that will be aimed at refining the location of associated genomic regions and identifying polymorphisms contributing to variation in lactation volume and milk composition.

Background

There have been many attempts to identify the genes that control milk production and functional traits in dairy cattle since they have high economic value [1, 2]. Linkage studies and genome-wide association studies (GWAS) have led to the identification of a handful of causative mutations that affect milk production traits in dairy cattle [37]. However, the mutations that underlie most of the genetic variation remain elusive, reflecting the fact that the majority of these mutations are likely to have small effects and, therefore, individually explain a small proportion of the genetic variance [8, 9]. New methods are needed to analyse the large quantity of genetic information provided by high-density SNP (single nucleotide polymorphism) panels in order to identify novel genetic variants that have a functional role in lactation traits.

One potential approach is to first filter genetic variants for association analysis by considering pathways of genes that are likely to be involved in lactation. The advantage of this method is that less stringent significance thresholds can be used than in traditional GWAS, since the level of multiple testing is not as high. This also means that associations of smaller effect can be detected. However, the approach does have the limitation that any mutations that affect the traits outside the selected pathways will be missed, which means that the variation we can identify may be reduced compared with that from whole-genome studies.

For dairy traits, genes that are involved in mammary gland development, prolactin signalling and involution pathways are relevant candidates. Genes in the lactation pathway have been well-described but are largely inferred from mouse studies [1013]. Development of the mammary gland (or mammogenesis) involves the formation of the rudimentary mammary structure before puberty and is triggered by secreted signalling proteins and transcription factors that regulate developmental processes, such as the Wnt, notch and hedgehog signalling pathways [12]. When the mammary structure begins to form, genes for growth hormone and proteins involved in basement membrane architecture are expressed. At puberty, the concentration of several hormones increases and stimulates the formation of alveolar buds [14]. Prolactin signalling is vital for lobulo-alveolar development and establishment of lactation but appears less important after teat formation in dairy cattle [15, 16]. One hypothesis is that in cattle, prolactin may be more important for immune support at calving [17]. Prolactin interacts with its receptors to trigger paracrine signalling mechanisms through a highly regulated feedback mechanism involving JAK/STAT and map kinase activity, as well as other downstream targets, which in turn regulate proliferation and cell differentiation [14]. In involution, milk producing epithelial cells are removed via cell detachment and apoptosis. Cytokines, interleukins and MMP (matrix metalloproteinases) are involved in complex signal transduction cascades to regulate proliferation and apoptosis in this pathway. The mammary epithelium undergoes several rounds of proliferation, differentiation and apoptosis over up to eight lactations in dairy cattle [18]. These processes are regulated by a number of genes, which represent excellent candidates for harbouring mutations that explain part of the observed variation in milk production traits and thus link genetic variation with the biological mechanisms underlying the phenotype. In this study, we have assembled sets of genes involved in mammary gland development, prolactin and involution biological pathways. Then, we tested SNPs in windows of 200 kb surrounding these genes for association with milk production traits in dairy cattle. Our hypothesis is that genes in these pathways will harbour genetic mutations that explain variation in production traits in dairy cattle, and that our approach will detect more of these associations than a traditional GWAS, since we can test variants at lower significance thresholds because of the smaller number of tests conducted.

Methods

Genome-wide association studies

To determine whether SNPs within key lactation pathways were significant for milk production traits, an association analysis was used. We analysed several traits, including fat kg, fat percentage, milk volume, protein kg, and protein percentage [19, 20]. A total of 16 812 dairy cattle were genotyped using the Illumina Bovine HD BeadChip, or the BovineSNP50 array [21] and imputed to the higher density [22] (1785 animals were actually genotyped at the higher density). After quality control (as in [22]), the final number of SNPs was 632 003. The genotyped animals included 9015 Holstein cows, 2770 Holstein bulls, 4202 Jersey cows, and 825 Jersey bulls [see Additional file 1: Table S1]. Phenotypes of bulls and cows were constructed as daughter trait deviations (the average of the bull’s daughters trait deviations corrected for breed of mate) and trait deviations, respectively (corrected for herd year season and permanent environment effects) [see Additional file 2: Table S2]. The distributions of the number of lactations (for cows) and daughters (for bulls) are in Additional file 3: Figure S1. Records were standardised in both breeds to have a mean of 0 and a standard deviation of 1. In all analyses, phenotypes on bulls were weighted as

1 h 2 1 + 4 h 2 n

where n represents the number of records [23] and h2 is the heritability of the trait (0.33 for milk volume, fat kg and protein kg, and 0.5 for protein percentage and fat percentage, for both breeds [20]). Phenotypes on cows were weighted using the formula [23]:

1 h 2 1 + r 2 l 1 l

where r2 is the repeatability (0.56 for all milk production traits) and l is the number of lactations. For the percentage traits, we were not able to fit weights for bulls in the model due to problems with convergence, likely because the heritability for these traits was high.

The linear mixed model used to determine the association between individual SNP and each milk production trait:

y = + Wb + Zu + e

where y is the vector of phenotypes, expressed as the trait deviations for cows and daughter trait averages for bulls, β is the vector of fixed effects, including the overall mean and the effects of breed and sex, X is a design matrix allocating phenotypes to fixed effects, W is the vector of animal genotypes (the number of copies of the second allele at the SNP that the animal carries, coded as 0, 1 or 2), b is the additive effect of the second allele of the SNP, Z is an incidence matrix mapping phenotype to animals, u is the vector of polygenic effects (one for each animal), and e is the vector of random residuals. The polygenic breeding values were fitted as random effects following a normal distribution N 0 , A σ α 2 where A is the expected relationship among individuals constructed from the pedigree (which dates back to the 1940s) and σ a 2 is the polygenic genetic variance. Variance components and fixed effects were estimated for each SNP with ASReml [24].

Analysis of key lactation pathways

Gene sets for analysis were chosen using published reviews of three important developmental stages of the lactating mammary gland. These included the mammary development pathway [12] and the prolactin signalling [14] and involution pathways [25]. We identified 64 genes involved in mammary development, 27 genes involved in prolactin signalling, and 40 genes involved in involution (Tables 1, 2 and 3). The gene families MAP kinase, P13K and frizzled were not included in the pathways since specific genes were not suggested in the reviews and these gene families have a wide range of signalling functions. The genomic location of these genes were determined using UMD3.1 in the NCBI database [26]. The SNPs within the genes of a pathway, or within 100 kb to each side of those genes, were then tested for association with each trait using the model above. The effect of a SNP was determined to be significant at P ≤ 0.05. The GWAS was repeated using other significance thresholds (P < 10−3 and P < 10−5) but 0.05 had the greatest power to detect enrichment (results not shown). The number of SNPs significant for each pathway was expressed as a proportion of the total number of SNPs in that pathway (PropSig).

Table 1 Proportion of significant SNPs for genes in the mammary development pathway and number of SNPs significantly (P < 0.05) associated with each trait
Table 2 Proportion of significant SNPs for genes in the prolactin pathway and number of SNPs significantly (P < 0.05) associated with each trait
Table 3 Proportion of significant SNPs for genes in the involution pathway and number of SNPs significantly (P < 0.05) associated with each trait

To determine if the proportion of significant SNPs observed for each pathway was significantly greater than by chance at an experiment-wise level, distributions under the null hypothesis of no association were constructed with random permutations of the data. A list of 24 617 uniquely annotated bovine genes was created from the Ensembl Biomart database [27, 28]. From this, three sets of genes, each with a length equal to the respective pathway tested were selected at random. SNPs were selected from within and 100 kb surrounding these genes to reflect the moderate to high linkage disequilibrium in Holstein cattle [29, 30]. Each pathway SNP set was analysed in ASReml using the mixed linear model described above. This procedure was repeated 10 000 times to construct null distributions and the 500th highest proportion of significant SNPs was taken at the experiment-wise P < 0.05 threshold. If the observed ratio for a pathway was greater than this value for a particular trait, the pathway was considered significant.

To account for differences in functional clustering of genes in the experimental pathways and in the random control gene sets, we compared the distance between genes on the same chromosome [see Additional file 4: Figure S2]. The experimental and control sets were distributed similarly but, due to the smaller number or paired genes for the experimental pathways, there were fewer gene pairs at long distances across the chromosomes (particularly > 10 Mb).

KEGG annotations were used to determine the gene sets that represented other biological pathways [31, 32].

Finally, a variance component analysis was used to determine whether the SNPs within each pathway explained a greater proportion of the genetic variance than an equal number of randomly selected SNPs from the whole genome. The model fitted was

y = W b + Zg + e ,

where terms were the same as above, and g is a vector of random effects, assumed distributed N 0 , G σ g 2 , where G is a genomic relationship matrix, constructed using the rules of [33]. The genomic relationship matrix was based on the SNPs from each pathway, plus a set of 4000 SNPs randomly selected from the whole genome. The reason for adding the 4000 randomly chosen SNPs was that SNPs in the genes of the pathways are typically clustered by genomic location (i.e. a number of the genes are located in close proximity) [see Additional file 4: Figure S2]. Given the large number of animals in our dataset, this means that a considerable number of animals can have genomic relationships that are equal to or close to 1, i.e. they have inherited the same segments of the genome at all of the locations of the pathway genes. Consequently, the genomic relationship matrix is singular and impossible to invert. Adding 4000 random SNPs removed the singularities and the genomic relationship matrix could be inverted and variance components estimated. However, with the 4000 SNPs included, we could only assess the marginal contribution of adding SNPs in the pathway.

Estimates of the variance components σ g 2 and σ e 2 were obtained from the REML analysis with ASREML [24]. The proportion of variance explained by the SNPs in these pathways was compared to that explained by the same number of randomly chosen SNPs within 100 kb of a gene, i.e. the additional SNPs were chosen to be close to genes, plus the set of 4000 randomly chosen SNPs corresponding to each pathway. Five replicates of the randomly chosen sets were performed to obtain standard errors.

Results

Mammary development pathway

The 64 genes identified in the mammary development pathway included 3968 SNPs (Table 1). When the proportion of significant SNPs, at P < 0.05, (PropSig) was compared to the null distributions, the mammary development pathway was significantly associated with protein percentage (PropSig = 0.340, P < 0.01; Table 4 and Additional file 5: Figure S3). The null distributions compared with the experimental results are shown in Additional file 5: Figure S3, Additional file 6: Figure S4 and Additional file 7: Figure S5. The genes that contained the largest proportion of significant SNPs (> 50% significant SNPs) were the following: AREGB, CASB, DKK1, FGF1, FGF10, GHR, PRLR, SOCS2, STAT5A, STAT5B, TGFB1 and WNT10B (Table 1 and Additional file 8: Table S3 for gene abbreviations).

Table 4 Proportion of significant SNPs for milk production traits in the mammary development, prolactin and involution pathway genes

Four genes in the mammary development pathway were located on BTA20, which contains a well-known QTL for milk production [5]. These genes included FGF10, MSX2, PRLR and GHR. FGF10 is located 1 Mb downstream of GHR, which is the gene often described with, though not necessarily underlying [34], this large QTL. To account for any potential bias associated with over-represented genes, we re-ran the pathway test and control permutations without BTA20. The mammary development pathway still reached significance for protein percentage when this chromosome was removed [see Additional file 9: Figure S6].

KEGG annotations of these 64 genes found 25 genes in pathways associated with cancer and 8 to 14 other genes in signalling pathways, such as JAK-STAT, that are known to be activated during lactation (Table 5). The PI3K-Akt pathway is involved in mammary development, and mutations in genes of this pathway are found in approximately 70% of breast cancers [35]. There were eight genes involved in Wnt signalling pathways, which are prominent in mammary development and cancers [36].

Table 5 KEGG associations for the mammary development, prolactin signalling and involution pathways

To determine the extent of pleiotropy for variants in the pathway, we correlated the SNP effect estimates (for the 3968 SNPs in the pathway) for each pair of traits. Milk volume was negatively correlated with fat percentage and protein percentage, while fat percentage and protein percentage were highly positively correlated (Table 6). Fat kg and milk volume were also highly positively correlated with protein kg, as expected.

Table 6 Correlation between core traits for SNP within the mammary development, prolactin signalling and involution pathways

Prolactin signalling pathway

The prolactin signalling gene set was considerably smaller (27 genes, 1569 SNPs) than the involution and mammary development sets, since it only represents only one signalling pathway, while mammary development and involution represent the combined effects of several sub-pathways (Table 2). Protein kg, fat kg and fat percentage were significantly associated with the prolactin signalling gene set (Table 4) and [see Additional file 6: Figure S4]. The SOCS2, STAT3, STAT5A, STAT5B, PRLR and CASB genes had more than 50% of SNPs significant for three or more milk production traits (Table 2).

KEGG annotations for genes in the prolactin pathway showed 12 associations with the JAK-STAT signalling pathway, followed by the PI3K-Akt and insulin signalling pathways (Table 5).

Involution pathway

The involution pathway contained 40 genes and 2521 SNPs (Table 3). The proportion of associated SNPs was significant at the experiment-wise level for all milk production traits, except fat [see Additional file 7: Figure S5] and (Table 4). We identified a large ratio of significant SNPs for ATF4, IGFBP4, IRF1, LIFR, OSMR, PTK2, STAT3, STAT5A and STAT5B (Table 3). KEGG analysis showed a trend towards infection-related pathways (Table 5). JAK-STAT, hepatitis B and PI3K signalling pathways were also highly represented. Traits showed moderate to high correlations, which suggested pleiotropy for milk production traits within SNPs in the involution pathway (Table 6).

Three genes in the involution pathway were located on BTA14 and may be biased by associations with the large QTL at the beginning of BTA14 associated with the mutation in DGAT1[37]. The CEPBD and MYC genes are located more than 13 Mb upstream of this QTL but PTK2 sits 2 Mb upstream from DGAT1, well within the bounds of this very large QTL. When BTA14 was removed from the analysis, the involution pathway remained significant for the traits for which this was tested [see Additional file 9: Figure S6].

There was some overlap in the genes of the three pathways. Genes STAT5A, STAT5B and SOCS3 were common to all three pathways (Figure 1). Prolactin and mammary development pathways showed the largest overlap, which included TNF, SOCS and prolactin genes. KEGG analyses showed that similar pathways were represented in mammary development and involution but infection-related pathways were more prominent due to the abundance of acute phase response genes such as interleukins and STAT genes (Table 5).

Figure 1
figure 1

Venn diagram showing the number of overlapping genes in three lactation pathways.

Proportion of variance explained by mutations in pathways

For milk production traits, SNPs in the involution pathway explained 10 to 13% more genetic variation than expected by chance for all traits (Table 7). SNPs in the mammary development pathway explained 7 to 9% more genetic variation than expected by chance for milk, protein percentage and fat percentage. SNPs in the prolactin pathway explained less variation than expected by chance, although results were not significantly different from zero. This could be the result of a combination of two factors, i.e. (1) SNPs within the prolactin signalling pathway do not really explain much variation, and (2) because of the small number of genes in this pathway, the SNPs did not cover all chromosomes (and therefore did not capture variation on those chromosomes), unlike the randomly sampled SNPs. The overall significance of each milk production trait for each pathway tested was very similar, though not identical, to the results from SNP by SNP association testing (perhaps a result of random sampling to construct the null distributions).

Table 7 Additional genetic variance explained by SNPs in genes or within 100 kb of genes in the mammary development, prolactin signalling, and involution pathways, compared with an equal number of randomly chosen SNPs within 100 kb of genes

Discussion

We used information on mammary development, prolactin signalling and involution pathways to identify candidate gene regions that could be associated with milk production traits. SNPs in genes that are involved in the mammary development pathway were highly associated with protein percentage and explained a considerable proportion of the variance for three milk production traits. The prolactin signalling pathway did not explain any additional variance in milk production traits, but contained a significant number of associated SNPs for protein kg, protein percentage and fat percentage. SNPs in genes involved in the involution pathway explained the greatest level of variance in milk production traits in our variance component approach. The involution pathway was also significant for all milk production traits except fat in the association testing approach.

Mammary development, prolactin signalling and involution pathways contained highly significant genes that have been described in GWAS or are known to be important lactation genes. These include, CASB, SOCS2, GHR, PRLR, LIFR and the STAT genes. In particular, SNPs within STAT5A have a large effect on milk composition and have been validated in vitro[38, 39]. Figure 2 shows a GWAS for protein percentage as an example, and displays the relationship between genes studied from these pathways and genome-wide QTL patterns. Most genes are located in regions that could not be identified by a traditional GWAS. SNPs within regions not previously associated with milk production traits, such as AREGB, ATF4, IRF1, DKK1, and TGFB1, which were significant for mammary development, may contain novel mutations that affect milk production traits and may represent key genes from the mammary development pathway that explain some of the variance in these traits in cattle.

Figure 2
figure 2

GWAS of protein percentage in Holsteins and Jerseys. SNPs within the mammary development, prolactin signalling, and involution pathways are highlighted as red, blue and green dots, respectively; * identifies chromosomes 14 and 20, which have been scaled down to allow observation of smaller effects.

The reason why the involution pathway explained the greatest level of variance in milk production traits in our variance component approach, although only half the number of SNPs of the mammary development pathway were available, could be because this pathway includes genes in or close to a previously described QTL with quite large effects on milk production traits (Figure 2), particularly protein percentage [5]. However, when the analysis was ran without the genes on BTA20 (FGF10, MSX2, PRLR and GHR), this pathway was still significant, even for protein percentage. Note that removing the GHR gene from the analysis is questionable because the growth hormone receptor is a vital component of the lactation pathway since it interacts with several relevant substrates during lactation [5]. Similarly, removing the CEBPD, MYC and PTK2 genes on BTA14 (because they were in the region of DGAT1) did not affect the overall significance of the mammary development pathway. The clustered expression of the genes in a pathway, i.e. they are expressed with other secreted milk genes [40], may result in significant associations that are due to nearby, co-expressed genes. The permutation method generated some replicates with similar genome distributions to the experimental data [see Additional file 4: Figure S2], which implies that the clustered expression of genes probably does not greatly affect the results. There is currently no ideal approach to control for the complicated genetic architectures of traits in pathway analyses. While these genetic structures should be accounted for, caution should be taken to avoid losing information from highly relevant genes.

One of the main limitations of our approach is that if a mutation that affects milk production is not in the analysed pathways, it will automatically be excluded. Perhaps even more importantly, our interpretations could be biased if irrelevant genes are included in the pathways. This may have occurred in cases where broad-acting cellular processes are represented in the gene sets. Improved descriptions of pathways would increase the power to identify genomic regions that influence these traits. The pathways used in this study were primarily derived from mouse studies and are relatively poorly described in cattle. For mammary development, the signalling interactions in the placode epithelium are particularly poorly described. For the prolactin signalling pathway, little is known about the downstream signalling of progesterone receptors. For the involution pathway, it is not known how membrane apoptosis is triggered although this would represent a significant contribution to the description of this biological process. Approaches such as microarray and RNAseq technologies using time-course data could help refine this method so that it represents more closely the true biological action. These approaches have successfully identified genes acting at different physiological states in the lactation cycle. Another potential limitation of our study is that the phenotypes were averages of several records across lactation. The same analyses could be performed using just early or late lactation records. Lactation curve parameters have been used in similar modelling experiments and may further refine these numerous SNP associations [41].

Finally, the value of KEGG pathway annotations was questionable. The relevance of these annotations for the target traits is difficult to establish for genes that are involved in broad and numerous biological processes. A further problem is that KEGG annotations are heavily dominated by cancer-related information.

Conclusions

We have successfully used the information from characterised mammary development, prolactin signalling and involution pathways to identify novel SNP associations with milk production traits. The proportion of significant SNPs in or near genes from the mammary development pathway was considerably greater than expected by chance for protein percentage. Of the three pathways studied, the involution pathway was highly associated with milk production traits and explained the highest level of variation above that expected by chance (up to 13% for protein kg). While we have reported many novel candidates useful for further studies, we must point out that pathway-based methods are restricted by the quality of annotations and completeness of pathway information.

References

  1. Hayes BJ, Pryce J, Chamberlain AJ, Bowman PJ, Goddard ME: Genetic architecture of complex traits and accuracy of genomic prediction: Coat colour, milk-fat percentage, and type in Holstein cattle as contrasting model traits. PLoS Genet. 2010, 6: e1001139-10.1371/journal.pgen.1001139.

    Article  PubMed Central  PubMed  Google Scholar 

  2. Cole JB, VanRaden PM, O’Connell JR, Van Tassell CP, Sonstegard TS, Schnabel RD, Taylor JF, Wiggans GR: Distribution and location of genetic effects for dairy traits. J Dairy Sci. 2009, 92: 2931-2946. 10.3168/jds.2008-1762.

    Article  CAS  PubMed  Google Scholar 

  3. Cohen-Zinder M, Seroussi E, Larkin DM, Loor JJ, Wind AE-v, Lee J-H, Drackley JK, Band MR, Hernandez AG, Shani M, Lewin HA, Weller JI, Ron M: Identification of a missense mutation in the bovine ABCG2 gene with a major effect on the QTL on chromosome 6 affecting milk yield and composition in Holstein cattle. Genome Res. 2005, 15: 936-944. 10.1101/gr.3806705.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  4. Grisart B, Coppieters W, Farnir F, Karim L, Ford C, Berzi P, Cambisano N, Mni M, Reid S, Simon P, Spelman R, Georges M, Snell R: Positional candidate cloning of a QTL in dairy cattle: identification of a missense mutation in the bovine DGAT1 gene with major effect on milk yield and composition. Genome Res. 2002, 12: 222-231. 10.1101/gr.224202.

    Article  CAS  PubMed  Google Scholar 

  5. Blott S, Kim JJ, Moisio S, Schmidt-Kuntzel A, Cornet A, Berzi P, Cambisano N, Ford C, Grisart B, Johnson D, Karim L, Simon P, Snell R, Spelman R, Wong J, Vilkki J, Georges M, Farnir F, Coppieters W: Molecular dissection of a quantitative trait locus: a phenylalanine-to-tyrosine substitution in the transmembrane domain of the bovine growth hormone receptor is associated with a major effect on milk yield and composition. Genetics. 2003, 163: 253-266.

    PubMed Central  CAS  PubMed  Google Scholar 

  6. Cohen M, Reichenstein M, Everts-van der Wind A, Heon-Lee J, Shani M, Lewin HA, Weller JI, Ron M, Seroussi E: Cloning and characterization of FAM13A1 gene near a milk protein QTL on BTA6: evidence for population-wide linkage disequilibrium in Israeli Holsteins. Genomics. 2004, 84: 374-383. 10.1016/j.ygeno.2004.03.005.

    Article  CAS  PubMed  Google Scholar 

  7. Viitala S, Szyda J, Blott S, Schulman N, Lidauer M, Mäki-Tanila A, Georges M, Vilkki J: The role of the bovine growth hormone receptor and prolactin receptor genes in milk, fat and protein production in Finnish Ayrshire dairy cattle. Genetics. 2006, 173: 2151-2164. 10.1534/genetics.105.046730.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  8. Lynch M, Walsh B: Genetics and Analysis of Quantitative Traits. 1998, Sunderland: Sinauer Associates

    Google Scholar 

  9. Chamberlain AJ, McPartlan HC, Goddard ME: The number of loci that affect milk production traits in dairy cattle. Genetics. 2007, 177: 1117-1123. 10.1534/genetics.107.077784.

    Article  PubMed Central  PubMed  Google Scholar 

  10. Gjorevski N, Nelson CM: Integrated morphodynamic signalling of the mammary gland. Nat Rev Mol Cell Biol. 2011, 12: 581-593. 10.1038/nrm3168.

    Article  CAS  PubMed  Google Scholar 

  11. Sternlicht M: Key stages in mammary gland development: the cues that regulate ductal branching morphogenesis. Breast Cancer Res. 2006, 8: 201-10.1186/bcr1368.

    Article  PubMed Central  PubMed  Google Scholar 

  12. Macias H, Hinck L: Mammary gland development. Wiley Interdiscip Rev Dev Biol. 2012, 1: 533-557. 10.1002/wdev.35.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  13. Lemay DG, Neville MC, Rudolph MC, Pollard KS, German JB: Gene regulatory networks in lactation: identification of global principles using bioinformatics. BMC Syst Biol. 2007, 1: 56-10.1186/1752-0509-1-56.

    Article  PubMed Central  PubMed  Google Scholar 

  14. Oakes SR, Rogers RL, Naylor MJ, Ormandy CJ: Prolactin regulation of mammary gland development. J Mammary Gland Biol Neoplasia. 2008, 13: 13-28. 10.1007/s10911-008-9069-5.

    Article  PubMed  Google Scholar 

  15. Akers RM: Lactation and the mammary gland. 2002, Ames: Iowa State Press

    Google Scholar 

  16. Liu X, Robinson GW, Wagner KU, Garrett L, Wynshaw-Boris A, Hennighausen L: Stat5a is mandatory for adult mammary gland development and lactogenesis. Genes Dev. 1997, 11: 179-186. 10.1101/gad.11.2.179.

    Article  CAS  PubMed  Google Scholar 

  17. Nagy EVA, Berczi I: Hypophysectomized rats depend on residual prolactin for survival. Endocrinology. 1991, 128: 2776-2784. 10.1210/endo-128-6-2776.

    Article  CAS  PubMed  Google Scholar 

  18. Australian Dairy Herd Improvement Scheme: Industry Statistics for 2002–2012. 2012,http://www.adhis.com.au,

    Google Scholar 

  19. Moser G, Khatkar MS, Hayes BJ, Raadsma HW: Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers. Genet Sel Evol. 2010, 42: 37-10.1186/1297-9686-42-37.

    Article  PubMed Central  PubMed  Google Scholar 

  20. Pryce JE, Bolormaa S, Chamberlain AJ, Bowman PJ, Savin K, Goddard ME, Hayes BJ: A validated genome-wide association study in 2 dairy cattle breeds for milk production and fertility traits using variable length haplotypes. J Dairy Sci. 2010, 93: 3331-3345. 10.3168/jds.2009-2893.

    Article  CAS  PubMed  Google Scholar 

  21. Matukumalli LK, Lawley CT, Schnabel RD, Taylor JF, Allan MF, Heaton MP, O’Connell J, Moore SS, Smith TPL, Sonstegard TS, Van Tassell CP: Development and characterization of a high density SNP genotyping assay for cattle. PLoS ONE. 2009, 4: e5350-10.1371/journal.pone.0005350.

    Article  PubMed Central  PubMed  Google Scholar 

  22. Erbe M, Hayes BJ, Matukumalli LK, Goswami S, Bowman PJ, Reich CM, Mason BA, Goddard ME: Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels. J Dairy Sci. 2012, 95: 4114-4129. 10.3168/jds.2011-5019.

    Article  CAS  PubMed  Google Scholar 

  23. Garrick DJ, Taylor JF, Fernando RL: Deregressing estimated breeding values and weighting information for genomic regression analyses. Genet Sel Evol. 2009, 41: 55-10.1186/1297-9686-41-55.

    Article  PubMed Central  PubMed  Google Scholar 

  24. Gilmour AR, Gogel BJ, Cullis BR, Thompson R: ASReml User Guide Release 2.0. 2006, VSN International Ltd: Hemel Hempstead

    Google Scholar 

  25. Sutherland KD, Lindeman GJ, Visvader JE: The Molecular culprits underlying precocious mammary gland involution. J Mammary Gland Biol Neoplasia. 2007, 12: 15-23. 10.1007/s10911-007-9034-8.

    Article  PubMed  Google Scholar 

  26. Rincon G, Islas-Trejo A, Castillo AR, Bauman DE, German BJ, Medrano JF: Polymorphisms in genes in the SREBP1 signalling pathway and SCD are associated with milk fatty acid composition in Holstein cattle. J Dairy Res. 2012, 79: 66-75. 10.1017/S002202991100080X.

    Article  CAS  PubMed  Google Scholar 

  27. Hubbard TJP, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, Down T, Dyer SC, Fitzgerald S, Fernandez-Banet J, Graf S, Haider S, Hammond M, Herrero J, Holland R, Howe K, Howe K, Johnson N, Kahari A, Keefe D, Kokocinski F, Kulesha E, Lawson D, Longden I, Melsopp C, Megy K: Ensembl 2007. Nucleic Acids Res. 2007, 35: D610-D617. 10.1093/nar/gkl996.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  28. European Bioinformatics Institute.http://www.ebi.ac.uk/,

  29. Gibbs RA, Taylor JF, Van Tassell CP, Barendse W, Eversole KA, Gill CA, Green RD, Hamernik DL, Kappes SM, Lien S, Matukumalli LK, McEwan JC, Nazareth LV, Schnabel RD, Weinstock GM, Wheeler DA, Ajmone-Marsan P, Boettcher PJ, Caetano AR, Garcia JF, Hanotte O, Mariani P, Skow LC, Sonstegard TS, Williams JL, Diallo B, Hailemariam L, Martinez ML, Morris CA, Silva LO: Genome-wide survey of SNP variation uncovers the genetic structure of cattle breeds. Science. 2009, 324: 528-532.

    Article  CAS  PubMed  Google Scholar 

  30. Villa-Angulo R, Matukumalli LK, Gill CA, Choi J, Van Tassell CP, Grefenstette JJ: High-resolution haplotype block structure in the cattle genome. BMC Genet. 2009, 10: 19-

    Article  PubMed Central  PubMed  Google Scholar 

  31. Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M: KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 1999, 27: 29-34. 10.1093/nar/27.1.29.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  32. Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M: KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 2011, 40 (D1): D109-D114.

    Article  PubMed Central  PubMed  Google Scholar 

  33. Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW, Goddard ME, Visscher PM: Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010, 42: 565-569. 10.1038/ng.608.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  34. Chamberlain AJ, Hayes BJ, Savin K, Bolormaa S, McPartlan HC, Bowman PJ, Van Der Jagt C, MacEachern S, Goddard ME: Validation of single nucleotide polymorphisms associated with milk production traits in dairy cattle. J Dairy Sci. 2012, 95: 864-875. 10.3168/jds.2010-3786.

    Article  CAS  PubMed  Google Scholar 

  35. Wickenden JA, Watson CJ: Key signalling nodes in mammary gland development and cancer. Signalling downstream of PI3 kinase in mammary epithelium: a play in 3 Akts. Breast Cancer Res. 2010, 12: 202-10.1186/bcr2558.

    Article  PubMed Central  PubMed  Google Scholar 

  36. Cadigan KM, Nusse R: Wnt signaling: a common theme in animal development. Genes Dev. 1997, 11: 3286-3305. 10.1101/gad.11.24.3286.

    Article  CAS  PubMed  Google Scholar 

  37. Grisart B, Farnir F, Karim L, Cambisano N, Kim JJ, Kvasz A, Mni M, Simon P, Frère JM, Coppieters W, George M: Genetic and functional confirmation of the causality of the DGAT1 K232A quantitative trait nucleotide in affecting milk yield and composition. Proc Natl Acad Sci U S A. 2004, 101: 2398-2403. 10.1073/pnas.0308518100.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  38. Khatib H, Monson RL, Schutzkus V, Kohl DM, Rosa GJM, Rutledge JJ: Mutations in the STAT5A gene are associated with embryonic survival and milk composition in cattle. J Dairy Sci. 2008, 91: 784-793. 10.3168/jds.2007-0669.

    Article  CAS  PubMed  Google Scholar 

  39. Khatib H, Monson RL, Huang W, Khatib R, Schutzkus V, Khateeb H, Parrish JJ: Short communication: validation of in vitro fertility genes in a Holstein bull population. J Dairy Sci. 2010, 93: 2244-2249. 10.3168/jds.2009-2805.

    Article  CAS  PubMed  Google Scholar 

  40. Lemay DG, Lynn DJ, Martin WF, Neville MC, Casey TM, Rincon G, Kriventseva EV, Barris WC, Hinrichs AS, Molenaar AJ: The bovine lactation genome: insights into the evolution of mammalian milk. Genome Biol. 2009, 10: R43-10.1186/gb-2009-10-4-r43.

    Article  PubMed Central  PubMed  Google Scholar 

  41. Strucken EM, Bortfeldt RH, de Koning DJ, Brockmann GA: Genome-wide associations for investigating time-dependent genetic effects for milk production traits in dairy cattle. Anim Genet. 2011, 43: 375-382.

    Article  PubMed  Google Scholar 

  42. Raven L, Cocks BG, Hayes BJ: Multi-breed genome wide association can improve precision of mapping causative variants underlying milk production in dairy cattle. BMC Genomics. 2013, 15: 62-76.

    Article  Google Scholar 

Download references

Acknowledgements

LR is supported by the Dairy Futures CRC Australia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lesley-Ann Raven.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

LR performed the GWAS and pathway analyses and drafted the manuscript. BC, BH, LR and MG conceived and designed the study. BH performed the variance component analysis. JP assisted in the interpretation. MG assisted with the pathway design. All authors read and approved the final manuscript.

Electronic supplementary material

12711_2013_2664_MOESM1_ESM.zip

Additional file 1: Table S1a: Number of phenotypes for production traits. Sample sizes adapted from [42]. Table S1b. The minimum and maximum phenotypes for production traits in dairy cattle. Phenotypes are expressed in standard deviations, with a mean of zero within each breed. Adapted from [42]. (ZIP 84 KB)

12711_2013_2664_MOESM2_ESM.png

Additional file 2: Table S2: Description and measurement of milk production traits. All non-production traits are expressed as a percentage of the standard deviation from the phenotypic mean. (PNG 52 KB)

12711_2013_2664_MOESM3_ESM.zip

Additional file 3: Figure S1a: Distribution of number of lactations for cows. X-axis is labelled with the mid-point of each bin. Figure S1b. Distribution of number of daughters per bull. X-axis is labelled with the mid-point of each bin. (ZIP 45 KB)

12711_2013_2664_MOESM4_ESM.jpeg

Additional file 4: Figure S2: Distance between genes in lactation pathways and control permutations. Control plots are scaled to represent 10 000 replicates of randomly selected genes of size equivalent to the experimental pathway. (JPEG 519 KB)

12711_2013_2664_MOESM5_ESM.jpeg

Additional file 5: Figure S3: Permutation tests for SNP within the mammary development pathway. Associations were created using 800 k SNP data from Holstein and Jersey cattle. Purple bars represent the null hypothesis distribution. SNP sets were randomised from the 200 kb region spanning 67 genes. The vertical red line is the experimental result (e.g. the observed proportion of SNP in that pathway), while the green line is the P ≤ 0.05 significance threshold for the pathway from the permutation test, for a) fat kg, b) milk volume, c) protein kg, d) fat percentage, e) protein percentage. (JPEG 272 KB)

12711_2013_2664_MOESM6_ESM.jpeg

Additional file 6: Figure S4: Permutation tests for SNP within the prolactin signalling pathway. Purple bars represent the null hypothesis distribution. SNP sets were randomised from the 200 kb region spanning 27 genes. The vertical red line is the experimental result (e.g. the observed proportion of SNP in that pathway), while the green line is the P ≤ 0.05 significance threshold for the pathway from the permutation test, for a) fat kg, b) milk volume, c) protein kg, d) fat percentage, e) protein percentage. (JPEG 284 KB)

12711_2013_2664_MOESM7_ESM.jpeg

Additional file 7: Figure S5: Permutation tests for SNP within the involution pathway. Purple bars represent the null hypothesis distribution. SNP sets were randomised from the 200 kb region spanning 40 genes. The vertical red line is the experimental result (e.g. the observed proportion of SNP in that pathway), while the green line is the P ≤ 0.05 significance threshold for the pathway from the permutation test, for a) fat kg, b) milk volume, c) protein kg, d) fat percentage, e) protein percentage. (JPEG 283 KB)

Additional file 8: Table S3: Gene abbreviations. Gene families are represented in bold. (DOCX 82 KB)

12711_2013_2664_MOESM9_ESM.jpeg

Additional file 9: Figure S6: Control permutations with major QTL regions removed. Description: Histograms show a) mammary development with BTA14 removed and b) involution with BTA20 removed, both for protein percentage. Red lines represent the significance of the pathway. Green lines show the P value cut-off. (JPEG 135 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Raven, LA., Cocks, B.G., Goddard, M.E. et al. Genetic variants in mammary development, prolactin signalling and involution pathways explain considerable variation in bovine milk production and milk composition. Genet Sel Evol 46, 29 (2014). https://doi.org/10.1186/1297-9686-46-29

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1297-9686-46-29

Keywords