Casein haplotypes and their association with milk production traits in Norwegian Red cattle
© Nilsen et al; licensee BioMed Central Ltd. 2009
Received: 29 January 2009
Accepted: 20 February 2009
Published: 20 February 2009
A high resolution SNP map was constructed for the bovine casein region to identify haplotype structures and study associations with milk traits in Norwegian Red cattle. Our analyses suggest separation of the casein cluster into two haplotype blocks, one consisting of the CSN1S1, CSN2 and CSN1S2 genes and another one consisting of the CSN3 gene. Highly significant associations with both protein and milk yield were found for both single SNPs and haplotypes within the CSN1S1-CSN2-CSN1S2 haplotype block. In contrast, no significant association was found for single SNPs or haplotypes within the CSN3 block. Our results point towards CSN2 and CSN1S2 as the most likely loci harbouring the underlying causative DNA variation. In our study, the most significant results were found for the SNP CSN2_67 with the C allele consistently associated with both higher protein and milk yields. CSN2_67 calls a C to an A substitution at codon 67 in β-casein gene resulting in histidine replacing proline in the amino acid sequence. This polymorphism determines the protein variants A1/B (CSN2_67 A allele) versus A2/A3 (CSN2_67 C allele). Other studies have suggested that a high consumption of A1/B milk may affect human health by increasing the risk of diabetes and heart diseases. Altogether these results argue for an increase in the frequency of the CSN2_67 C allele or haplotypes containing this allele in the Norwegian Red cattle population by selective breeding.
Several studies have reported the existence of QTL affecting milk production traits on bovine chromosome 6 (BTA6) [1, 2] (summarized at http://genomes.sapac.edu.au/bovineqtl/ and http://www.vetsci.usyd.edu.au/reprogen/QTL_Map/). Two distinct regions on this chromosome affect milk traits (including protein yield, protein percentage, fat yield, fat percentage and milk yield). One QTL affecting protein and fat percentage has been positioned in a narrow region of 420 kb  and a putative functional polymorphism in the ABCG2 gene underlying the QTL has been suggested [4, 5]. The second region on BTA6 associated with milk traits maps to the casein cluster [e.g. [6–11]]. The casein cluster is composed of four genes; αs1-, β-, αs2- and κ-casein (CSN1S1, CSN2, CSN1S2 and CSN3, respectively) producing approximately 80 percent of the protein content of cow's milk . The four casein genes have been mapped in the order CSN1S1-CSN2-CSN1S2-CSN3 to bovine chromosome 6 (BTA6) at q31-33 by in situ hybridisation [13, 14].
Several polymorphisms have been detected in the open reading frame (reviewed by ) and in noncoding regions such as the 5'-flanking region of the casein genes [15, 16]. The most common genetic variants in western dairy breeds are αs 1-casein B (here denoted CSN1S1_192*A) and C (CSN1S1_192*G), β-casein A1 (CSN2_67*A), A2 (CSN2_67*C) and B (CSN2_122*C), and κ-casein A (CSN3_136*C), B (CSN3_136*T) and E (CSN3_155*G).
In the present study, we have constructed a dense SNP map in the casein region. The map facilitates accurate haplotype construction and was used for comprehensive association studies in Norwegian Red cattle.
Animals in the QTL study
All animals in the study belonged to the Norwegian Red cattle breed. For the chromosome wide QTL scan, animals were organized in a granddaughter design consisting of 18 elite sire families with a total of 716 sons and 507,000 granddaughters. To fine-map QTL in the casein region, the animal data was expanded to 31 elite sire families with a total of 1112 sons, ranging from 23 to 70 sons for the smallest and largest families, respectively. The total number of daughters in this analysis was approximately 1.9 million, with an average of 1670 daughters per son. The families were chosen based on sufficiently large family sizes and/or availability of trait data. The pedigree of each animal in the study was traced back as far as known. Daughter yield deviations (DYDs) of the sons were used as performance information in the analyses. The DYDs for milk production traits [protein percentage (P%), protein yield (PY), milk yield (MY), fat percentage (F%) and fat yield (FY)] were available from the national genetic evaluation carried out by GENO Breeding and AI Association, and evaluated using a BLUP animal model .
For the initial QTL scan, we used a map consisting of 399 SNPs covering the entire BTA6 . To fine-map QTL, we constructed a dense marker map consisting of 73 SNPs in and around the casein region on BTA6, covering approximately 750 kb. Fifty-four of the 73 SNPs in the map were detected by PCR resequencing of promoters and exon regions of all four casein genes (CSN1S1, CSN2, CSN1S2 and CSN3), nine SNPs were available from , whereas ten SNPs were selected from the Bovine Genome Sequencing Project . Physical distances between markers were determined from one single scaffold, NW_001495211, available from the latest assembly of the bovine genome Btau_4.0 . The average distance between SNPs was 10,462 bp (ranging from 7 to 302,143 bp). A description of the SNPs, including accession numbers in dbSNP, assays for genotyping on the MassARRAY system (Sequenom, San Diego, USA), marker allele frequencies and predicted physical distances between markers can be found in Additional file 1.
A combined linkage and linkage disequilibrium (LDLA) method  was used to analyze milk production traits based on the information on markers from the 399-marker map described in  and a dense SNP map (73 markers) constructed for the casein region (see Additional file 1). For the midpoint of each marker bracket, the log-likelihood of a model containing the QTL (LogL(G i )) was calculated as well as a model fitting only background genes (LogL(0)) using the ASREML package . Our test statistic, LogL difference, was then calculated as the difference in log-likelihood between the first and the second model. This LogL difference times 2 is equal to the Likelihood Ratio Test-statistic (LRT) of . According to Baret and coworkers, the distribution of the LRT under the null hypothesis can be seen as a mixture of two chi square distributions with 0 and 1 degree of freedom (df), respectively. Significance levels for the LRT are then found from a chi square distribution with 1 df but doubling the probability levels . Then, to obtain a significance level of 0.0005, the LRT value corresponding to a chi square distribution with 1 df and P = 0.001 is utilized. This LRT value is 10.8, and thus the corresponding LogL difference must be 5.4 or higher to achieve a significance level of 0.0005.
SNP association tests
DYDs of the sons were used as performance information in the analyses. The model fitted to the performance information for each trait and each SNP was: DYD i = μ + s i + x i b + a i + e i where DYDi is performance of son i, μ is the overall mean, si is a fixed effect of sire of son i, xi is 0 if son i is homozygous 1 1 (e.g. AA); 1 if son i is heterozygous 1 2 (e.g. AT or TA); or 2 if son i is homozygous 2 2 (e.g. TT), b is the effect of the SNP, ai is a polygenic effect of son i, and ei is a residual effect. For each single marker, the log-likelihood of a model containing the SNP effect (LogL(H1)) was calculated as well as a model without this SNP effect (LogL(H0)) using the ASREML package . Our test statistic, LogL difference, was then calculated as the difference in LogL between the first and the second model as described above. A SNP effect was regarded significant if the LogL difference exceeded 5.4.
Additionally, multiple SNP association tests were carried out for the most significant markers from the single SNP association test. The tests were implemented by fitting a fixed effect of the SNP in the above-mentioned model and repeating the analyses for the most significant SNPs in turn. Test statistics for the analyses were as described above.
LD and haplotype block structure of the casein region
An analysis package, CRIHAP, was developed for determining haplotypic phases and imputing missing genotypes for all individuals (Nome and Lien, unpublished). The programs are based on both linkage and linkage disequilibrium information generated by the CRI-MAP 2.4  and PHASE version 2.1 [24, 25] programs. Map information and genotypes for all animals were imported into the Haploview program  to calculate LD (r2) between markers.
Haplotype blocks were constructed for the casein loci CSN1S1, CSN2 and CSN1S2 for which we found highly significant brackets or single SNPs associated with protein yield. A script was made to deduce maternal and paternal haplotypes for all individuals and different haplotype blocks using haplotypic phases from the CRIHAP program package. As for the single SNP analyses, DYDs of the sons were used as performance information in the analyses. The model fitted to the DYDs, for each trait and each haplotype, was DYD i = μ + s i + x i b + a i + e i where DYDi is the performance of son i, μ is the overall mean, si is a fixed effect of sire of son i, xi is a row-vector indicating which haplotypes and how many copies are carried by the son; and b is a column indicating the random effects of the haplotypes; ai is a random polygenic effect of son i, and ei is a residual effect. The test statistic (LogL difference) was found as previously described for the single SNP association test. Phenotypic standard deviations for protein and milk yield were 36.75 kg and 1137.79 kg, respectively. These deviations were used to scale the haplotype effects into phenotypic standard deviations for each of the traits for a standardised presentation.
Chromosome wide QTL scan
SNP association tests
Extent of LD and haplotype reconstruction
Level of significance of haplotype effects within locus/haplotype block for protein yield (PY) and milk yield (MY). LogL differences above 5.4 are regarded as significant (P < 0.0005)
Our analysis of a dense SNP map in the casein region using the LDLA methodology revealed a high number of significant marker brackets for protein yield especially in CSN2 and CSN1S2 (Figure 1 and Figure 2). The fact that LDLA could not pin point a single marker bracket harbouring the QTL can probably be explained by a high degree of LD between the markers in the region. Analysis of the extent of LD in the region showed high LD in two segments (one segment consisting of CSN1S1, CSN2 and CSN1S2 and another one consisting of CSN3) (Figure 7). The two segments seem to be broken by a possible recombinant hotspot. Nilsen et al.  have reported evidence for a recombination hotspot between CSN1S2 and CSN3, confirming these findings. Hayes et al.  have also reported a recombination hotspot in the casein region in goat. Despite the fact that all four casein genes are coordinately expressed at high levels in a tissue- and stage-specific fashion, the κ-casein gene is not evolutionarily related to the three other casein genes (αs1, β and αs2) . The calcium-sensitive caseins (αs1, β and αs2) have originated from a common ancestral gene via intergenic and intragenic duplications  and share common regulatory motifs , whereas it has been suggested that the κ-casein is related to fibrinogens on the basis of amino acid sequence similarities . This evolutionary origin may also account for the LD segmentation described in this paper.
In accordance with the LDLA results, the single SNP association tests did not detect significant results for the CSN3 region, whereas a large number of significant associations were detected between SNPs within CSN2 and CSN1S2, and protein and milk yields. The most significant results were found for CSN2_67, CSN2-BMC_9215 and CSN1S2-BMC_17192. When fitting CSN2_67 as fixed effect in a multiple SNP association test it removed almost all peaks for other markers in the region (Figure 5). This indicates that CSN2_67 is in strong LD with the underlying causal variation in Norwegian Red. However, the fact that the two SNP alleles seem to display contradictory effects in various cattle breeds [6–8, 10] argue against CSN2_67 as being an underlying causal variation.
Notably, CSN2_67 determines the genetic variants A1/B versus A2. The C → A substitution at codon 67 results in the exchange of proline with histidine in the amino acid sequence , leading to a difference in the conformation of the secondary structure of the expressed protein. It is thought that the A allele at CSN2_67 yields the bioactive peptide beta-casomorphin 7 (BCM-7), a peptide with opioid-like effect, which may play an unclear role in the development of some human diseases (for a review, see ). It has been suggested that a high consumption of A1/B milk increases the risk of type 1 (insulin-dependent) diabetes mellitus , ischaemic heart disease , sudden infant death syndrome (SIDS) , the aggravation of symptoms associated with schizophrenia and autism (reviewed in ), and may also correlate with milk allergy [39, 40] in humans.
The high degree of LD between SNPs allowed us to construct haplotypes within and across the CSN1S1, CSN2 and CSN1S2 genes and investigate associations between haplotypes and DYDs for protein yield and milk yield. Analysis for CSN2 reveals two haplotypes (2 and 5) that associate with low protein yield values whereas four haplotypes (1, 3, 4 and 6) seem to be associated with higher PY levels (Figure 8). The difference between these two classes of haplotypes is characterized by the three SNPs CSN2-BMC_9215, CSN2_67 and CSN2-BMC_6334 (marker 11, 14 and 16, respectively; Figure 6), all of which have high LogL differences in the single SNP association test for both PY and MY.
For the CSN1S2 locus, we detected two haplotypes that seem to be associated with increased protein yield (1 and 5) whereas three haplotypes (2, 3 and 4) tend to be associated with a lower protein yield (Figure 9). CSN1S2 haplotype 5 is part of CSN2 haplotype 5 (see Figure 6). No significant haplotype was detected for CSN1S1 (data not shown). The main reason is probably that CSN2 haplotypes 1 (positive for protein yield) and 2 (negative for protein yield) combine into one frequent haplotype in CSN1S1.
For the extended block covering CSN1S1-CSN2-CSN1S2, we detected four haplotypes that associate with reduced milk and protein production (haplotype 2, 3, 6 and 7). Interestingly, all of these haplotypes contain the A-allele of CSN2_67 (the A1/B variant), in addition to the G-allele of CSN2-BMC_9215 (Additional file 2). In contrast, haplotypes containing the CSN2-A2 variant tend to associate with increased milk and protein yields. As consumption of CSN-A2 milk may have an accompanying positive effect on human health [39, 40, 35, 34, 38, 36, 37] it is recommended to increase the frequency of this allele in the Norwegian cattle population. One possible way of implementation would be to preselect calves prior to phenotype testing for growth performance and progeny testing for milk performance.
We would like to thank GENO Breeding and AI association for providing relationship information and DYDs for bulls. This project has been funded by The Research Council of Norway. The authors gratefully acknowledge the early pre-publication access under the Fort Lauderdale conventions to the draft bovine genome sequence provided by the Baylor College of Medicine Human Genome Sequencing Center and the Bovine Genome Sequencing Project Consortium.
- Khatkar MS, Thomson PC, Tammen I, Raadsma HW: Quantitative trait loci mapping in dairy cattle: review and meta-analysis. Genet Sel Evol. 2004, 36: 163-190. 10.1051/gse:2003057.PubMed CentralView ArticlePubMedGoogle Scholar
- Smaragdov MG: Genetic mapping of loci responsible for milk quality parameters in dairy cattle. Genetika. 2006, 42: 5-21.PubMedGoogle Scholar
- Olsen HG, Lien S, Gautier M, Nilsen H, Roseth A, Berg PR, Sundsaasen KK, Svendsen M, Meuwissen THE: Mapping of a milk production quantitative trait locus to a 420-kb region on bovine chromosome 6. Genetics. 2005, 169: 275-283. 10.1534/genetics.104.031559.PubMed CentralView ArticlePubMedGoogle Scholar
- Cohen-Zinder M, Seroussi E, Larkin DM, Loor JJ, Everts-van der Wind A, Lee JH, Drackley JK, Band MR, Hernandez AG, Shani M, Lewin HA, Weller JI, Ron M: Identification of a missense mutation in the bovine ABCG2 gene with a major effect on the QTL on chromosome 6 affecting milk yield and composition in Holstein cattle. Genome Res. 2005, 15: 936-944. 10.1101/gr.3806705.PubMed CentralView ArticlePubMedGoogle Scholar
- Olsen HG, Nilsen H, Hayes B, Berg PR, Svendsen M, Lien S, Meuwissen T: Genetic support for a quantitative trait nucleotide in the ABCG2 gene affecting milk composition of dairy cattle. BMC Genetics. 2007, 8: 32-10.1186/1471-2156-8-32.PubMed CentralView ArticlePubMedGoogle Scholar
- Boettcher PJ, Caroli A, Stella A, Chessa S, Budelli E, Canavesi F, Ghiroldi S, Pagnacco G: Effects of casein haplotypes on milk production traits in Italian Holstein and Brown Swiss cattle. J Dairy Sci. 2004, 87: 4311-4317.View ArticlePubMedGoogle Scholar
- Bovenhuis H, Weller JI: Mapping and analysis of dairy cattle quantitative trait loci by maximum likelihood methodology using milk protein genes as genetic markers. Genetics. 1994, 137: 267-280.PubMed CentralPubMedGoogle Scholar
- Ikonen T, Bovenhuis H, Ojala M, Ruottinen O, Georges M: Associations between casein haplotypes and first lactation milk production traits in Finnish Ayrshire cows. J Dairy Sci. 2001, 84: 507-514.View ArticlePubMedGoogle Scholar
- Lien S, Gomez-Raya L, Steine T, Fimland E, Rogne S: Associations between casein haplotypes and milk yield traits. J Dairy Sci. 1995, 78: 2047-2056.View ArticlePubMedGoogle Scholar
- Velmala R, Vilkki J, Elo K, Maki-Tanila A: Casein haplotypes and their association with milk production traits in the Finnish Ayrshire cattle. Anim Genet. 1995, 26: 419-425.View ArticlePubMedGoogle Scholar
- Velmala RJ, Vilkki HJ, Elo KT, de Koning DJ, Maki-Tanila AV: A search for quantitative trait loci for milk production traits on chromosome 6 in Finnish Ayrshire cattle. Anim Genet. 1999, 30: 136-143. 10.1046/j.1365-2052.1999.00435.x.View ArticlePubMedGoogle Scholar
- Farrell HM, Jimenez-Flores R, Bleck GT, Brown EM, Butler JE, Creamer LK, Hicks CL, Hollar CM, Ng-Kwai-Hang KF, Swaisgood HE: Nomenclature of the proteins of cows' milk – sixth revision. J Dairy Sci. 2004, 87: 1641-1674.View ArticlePubMedGoogle Scholar
- Ferretti L, Leone P, Sgaramella V: Long range restriction analysis of the bovine casein genes. Nucleic Acids Res. 1990, 18: 6829-6833. 10.1093/nar/18.23.6829.PubMed CentralView ArticlePubMedGoogle Scholar
- Threadgill DW, Womack JE: Genomic analysis of the major bovine milk protein genes. Nucleic Acids Res. 1990, 18: 6935-6942. 10.1093/nar/18.23.6935.PubMed CentralView ArticlePubMedGoogle Scholar
- Martin P, Szymanowska M, Zwierzchowski L, Leroux C: The impact of genetic polymorphisms on the protein composition of ruminant milks. Reprod Nutr Dev. 2002, 42: 433-459. 10.1051/rnd:2002036.View ArticlePubMedGoogle Scholar
- Schild TA, Geldermann H: Variants within the 5'-flanking regions of bovine milk-protein-encoding genes. III. Genes encoding the Ca-sensitive caseins αs1, αs2 and β. Theor Appl Genet. 1996, 93: 887-893. 10.1007/BF00224090.View ArticlePubMedGoogle Scholar
- Svendsen M, Heringstad B: New genetic evaluation for clinical mastitis in multiparous Norwegian Red cows. Interbull Bull. 2006, 35: 8-11.Google Scholar
- Nilsen H, Hayes B, Berg PR, Roseth A, Sundsaasen KK, Nilsen K, Lien S: Construction of a dense SNP map for bovine chromosome 6 to assist the assembly of the bovine genome sequence. Anim Genet. 2008, 39: 97-104. 10.1111/j.1365-2052.2007.01686.x.View ArticlePubMedGoogle Scholar
- Lien S, Rogne S: Bovine casein haplotypes: number, frequencies and applicability as genetic markers. Anim Genet. 1993, 24: 373-376.View ArticlePubMedGoogle Scholar
- Bovine Genome Project. [http://www.hgsc.bcm.tmc.edu/]
- Gilmour AR, Cullis BR, Welham SJ, Thompson R: ASREML reference manual. 2001, New South Wales AgricultureGoogle Scholar
- Baret PV, Knott SA, Visscher PM: On the use of linear regression and maximum likelihood for QTL mapping in half-sib designs. Genet Res. 1998, 72: 149-158. 10.1017/S0016672398003450.View ArticlePubMedGoogle Scholar
- Green P, Falls K, Crooks S: Documentation for CRI-MAP, version 2.4. 1990, Washington University School of Medicine St. LouisGoogle Scholar
- Stephens M, Donnelly P: A comparison of bayesian methods for haplotype reconstruction from population genotype data. Am J Hum Genet. 2003, 73: 1162-1169. 10.1086/379378.PubMed CentralView ArticlePubMedGoogle Scholar
- Stephens M, Smith NJ, Donnelly P: A new statistical method for haplotype reconstruction from population data. Am J Hum Genet. 2001, 68: 978-989. 10.1086/319501.PubMed CentralView ArticlePubMedGoogle Scholar
- Barrett JC, Fry B, Maller J, Daly MJ: Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005, 21: 263-265. 10.1093/bioinformatics/bth457.View ArticlePubMedGoogle Scholar
- Nilsen H, Olsen HG, Hayes B, Nome T, Svendsen M, Meuwissen T, Lien S: Identification of a haplotype on bovine chromosome 6 reducing clinical mastitis while simultaneously increasing protein yield. Anim Genet.Google Scholar
- Hayes B, Hagesaether N, Adnoy T, Pellerud G, Berg PR, Lien S: Effects on production traits of haplotypes among casein genes in Norwegian goats and evidence for a site of preferential recombination. Genetics. 2006, 174: 455-464. 10.1534/genetics.106.058966.PubMed CentralView ArticlePubMedGoogle Scholar
- Alexander LJ, Stewart AF, Mackinlay AG, Kapelinskaya TV, Tkach TM, Gorodetsky SI: Isolation and characterization of the bovine kappa-casein gene. Eur J Biochem. 1988, 178: 395-401. 10.1111/j.1432-1033.1988.tb14463.x.View ArticlePubMedGoogle Scholar
- Groenen MA, Dijkhof RJ, Verstege AJ, Poel van der JJ: The complete sequence of the gene encoding bovine alpha s2-casein. Gene. 1993, 123: 187-193. 10.1016/0378-1119(93)90123-K.View ArticlePubMedGoogle Scholar
- Groenen MA, Dijkhof RJ, Poel van der JJ, van Diggelen R, Verstege E: Multiple octamer binding sites in the promoter region of the bovine alpha s2-casein gene. Nucleic Acids Res. 1992, 20: 4311-4318. 10.1093/nar/20.16.4311.PubMed CentralView ArticlePubMedGoogle Scholar
- Jolles P, Loucheux-Lefebvre MH, Henschen A: Structural relatedness of kappa-casein and fibrinogen gamma-chain. J Mol Evol. 1978, 11: 271-277. 10.1007/BF01733837.View ArticlePubMedGoogle Scholar
- Groves ML: Some minor components of casein and other phosphoproteins in milk. A review. J Dairy Sci. 1969, 52: 1155-1165.View ArticleGoogle Scholar
- Kaminski S, Cieslinska A, Kostyra E: Polymorphism of bovine beta-casein and its potential effect on human health. J Appl Genet. 2007, 48: 189-198.View ArticlePubMedGoogle Scholar
- Elliott RB, Harris DP, Hill JP, Bibby NJ, Wasmuth HE: Type I (insulin-dependent) diabetes mellitus and cow milk: casein variant consumption. Diabetologia. 1999, 42: 292-296. 10.1007/s001250051153.View ArticlePubMedGoogle Scholar
- McLachlan CN: Beta-casein A1, ischaemic heart disease mortality, and other illnesses. Med Hypotheses. 2001, 56: 262-272. 10.1054/mehy.2000.1265.View ArticlePubMedGoogle Scholar
- Sun Z, Zhang Z, Wang X, Cade R, Elmir Z, Fregly M: Relation of beta-casomorphin to apnea in sudden infant death syndrome. Peptides. 2003, 24: 937-943. 10.1016/S0196-9781(03)00156-6.View ArticlePubMedGoogle Scholar
- Knivsberg AM, Reichelt KL, Nodland M: Reports on dietary intervention in autistic disorders. Nutr Neurosci. 2001, 4: 25-37.Google Scholar
- Chatchatee P, Jarvinen KM, Bardina L, Beyer K, Sampson HA: Identification of IgE- and IgG-binding epitopes on alpha(s1)-casein: differences in patients with persistent and transient cow's milk allergy. J Allergy Clin Immunol. 2001, 107: 379-383. 10.1067/mai.2001.112372.View ArticlePubMedGoogle Scholar
- Chatchatee P, Jarvinen KM, Bardina L, Vila L, Beyer K, Sampson HA: Identification of IgE and IgG binding epitopes on beta- and kappa-casein in cow's milk allergic patients. Clin Exp Allergy. 2001, 31: 1256-1262. 10.1046/j.1365-2222.2001.01167.x.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.