Populations of White Leghorn chickens used to fine-map the QTL for bone strength and understand its function
Population 1
Population 1 was an F2 population (n = 372) that was described previously [16]. It was created by crossing high and low bone quality lines produced by divergent selection of the founder breed [8, 16] (Fig. 1). The QTL on chromosome 1 was originally discovered in this population. The population was used to improve the precision of the location of the QTL with informative single nucleotide polymorphisms (SNPs) at the locus. The measured phenotype was tibial breaking strength and genotyping was performed with microsatellite markers as described previously [16] and SNPs that are listed in Additional file 1: Table S1.
Population 2
Population 2 was a later generation (sampled in 2006, see Fig. 1) of the founder breed used to generate the divergently selected population that was used to create the F2 population where the QTL was discovered [16]. It is a White Leghorn breed used in the production of LSL hybrid layers (Lohmann Tierzucht GmbH). With the ability to obtain high-density genotyping and a larger number of animals, the population allowed better precision for mapping the QTL. As in all the populations for which bones were measured in this study, hens were housed in individual cages to facilitate egg recording. Phenotype for tibia breaking strength, body weight and egg production were available for 1595 hens. DNA was prepared from red blood cells using DNAzol (Invitrogen). Hens were genotyped for 144 SNPs as described in the section on genetic markers below and in Additional file 2: Table S2. To increase power and reduce cost only the top and tail of the population were genotyped. Bone strength is influenced by body weight in hens, as witnessed by its negative weighting in the bone index used to divergently select hens that formed the founder population for the QTL analysis [8]. For this reason and to avoid biasing the sample towards heavier and lighter hens, we chose to select equal numbers of hens from the top and tail of the distribution of the population after ranking the hens using the residuals generated after fitting body weight using regression analysis. The equation derived from the data was tibia breaking strength (N) = 53.0 + 0.0946 body weight (g). Hens laying less than 230 eggs were removed, since these can have stronger bones and do not form part of the normal distribution, as was done in the original QTL study [16]. Equal numbers of individuals from the top and the tail of the population were selected and 998 animals were genotyped. This represented 63% of the population. The phenotype was tibial breaking strength and genotyping was performed with 144 SNPs on chromosome 1.
Further studies were undertaken and samples were collected from the founder breed in subsequent years to characterise the locus and its effects.
Population 3
Samples (n = 111) for genotyping and phenotyping were taken from the 2010 generation (Fig. 1) of the founder breed using a targeted sampling approach, which was used to provide further independent confirmation of the locus effect. As in the other populations, only hens laying more than 230 eggs were analysed. To obtain the best power with the resources available, samples were collected only from hens at post-mortem if, after being tested for breaking strength, the bones were more than 1.5 standard deviations from their predicted breaking strength (PBS). PBS was calculated using a training set of 200 hens. The equation was produced by regressing tibial breaking strength against body weight and total egg production. (PBS = 90.5 + (− 0.687 egg production) + (0.1239 bodyweight)). The distribution of the ‘residual’ between the observed and predicted breaking strength based on the 200-hen training set was used to define 1.5 standard deviations. This ensured that we did not bias the results since larger hens have stronger bones and hens that lay significantly fewer eggs due to pauses in production can have stronger bones as do out-of-lay hens [18]. Hens with four or less follicles or with any evidence of reproduction abnormality such as internal ovulations were rejected at post-mortem. The phenotype was tibial breaking strength and genotyping was performed with the Ost106225194 and Ost112522587 SNPs.
Population 4
Samples of mid-shaft tibia bone were collected from the 2012 generation (Fig. 1) for the preparation of RNA for next-generation sequencing (NGS) and to make bone density measurements. This allowed the characterisation of expression differences related to the QTL. Thirteen sires and 50 dams were genotyped for SNPs Ost106225194 and Ost112522587 and offspring from heterozygous individuals were prioritised for the detection of homozygotes by genotyping the same markers. Homozygous hens (n = 34 AA/AA; n = 22 GG/GG) were phenotyped at the end of their productive cycle at 70 weeks of age. RNA was prepared from mid-shaft tibia. The measured phenotype was mid-shaft cross-sections of tibia that were taken for radiographic measurement of cortical and medullary bone density, volume and area. The animals used for NGS of bone RNA were selected from these (see section on NGS). Genotyping was performed with Ost106225194 and Ost112522587 SNPs.
Population 5
Samples were obtained from the 2013 generation (n = 66) using the same sampling strategy as for Population 4, but samples were collected for evaluating specific properties of bone material at 70 weeks of age to understand how the genotype might result in the observed bone strength phenotype (Fig. 1). The measured phenotype was concentration of homocysteine in the plasma towards the end of peak egg production (48 weeks of age) and end of production (70 weeks of age). A subset of these animals, n = 19 per genotype, with matching physiology with regard to stage of egg formation were further analysed for bone mechanical properties, bone microstructure, chemical composition of the cortical and medullary bone, mineral crystallinity and crystal orientation and collagen maturity using infrared spectroscopy and X-ray diffraction techniques. Genotyping was performed with Ost106225194 and Ost112522587 SNPs (n = 33 AA/AA; n = 33 GG/GG).
Population 6
Population 6 was a White Leghorn line that was not related to the other populations. We used it to provide material for the measurement of CBS enzyme activity as described in the respective sub-section. It consisted of embryos from a White Leghorn population maintained at the Roslin Institute, which was found to carry the SNP rs316554658 that resulted in a predicted amino acid difference. Sires and dams heterozygous for this SNP were bred and the resultant eggs were incubated. At day 15, the embryo livers were harvested and stored at -20 °C prior to use in the assay, and a sample of blood was taken for DNA preparation to confirm the genotype of the embryo. The measured phenotype was CBS enzyme kinetics and genotyping was done with SNP rs316554658.
Genetic markers
For fine mapping, SNPs in the QTL region were sourced in a number of ways. SNPs were sourced from dbSNP [19] along chicken chromosome 1 for use in the F2 cross. A more targeted set of SNPs was defined from sequence information [20] from White Leghorn breeds between 81.1 and 128.8 Mb on chromosome 1 (galGal6). This corresponded to the 95% confidence interval (CI) of the original QTL. The number of SNPs was reduced from 91,455 to 15,761 by using only those segregating in all the sequenced breeds, and finally to 193 using those that had a minor allele frequency (MAF) lower than 0.2 in all three breeds. This list was reduced furthermore by removing SNPs in close proximity to each other and by biasing the density to be higher at the perceived peak and reduced at the tails of the region. Around 25 SNPs failed the test for ability to construct an Illumina Golden Gate assay. The SNP list was supplemented by SNPs that were derived from the alignment of EST sequences from the following genes located in the target region: RGN, EFHC2, EGFL6, DMD, PHEX, AGPAT3, PRDM15, MX1, PIGP, MRPS6, DDX3X, RPL8, ATP6AP2, PDK3 and APOO. The full list of genotyped SNPs is in Additional file 2: Table S2. One hundred and forty-four SNPs were genotyped on Population 2 using the Illumina Golden Gate assay on the Illumina BeadXpress platform. The Illumina Bead Studio Genotyping Module software was used to analyse SNP data and, as a first quality control, to remove poor quality or uninformative SNPs. From the 119 SNPs obtained from sequence data, only 13 were informative in the F2 cross, whereas in the case of the SNPs discovered using EST sequences, 19 out of 39 were informative. The 95% CI of the QTL was chr1: 75,624,794-124,047,308 bp (galGal6) and the region covered by the new markers was chr1: 77,312,158-124,272,094 with an average distance of 0.336 Mb (SD 0.183) between SNPs.
For Population 1, the 32 informative SNPs from those detailed above were genotyped using the Illumina Golden Gate assay on the Illumina BeadXpress platform. The Bead Studio Genotyping Module software was used to analyse SNP data. These SNPs were added to the genetic map for chromosome 1 to determine the QTL position in the F2 Population 2. The new F2 map is in Additional file 1: Table S1. The map was used as previously [16] using the QTL mapping method of Haley et al. [21] but implemented using gridQTL [22].
Small-scale genotyping
Genotyping with SNPs Ost112522587 and Ost106225194 to define parents or individuals for Populations 3, 4 and 5 with the desired genotype was carried out by LGC (LGC, Middlesex, UK), assay reference Chr1_112522587200 and Chr1_1062251946, respectively.
SNP rs316554658, which was predicted to result in an amino-acid change, was diagnosed using RFLP after amplification with primers CBS_NS_F (5′ CGT CTG GTG AAG GGG AAT AA 3′) and CBS_NS_R (5′ TCC CTT TTC AGC TGC TCA GT 3′). This resulted in an amplified product of 596 bp using genomic DNA as target (Chromosome 1, 106,922,247–106,922,842). When digested with the restriction enzyme HpyCH4V, the amplification of allele C gives products of 289, 166, 93 and 48 bp (the restriction site at position 289 in the PCR product is part of a codon for Q at position 498 in the CBS protein, therefore CC). Digestion of the allele A product (part of a codon for K at position 498 in the CBS protein) gives digestion products of 382, 166 and 48 bp. This was used to genotype Population 6 to allow identification of individuals that differed in their CBS amino acid sequence and to collect tissue to assess the function of the enzyme.
RNA seq
Mid-shaft tibia bone total RNA samples were prepared from 70-week old hens from Population 4 for which the egg was in the shell gland to reduce any effect of the egg calcification cycle. RNA was prepared using the TRIzol reagent according to the manufacturer’s instructions (Invitrogen Ltd., Renfrewshire, Scotland). RNA was treated with DNAse I and purified using an RNeasy Mini Kit following the manufacturer’s instructions (Qiagen, Manchester, England), concentration and quality were checked with a Nanodrop ND-1000 spectrophotometer (Thermo Scientific; Waltham, MA USA). Half of the samples were homozygous for the low bone breaking strength genotype (n = 8) and the other half (n = 8) were homozygous for the high bone breaking strength genotype. Total RNA samples (1 µg) were prepared for mRNA sequencing using the Illumina Truseq RNA Sequencing protocol. Resulting libraries were quality-checked on an Agilent DNA 1000 bioanalyzer (Agilent Technologies, South Queensferry, UK) and then clustered onto a paired end flowcell using the Illumina v3 cluster generation kit at a concentration of 8 pM. One hundred cycle paired-ended sequencing was carried out on the HiSeq 2000 using Illumina v2 Sequencing by Synthesis kits (Illumina, Little Chesterford, UK). An Illumina HiSeq 2000 platform at Edinburgh Genomics generated between 40 and 60 million RNAseq reads per sample (819 million in total). Quality control of the raw data was evaluated using the FastQC package [23] (Babraham bioinformatics, Cambridgeshire, England). Reads were adapter-trimmed using cutadapt version 1.3 [24] with the parameters-q 30-m 50-a AGATCGGAAGAGC. Differential expression of genes or tags was assessed using edgeR version 3.6.8 [25], a package in the bioconductor suite [26] implemented in R [27]. The likelihood that the expression of genes differed between the genotypes was estimated using a general linear model in the edgeR package. Genes with a false discovery rate less than 0.05 were reported as of interest. A fuller version is available in Additional file 3. The data were submitted to the European nucleotide archive (study accession PRJEB6782) and have been used in the annotation of the chicken genome [28].
Examination of genes of interest using the NGS data was performed manually using Tablet v1.17.08.17 [29] to ascertain the location of polymorphisms in each bird and these were annotated to the sequence using Seqbuild (DNASTAR, Inc. Madison, WI). Assembly of short reads to produce the CBS mRNA sequence was made using Staden Gap5 v1.2.14-r [30].
The CBS gene
Understanding the CBS gene in the chicken: sequence and variant expression
The CBS gene is considered to be present in the chicken genome based on a prediction. It has the Ensembl transcript ID ENSGALT00000026110 at position chr1:111,011,700–111,029,467 (galGal6). Another gene shares sequence similarity in some of the exons with CBS and is named the CBS-like gene (CBSL); it has the Ensembl transcript ID ENSGALT00000026104 and is located upstream from CBS at position chr1:110,980,772–110,999,637 (galGal6). ENSGALT00000026110 and ENSGALT00000026104 are 1836 bp and 1938 bp long, respectively, and the identity over the regions that align is equal to 73.5% (1106/1505) with 14 gaps using ‘Matcher’ in mEMBOSS [31]. The sequences differ sufficiently for defining specific PCR assays to investigate their relative expression.
CBS quantitative PCR
CBS gene expression was measured using quantitative PCR (qPCR); a standard curve was used to quantify expression of CBS and measurements were standardised using the lamin B receptor (LBR) as reference gene, the expression of which did not differ between genotypes (P = 0.585). The overall methodology for quantification was as previously described in [32, 33].
cDNA for measurement was prepared from mid-shaft tibia samples from the Ost112522587/Ost106225194, AA/AA (n = 36) and GG/GG (n = 25) genotypes from Population 4. Because two versions of the gene appear to exist in the chicken genome as stated above, CBS and CBSL, we measured both to confirm which gene was differentially expressed. In the process, we established that the CBSL gene was not highly expressed using the ΔΔct method [34] to estimate the fold difference in expression of the two genes in the same samples. The CBS primers for qPCR avoided polymorphisms that were known to be within the locus: CBS-F2a, TTGGGCTGAAGTGTGAACTC; CBS-R2, TCAGGACATCCACCTTCTCC; product length 233 and for CBSlikeF2, GCTCCGGAGTCTAACATTCG; CBSlikeR, ATCACCACCATGTGGACCTT product length 164 bp. Using 16 samples of RNA extracted from mid-shaft tibia bone for NGS, which represented eight samples of each genotype (Ost112522587/Ost106225194, AA/AA vs. GG/GG) that were in the same reproductive state, the version of CBS represented by ENSGALT00000026110 was expressed on average 815 ± 157-fold higher than CBSL when calculated from the quantitative PCR data. For this reason, we confined subsequent measurements to the CBS gene.
CBS allelic imbalance
Allelic imbalance of CBS expression in hens that were heterozygous at the Ost112522587 SNP from Population 4 was determined by amplification of cDNA derived from bone of 10 heterozygous hens using primers CBSgenoF1, GTGGAACGTCAGTGTTCAGG; CBSgenoR1, AAGGCTGAACTTTTCCAGCA followed by digestion with HpyCH4V restriction enzyme. This yields DNA products of 132 and 68 bp for the A allele associated with low bone strength or leaves the fragment uncut at 200 bp for the B allele associated with high bone quality. Products were run on a 3% agarose gel containing Sybrsafe (Invitrogen, Paisley, Scotland). The intensity of the bands was calculated using ImageJ 1.32 (http://imagej.nih.gov/ij/) on images taken using a G:Box imager (Syngene, Cambridge, UK). The sum of the intensity of the 132-bp and 68-bp band was compared with the 200-bp band intensity expressed as a fraction of the total area under the curve using a paired t test.
CBS enzyme activity assay in livers of embryos expressing protein from allelic variants
A coding variant was detected at position 498 in CBS coding for glutamine (Q) or lysine (K) giving rise to two allozymes. Estimation of the activity of CBS in liver expressing the protein that contains these alternative amino-acids was based on the production of cystathionine in the presence of varying substrates (L-homocysteine 0.1–5 mmol/L, serine constant at 5 mmol/L) and cofactors (S adenosylmethionine 200 µmol/L and pyridoxal phosphate 50 µmol/L [35]. All reagents were obtained from Sigma-Aldrich. In brief, liver samples from day-15 embryos harvested from Population 6, which segregated for the alternative alleles, were homogenized (Ultraturrax, IKA-Werke GmbH & Co. KG, Germany) in an extraction buffer containing protease inhibitor. Protein concentrations of the homogenate supernatant were estimated using a Pierce™ Coomassie blue assay (Thermo Fisher, UK) and 100 µg of protein was included in the assay in a total volume of 50 µl and incubated for 1 h at 37 °C. Finally, 150 µl of acetonitrile (VWR Chemicals Leicestershire, UK) was added to precipitate proteins and the supernatant retained for liquid chromatography–mass spectrometry (LC–MS) measurement, see below. The Michaelis–Menten constant (Km) and the maximum activity (Vmax) were estimated by plotting a double reciprocal plot using the rate of cystathionine production at each homocysteine concentration; 50, 40, 30, 20, 10, 5, 2 and 1 mM.
Measurement of homocysteine and cystathionine using LC–MS
LC–MS measurements for homocysteine and cystathionine levels in samples from the CBS enzyme activity assay were performed by a selected reaction monitoring assay on an amaZon ETD IonTrap Mass spectrometer (Bruker Daltonics, GmbH, Bremen, Germany) coupled to an Ultimate HPLC (Dionex) system, for more details (see Additional files 3 and 4). In addition, the method was used to measure plasma homocysteine for validation of the commercial kit as described below.
Plasma homocysteine measurement
Plasma homocysteine was measured using the kit HY4036 (Randox laboratories, County Antrim, UK) based on enzymatic conversion of homocysteine to cystathionine by CBS. Plasma was treated prior to measurement with lipoclear (Vetlab supplies, West Sussex, UK) to remove circulating lipids. In humans, the homocysteine assay relies on low endogenous levels of circulating cystathionine to work, but it was not clear if this was true in chickens. To ascertain if concentrations measured by the kit were the same as those measured using LC–MS, a set of samples was measured by both methods. Comparison of the respective measurements resulted in an R-squared value of 92%, which suggested that the biochemical method worked adequately to detect homocysteine in chicken plasma.
Bone material properties
Breaking strength
Among the main morphological and biomechanical properties, tibia breaking strength was determined by a three-point bending test using a material testing machine (JJ Lloyd LRX50, Sussex, UK) as previously described [7].
Radiographic density
Cross-sections of the tibia bone were radiographed in a Faxitron 43855D soft X ray apparatus using Kodak MRE-1 high-resolution mammography film in Min-R2 cassettes with a single Min-R intensifying screen. Each exposed plate included a 16-step aluminium wedge, with 0.25-mm increments, for calibration purposes. Exposed films were developed using an automatic processor and then digitised using a Kodak LS-75 film scanner. Measurement of the radiographic density and proportion of medullary and cortical bone in the tibia was made using the software package ImageJ 1.32 (http://rsb.info.nih.gov/ij/). Each tibia bone was automatically delineated from the background and the mean radiographic density (pre-calibrated in mm of aluminium equivalent) of the whole bone was measured. The proportion of medullary and cortical bone type was calculated directly from the X-ray by delineation [36].
Bone chemistry and structure
Bone samples: Tibia bones were stored in a freezer at − 20 °C until analysed for bone physicochemical material specific properties (e.g., bone microstructure, chemical composition of the cortical and medullary bone, mineral crystallinity and crystal orientation, and collagen maturity) using infrared spectroscopy and X-ray diffraction techniques as fully described in Additional file 3 and briefly below.
Infrared spectrometry: The chemical composition of bone tissues (cortical and medullary bone) were analyzed by infrared spectroscopy as previously described [37]. The relative amounts of water, proteins (collagen), lipids, phosphate and carbonate in the bone samples were determined from the peak area of the absorption bands associated with the characteristic molecular groups of each component [38, 39]. In addition, the absolute water, organic matter, carbonate and phosphate contents in bone were determined by thermogravimetry (TGA) in selected samples. For these analyses, about 25 mg of the powdered bone were introduced into a crucible and analysed using a TGA system from METTLER-TOLEDO (mod. TGA/DSC1). A heating rate of 20 °C/min was used for registering the TGA curves.
X-ray diffraction: Tibiae cortical bone (about 1 × 1 cm) samples cut from the diaphysis were analyzed in transmission mode with a single crystal diffractometer equipped with an area detector (D8 SMART APEX from Bruker) and Mo radiation (50 kV and 30 mA; 0.5 mm collimator). A quantitative estimation of the degree of orientation of apatite crystals (Angular spread; AS) in the cortical bone was determined from the angular breadth of bands displayed in the intensity profile along the Debye–Scherrer ring associated with the 002 reflection of apatite mineral [37].