Weighting sequence variants based on their annotation increases the power of genome-wide association studies in dairy cattle

Background Genome-wide association studies (GWAS) are widely used to identify regions of the genome that harbor genetic determinants of quantitative traits. However, the multiple-testing burden from scanning tens of millions of whole-genome sequence variants reduces the power to identify associated variants, especially if sample size is limited. In addition, factors such as inaccuracy of imputation, complex linkage disequilibrium structures, and multiple closely-located causal variants may result in an identified causative mutation not being the most significant single nucleotide polymorphism in a particular genomic region. Therefore, the use of information from different sources, particularly variant annotations, was proposed to enhance the fine-mapping of causal variants. Here, we tested whether applying significance thresholds based on variant annotation categories increases the power of GWAS compared with a flat Bonferroni multiple-testing correction. Results Whole-genome sequence variants in dairy cattle were categorized according to type and predicted impact. Then, GWAS between markers and 17 quantitative traits were analyzed for enrichment for association of each annotation category. By using annotation categories that were determined with the variants effect predictor software and datasets indicating regions of open chromatin, “low impact” variants were found to be highly enriched. Moreover, when the variants annotated as “modifier” and not located at open chromatin regions were further classified into different types of potential regulatory elements, the high impact variants, moderate impact variants, variants located in the 3′ and 5′ untranslated regions, and variants located in potential non-coding RNA regions exhibited relatively more enrichment. In contrast, a similar study on human GWAS data reported that enrichment of association signals was highest with high impact variants. We observed an increase in power when these variant category-based significance thresholds were applied for GWAS results on stature in Nordic Holstein cattle, as more candidate genes from previous large GWAS meta-analysis for cattle stature were confirmed. Conclusions Use of variant category-based genome-wide significance thresholds can marginally increase the power to detect the candidate genes in cattle. With the continued improvements in annotation of the bovine genome, we anticipate that the growing usefulness of variant category-based significance thresholds will be demonstrated. Electronic supplementary material The online version of this article (10.1186/s12711-019-0463-9) contains supplementary material, which is available to authorized users.


Background
Cattle is one of the most important domestic animals in human history. Both breeding programs and genetic studies in cattle depend largely on the availability of a reliable cattle reference genome [1] and reference populations [2]. In addition, genome-wide association studies (GWAS) have identified valuable links between genetic variants and variations in complex traits [3,4]. For example, numerous GWAS have been conducted in cattle to investigate production traits such as milk yield [5,6], milk composition [7], and mastitis [8][9][10]. However, GWAS alone cannot distinguish causative variants from variants which are in perfect, or near-perfect, linkage disequilibrium (LD) with them.
To address this problem, additional information from independent sources are needed. For example, gene expression data have facilitated the identification of candidate genes from GWAS data [11], and expression quantitative trait loci (eQTL) data [12,13] have helped map causative variants within regulatory regions. More recently, Brown et al. [14] proposed a 'causal-variant evidence mapping using nonparametric resampling' (CaVE-MaN) method to pinpoint causative mutations in eQTL studies. By integrating eQTL data with results from GWAS, genes with expression levels that are associated with complex traits due to pleiotropic effects (e.g., when both gene expression and trait variation are affected) can be identified [13]. However, large-scale eQTL studies can be expensive because they require generation of RNAseq data specific to the population under study, especially in the case of livestock species for which initiatives such as the GTEx [15] project in humans do not exist.
Due to many reasons, such as LD, inaccuracy of imputation, random sampling errors, etc., the lead single nucleotide polymorphism (SNP) may not be the causative one [6]. Using additional information to prioritize variants within the QTL interval has become a popular strategy [16]. It was recently demonstrated that the use of a variant annotation tool [17] and its evolutionary conservation score [16] can help prioritize variants. In particular, variant annotation can be used across different studies without being tissue-or trait-specific. The power to identify associations between genetic variants and phenotypes may be further improved by using functional annotation information [18][19][20]. For example, Sveinbjornsson et al. [21] reported an increase in power for the detection of associations when an annotation enrichment-based weighted Bonferroni adjustment was used to correct for family-wise error rate (FWER).
In this study, we implemented a previously proposed, category-based Bonferroni adjustment based on the enrichment (the probability of a causal variant being from a category divided by the probability of a noncausal variant being from the same category) of variant annotations observed for association signals [21] that were obtained from a GWAS conducted in Nordic Holstein cattle. This adjustment is based on the hypothesis that different types of variants have varying probabilities of being causal mutations, which means the enriched categories of variants could have lower thresholds estimated from their enrichment. The GWAS results for 17 quantitative traits were used to extract the lead SNP along with other significant SNPs showing LD (r 2 > 0.2) with the lead SNPs, as potential causal variants to estimate the enrichment of each of the annotated variants' categories. We make the hypothesis that the category-based significance threshold will increase the power of a GWAS study. We tested this hypothesis by performing an association study on stature in cattle and comparing the results with those of a previously reported large meta-analysis in cattle stature and genes reported for human height.

Phenotype and genotype data
Since, no animal experiments were performed in this study, approval from an ethics committee was not required.
Phenotypic records on 17 traits/indices for Nordic Holstein cattle were obtained from a central national database (Nordic Cattle Genetic Evaluation (NAV), http://www. nordi cebv.info/). For details on the genetic evaluations performed for these 17 traits/indices in Nordic countries, see http://www.nordi cebv.info/produ ction . The phenotypic values used in the association analysis included deregressed proofs that were derived for animals based on the effective daughter contributions of sires and maternal grandsires [22,23], which were obtained from the NAV routine genetic evaluations by using the MiX99 software [24]. De-regressed proofs were available for 5373 sires (the total number of animals varying according to trait). A short description of the 17 traits/indices is presented in Additional file 1: Table S1.
An association study was performed by using imputed WGS data, as previously described by Iso-Touru et al. [5] and Wu et al. [25]. A total of 4921 bulls were genotyped with versions 1 or 2 of the Illumina BovineSNP50 Bead-Chip (54 k) system (Illumina, San Diego, CA, USA). The 54 k genotypes were imputed to the WGS level by using a 2-step approach [26]. First, all the animals were imputed to a high-density (HD) level, by using IMPUTE2 v2.3.1 and a multi-breed reference of 3383 animals (1222 Holsteins, 1326 Nordic Red Dairy Cattle, and 835 Danish Jerseys), which had previously been genotyped with the Illumina Bovine HD BeadChip [27]. The distribution of imputation accuracies according to minor allele frequency is described in [25]. These imputed HD genotypes were imputed with Minimac2 [28] [2] and the whole-genome sequence data from Aarhus University are described in Brøndum et al. [29]. A total of 22,751,039 bi-allelic variants were present in these imputed sequence data. After excluding SNPs with a minor allele frequency lower than 1%, and SNPs that deviate from Hardy-Weinberg proportions (P < 1.0 −6 ), 16,503,508 SNPs on 29 autosomes in Nordic Holstein cattle were retained for association analyses.

Methodology for the detection of multiple QTL
We performed a GWAS according to a previously described approach [6]. First, a single SNP GWAS analysis was performed by using GCTA [30] for each chromosome as the first round. Next, SNPs were ranked based on their − log 10 (P) values. The SNP with the highest − log 10 (P) value, referred to as the lead SNP, was identified for each chromosome. If the − log 10 (P) value of the lead SNP exceeded 8.5 (a threshold value representing a 0.05 type I error-rate after Bonferroni correction for 16,503,508 simultaneous tests, e.g., − log 10 (P) ≈ 8.5), the SNP genotype dosage was fitted as a covariate, and rerun in association analyses for the same chromosome as a second round. If the result of this second round detected another SNP with a − log 10 (P) value exceeding 8.5, and this SNP was also significant in the first round (e.g., − log 10 (P) > 8.5), we fitted it as another covariate, and then scanned the chromosome in a third round. This same procedure was repeated for each chromosome until no additional SNPs remained significant. A list of the lead SNPs identified in each round was compiled. In each round, we checked whether the lead SNP was the only significant SNP identified within ± 1 Mb of flanking region. If it was, the SNP was not considered as a lead SNP since it could represent a false positive or be mapped to a wrong location in the genome. Details regarding the 17 traits and the GWAS results are in Additional file 1: Table S1.

GWAS for stature in Nordic Holstein cattle
Stature in cattle is measured from the top of the spine between the hips to the ground. In Denmark, this trait is measured in cm. We performed a GWAS for stature according to the method described above. However, first we removed extreme phenotypic records according to Tukey's rules of quartiles ± 1.5 × interquartile range. The remaining 4832 phenotypic records were associated with 15,535,049 imputed SNPs. The number of markers used for association with stature differs from that used in the GWAS conducted for the 17 traits (described above) since the set of sires was not exactly the same in both analyses.

LD estimation and variant annotation
PLINK was used to estimate pairwise LD (r 2 ) between lead SNPs and all the other SNPs on the same chromosome. All SNPs that had an r 2 with the lead SNPs higher than 0.2 were extracted. The SNPs that were not significant in the association study were discarded in order to generate a list of possible causal variants. These SNPs were annotated with the variants effect predictor (VEP) (version 92.0) software [17]. The variants were subsequently classified into annotation categories according to the impact for the consequence type predicted by VEP. When a SNP had multiple annotations, the annotation with the highest impact predicted by VEP was retained. Information on transposase-accessible chromatin, i.e. ATAC-seq peaks [31], as well as histone modifications, i.e. H3K27Ac and H3K4me3 peaks [32], were retrieved from previously published studies. The locations of the UTR regions were obtained from Ensembl [33]. The locations of predicted regulatory elements (RE) were also obtained from a previous study [34], while the locations of non-coding RNAs (ncRNAs) were retrieved from the RNAcentral database [35].

Assessment of category enrichment and category-based Bonferroni correction
Methods to assess the enrichment of each category, the enrichment confidence intervals, and weighted Bonferroni corrections were previously described [21]. We classified all the variants based on the VEP annotation: (1) high impact variants (e.g., stop_gained, stop_lost, start_lost, frameshift, splice_acceptor, and splice_donor variants), (2) moderate impact variants (missense_variants), (3) low impact variants (synonymous, stop_retained, upstream_gene, down-stream_gene and splice_region variants), and (4) other variants (including SNPs with a consequence predicted as "modifier"). In addition, we further classified other variants (annotated by VEP as "modifier") using open chromatin (OC) information to two categories, which resulted in a total of five categories: (1) high impact variants, (2) moderate impact variants, (3) low impact variants, (4.1) OC variants (annotated by VEP as "modifer" and located at ATAC-seq peaks [31] or H3K27Ac and H3K4me3 peaks [32]), and (4.2) variants with no known function i.e. NKF variants (including SNPs with a consequence predicted as "modifier", which were not located in OC). Finally, we further classified the NKF variants (category-4.2 above) into four categories leading to a total of eight categories. The four NKF categories were: (4.2.1) variants located within 5′ and 3′ untranslated regions (UTR), (4.2.2) variants located in predicted RE according to a recently proposed algorithm based on conservation among mammals [34], (4.2.3) variants located within ncRNAs retrieved from the RNAcentral database [35], and (4.2.4) variants with no known information (NKI) predicted as "modifier" and not located in any of these first three types of sequence. First, we considered UTR, since these regions mediate the initiation and termination of translation. Next, we considered the experimental datasets of accessible chromatin (ATAC-seq) [31] and active motifs (H3K27Ac and H3K4me3) [32]. Second, we considered RE for two reasons: (1) because promoters and transcription factor binding sites are near transcription start sites [36,37], and regions proximal to genes tend to exhibit greater enrichment of significantly associated variants in GWAS [38]; and (2) predicted RE can potentially help identify causal mutations [34]. ncRNAs play a major role in gene expression regulation [39], although their specific functions are largely unknown [40]. The detailed classification of variants is in Additional file 1: Table S2.
According to Sveinbjornsson et al. [21], we estimated the probability of a causal variant being a particular annotation type (according to the five or eight categories that we established in this study) by using a maximum likelihood method. Accordingly, the enrichment of an annotation category was estimated based on the probability of a causal variant being from a class, divided by its genomic frequency. The significance threshold for each annotation category was then estimated based on a category-based Bonferroni correction threshold that was established based on enrichment of the annotation class. For example, for a total number of sequence variants tested (T), the number of variants in an annotation category C (T C ), and enrichment of category C (e C ), c j is the category to which the jth sequence variant belongs with enrichment e Cj , and the weight for the jth sequence variant is [21]: where P wt j and P bc are weighted significance thresholds for the jth variant and Bonferroni corrected FWER, respectively.

Bootstrapping and confidence interval estimation for enrichment
Association signals (QTL) were resampled 100 times with replacement. For each resample, we estimated enrichment for annotation categories and calculated the averages and 95% confidence intervals. The code for bootstrapping procedure is included in the R code provided in Additional file 2 together with the code for enrichment estimation.

GWAS for 17 traits in Nordic Holstein cattle
A total of 5373 animals with 16,503,508 imputed SNPs were subjected to GWAS for 17 traits. A previously described pipeline [6] was used to detect the 'lead' variants that showed the highest association for each association signal. A total of 261 QTL (see Additional file 1: Table S1) were detected with a genome-wide association significance threshold of − log 10 (p) > 8.5. Significant associations were observed for 16 of the 17 traits examined (see Additional file 1: Table S1). Due to longrange LD in the bovine genome [5], sequence variants that were in LD with the lead SNPs (r 2 > 0.2) and genomewide significant were identified as possibly causal. In total, 78,593 possibly causal variants on 29 autosomes were selected for further analysis.

Annotations for all possible causal variants
The variant effect predictor (VEP) software (version 92.0) [17] was used to predict the maximal consequence of the variants on the nearest genes (e.g., within 5-kb flanking regions). A summary of the annotations obtained is in Fig. 1. Most of the annotated variants are intergenic variants or intron variants (Fig. 1a). Among the coding sequences, the most abundant variants are synonymous variants and missense variants (Fig. 1b). We also examined the distribution of annotations among the possible causal variants and the lead variants. The total number of variants in these two sets were equal to 78,593 and 261, respectively. The overall distributions of the annotations for these two groups ( Fig. 1c-f, respectively) were similar to that of the entire set of variants (Fig. 1a, b). We also observed that no high impact variants (e.g., "stop gained" or "start lost" variants) were present among the lead SNPs identified.

Enriched annotations and annotation-based significance thresholds based on VEP annotation
Based on the VEP-derived annotations, we classified all the annotation types obtained into four categories: (1) high impact variants, (2) moderate impact variants, (3) low impact variants, and (4) other variants. The other variants included SNPs that were annotated by VEP as intergenic variants, and those with a consequence predicted as "modifier". A plot of category enrichment is in Fig. 2. In contrast with the results of previous GWAS that involved quantitative and binary phenotypes in humans [21], low impact variants were the most enriched (245-fold enrichment) category instead of the high impact variants ( Table 1). The next most enriched category was the moderate impact variants, and these exhibited a fivefold enrichment (Table 1). We did not observe enrichment for high impact variants and 'other variants' .

Incorporation of information of open chromatin
While an extensive dataset of DNase I hypersensitivity sites (DHS) is available for the human genome [21], such data are much more limited for the bovine genome. However, an assay for transposase-accessible chromatin with a high-throughput sequencing (ATACseq) dataset for cattle was recently generated to explore accessible chromatin regions in the bovine genome [31]. In addition, a histone modification dataset (for H3K27Ac and H3K4me3) was created to mark active motifs across the bovine genome [32]. Therefore, in combination with VEP-derived annotations, we classified all the annotation types obtained into five categories: (1) high impact variants, (2) moderate impact variants, (3) low impact variants, (4) open chromatin (OC) variants, and (5) variants with no known function (NKF). The latter included variants that are not known to affect biological processes, including SNPs annotated by VEP as intergenic variants, and those with a consequence predicted as "modifier" which were not located within OC regions. A plot of category enrichment is in Fig. 3. In contrast with the results of a previous GWAS that involved quantitative and binary phenotypes in humans [21], low impact variants were the most enriched (405-fold enrichment) category instead of the high impact variants. Moreover, in spite of a large variance in the interval of enrichment for the low impact variants, the lower boundary still represented a high level of enrichment ( Table 2). The next most enriched category was the moderate impact variants, and these exhibited a fivefold enrichment, followed by the high impact variants that exhibited a fourfold enrichment. The lower boundary of the high impact variants was  (Table 2). Furthermore, enrichment was observed only for 2 of 100 replicates (with values of 177-fold and 133-fold, respectively; (see Additional file 3: Table S3). Therefore, we did not consider that these high impact variants were true positives. Finally, the OC category exhibited a 2.5-fold enrichment. Based on these enrichment values, a category-based significance threshold was calculated for each variant category ( Table 2).

Incorporation of additional genomic information to address NKF variants
The enrichment observed for each of these eight categories (including sub-categories from 'modifiers') is provided in Table 3 and represented in Fig. 4

Variant annotation-based significance thresholds in GWAS
To assess the power of using annotation category-based significance thresholds, we applied this approach to a GWAS conducted for stature in cattle and identified 35 QTL on 21 chromosomes (see Additional file 4: Figure  S1 and Additional file 3: Table S4). The number of significant variants within each of the four, five and eight categories of classified annotations (as described above)

Table 3 Enrichment of eight variant categories and their category-based significance thresholds
The confidence interval for each degree of enrichment is the 95% confidence interval obtained from bootstrapping resampled QTL 100 times *Indicates that the category-based significance was not calculated for this annotation class since there was no enrichment for this category  Tables 4, 5 and 6, respectively. When adjusted thresholds based on the four-category classifications were used in comparison with a flat Bonferroni multiple-testing correction across the tested variants, the total number of significant variants increased from 58,539 to 58,992 (Table 4). Then, when we checked whether the additional genes that were identified with the category-based thresholds had been previously identified as candidate genes in a meta-analysis study on bovine stature [41], we found that TNNI2 and TCP11 were identified by the new significant variants (n = 453). Subsequently, when we checked for overlap between the newly identified genes and those from a previous study on human height [42], we detected five overlapping genes i.e. ANKRD52, DNMT1, SCN4A, TP53I13, and TCP11 that may be associated with cattle stature and human height. When adjusted thresholds based on the five-category classifications were used versus a flat Bonferroni multiple-testing correction across the tested variants, the total number of significant variants increased from 58,539 to 58,993 ( Table 5). The list of identified genes is similar to that obtained with the four-category classification threshold, except that one additional gene FBP1 was included. When adjusted thresholds based on the eightcategory classifications were used versus a flat Bonferroni multiple-testing correction across the tested variants, the total number of significant variants increased from 58,539 to 61,191 (Table 6) and the newly identified variants (n = 2652) included TNNI2 and TCP11, as obtained by using the four-or five-category classification thresholds. When we checked for overlap between these newly identified genes with the previous study on human height [42], in this case, we detected more genes i.e. GHR, THADA, RPS6KA1, TP53I13, TCP11, VGLL4, KCNJ12, PPP2R3A, GCKR, and ZBTB38 that may be potentially relevant for cattle stature and human height.

Discussion
The goal of this study was to investigate whether categories of variants with different probabilities of being functional can be identified based on enrichment values obtained from categorization of GWAS results. When we classified all the variants on the basis of VEP annotation only, we observed a greater enrichment of "low impact" variants and a limited enrichment of "moderate impact" variants (Table 1). When we classified all the variants into five categories based on: (1) their impact predicted by VEP, (2) an ATAC-seq dataset indicating accessible chromatin regions [31], and (3) a histone modification dataset (involving H3K27Ac and H3K4me3) [32], we observed a greater enrichment of "low impact" variants and limited enrichment of "high impact", "moderate impact", and OC variants ( Table 2). In contrast, when we classified the original GWAS variants into eight categories, a more than tenfold enrichment was observed for the "high impact", "moderate impact", UTR, and ncRNA categories (Table 3), whereas "low impact" variants, OC, and RE categories exhibited a limited enrichment. This variant enrichment profile differs from that reported in a study on human data [21], for which the enrichments observed were in line with the magnitude of the predicted consequences of the variants. Thus, the variants with consequences that were predicted to be more severe displayed a greater enrichment. One reason for this difference in results may be the inclusion of both quantitative traits (n = 96) and binary (disease) phenotypes (n = 123) in the human study [21] versus the use of only quantitative traits (n = 17) in the bovine study. For complex disease traits, a loss-of-function variant can induce a disease state. Consequently, such variants have a higher probability of being causal for complex disease phenotypes. Economic traits in dairy cattle are generally quantitative traits that are affected by genetic variants in many genes, each gene having a small effect. Furthermore, in addition to loss-of-function and gain-of-function mutations, most of the causal mutations that underlie economic traits are likely to belong to regulatory variants which control the up-or down-regulation of genes. Among the categories of enriched variants that we identified in our study, most of them had a link with the regulation of gene expression, such as cis-regulatory elements (e.g., upstream gene variants, downstream gene variants, and UTR) [43] and ncR-NAs [40]. These variants can alter translation efficiency, particularly the synonymous variants [44], UTR [45], and ncRNAs [40]. In addition, they can affect transcript splicing, particularly the ncRNAs [40] and splice region variants. As a result, these variants can alter the functions of encoded proteins. A previous study demonstrated that regulatory elements are a major source of quantitative trait variation [46]. We go one step further and suggest that variants that affect both gene expression and protein translation (including translation efficiency and protein product stability) could be the source of quantitative trait variation. Data from GWAS on stature in cattle were also used to check if an approach using an annotation category-based significance threshold can identify more associations than the use of a uniform Bonferroni multiple-testing correction threshold. A recent meta-analysis conducted in cattle [41] proposed a list of candidate genes that affect bovine stature. With the three annotation classification approaches that we used here, we identified two additional candidate genes, TNNI2 and TCP11. Furthermore, comparison between the new list of genes reported here and that from a previous GWAS conducted on human height [42] revealed five additional genes with the fourcategory enrichment method, six additional genes with the five-category enrichment method, and ten additional genes with the eight-category enrichment method.  Several reasons can explain why our approaches detected only a fraction of the candidate genes reported in previous studies [41,42]: (1) loss of power for NKF and NKI categories, since we mixed variants with different probabilities due to lack of information for most variants; in this case, the enrichment estimation for NKF, NKI and unannotated variants that belong to another annotation category will be affected; (2) some of the causal variants reported in the meta-analysis study on bovine stature [41] and in the GWAS study on human height [42] do not segregate in the Nordic Holstein population; and (3) the relatively poor annotation of the bovine genome compared with the human genome. Nevertheless, our findings demonstrated that implementation of an annotation category-based significance threshold approach results in a larger number of high confidence candidate genes that include significant SNPs than the use of a flat Bonferroni multiple-testing significance threshold.
In this study, we observed distinct enrichments when two variant classification systems were used (e.g., 5 vs. 8 categories of classification) although the same number of potential causal variants was present in each system. There are two possible explanations for this observation: (1) the number of traits included in the analysis was small, thereby resulting in the detection of a limited number of QTL, e.g. there were only 19 high impact potential causal variants; with resampling, the number of high impact potential causal variants can fluctuate between extremes, as observed for the different replicates (see Additional file 3: Table S3); and (2) the number of annotation categories that were considered, and the underlying genetic architecture of the traits, may be contributing factors, e.g. a larger distribution of the enrichment level was observed across the categories established in this study when the variants were categorized into eight classes instead of five (see Additional file 3: Table S5).
In our study, we achieved slightly higher power with the classification of variants into categories based on annotations (Table 4 vs. Table 5 vs. Table 6). Further improvements are expected as the knowledge on the function of different genomic features in cattle increases. There are more and more studies that report the functions of non-coding sequences. For example, long ncRNAs have been identified as key regulators of chromatin states [47], microRNAs have been shown to play key roles in animal development and physiology [48], and cis regulatory elements may be located in 5′ or 3′ UTR. In combination with trans-regulatory elements, these elements regulate the level of gene transcription [49,50]. In a human study [21], NKF variants were categorized according to the presence or absence of overlap with DHS. The group of variants that overlapped with DHS was more enriched with GWAS hits than the group of variants that did not overlap with DHS. These results provide support for future efforts to classify NKF variants, especially in cattle for which this information is not currently available. In our study, an ATAC-seq dataset [20] of accessible chromatin and a histone modification dataset (H3K27Ac and H3K4me3) [34] provided additional genomic information. In spite of these additional data, the OC variant category was only moderately enriched (Tables 2, 3) but these datasets were each generated from a single tissue from a few individuals, which could have introduced errors. Alternatively, predicted regulatory elements could be used, although they may introduce noise and reduce the estimated degree of enrichment. In our study, the NKF variants were divided into five categories according to their relation to UTR, OC, predicted RE [34], ncRNAs [35], and NKI variants. We anticipate that as the functional annotation of dairy cattle genomes improves, the power of this approach will increase.