A genome-wide association analysis for body weight at 35 days measured on 137,343 broiler chickens

Background Body weight (BW) is an economically important trait in the broiler (meat-type chickens) industry. Under the assumption of polygenicity, a “large” number of genes with “small” effects is expected to control BW. To detect such effects, a large sample size is required in genome-wide association studies (GWAS). Our objective was to conduct a GWAS for BW measured at 35 days of age with a large sample size. Methods The GWAS included 137,343 broilers spanning 15 pedigree generations and 392,295 imputed single nucleotide polymorphisms (SNPs). A false discovery rate of 1% was adopted to account for multiple testing when declaring significant SNPs. A Bayesian ridge regression model was implemented, using AlphaBayes, to estimate the contribution to the total genetic variance of each region harbouring significant SNPs (1 Mb up/downstream) and the combined regions harbouring non-significant SNPs. Results GWAS revealed 25 genomic regions harbouring 96 significant SNPs on 13 Gallus gallus autosomes (GGA1 to 4, 8, 10 to 15, 19 and 27), with the strongest associations on GGA4 at 65.67–66.31 Mb (Galgal4 assembly). The association of these regions points to several strong candidate genes including: (i) growth factors (GGA1, 4, 8, 13 and 14); (ii) leptin receptor overlapping transcript (LEPROT)/leptin receptor (LEPR) locus (GGA8), and the STAT3/STAT5B locus (GGA27), in connection with the JAK/STAT signalling pathway; (iii) T-box gene (TBX3/TBX5) on GGA15 and CHST11 (GGA1), which are both related to heart/skeleton development); and (iv) PLAG1 (GGA2). Combined together, these 25 genomic regions explained ~ 30% of the total genetic variance. The region harbouring significant SNPs that explained the largest portion of the total genetic variance (4.37%) was on GGA4 (~ 65.67–66.31 Mb). Conclusions To the best of our knowledge, this is the largest GWAS that has been conducted for BW in chicken to date. In spite of the identified regions, which showed a strong association with BW, the high proportion of genetic variance attributed to regions harbouring non-significant SNPs supports the hypothesis that the genetic architecture of BW35 is polygenic and complex. Our results also suggest that a large sample size will be required for future GWAS of BW35. Supplementary Information The online version contains supplementary material available at 10.1186/s12711-021-00663-w.

meat will account for nearly half of the additional produced meat [1,2]. Body weight (BW) is one of the most economically important traits in the broiler industry. Traditional broiler breeding programs have achieved an increase in meat production efficiency of ~ 3.3% per year and selection on body weight has contributed to this result [3]. However, in spite of the importance of this trait, relatively little is known about the genetic variants that underlie the variance observed in body weight.
Knowledge of the genetic variants that underlie the variance observed in traits can amplify the breeding efficiency. For example, accuracy of genomic prediction can be increased by using markers that are strongly linked to the causative loci in genomic prediction models. Furthermore, information on the genetic architecture that underlies growth in meat type poultry will help to unravel the genes and pathways that are involved and enhance our understanding on such complex developmental processes.
Quantitative trait loci (QTL) mapping and genomewide association studies (GWAS) have been used to improve BW in chicken [4][5][6][7][8][9]. However, although significant associations have been detected, their practical value to breeding programs is limited. Typically, the associations were not finely mapped and encompassed broad chromosome regions. Furthermore, the populations used were often F 2 or advanced inter-crosses between lines of chicken that have been selected for egg laying, which are relatively slow growing, and fast growing broiler lines [4,10,11]. Relatively few publications on GWAS of body weight are available for commercially relevant lines [9,12,13].
Moreover, a large part of the genetic variance might be due to rare variants, or variants that are highly correlated/linked with other variants [14]. If this is the case, then a GWAS with a large sample size is required to detect those variants [15][16][17][18]. There is theoretical [19] and empirical evidence [16,20,21] that the power of GWAS increases as the size of the dataset increases. For example, a series of studies with datasets of increasing size for human height discovered 180 significant associations with a dataset of 183,727 individuals [15], 697 significant associations with a dataset of 253,288 individuals [16], and recently, another 83 new significant associations not previously detected were identified with a dataset of 711,428 individuals [21]. Analogous results are reported for studies on type 2 diabetes [22] and Crohn's disease [20] in humans. If BW is a complex polygenic trait [23,24], a large number of small-effect variants might regulate its expression. Hence, a large GWAS (in terms of sample size and number of markers analysed) is required to discover such variants.
The routine use of genomic selection in broiler breeding makes large GWAS possible. As part of the routine implementation of genomic selection over the past decade, Aviagen has accumulated both single nucleotide polymorphisms (SNP) array genotype and phenotype data for BW on 157,674 individuals from one of its lines.
Our objective was to conduct a GWAS, with a large sample size, for BW measured at 35 days of age (BW35) in broilers, which is a typical age at which broilers are slaughtered for meat production. After editing routinely collected data from a commercial broiler line, we analysed a dataset consisting of 137,343 broilers with phenotypes and 595,299 imputed SNPs.

Data
In total, we used 157,674 broilers spread across 15 generations of a pedigree for which BW35 and SNP array genotype data were collected as part of the routine commercial broiler breeding program (Aviagen Ltd, Newbridge, UK). The line used in this study was a female line (maternal side). The birds were genotyped with SNP arrays of different densities: 600 k SNPs for 1690 birds, 50 k SNPs for 59,773 birds, 42 k SNPs for 1507 birds, 3 k SNPs for 72,221 birds and 384 SNPs for 2152 birds. The development of these arrays is described in detail in [25]. Of these 632,439 SNPs, 52,408 are proprietary to Aviagen. We included all the SNPs in the analysis but do not show the base pair positions of the proprietary significant SNPs in our results, which represent 18 of the 96 significant SNPs). To unify the data from the different arrays and reach the highest density of 600 k, we imputed the genotypes of all broilers' to the 600 k Affymetrix Axiom chip with the AlphaImpute software v1.9 [26,27]. Broilers with more than 10% missing SNP genotypes were excluded from the analysis. Quality control of the SNPs was carried out using the PLINK v1.07 software [28]; SNPs with a call rate higher than 0.95 and a minor allele frequency higher than 0.01, that showed no extreme deviation from the Hardy-Weinberg proportions (P < 0.000001), and that were located on the Gallus gallus (GGA) autosomes 1 to 28 (except GGA16) were retained. After quality control, 137,343 birds and 392,255 SNPs remained for the analysis. The Galgal4 assembly in the Ensembl Genome Browser (version 85) was used to map the SNP positions on the genome (www. ensem bl. org).

Statistical analysis Pedigree genetic analysis
Variance component and heritability estimates were based on a pedigree-based model using ASReml.v3 [29]: where y is a vector of BW35 records; b is a vector of fixed non-genetic effects (sex, mating group (the average genetic level of the parents) with 325 levels, the pen effects and hatch week with 381 levels); a is a vector of random additive genetic effects; e is a vector of random residuals; X and Z are design matrices linking phenotypes to effects. The model assumptions were a ∼ N (0, Aσ 2 a ) and e ∼ N (0, Iσ 2 e ) , where A is the pedigree relationship matrix, I is an identity matrix and σ 2 a and σ 2 e are the additive genetic and residual variances, respectively. Heritability was estimated as the ratio of σ 2 a to the total phenotypic variance ( σ 2 a + σ 2 e ).

Genome-wide association study
The GWAS was conducted by single SNP regression while simultaneously correcting for the background polygenic effect using the GEMMA software [30]: where y is a vector of BW35 records pre-corrected for the non-genetic effects of sex, mating group, pen, and hatch; μ is intercept, w is a column vector of genotypes for the SNP of interest with the corresponding allele substitution effect b ; g is a vector of random additive genomic (polygenic) effects; and e is a vector of random residuals. The model assumptions were g ∼ N (0, Gσ 2 g ) and e ∼ N (0, Iσ 2 e ) , where G is the genomic relationship matrix calculated following the first method of VanRaden [31] and σ 2 g and σ 2 e are, respectively, the additive genomic and residual variances. Matrix G was constructed and eigen decomposed using an in-house Python script. The eigenvalues and eigenvectors were subsequently used in GEMMA via flag -d and -u, respectively. A false discovery rate (FDR; Benjamini and Hochberg) of 1% was adopted to account for multiple testing when declaring significant SNPs [32]. Manhattan and quantile-quantile (Q-Q) plots of the GWAS results were drawn in R [33] with the qqman package [34]. Annotation of all the significant SNPs was performed with the variant effect predictor (https:// www. ensem bl. org/ Tools/ VEP) program using the Ensembl database and the Galgal4 assembly. Moreover, genes located 1 Mb up/downstream of the top SNP in each genomic region that contained significant SNPs were annotated using the BioMart tool of the Ensembl database and the Galgal4 assembly (http:// www. ensem bl. org/ bioma rt/ martv iew/).

Genetic variance partitioning by genomic region
Based on the GWAS results, the genome was partitioned into different regions that harboured significant and non-significant SNPs. Regions that contained (1) y = Xb + Za + e, (2) significant SNPs were defined by considering the region 1 Mb upstream and 1 Mb downstream from the SNP with the highest p-value in each region. Due to closely located GWAS signals on GGA13 (13a and 13b) and GGA14 (14a and 14b), the two regions on each of these chromosomes were merged. To reduce the computational cost, we used all the significant SNPs from the 600 k Affymetrix Axiom chip and among the nonsignificant SNPs only those that overlap between the 50 k and 600 k Affymetrix Axiom chips. To estimate the variance explained by each region, a Bayesian ridge regression model was implemented using AlphaBayes [35] and the same inputs as for the GWAS, but analysing all the SNPs simultaneously. Posterior samples for SNP effects for each region were obtained from 50,000 Markov-chain Monte Carlo (MCMC) iterations with a burn-in period of 10,000 iterations. For each region and each iteration, breeding values were calculated from SNP effects and SNP genotypes, the variance of these regional breeding values was calculated and divided by the variance of the breeding values for the whole genome to estimate the proportion of the (additive) genetic variance explained per genomic region, accounting for linkage-disequilibrium within and between regions [36].

Descriptive statistics and pedigree genetic parameters
The summary statistics and variance components of the raw data are presented in Fig. 1. The average BW35 in the full dataset was 1840 g, and ranged from 1080 to 2740 g, while the estimated pedigree heritability was 0.44 (0.01).

Genome-wide associations
A summary of the results of the GWAS is in Table 1

Partitioning of the genetic variance by genomic region
The proportion of the genetic variance over the total genetic variance for each genomic region that harboured significant SNPs is in Table 2 SNPs each explained less than 2% of the total genetic variance. GGA4 explained more of the total genetic variance than any other chromosome. Taken together, the regions that harboured significant SNPs on GGA4 explained ~ 8.6% of the total genetic variance.

Discussion
Previous studies have shown that the genomic architecture of BW is age-and population-dependent [8,9,37]. We focused on BW measured at 35 days of age, which is a typical age at which commercial broilers are slaughtered. In total, we found 96 significant SNPs with a 1% FDR that were located in 25 genomic regions across 13 chromosomes and explained ~ 30% of the genetic variance. We identified several candidate genes that might affect BW35 in broilers, and encode e.g. growth factors and the leptin receptor, and are involved in the JAK/ STAT signalling pathway (Table 3). Furthermore, inspection of the Q-Q plot (Fig. 2) [6,7], and carcass and breast muscle yield [38,39]. Nevertheless, such QTL studies detected only large chromosomal regions. The most significant chromosomal regions were mainly located on GGA1 to 4, but also GGA7 to 9, and GGA13, and on the Z chromosome [4,5,39].
Several GWAS have been conducted in broilers for a variety of carcass and growth traits, using commercial lines [9,13], birds from experimental stations [7,37,40], or field data [41], and a variety of SNP array densities that ranged from ~ 44,000 [7][8][9] to 470,486 [41]. In spite of this, previous GWAS that scanned the entire genome, were still limited in terms of statistical power because the size of the samples was usually only a few hundreds of birds.

Growth factor pathways
Growth factors, such as the transforming growth factor-β (TGF-β) and the insulin-like growth factor-1 (IGF1) are known to be key regulators of several traits related to body composition, growth, and development in chicken [42,43]. Our analysis detected seven regions that include genes coding for growth factors, i.e. regions 1a and 1b, 4a, 4c, 4d, 13b, 14a (Table 3). More precisely, the thioredoxin reductase 1 gene (TXNRD1; at ~ 54.74-54.77 Mb) is located in region 1a. The second most significant SNP detected in this region is located within TXNRD1 (54,756,840 bp). Although TXNRD1 is not considered as a growth factor, studies on salivary adenoid cystic carcinoma [44] have found a synergistic action between TXNRD1 and TGF-β. The insulin like growth factor 1 (IGF1; at ~ 55.43-55.48 Mb) gene is located within a 1-Mb region from TXNRD1. IGF1 has a major role in the body size of dogs [45,46], with a single allele causing a small size. It also affects body size in mice [47] and height in humans [48,49]. In broilers, increased IGF1 levels have been related with increased BW [43], growth of muscle [50], and IGF1 levels have been shown to differ between lines that are divergently selected for growth [51]. In a recent GWAS on a F2 chicken population, TXNRD1 and IGF1 have been associated with BW35 and BW41, respectively [52]. Moreover, the IGFBP4 (insulin like growth factor binding protein 4) gene has been identified as a candidate gene for broiler BW in a study that analysed a subset of the population used in our work [13]. This gene is in close proximity (~ 0.5 Mb) to the signal we detected on GGA27. Another insulin-like growth factor modulating protein, namely that encoded by the IGFALS (insulin-like growth factor binding protein, acid labile subunit; ~ 13.2 Mb) gene, is located ~ 8 kb downstream of the region 14a. In addition, on GGA1, near the region 1b (~ 134.49 Mb), we identified the TGFBRAP1 (transforming growth factor beta receptor associated protein 1; ~ 134.45-134.48 Mb) gene.
On GGA4, the region 4a (~ 44.84 Mb) contains two genes with growth factor activity: FGF5 (fibroblast growth factor 5; ~ 44.74 Mb) and BMP3 (bone morphogenetic protein 3; ~ 44.86 Mb). The FGF20 gene, which is a key regulator of skin development in chicken [53], Table 2 Genetic variance explained by genome regions GGA = Gallus gallus chromosome; V A : additive genetic variance by genomic region; V A , %: proportion of additive genetic variance explained by the genomic region; the parentheses denote the 95% high posterior density interval; nonsignificant SNPs were extracted from the 50 k Affymetrix Axiom chip; the two nearby located significant regions on GGA13 (13a and 13b) and GGA14 (14a and 14b) were merged on each chromosome

The leptin receptor overlapping transcript (LEPROT) and leptin receptor (LEPR)
Leptin (LEP) is a well-known hormone that is strongly related to appetite, through regulation of the brain satiety centres [54]. Its effect on reducing weight results from its interaction with another protein, the leptin receptor (LEPR) [55][56][57]. Several studies have investigated the association of the LEP and LEPR genes with obesity [58,59], feed intake [60,61], growth and fat traits [62,63], and their cardio-metabolic implications [64,65] in a variety of species. However, the results are controversial, especially in the chicken literature which currently reports evidence against the leptin system being involved in body weight control in birds [59,[66][67][68]. In our GWAS, a significant association in region 8b was located within the 3ʹ untranslated region of the LEPROT (leptin receptor overlapping transcript; It should be noted that LEPROT is not in any way homologous to the leptin receptor gene, but its expression has been associated with muscle development in turkey [69] and it is thought to regulate the expression of cytokine receptor, growth hormone receptor (GHR) and LEPR genes.    Dadousis et al. Genet Sel Evol (2021) 53:70 The JAK/STAT signalling pathway Although leptin activation of the JAK/STAT pathway in birds may not be important, as it is in other animals [70], this JAK/STAT pathway has a role in the mediation of many cytokine signals, such as those through GHR, and therefore for the growth of poultry. The JAK/STAT pathway is related to the generation of spermatogonial stem cells in chicken [71]. The association signal on GGA27 at ~ 4.96 Mb) (a gene dense region) was located near the STAT3 (signal transducer and activator of transcription 3; ~ 4.90-4.91 Mb) gene, and more precisely is within an intron of the ATP6V0A1 (ATPase, H+ transporting, lysosomal V0 subunit a1; ~ 4.93-4.96 Mb) gene ( Table 3). The association between the STAT3/STAT5B locus and BW in chicken confirms the findings of [13]. We also detected several other genes related to reproduction. The association signal in region 4c (~ 52.  (Table 3).
Apart from the effect of LEPR on the JAK/STAT pathway, Hou and Luo [64] have suggested a relationship between the leptin, JAK/STAT and mitogen-activated protein kinases (MAPK) signal pathways with an effect on cardiovascular diseases. Interestingly, the association on GGA12 was within an intron of the MAPKAPK3 (mitogen-activated protein kinase-activated protein kinase 3; ~ 1.84-1.88 Mb) gene. The results of an experimental study in pigs that compared the expression level of MAPKAPK3 in mini and large-type Diannan small-ear pigs, indicated that MAPKAPK3 might have an important role in growth and development [72]. Interestingly, Tarsani et al. [13] suggested LEMD2 (LEM domain containing 2), located on GGA26, as a strong candidate gene for BW and it is considered to have an important role during embryonic development in mice by regulating the MAPK signalling pathway [73].

T-box genes (TBX5 and TBX3)
Several studies have implicated the effect of the T-box genes (TBX4 and TBX5) in the development of the chicken limb, heart and embryo [74][75][76][77]. The significant SNP on GGA15 was included in the intron of TBX5 (Tbox 5; ~ 12.31-12.35 Mb). The TBX3 gene is adjacent to TBX5. TBX4 is located on GGA19 at ~ 7.59-7.61 Mb but the region that we detected on this chromosome is located further down at ~ 6.62 Mb, in the intron of the SYNRG (synergin, gamma) gene. In the same region on GGA19 (within 1 Mb) are found the TBX2 (~ 7.63-7.64 Mb) and HNF1B (HNF1 homeobox B; ~ 8.64-8.66 Mb) genes. Interestingly, Tarsani and colleagues [13] have recently reported several candidate genes for BW35 in broilers, among which the TBX21 gene and several members of the homeobox family (HOXB1-9 and HOXB13), and Moreira et al. [52] have shown an association of HOXB2,4,7,9 and HOXB13 with BW35 in an F2 cross. However, it should be noted that all these genes are located on GGA27 at more than one Mb from our top SNP on that chromosome. In our analysis, another limb morphogenesis gene was detected in region 1a, with the SNP at 54,681,614 bp being within an intron of the CHST11 (carbohydrate (chondroitin 4) sulfotransferase 11; ~ 54.54-54.74 Mb) gene (Table 3), which has been suggested to be involved in the elongation of limb buds and bone formation [78]. CHST11 has also been recently reported as a candidate gene for BW35 [52].

The zinc finger protein PLAG1
PLAG1, together with the LCORL-NCAPG locus, has been associated with body weight and height in a variety of species (human, cattle, horses, pigs and dogs) [79]. In cattle, a QTL for growth and development is located on bovine chromosome 14 (at ~ 25 Mb) and a cluster of four genes on the same chromosome, namely, PLAG1, LYN (v-yes-1 Yamaguchi sarcoma viral related oncogene homolog), RPS20 (ribosomal protein S20), and CHCHD7 (coiled-coil-helix-coiled-coil-helix domain containing 7) [80][81][82][83][84], is thought to be involved in this trait. Recently, a mutation in PLAG1 has been reported to have a major contribution to stature in modern cattle [85]. More importantly, PLAG1 is also known to affect body weight and milk characteristics [86] and to regulate several growth factors, such as IGF2 [87,88]. The large region 2b contains the PLAG1, LYN, and RPS20 genes. Our top SNP on GGA2 was located 3.5 kb downstream of the IMPAD1 (inositol monophosphatase domain containing 1) gene, but the second most significant SNP in the region was a synonymous variant in PLAG1 (Table 3). Altogether, these findings mark PLAG1 as a very good candidate for BW35 in broilers.

Other candidate genes for BW35 On GGA4 (peak at ~ 65.86 Mb)
In line with previous studies [8,89,90], we found that GGA4 contains genes that play a role in body weight in chicken. Among these, the CCKAR (cholecystokinin A receptor) gene is located at ~ 72.8 Mb on GGA4. Decreased expression of this gene is associated with increased BW and growth in chicken and it has been suggested that modern chicken breeds might have been selected during the early domestication process for the high-growth haplotype [91]. Other studies have reported the LCORL-NCAPG locus, at ~ 74.0 Mb [12,89], which explains part of the variance in stature in many species [92][93][94][95][96] and is associated with carcass traits in beef cattle [97]. However, the region identified in our study is located at ~ 65.86 Mb, which is quite far away (~ 7 Mb) from CCKAR and even further away from the LCORL-NCAPG locus. Whereas a long-range enhancer might be involved, a more conservative hypothesis indicates another gene. The top SNP in the region 4e detected here was ~ 400 bp upstream of the CWH43 (cell wall biogenesis 43 C-terminal homolog) gene, which encodes the PGAP2-interacting protein. This SNP also had the highest positive effect, indicating an effect that increases BW35 (see Additional file 1: Table S1). Eight other SNPs in this region 4e are located in CWH43, among which four are in introns and two are synonymous variants. Forty-one SNPs in this region 4e were mapped to eight additional genes, some of these being located in their introns (Table 3) Among all these genes, SLAIN2 was recently reported as a strong candidate gene regulating BW in broilers [13]. Moreover, several members of the general solute carrier gene family have been associated with BW35 and BW41 [52].
Concerning the second region 4b on GGA4, we identified the group-specific component (GC) gene, which is located ~ 19 kb from the region (Table 3). GC (a vitamin D-binding protein) belongs to the general albumin family, involved in vitamin transportation, and vitamin D, lipids and lipoproteins metabolism and is expressed in all vertebrates [98]. In humans, it is one of the major determinants of the status in vitamin D, as assessed by measuring the circulating concentrations of 25 hydroxyvitamin D (25(OH)D) [99]. The role of vitamin D in the maintenance of skeletal health has been known for over a century but there is now growing evidence that vitamin D plays an important role also in the health of nonskeletal tissues. The linkage between GC and BW35 is consistent with recent studies that found that 25(OH) D supplementation increases breast meat yield in broilers [100]. This observation highlights the importance of understanding how key vitamin D metabolism pathways regulate physiological processes relevant to production in farm animals. In dairy cattle, the same gene was recently associated with complex traits such as mastitis and milk traits [101,102].

On GGA1 to 3
Although the important role of GGA1 was previously reported in the literature, we detected only weak associations in our study, significant at 1% FDR. The top SNP in region 1b (~ 134.49) was located within the 3ʹ untranslated region of the C2orf49 gene (~ 134.48-134.50 Mb), which is near the region containing the C2orf40 gene. The SNP in region 1c was located within the MTMR2 (myotubularin related protein 2; ~ 184.42-184.48 Mb) gene. In region 1d, several genes are mapped. The significant SNP located in region 1c was in an intron of the STIM1 gene (stromal interaction molecule 1; ~ 193. 79-193.83 Mb), which is close to DGAT2 (diacylglycerol O-acyltransferase homolog 2; ~ 193. 95-193.97 Mb), a gene that is related to fatty acid metabolism and associated with changes in carcass and meat quality characteristics in domestic pigeons [103].
On GGA2 two significant regions were detected, namely at 103. 15

On chromosomes other than GGA1, 2, 3 and 4
We detected significant associations on nine other chromosomes (GGA8, 10, 11, 12, 13, 14, 15, 19 and 27). The association on GGA10 points to the PPCDC (phosphopantothenoylcysteine decarboxylase; ~ 1.86-1.88 Mb) gene, which is involved in the biosynthesis of coenzyme A. Coenzyme A is essential for energy production of the body. The association on GGA11 was located between the CRISPLD2 (cysteine-rich secretory protein LCCL domain containing 2; ~ 16.37-16.40 Mb) and HNF4beta (hepatic nuclear factor 4beta; ~ 16.46-14.47 Mb) genes. On GGA13, two regions were identified and because of the "two-peak" pattern in the Manhattan plot (Fig. 3 1 subunit) gene, which has been reported as a strong candidate for BW35 in broilers [13] and for which skeletal muscle-specific isoforms are reported in humans [104].

Implications for breeding programs
Combined together, the 25 genomic regions that contain 96 significant SNPs explained ~ 30% of the total genetic variance. This implies that the genetic architecture of BW35 is polygenic and complex, and therefore genomic prediction (using all available genomic data), rather than targeting specific genes via marker-assisted selection, will be more effective to improve BW35 in broilers. However, the region that contains significant SNPs and explains the largest proportion of the total genetic variance, is region 4e (~ 65.67-66.31 Mb), which explains 4.368% of the total genetic variance, and taken together the regions with significant SNPs on chromosome GGA4 explain ~ 8.6% (Table 2), which suggests that genomic prediction models that upweight regions of the genome known to harbour significant SNPs [105] may be effective. Moreover, information about the relevant genes identified in this paper could be included in the design of future SNP arrays.

Conclusions
To the best of our knowledge, this is the largest GWAS that has been conducted for BW in chicken to date. Our analysis revealed 25 genomic regions that harbour 96 significant SNPs on 13 Gallus gallus autosomes, which combined together explain ~ 30% of the total genetic variance. Although the region on GGA4 at ~ 65.67-66.31 Mb explains 4.37% of the total genetic variance, the high proportion of genetic variance attributed to regions that harbour non-significant SNPs supports the hypothesis that the genetic architecture of BW35 is polygenic and complex. The significant SNPs and associated genes identified here could be used in future experimental designs targeting specific genes and biological pathways, and in the design of future SNP arrays as well as in statistical models of genomic prediction using prior biological knowledge of genome regions known to affect the traits of interest. Our results also illustrate the importance of a large sample size for future GWAS of BW35.
Additional file 1: Table S1. List of all significant SNPs sorted by significance 1 . SNP = name of the single nucleotide polymorphism; GGA = Gallus gallus chromosome; location (bp) = position of the SNP on the chromosome in base pairs on the Galgal4 assembly; MAF = minor allele frequency; EFF = allele substitution effect; SE = standard error of the effect; P-value = P-values from the single SNP regression while simultaneously correcting for the background polygenic effect; P-value FDR = false discovery rate P-value; LOG = the −log 10 of the P-value; V SNP = variance due to the SNP (calculated as 2pqb 2 , where p is the frequency of major allele, q = 1 − p is the frequency of the minor allele and b is the allele substitution effect); V A = additive genetic variance; V P = phenotypic variance; V Asnp (%) = percentage of additive genetic variance explained by each SNP; V Psnp (%) = percentage of phenotypic variance explained by each SNP. 1 Of the 96 significant SNPs 18 are proprietary to Aviagen and therefore their base pair location has been excluded.