Whole-genome SNP association analysis of reproduction traits in the Finnish Landrace pig breed

Background Good genetic progress for pig reproduction traits has been achieved using a quantitative genetics-based multi-trait BLUP evaluation system. At present, whole-genome single nucleotide polymorphisms (SNP) panels provide a new tool for pig selection. The purpose of this study was to identify SNP associated with reproduction traits in the Finnish Landrace pig breed using the Illumina PorcineSNP60 BeadChip. Methods Association of each SNP with different traits was tested with a weighted linear model, using SNP genotype as a covariate and animal as a random variable. Deregressed estimated breeding values of the progeny tested boars were used as the dependent variable and weights were based on their reliabilities. Statistical significance of the associations was based on Bonferroni-corrected P-values. Results Deregressed estimated breeding values were available for 328 genotyped boars. Of the 62 163 SNP in the chip, 57 868 SNP had a call rate > 0.9 and 7 632 SNP were monomorphic. Statistically significant results (P-value < 2.0E-06) were obtained for total number of piglets born in first and later parities and piglet mortality between birth and weaning in later parity, and suggestive associations (P-value < 4.0E-06) for piglet mortality between birth and weaning in first parity, number of stillborn piglets in later parity, first farrowing interval and second farrowing interval. Two of the statistically significant regions for total number of piglets born in first and later parities are located on chromosome 9 around 95 and 79 Mb. The estimated SNP effect in these regions was approximately one piglet between the two homozygote classes. By combining the two most significant SNP in these regions, favourable double homozygote animals are expected to have 1.3 piglets (P-value = 1.69E-08) more than unfavourable double homozygote animals. A region on chromosome 9 (66 Mb) was statistically significant for piglet mortality between birth and weaning in later parity (0.44 piglets between homozygotes, P-value = 6.94E-08). Conclusions Three separate regions on chromosome 9 gave significant results for litter size and pig mortality. The frequencies of favourable alleles of the significant SNP are moderate in the Finnish Landrace population and these SNP are thus valuable candidates for possible marker-assisted selection.


Background
Estimated breeding values (EBV) based on best linear unbiased prediction (BLUP) for total numbers of piglets born and farrowing intervals have been available to Finnish pig breeders since 1991 [1]. The current multi-trait BLUP evaluation and fertility index also includes pig mortality and age at first farrowing [2]. During the last decade, a favourable genetic trend has been observed in the Finnish Landrace pig population for total litter size in terms of number of piglets weaned per litter (0.1 piglet) [3]. Selection based on quantitative genetic theory and the BLUP method has been successful in improving female reproduction traits. However, genetic improvement of reproduction traits, which have a low heritability and sex-limited expression, constitutes a real challenge for animal breeders and requires a better understanding of the genetic architecture of these traits to allow selection on genetic variants affecting these traits [4].
Over the past twenty years, several microsatellitebased linkage studies have been conducted to locate quantitative trait loci (QTL) that affect pig reproduction traits. The results are listed in the Pig Quantitative Trait Locus database (Pig QTLdb, http://www.genome.iastate. edu/cgi-bin/QTLdb/SS/index) [5,6]. Through a collaborative effort between the International Porcine SNP Chip Consortium and Illumina (CA, San Diego), researchers in pig breeding have now access to a wholegenome SNP panel, which makes it possible to study in greater detail the genetic architecture of reproduction traits in pigs [7].
The objective of this study was to identify SNP associated with female reproduction traits in the Finnish Landrace pig breed. Significant SNP can then be incorporated into the national evaluation and selection scheme. The analysis is based on 328 progeny-tested artificial insemination (AI) boars genotyped with the Illumina PorcineSNP60 BeadChip.

Animal material
The study included 328 Finnish Landrace AI boars born between 1996 and 2009. The average and maximum numbers of daughters per sire were 141 and 782, respectively. All boars were related to each other. The data included 114 sires with genotyped sons, for which the average and maximum numbers of sons per sire were 2.2 and 10, respectively.

DNA extraction and genotyping
DNA was extracted either from hair follicles or from semen using a DNeasy Blood & Tissue kit (Qiagen, Helsinki, Finland). For more information on extraction methods see Sironen et al. [8]. Expected DNA concentrations were 100 ng/μL for semen and 50 ng/μL for hair follicles. For each sample, 20 μL of DNA sample was sent out for genotyping at the FIMM (Institute for Molecular Medicine Finland, Helsinki, Finland) using the PorcineSNP60 BeadChip.

Phenotypes and statistical method
Nine female reproductive traits were studied: total number of piglets born in first (TNB1) and later parities (TNB2), number of stillborn piglets in first (NSB1) and later parities (NSB2), piglet mortality between birth and weaning in first (PM1) and later parities (PM2), age at first farrowing (AFF), first farrowing interval (FFI), and second farrowing interval (SFI). EBV for all nine traits were obtained for each genotyped AI boar from the national breeding value evaluation (multi-trait BLUP). The linear model for TNB1, TNB2, NSB1, NSB2, PM1, and PM2 used in the national evaluation includes herdyear, year-month, type of insemination, litter breed, and age at farrowing as fixed effects, and litter sire, permanent environmental effects, and additive genetic (animal) effects as random effects. Additionally, parity number was included as a fixed effect for TNB2, NSB2, and PM2. The linear model for AFF, FFI, and SFI included herd-year and herd-month as fixed effects and animal as a random effect. The models for FFI and SFI also included the effect of dam breed.
Prior to SNP association analysis, unstandardized EBV were deregressed and corresponding weights were calculated based on individual and parental EBV and reliabilities [9]. The proportion of genetic variance not explained by markers, due to partial marker coverage of the genome and incomplete linkage disequilibrium between markers and causal genes, was fixed at 0.5 (parameter c in Garrick et al. [9]).
The association of SNP with deregressed EBV was studied using a mixed linear model, for each SNP separately. The model included a fixed SNP effect and a random polygenic effect to account for residual genetic variance not explained by the SNP in the model and the relationship between animals in the data. The model used was: where y i is the deregressed EBV; x i is the number of minor alleles (0, 1, or 2); b is the corresponding regression coefficient; a i is a random polygenic effect with covariance structure a i~N (0, Aσ 2 a ), where A is the additive relationship matrix and σ 2 a is the polygenic variance; and e i is a random residual effect with e i~N (0, Iσ 2 e /w i ), where I is an identity matrix, σ 2 e is the residual variance, and w i is the weight. The analyses were performed using the AI-REML method in the DMU program package [10]. Variance components were estimated separately for each SNP. The estimated heritabilities based on deregressed EBV were generally smaller than the ones used in national breeding value estimation (e.g. h 2 = 0.07 vs. 0.10 for TNB1) but were quite constant across SNP.
Statistical significance of the associations was based on Bonferroni-corrected P-values. This method treats individual tests as independent and thus is very conservative for data for which the correlation (linkage disequilibrium) between tests (SNP) is high; the linkage disequilibrium (r 2 ) between adjacent SNP in the Finnish Landrace population is 0.43 [11]. Aiming for an overall false positive rate of 0.05 and considering 50 000 to 25 000 independent tests, the point-wise P-value should be between 1.0E-06 and 2.0E-06. In this article, individual SNP with a P-value of 2.0E-06 or less were considered statistically significant, and SNP with a P-value of 4.0E-06 or less as suggestive. More precise estimates of multiple-test-corrected P-values can be obtained by a permutation procedure but this was not possible for this research because of its high computation demand.

SNP quality
A total of 390 animals were originally genotyped; 366 animals had a call rate above the commonly used limit of 90% and five samples (DNA extracted from hair follicles) had a call rate of 0. Only the samples with a call rate equal or above to 0.90 and with available national EBV were used in the association analysis (328 animals).
The Illumina PorcineSNP60 BeadChip contains 62 163 SNP http://www.illumina.com. One quarter of these had a call rate of 1.0, and 57 868 SNP had a call rate equal to or above 0.9 and were used in the association analyses. A large proportion of the SNP (7 632) were monomorphic (for these, no estimate for SNP effect is available) and 7 642 SNP had a minor allele frequency below 0.05. Most of the SNP with a low frequency were evenly distributed across the genome, but regions with low polymorphism longer than 1 Mb were also detected on different chromosomes. These low-polymorphism regions could be the result of selection, random genetic drift, or a bottleneck effect. Otherwise, the distribution of minor allele frequency was uniform across SNP. The SNP were mapped to pig genome build 9 (Sscrofa9, http://www.ensembl.org).

Association results
For each trait, animals with a weight of the deregressed EBV less than 1.0 were removed from the analysis. The choice of this limit was arbitrary and subsequent analyses showed that inclusion or exclusion of these animals in the data had no real effect on the association results because of their large residual variance (small weight) in the linear mixed model equations. Figure 1 describes the distribution of deregressed EBV and weights for TNB1, and Table 1 gives the number of observations, mean, and standard deviation of the studied traits. Also the mean reliabilities of the original EBV are presented in Table 1.
Significant associations were observed for TNB1, TNB2, and PM2, and suggestive associations for PM1, NSB2, FFI, and SFI (Table 2 and Figure 2). Two chromosomal regions on chromosome 9 (around 79 Mb and 95 Mb, based on Sscrofa9) were statistically significant for litter size of first and later parities (TNB1 and TNB2). The estimated effect of SNP in these two regions of chromosome 9 was approximately 1 piglet between the two homozygote classes (e.g. SNP DRGA0009645; AA-genotypes vs. GG-genotypes). SNP ALGA0054078, H3GA0027863, MARC0003458, and MARC0027588 (79 Mb) were in complete linkage disequilibrium with each other and in moderate linkage disequilibrium (r 2 = 0.36) with DRGA0009645 (95 Mb) ( Figure 3). During the last 15 years, the frequency of favourable alleles increased from 0.10 to 0.19 for SNP in the 79 Mb region and from 0.14 to 0.22 for DRGA0009645 (Table 2). This positive trend in allele frequencies is in good agreement with the overall increase in litter size observed in the Finnish Landrace breed over the same period.
When the two regions were combined in the analysis and the regression variable corresponded to the number of favourable alleles (A) in H3GA0027863 and DRGA0009645 (possible values 0, 1, 2, 3, and 4), the estimated SNP effect was 0.32, with a P-value of 1.69E-08. Thus, animals that are double homozygotes AA for H3GA0027863 and DRGA0009645 are expected to have 1.28 (4*0.32) more piglets than animals that are double homozygotes GG. To test the sensitivity of the analysis, observations with a weight less than 5 were discarded. For this analysis, the estimated SNP effect was also 0.32, with a P-value of 1.08E-08. Figure 4 shows the distribution of the deregressed EBV of TNB1 for animals with different combinations of genotypes for H3GA0027863 and DRGA0009645 (deregressed EBV with a weight less than 5.0 were discarded).
Another region on chromosome 9 (around 66 Mb, based on Sscrofa9) was statistically significant for piglet mortality before weaning (PM1 and PM2, Table 2). SNP in this region were in strong linkage disequilibrium with each other (r 2 ranging from 0.85 to 0.94) but nearly in linkage equilibrium with the SNP in region 79 Mb on the same chromosome that were significant for litter size (r 2 < 0.01) ( Figure 3). Moreover, the SNP that gave suggestive evidence for PM2 at 95.4 Mb on chromosome 9 (MARC0023136) was in weak linkage disequilibrium (r 2 = 0.17) with a SNP in the same proximal area (DRGA0009645) that was significant for litter size ( Figure 3). P-values for    the association of MARC0023136 on TNB1 and TNB2 were 0.002 and 0.0005, respectively, reflecting the observed linkage disequilibrium between MARC0023136 and DRGA0009645. The low linkage disequilibrium between adjacent DRGA0009645 and MARC0023136, which were located 22 kb from each other, may be due to mapping errors in build 9 (Sscrofa9, http://www.ensembl.org). The frequencies of favourable alleles for pig mortality have increased during the last 15 years. This could be due either to random drift or, given the fact that the SNP are associated with the trait under selection, to selection force ( Table 2). Suggestive evidence for associations between SNP and female reproduction traits was also observed for number of stillborn piglets in later parities (NSB2) on chromosome 1, for pig mortality in later parities (PM2) on chromosome 7, for first farrowing interval (FFI) on chromosome 4, and for later farrowing interval (SFI) on chromosome 1 ( Table 2).

Discussion
Good genetic progress for pig reproduction traits has been achieved using a quantitative genetic-based multitrait BLUP evaluation system. Marker-assisted selection [12] was expected to boost selection efficiency to a new level [4]. As is well known, these promises have not been fulfilled, mainly due to the lack of markers strongly linked to QTL. High-density SNP genotyping technology provides a new tool to select elite animals for breeding. SNP are also more powerful to study the genetic background of traits than microsatellites because they have a better genomic coverage and can provide information on historical linkage disequilibrium with potential QTL, whereas microsatellite linkage studies were based on more recent linkage within families. High-density SNP genotypes are primarily incorporated into breeding programs through the use of genomic selection instead of traditional marker-assisted selection, particularly in dairy cattle breeding [13].
Marker-assisted selection on known marker-QTL associations and genomic selection on an overall sum of marker effects across the genome both rely on strong linkage disequilibrium between markers and QTL. Finnish pig breeds are very suitable for SNP association studies because of their high linkage disequilibrium over the typical distances between SNP in the PorcineSNP60 BeadChip [11] and the homogeneity of the population. Other important factors affecting the power of association studies, beyond the actual genetic architecture of the trait, are the number of genotyped animals and the reliability of the observations used in statistical analysis. In this study, over 300 AI boars with an average number of 141 daughters per sire were available for association analyses. Given that the number of genotyped animals is the same in both scenarios, for traits with a low heritability, a half-sib design, as used here is more powerful for association studies than a direct design, in which the same animals are both genotyped and phenotyped [14]. Given the number of genotyped boars and daughters per sire, a direct design would have required several hundreds to several thousands more genotyped animals to achieve the same power as the half-sib design used  here (e.g. with an average number of 40 daughters per sire, h 2 = 0.1, a QTL that explains 5% of the phenotypic variation, and a significance value of 2.0E-06, approximately 650 genotyped sows are needed to achieve the same power (0.8) as 130 genotyped AI boars [14]). Although a larger number of genotyped sires would have improved the power, the results show that the data size was sufficient to discover three new regions on chromosome 9 with effects on litter size and piglet mortality with very small P-values. The reported effects of these SNP are most likely overestimated due to the "winner's curse" effect [15] that is commonly observed in initial genome-wide association studies. A larger population study is still needed to precisely estimate the effects and gene actions of these chromosomal regions on litter size and pig mortality. Annotation of the SNP was based on the pig genome build 9 (Sscrofa9, http://www.ensembl.org). It is well known that build 9 contains errors in the actual position of the SNP and even in the order of the SNP, as suggested by the linkage disequilibrium pattern of the significant SNP on chromosome 9 in this study. However, single SNP association tests are not sensitive to mapping errors. Thus, results (P-values) presented here hold even with an updated pig genome build but the candidate genes that are in the proximity of the significant SNP may change from the ones presented here.
The three regions on chromosome 9 (around 65, 79 and 95 Mb) that were significant for female reproduction traits in this study are within the same region as that reported for ovulation rate by Rohrer et al. [16] (57 to 122 cM, with a peak at 67 cM), based on their whole-genome microsatellite study of Chinese Meishan × European White composite line crossbreds. A significant region on chromosome 9 was also reported for ovulation rate in another microsatellite-based genome scan of an F2 cross of two selected experimental lines [17]. However, this 1 cM region does not overlap with the region reported here or by Rohrer et al. [16]. Yet another microsatellite genome scan of Meishan × Large White F2 population revealed two significant regions on chromosome 9 for female reproduction traits: a region at 127 cM was significant for ovulation rate and one at 36 cM for number of viable embryos and embryo survival [18]. Neither of these regions overlaps with the significant regions in this study.
The suggestive region for NSB2 at the end of chromosome 1 that was found in this study has been reported to carry a QTL for age at puberty [18] but not for number of stillborn piglets. The estrogen receptor 1 (ESR1) gene on chromosome 1 (15186680 to 15450209 bp) is a good candidate gene for all reproduction traits. However, none of the SNP in this region gave significant or suggestive P-values for any of the studied traits here.
The best SNP (ALGA0000673 for SFI) is located 7 Mb from the ESR1 gene. A QTL for number of stillborn has been reported on chromosome 7 [19] and a QTL for ovulation rate on chromosome 4 [18].
It is interesting that the genome regions that reached statistical significance in our study are all on chromosome 9. Without further studies, it is difficult to say whether this is by chance or whether there is some biological or population-based mechanism that explains this result. No linkage disequilibrium was observed between the significant regions on chromosome 9 at the population level, which implies that the regions have segregated independently in the Finnish Landrace population and thus rules out the hitchhiking effect as the most likely explanation.
The region on chromosome 9 that is associated with TNB1 and TNB2 contains several potential candidate genes that may contribute to the physiology of variation in sow fertility. The aryl hydrocarbon receptor (AhR) gene between positions 81024216 and 81081616 bp is involved in folliculogenesis, gonadotrophin receptor expression, proliferation of granulosa cells, and intraovarian estrogen signalling [20]. Its well-balanced activity is necessary for normal ovarian function. Additionally, it has been found that AhR knockout mice have reduced fertility due to disturbed follicle development, with significantly fewer pre-antral and antral follicles and less ovulations compared with wild-type mice [21]. Another candidate gene for litter size is interleukin 6 (IL6), which is located on chromosome 9 between positions 85801970 and 85806347 bp. IL6 is a multifunctional cytokine that regulates various aspects of the immune response and is also expressed during the ovulation process [22]. IL6 has been shown to serve as a potent regulator of ovarian cumulus cell function and cumulus cell oocyte complex expansion, and it may mediate some of its effects [23]. Furthermore, the protein tyrosine phosphatase non-receptor type 12 (PTPN12, PTP-PEST) gene between positions 95427072 and 95467282 bp has been shown to play an essential role in early murine embryogenesis. PTPN12 functions in embryonic vascularization, mesenchyme formation, neurogenesis, and early liver development [24], and may thus affect embryo survival. However, identification of the causal genes for litter size and piglet mortality traits in the reported regions on chromosome 9 requires additional work.

Conclusions
To conclude, this whole-genome SNP association study using 328 Finnish Landrace AI boars revealed three highly significant regions on chromosome 9 with effects on litter size and piglet mortality. Suggestive P-values were also observed on chromosomes 1, 4, and 7 for second and first farrowing intervals and for piglet mortality in later parities, respectively. The frequencies of favourable alleles of the significant SNP are still moderate in the Finnish Landrace population. Thus, if these initial findings are confirmed, the specified SNP will be valuable in the national breeding program through their use in marker-assisted selection.