Weighted single-step GWAS and gene network analysis reveal new candidate genes for semen traits in pigs

Background In recent years, there has been increased interest in the study of the molecular processes that affect semen traits. In this study, our aim was to identify quantitative trait loci (QTL) regions associated with four semen traits (motility, progressive motility, number of sperm cells per ejaculate and total morphological defects) in two commercial pig lines (L1: Large White type and L2: Landrace type). Since the number of animals with both phenotypes and genotypes was relatively small in our dataset, we conducted a weighted single-step genome-wide association study, which also allows unequal variances for single nucleotide polymorphisms. In addition, our aim was also to identify candidate genes within QTL regions that explained the highest proportions of genetic variance. Subsequently, we performed gene network analyses to investigate the biological processes shared by genes that were identified for the same semen traits across lines. Results We identified QTL regions that explained up to 10.8% of the genetic variance of the semen traits on 12 chromosomes in L1 and 11 chromosomes in L2. Sixteen QTL regions in L1 and six QTL regions in L2 were associated with two or more traits within the population. Candidate genes SCN8A, PTGS2, PLA2G4A, DNAI2, IQCG and LOC102167830 were identified in L1 and NME5, AZIN2, SPATA7, METTL3 and HPGDS in L2. No regions overlapped between these two lines. However, the gene network analysis for progressive motility revealed two genes in L1 (PLA2G4A and PTGS2) and one gene in L2 (HPGDS) that were involved in two biological processes i.e. eicosanoid biosynthesis and arachidonic acid metabolism. PTGS2 and HPGDS were also involved in the cyclooxygenase pathway. Conclusions We identified several QTL regions associated with semen traits in two pig lines, which confirms the assumption of a complex genetic determinism for these traits. A large part of the genetic variance of the semen traits under study was explained by different genes in the two evaluated lines. Nevertheless, the gene network analysis revealed candidate genes that are involved in shared biological pathways that occur in mammalian testes, in both lines. Electronic supplementary material The online version of this article (10.1186/s12711-018-0412-z) contains supplementary material, which is available to authorized users.


Background
Artificial insemination (AI) pig industry focuses mainly on maximizing the number of insemination doses produced from each boar ejaculate. To achieve this goal, the ability of boars to produce high-quality semen (high motility and progressive motility with low levels of morphological defects) in sufficient quantity (large number of sperm cells per ejaculate) is decisive [1].
In recent years, with the fast advances in high-throughput genotyping and in molecular techniques in general, there is an increased interest in the study of the molecular processes and genetic mechanisms that affect semen traits. Genes and markers associated with pig semen traits have been described in the literature [2][3][4][5][6][7][8]. However, very few studies analyze large datasets to identify novel quantitative trait loci (QTL) and to provide a deeper knowledge of the genes that control boar semen traits. One reason for this is the genetic complexity of the process for the production and maturation of sperm cells. Mammalian spermatogenesis requires coordination among different genes and cell types (germ cells, Sertoli cells, and Leydig cells) [9] and occurs in the seminiferous tubules of the testes in three steps: mitotic phase, meiotic phase, and spermiogenesis [10]. In the first step (mitosis), spermatogonias produce primary spermatocytes, which enter the first stage of meiosis (meiosis I) and produce secondary spermatocytes. Then, the second step of meiosis (meiosis II) leads to the generation of haploid round spermatids. In the last phase, i.e. spermiogenesis, the spermatids undergo morphological transformations and acquire the spermatozoa shape. Then, the new preformed spermatozoa go through the epididymis to maturate and acquire motility [10]. Mutations and impaired expression of genes that control the whole process of spermatogenesis and sperm maturation can lead to problems in semen quality and fertility.
Genome-wide association studies (GWAS) are commonly used to identify single nucleotide polymorphisms (SNPs) that are associated with QTL with major effects [11]. The weighted single-step GWAS (WssGWAS), proposed by Wang et al. [12], is a method that allows estimation of SNP effects using genomic estimated breeding values (GEBV) from single-step genomic best linear unbiased prediction (ssGBLUP, [13]) based on all phenotyped, genotyped and pedigree-related animals. In addition, it allows unequal variances for SNPs, which results in improved precision of the estimation of SNP effects [12]. Therefore, when the number of animals with both phenotypes and genotypes is small and the traits are controlled by QTL with large effects, the WssGWAS may perform better than traditional GWAS methods. Recent studies have used this method for production, carcass and reproductive traits in livestock [14][15][16][17][18][19][20][21][22][23].
In a post-GWAS study, a gene network analysis can be performed for candidate genes related to QTL regions identified in GWAS. The gene network is used to investigate pathways and biological processes that are shared by these genes [24]. In addition, the biological information provided by these gene networks are helpful to understand genetic differences between populations for the same trait [25].
In this study, our aim was to identify QTL regions that are associated with four semen traits (motility, progressive motility, number of sperm cells per ejaculate and total morphological defects) in pigs. In addition, our aim was to identify candidate genes within those QTL regions that explained the highest proportions of genetic variance. To achieve our goal, we performed a WssGWAS in two commercial pig lines (L1: Large White type and L2: Landrace type), followed by gene network analyses to investigate the biological processes shared by genes that were identified for the same semen traits in these two lines.

Phenotypic, genotypic and pedigree data
Phenotypic data were available from two commercial pig lines, a Large White type line (L1) and a Landrace type line (L2), on ejaculate samples that were collected between January 2007 and October 2014. The evaluated traits were: (1) sperm motility (MOT), which is the proportion of moving sperm cells in an ejaculate; (2) sperm progressive motility (PROMOT), defined as the proportion of sperm cells that move in a straight line; (3) abnormal sperm cell number (ABN), which is the total number of sperm cells with morphological abnormalities; and (4) the total number of sperm cells in the ejaculate (Ncells per 10 6 sperm cells). MOT and PROMOT were evaluated using the UltiMateTM CASA system (Hamilton Thorne Inc., Beverly, MA, USA). Ncells was calculated as the product of the semen volume (mL) and concentration (10 6 mL −1 , measured by the CASA system). The values of this measure were not normally distributed, and thus logtransformed before further analyses (lnNcells). Ejaculates evaluated for ABN were analyzed microscopically at a 1000× magnification by a trained technician with a phase contrast microscope and a thermal plate (BH-2, Olympus, Tokyo, Japan), counting 100 sperm cells per assessment. All semen traits were assessed on the day of semen collection using fresh semen.
The phenotypic data for MOT, PROMOT and lnNcells consisted of 43 Table 1.
Genotyping data for 3737 animals (856 males and 2881 females) from L1 and 3307 animals (953 males and 2354 females) from L2 were available. A majority of the animals (2718 for L1 and 2394 for L2) were genotyped using the Illumina PorcineSNP60 BeadChip (Illumina, Inc., San Diego, CA) but part of the animals from L1 (n = 1019) and L2 (n = 913) were genotyped using the (Illumina, Inc.) GeneSeek Custom 80 K SNP chip (GeneSeek Inc., Lincoln, NE). Quality control was performed by excluding SNPs that had an unknown position on the pig genome build 10.2 [26], that were located on sex chromosomes, that had a call rate lower than 0.95 or a minor allele frequency lower than 0.01, or that were in strong deviation from Hardy-Weinberg equilibrium (χ 2 > 600). Animals for which the frequency of missing genotypes was higher than 0.05 were also excluded. After quality control, missing genotypes from animals genotyped with the SNP60 BeadChip were imputed with the software Beagle version 3.3.2 [27] to the set of SNPs on the SNP60 BeadChip that passed quality control. In addition, genotypes from the animals genotyped with the GeneSeek Custom 80 K SNP chip were imputed to the set of SNPs on the SNP60 BeadChip that passed quality control. After quality control, 39,945 and 41,478 SNPs remained for L1 and L2, respectively, and were used for the GWAS.
The complete pedigree included 8352 animals for L1 and 8271 animals for L2. The total number of animals that remained after pedigree pruning was 6724 for L1 and 6502 for L2. Most animals had either phenotypic or genotypic data. The number of animals with both phenotypes and genotypes was 349 for L1 and 446 for L2.

Statistical analyses
The weighted ssGBLUP analysis was conducted within line using the BLUPF90 software family [28] adapted for genomic analyses [29]. First, variance components were estimated using AIREMLF90, which were then used in BLUPF90 to predict GEBV. SNP effects were then calculated using postGSf90 software.
The single-trait animal model for ssGBLUP was as follows: where y is the vector of phenotypic observations; β is the vector of fixed effects (combined effects of AI station, year and month of semen collection; the laboratory where the samples were analyzed and the covariates of interval between two subsequent semen collections in days and age of the boar at the time of collection in months); a is the vector of random additive genetic effects; p is the vector of random permanent environmental effects; ε is the vector of random residuals; and X , Z and W are the incidence matrices of β , a and p , respectively.
It was assumed that a ∼ N 0, Hσ 2 a , p ∼ N 0, Iσ 2 p and ε ∼ N 0, Iσ 2 e , where σ 2 a , σ 2 p and σ 2 e are the additive genetic, permanent environmental and residual variances, respectively; H is the matrix that combines pedigree and genomic information [13], and I is an identity matrix. The inverse of H needed for mixed model equations is given by: where A is the numerator relationship matrix based on pedigree for all animals; A 22 is the numerator relationship matrix for genotyped animals; and G is the genomic relationship matrix [30]. This matrix was obtained as follows: where Z is a matrix of gene content adjusted for allele frequencies (0, 1 or 2 for aa, Aa and AA, respectively); D is a diagonal matrix of weights for SNP variances (initially D = I ); M is the number of SNPs, and p i is the minor allele frequency of i-th SNP.
Estimates of SNP effects and weights for WssGWAS were obtained according to Wang et al. [12] by the following steps: 1. In the first iteration ( t = 1 ): 2. GEBV were calculated for the entire dataset using ssGBLUP; 3. GEBV were converted to estimates of SNP effects ( û ): where â g is the GEBV of animals that were also genotyped; 4. The weight for each SNP to be used in the next iteration was calculated as: The SNP weights were normalized to keep the total genetic variance constant: This procedure was run for three iterations based on the realized accuracies of GEBV according to Legarra et al. [31] and performed by Wang et al. [14]. At each iteration, the weights for SNPs were updated (steps 4 and 5), used to construct the G matrices (step 6), update the GEBV (step 2) and, consequently, the estimated SNP effects (step 3). The percentage of genetic variance explained by the i-th set of consecutive SNPs ( i-th SNP window) was calculated as described by Wang et al. [14] as: where a i is the genetic value of the i-th SNP window that consists of a region of consecutive SNPs located within 0.4 Mb, which is the average haplotype block size in commercial pig lines [32], including the lines considered in the present study; σ 2 a is the total additive genetic variance; Z j is the vector of gene content of the j-th SNP for all individuals and û j is the effect of the j-th SNP within the i-th window. Manhattan plots showing these windows were created using the R software [33].

Selection of relevant SNP windows, search for candidate genes, and gene network analyses
The selection of relevant SNP windows and the search for genes within these QTL regions were performed in three steps. First, for the four traits (MOT, PROMOT, lnNcells and ABN), the SNP windows that explained 1% or more of the total genetic variance based on the Wss-GWAS were selected within each line (L1 and L2). The threshold of 1% was chosen based on the literature [19,20,34] and on the expected contribution of SNP windows [35]. For example, assuming an equal contribution of all windows (on average, we had 4223 and 4229 windows for each trait in L1 and L2, respectively), the expected proportion of genetic variance explained by each window was 0.02% for both lines (100/4223 for L1 and 100/4229 for L2). The threshold of 1% used in the present study is equal to 50 times the expected variance (0.02% × 50 = 1%) and considered a suitable threshold for our purposes. Then, after selecting the windows that explained more than 1% of the genetic variance, a search for overlapping windows for two or more traits of the same line was performed. Windows were considered to overlap if their midpoints were less than 0.4 Mb apart. Second, the three most important windows (that explained the highest proportion of genetic variance) for each trait and the 0.4 Mb region on either side of these windows midpoints were also identified. Third, based on the windows selected in the first step (> 1% of variance explained), overlapping windows for the same traits, but across lines, were investigated (also considering a maximum distance of 0.4 Mb between midpoints). Based on the start and end positions of each identified and selected QTL region in the third step, we identified the genes in these QTL regions based on the Gene database for Sus scrofa available at "National Center for Biotechnology Information" (NCBI, [36]). For all the identified genes, we manually searched the literature if they had a previously identified relationship with the traits under study. In addition, based on human genes with the same description, we carried out four gene network analyses that described the biological processes and relations between the L1 and L2 sets of genes that were identified for the same traits by using the ClueGO and CluePedia Cytoscape plug-ins [37,38]. The ClueGO plug-in combines Gene Ontology (GO) terms and KEGG/BioCarta pathways and develops a GO/pathway network. It also calculates enrichment and depletion tests for groups of genes based on the hypergeometric distribution and corrects the P-values for multiple testing [37]. Using the CluePedia plug-in, new associations between genes can be discovered with enrichments and added to the ClueGO pathways [38].

Results
Estimates of variance components for all semen traits in both lines are in Additional file 1: Table S1. Low to moderate heritabilities ranging from 0.10 to 0.31 were estimated (Table 2), with higher estimates in L1 than L2 for MOT, PROMOT and ABN.
We found no overlapping QTL regions between the two lines for the same traits in this study. Nevertheless, the gene network analysis for PROMOT revealed two genes in L1 (PLA2G4A and PTGS2) and one gene in L2 (HPGDS) that shared the biological processes of eicosanoid biosynthesis and arachidonic acid metabolism. The genes PTGS2 in L1 and HPGDS in L2 were also found to share the biological process of cyclooxygenase pathway (Fig. 3).

Discussion
In the past years, interest in investigating genomic regions that control boar semen traits has increased due to advances in molecular and genotyping techniques, statistical methods, and the ease of GWAS application. According to Wang et al. [12], two of the most used GWAS methods are (1) single-SNP GWAS, which fits each SNP separately as a fixed effect in a model that accounts for population stratification; and (2) Bayesian methods, which analyze all SNPs at the same time. Nonetheless, when the number of animals that are both phenotyped and genotyped is relatively small, the application of single-SNP GWAS and Bayesian methods is limited, since the calculation of pseudo-phenotypes (i.e. deregressed breeding values [39]) is necessary [14]. An alternative method is single-step GWAS (ssGWAS), which was proposed by Wang et al. [12] and converts the GEBV of genotyped animals obtained by ssGBLUP into SNP effects. However, the ssGWAS model, which assumes equal variance for all marker effects, may be limited when traits are affected by large QTL [12,40]. To overcome this limitation, Wang et al. [12] proposed the WssGWAS approach, in which SNP effects are weighted according to their importance for the trait of interest, which improves QTL detection [12].
To perform the WssGWAS, first, the H −1 matrix is calculated, by combining all known pedigree and genotype information, and then used in the ssGBLUP procedure to estimate GEBV for all animals. Then, the GEBV of the genotyped animals are used to estimate effects for the SNPs. Finally, SNP effects are used to calculate the percentage of genetic variance that is explained by sets of consecutive SNPs (SNP windows). In this approach, the SNP effects are not directly estimated from the model and there are no measures of uncertainty for the statistical tests. However, the WssGWAS provides information on the most important SNP windows, based on the proportion of explained genetic variance, which is a general concept that is widely accepted in QTL detection analyses [12][13][14] but does not allow for a formal testing of significance. In this study, we chose the WssGWAS approach because: (1) it can integrate all phenotypic, genotypic and pedigree data simultaneously, thus avoiding the need to calculate pseudo-phenotypes for genotyped animals to incorporate all phenotypic information; (2) it allows the use of different weights for SNPs according to their importance, which is a deviation from the non-realistic GBLUP assumption of the infinitesimal model and improves the precision of estimates of SNP effects; and (3) it provides the possibility to work with SNP windows, since a window of consecutive SNPs in the GWAS may be more successful in finding QTL regions compared to individual SNP analysis because of linkage disequilibrium (LD). In general, analyses that consider the association of individual SNPs may overestimate the number of detected QTL [41].
In spite of the small number of animals that were both genotyped and phenotyped (349 for L1 and 446 for L2), all the information of the 3737 and 3307 genotyped animals, the 866 and 900 phenotyped boars (849 and 886 for ABN), and the 6724 and 6502 animals in the pedigree file, for L1 and L2, respectively, were used to calculate GEBV, and consequently, to estimate SNP effects in the WssGWAS. Obtaining a large dataset of both genotyped and phenotyped animals is unlikely or difficult for semen traits because phenotypes can be recorded only for animals that are used for AI. The number of males and females genotyped in this study also shows the relatively small number of boars, 856 and 953 for L1 and L2, respectively, compared to the number of females, 2881 and 2354. In this context, the advantage of WssGWAS  is to supply additional information on relatives without genotypes in order to improve the statistical power of QTL detection [12].
In our previous study [7], a classical GWAS (single-SNP GWAS) was performed for sperm motility in a subset of the L1 and L2 populations that were used in the present study: 645 and 1886 phenotyped and genotyped animals for L1, respectively, and 760 and 1972 animals for L2. Due to the small number of animals that were both genotyped and phenotyped, deregressed breeding values [39] were calculated and included as response variables in the GWAS model, a polygenic effect was added in the model to account for population structure, and a genome-wide false discovery rate (FDR) was applied (q-values) to avoid false positives due to multiple testing. The results showed no SNPs with a significant association (q-value ≤ 0.05) with sperm motility for L1 but six SNPs associated with the trait for L2 (SSC1, from 117.26 to 119.50 Mb). In this QTL region on SSC1, a novel candidate gene (MTFMT) that affects translation efficiency of proteins in sperm cells, was identified. The number of relevant QTL regions that was identified for MOT in the present study was greater than in [7], which indicates that the WssGWAS was more successful in detecting QTL. All the recommended steps for the GWAS method used were performed in each study and reliable candidate genes with biological meaning were found. In addition, Marques et al. [42] reported high genetic correlations between MOT, PROMOT and ABN, which corroborate our findings regarding the overlapping QTL regions that explained more than 1% of the genetic variance for these traits. Therefore, we believe that the QTL results of the present and previous studies should not be considered as false positives and we conclude that the single-SNP method with pseudo-phenotypes and the amount of data in the previous study were not able to identify some of the relevant QTL related to MOT. For L1, we identified overlapping QTL regions for MOT and other semen    (Table 3), but they did not include the MTFMT gene described in our previous study [7], likely as a result of the larger number of animals used in the current study and the different statistical models used in the two studies, i.e. WssGWAS based on the H relationship matrix in the current study and single-SNP GWAS using the A relationship matrix and deregressed phenotypes in the previous study.
In the WssGWAS, we identified several relevant QTL regions associated with the traits under study, which confirms the assumption that these traits have a complex genetic determinism. In this study, the region used to search for candidate genes was not limited to the SNP window, but also included the upstream and downstream flanking regions. The use of a larger genomic region for the identification of genes is important because SNPs within a window may be in high LD with QTL in the surrounding area.
We detected several candidate genes for the semen traits in both L1 and L2 pig lines (Tables 3 and 4). For L1, the dynein axonemal intermediate chain 2 (DNAI2) gene, located on SSC12, was considered the best candidate gene for MOT in the QTL region. The DNAI2 gene encodes axonemal dyneins; the axoneme is a microtubular structure located in the center of all motile cilia and flagella, including sperm flagella [43]. Dyneins are large multisubunit ATPases that interact with microtubules to generate the driving force for flagellar motility [44]. Mutations in the human DNAI2 are involved in defects of the axoneme [45].
For L1, some overlapping QTL regions were identified between MOT and PROMOT. On SSC5, we identified one candidate gene, i.e. sodium voltage-gated channel alpha subunit 8 (SCN8A). Pinto et al. [46] described the expression of SCN8A in human sperm flagellum principal piece and showed that sodium channels contribute to the regulation of human sperm motility. Another candidate gene for both traits was identified on SSC13, i.e. IQ motif containing G (IQCG). Li et al. [10] showed that Iqcg knockout mice presented severe malformation and total immobility of their spermatozoa because of disorganized sperm flagellum axoneme. Harris et al. [47] reported that mice with mutations in the Iqcg gene presented spermiogenesis defects, with incomplete sperm tail formation.
We detected one overlapping QTL region in L1 for MOT, PROMOT and ABN on SSC14. This QTL region includes the gene LOC102167830, which is described as a spermatogenesis associated (SPATA) protein 31E1like. SPATA genes form a large gene family that plays a very important role in testis development and spermatogenesis [48]. In humans, SPATA31E1 is a subfamily of the SPATA31 large gene family. In Mus musculus, only the Spata31 gene has been described. Wu et al. [49] demonstrated that the protein encoded by this mouse gene is located in the acrosome of round and elongated spermatids and Spata31 knockout mice showed disorganized testis morphology and aberrant spermatogenic cells in seminiferous tubules.
The gene network analysis was very useful to investigate the shared biological processes between the candidate genes that were identified for the same traits between the two lines. For PROMOT, we found two genes in L1 (PLA2G4A and PTGS2) and one gene in L2 (HPGDS) that are involved in eicosanoid biosynthesis and arachidonic acid metabolism, of which PTGS2 and HPGDS are also involved in the cyclooxygenase pathway (Figs. 3 and 4). Kaewmala et al. [50] described the presence of PTGS2 (COX-2) protein in boar Leydig cells, spermatogonium and spermatids, which suggests that it may have a role in the spermatogenic process in pigs. They also showed that the levels of COX-2 mRNA and enzyme tended to be higher in animals with low sperm motility, which indicates that it has a negative effect on boar sperm quality. Frungieri et al. [51] showed that the COX-2 enzyme is abundant in the interstitial cells of male seminiferous tubules with impaired spermatogenesis. The COX-2 enzyme provides a precursor for the action of HPGDS in testes interstitial mast cells (Fig. 4), producing prostaglandin D 2 (PGD 2 ), which regulates the function of Leydig cells [52]. Yamamoto et al. [53] and Saharkhiz et al. [54] showed that, after treatment with mast cells blockers, sperm motility increased in men. HPGDS is also involved in negative regulation of male germ cell proliferation (Fig. 3), which is linked to PGD 2 production. Moniot et al. [55] identified the PGD 2 pathway as one of the earliest signaling pathways involved in male germ cell differentiation in fetal testes. In the cyclooxygenase pathway, PGH 2 can be converted into prostaglandin E 2 (PGE 2 ) or into prostaglandin F 2α (PGF 2α ) (Fig. 4, steps IV and V). Schlegel et al. [56] stated that seminal plasma is the richest natural source of prostaglandins, which are synthesized in the seminal vesicles. The authors showed that high concentrations of prostaglandins (especially PGF 2α ) in the human seminal fluid were associated with poor sperm motility. Rios et al. [57] demonstrated that low physiological levels of PGE 2 and PGF 2α were able to increase and prolong human progressive sperm motility.
For L2, the NME/NM23 family member 5 gene (NME5) on SSC2 was considered the best candidate for lnNcells (Table 4). According to Munier et al. [58], this gene is highly and specifically expressed in testis and the encoded protein is important for the initial stages of spermatogenesis (before meiosis I). Choi et al. [59] reported that, when expression of murine Nm23-M5 (which shares 86% identity with its human homolog NME5) is reduced, the round and elongated spermatids in the testes become more sensitive to oxidative stress, leading to DNA damage and reduced cell numbers, showing that this gene is a critical factor for spermiogenesis.
The antizyme inhibitor 2 gene (AZIN2) is located in a region on SSC6 that explained 4.7% of the genetic variance for ABN in L2 (Table 4). Lopez-Contreras [60] showed that AZIN2 expression is critical for spermiogenesis in mice. Its expression starts in the testis at the beginning of spermiogenesis and its mRNA level increases in more differentiated spermatids. During spermiogenesis, round spermatids elongate, develop an acrosome in the sperm head, form a flagellum, and dispose of the excessive cytoplasm [10]. Therefore, if a mutation in AZIN2 causes impaired spermiogenesis, it may lead to morphological defects in the spermatozoa.
The methyltransferase like 3 gene (METTL3) is located on SSC7, in an overlapping QTL region for MOT and PROMOT in L2. According to Liu et al. [61], the METTL3 protein catalyzes the methylation and formation of N 6 -methyladenosine (m 6 A), which is the most prevalent and reversible RNA epigenetic modification in mammalian mRNA. Yang et al. [62] detected increased m 6 A contents in sperm RNA from patients with reduced sperm progressive motility, which was related to a higher expression of METTL3.
An overlapping QTL region for MOT and ABN was detected for L2 on SSC7 (Table 4), which includes the spermatogenesis associated 7 gene (SPATA7). This gene was first identified in rat and human spermatocytes and may be involved in preparing chromatin for the initiation of meiotic recombination [63]. According to Ferguson et al. [64], meiotic recombination ties homologous chromosomes together and facilitates proper segregation of chromosomes during meiosis. Errors in the formation of crossovers can result in the production of aneuploid gametes. Sun et al. [65] reported a higher frequency of sperm aneuploidies for some chromosomes in men with sperm morphological defects compared to men with normal sperm concentrations. Thus, the SPATA7 gene in this QTL region on SSC7 is considered the best candidate gene for MOT and ABN.

Conclusions
We identified several QTL regions that are associated with semen traits in two pig lines using the weighted single-step GWAS, which allowed detection of QTL in spite Fig. 4 Graphic scheme of pathways shared by genes found in network analysis for progressive motility. Only part of the cyclooxygenase pathway is presented. Cytosolic phospholipase A2 group IVA (PLA2G4A) is involved in cleaving arachidonic acid from phospholipids, preferentially (I). Then, the free arachidonic acid is metabolized to produce eicosanoids (including prostaglandins) in the process known as cyclooxygenase pathway (II-V). The genes prostaglandin-endoperoxide synthase 2 (PTGS2/COX-2, number II) and hematopoietic prostaglandin D synthase (HPGDS, number III) are involved in this pathway. The COX-2 enzymes catalyze prostaglandin H 2 (PGH 2 ) synthesis from arachidonic acid (II), providing PGH 2 for the action of HPGDS (III) and production of prostaglandin D 2 (PGD 2 ) in testes interstitial mast cells. PGH 2 can also be converted into prostaglandin E 2 (PGE 2 , number IV) and prostaglandin F 2α (PGF 2α , number V) of the small number of animals having both phenotypes and genotypes. A large part of the genetic variance of the semen traits was explained by different genes in the two lines but the gene network analysis revealed candidate genes for these two lines that are involved in shared biological pathways in the mammalian testes. These results can be used to search for causative mutations and for marker-assisted selection to enhance the production and quality of semen for a more efficient use of AI in pig breeding and production.