Genome-wide association and genomic prediction for host response to porcine reproductive and respiratory syndrome virus infection

Background Host genetics has been shown to play a role in porcine reproductive and respiratory syndrome (PRRS), which is the most economically important disease in the swine industry. A region on Sus scrofa chromosome (SSC) 4 has been previously reported to have a strong association with serum viremia and weight gain in pigs experimentally infected with the PRRS virus (PRRSV). The objective here was to identify haplotypes associated with the favorable phenotype, investigate additional genomic regions associated with host response to PRRSV, and to determine the predictive ability of genomic estimated breeding values (GEBV) based on the SSC4 region and based on the rest of the genome. Phenotypic data and 60 K SNP genotypes from eight trials of ~200 pigs from different commercial crosses were used to address these objectives. Results Across the eight trials, heritability estimates were 0.44 and 0.29 for viral load (VL, area under the curve of log-transformed serum viremia from 0 to 21 days post infection) and weight gain to 42 days post infection (WG), respectively. Genomic regions associated with VL were identified on chromosomes 4, X, and 1. Genomic regions associated with WG were identified on chromosomes 4, 5, and 7. Apart from the SSC4 region, the regions associated with these two traits each explained less than 3% of the genetic variance. Due to the strong linkage disequilibrium in the SSC4 region, only 19 unique haplotypes were identified across all populations, of which four were associated with the favorable phenotype. Through cross-validation, accuracies of EBV based on the SSC4 region were high (0.55), while the rest of the genome had little predictive ability across populations (0.09). Conclusions Traits associated with response to PRRSV infection in growing pigs are largely controlled by genomic regions with relatively small effects, with the exception of SSC4. Accuracies of EBV based on the SSC4 region were high compared to the rest of the genome. These results show that selection for the SSC4 region could potentially reduce the effects of PRRS in growing pigs, ultimately reducing the economic impact of this disease.


Background
For decades the swine industry has been battling the economically important disease of porcine reproductive and respiratory syndrome (PRRS), caused by the PRRS virus (PRRSV) but success has been limited. Since the infection mechanisms of the virus are not fully understood, vaccine development is a challenge [1]. There has been no genetic selection of pigs for PRRS resistance or tolerance due to the lack of good DNA markers for marker-assisted selection and lack of a good indicator trait for indirect selection.
Recently, a quantitative trait locus (QTL) was identified on Sus scrofa (SSC) 4 that explains a considerable amount of the total genetic variance for serum viremia and weight gain of weaned piglets following experimental infection [2]. In subsequent work, the effect of this~1 Mb region was identified in three unrelated populations, encompassing five experimental challenge trials of 200 weaned pigs [3]. Many of the single nucleotide polymorphisms (SNPs) in this~1 Mb region are in very high linkage disequilibrium, which makes identification of the causative mutation difficult; a single tag SNP (SNP WUR10000125) appears to track the effect of the region with the B allele being associated with PRRS tolerance [2,3]. The fore mentioned studies did not thoroughly investigate the rest of the genome, which may contain pertinent information for host response to PRRSV infection. The addition of three trials to the data analyzed in Boddicker et al. [3] has potentially increased the power in order to find these, likely smaller, effects.
The objectives of the current study were to (1) reestimate genetic parameters for host response to experimental PRRSV infection in growing pigs by including three additional, unrelated populations, (2) investigate additional genomic regions associated with weight gain and viremia in response to PRRSV challenge, (3) determine if there is a smaller informative haplotype block within the~1 Mb region on SSC4 associated with host response to PRRSV infection, (4) determine which haplotypes within the SSC4 region are associated with the favorable effects of the QTL for viremia and weight gain, and (5) investigate accuracies of genomic estimated breeding values (GEBV) for host response to PRRS based on the SSC4 region and the rest of the genome, by using cross-validation.

Methods
The Kansas State University Institutional Animal Care and Use Committee approved all experimental protocols for this study.

Study design
A detailed description of the design, data collection, and molecular techniques used in the PRRS host genetic consortium trials has been previously published [4]. Briefly, each challenge trial of~200 commercial pigs involved transporting animals at weaning (18-28 days of age) to Kansas State University, where they were subjected to a PRRSV challenge. Within a trial, pigs were from the same high health farm, except for trials 5 and 8, which each included pigs from two farms. All farms were free of PRRSV, Mycoplasma hyopneumoniae, and swine influenza virus. Upon arrival, pigs were randomly placed into pens of 10 to 15 pigs. After a 7-day acclimation period, pigs between 25 and 35 days of age (day 0), were experimentally infected intramuscularly and intranasally with 10 5 (TCID50) of NVSL 97-7985, a highly virulent PRRSV isolate [5]. Blood samples were collected at −6, 0, 4, 7, 11, 14, 21, 28, 35, and 42 days post-infection (dpi). Body weight was measured at 0, 7, 14, 21, 28, 35, and 42 dpi. Pigs were euthanized at 42 dpi. Trials 7 and 8 were stopped at 35 dpi due to facility availability.
Viremia was measured using a semi-quantitative TaqMan PCR assay for PRRSV RNA, as described in Boddicker et al. [2]. Assay results were reported as the log 10 of PRRSV RNA copies per mL of serum. Ear tissue was collected from all pigs for DNA isolation. Tissue or genomic DNA from the sires of pigs in trials 1 through 3 and from available sires and dams for trials 4 through 8 was supplied by the breeding companies. Tissues or DNA samples were sent to GeneSeek Inc. (Lincoln, Nebraska) for genotyping with Illumina's Porcine SNP60 BeadChip (San Diego, California).
Data from eight trials of up to 200 pigs were analyzed (Table 1). Trials 1 through 3 were described in Boddicker et al. [2] and included pigs of the same cross from a single breeding company. Trials 4 and 5 were described in Boddicker et al. [3] and included two unrelated populations from different breeding companies. Trials 6 through 8 are unique to this paper and were sourced from three additional breeding companies, with pigs that were unrelated to those in Boddicker et al. [2,3]. Pigs in trial 7 were sourced from the same breeding company as those in trial 4 but were produced using different sire and dam lines. Table 1 provides an overview of the population structure by trial. See Boddicker et al. [2] for further details on trials 1 through 3 and Boddicker et al. [3] for further detail on trials 4 and 5. Pigs in trial 6 were purebred Landrace (LR). Pigs in trial 7 were from Pietrain sires and Large White (LW)/LR/Yorkshire dams. Trial 8 pigs were from Duroc sires and Yorkshire/LR dams. Across all eight trials, 175 pigs died before 42 dpi (Table 1). Dead pigs were necropsied and subsequent gross and microscopic pathology by a board-certified pathologist identified PRRS associated disease as the major source of mortality, except for trial 6. Death loss was high in trial 6, 46% by day 42, due to secondary bacterial infections, as identified by pathology, including Escherichia coli, Streptococcus suis, Staphylococcus aureus, and Mycoplasma hyopneumoniae.

Phenotypic traits and pedigree
Details on the phenotypic traits analyzed are in Boddicker et al. [2,3]. Briefly, VL was quantified as the area under the curve for log-transformed serum viremia at 0, 4, 7, 11, 14, and 21 dpi. Weight gain to 42 dpi (WG) was calculated as body weight (BW) at 42 dpi minus BW at day 0 dpi; for trials 7 and 8 WG35 was calculated. Mortality was defined as death prior to the end of the experiment. Edits for trials 1-3 are in Boddicker et al. [2] and in Boddicker et al. [3] for trials 4 and 5. Briefly, edits for VL removed 34 individuals from the first 3 trials and 15 for trials 4 and 5. For WG, 47 and 18 individuals were removed, respectively. Edits removed 88 individuals from trials 6 through 8, with 86 due to death prior to 21 dpi and two with missing viremia data. For WG42, 111 individuals were removed from trials 6 through 8, all due to death prior to 42 dpi. The total number of individuals available after edits is in Table 2.
Recorded pedigrees (sire and dam) were checked and corrected using SNP genotypes, as described in Boddicker et al. [3]. Briefly, parent-offsping mismatch frequencies were calculated as the number of SNPs for which the parent and offspring had opposing homozygous genotypes divided by the total number of polymorphic SNPs for which the parent and offspring were both homozygous. If a parent-offspring pair had a mismatch frequency of less than 2%, then the named parent was accepted. Otherwise, offspring genotypes were compared to all possible parents and the most likely parent was chosen. The original pedigrees provided by the breeding companies were mostly correct but a few full-and half-sib families were reassigned to different sires and dams.

Statistical analyses
Heritabilities and variances due to litter were estimated based on validated and corrected pedigree relationships, as determined from SNP genotypes (see Boddicker et al. [3] for details), with a single-trait animal model using the software ASREML [6]. Sex and the interaction of trial and parity of the sow were included as fixed factors and pen within trial, animal, and litter as random effects. Piglets were born from parities 1 to 7. Parities 3 through 7 were combined into one parity class. The effect of farm of origin for trials 5 and 8 was not significant (p > 0.43) and was excluded from the analyses. The genetic correlation between traits was estimated using a bivariate animal model with the same fixed and random factors as used in the single-trait models.

Genome-wide association analyses
Associations of SNP genotypes with phenotypes were analyzed by fitting all SNPs simultaneously using Bayesian genomic selection methods [7], as implemented in the software GenSel [8], using the following mixed model: where y = vector of phenotypic observations, X = incidence matrix relating fixed factors to phenotypes, b = vector of fixed factors of sex, pen within trial, and the interaction of trial and parity class, z j = vector of the genotype covariate for SNP j (j = 1 to k) based on the number of B alleles using Illumina's (San Diego, California) genotype calling (coded 0, 1, 2, or equal to the trial average for missing genotypes), α j = allele substitution effect for SNP j, and δ j = indicator for whether SNP j was included (δ j = 1) or excluded (δ j = 0) in the model for a given iteration of the Markov chain Monte Carlo (MCMC). Pen within trial  was included as a fixed factor, as opposed to a random effect in ASREML, because the current version of GenSel does not allow additional random effects. A total of 50 000 iterations were run for each analysis, with the first 5000 iterations discarded as burn-in. The probability of δ i = 0 was set equal to π = 0.99. The Bayesian model was implemented using method Bayes-B [7]. Genomic regions associated with traits were identified using 1 Mb, non-overlapping windows using build 10.2 of the swine genome (http://www.ncbi.nlm.nih.gov/nuccore?term=199 Sus%20scrofa%2C%20whole%20genome%20shotgun%20 sequence, accessed November 1, 2011).

Haplotype analyses
The QTL previously identified on SSC4 included 38 SNPs, two of which had 0 called genotypes [2]. The remaining 36 SNPs in the~1 Mb region on SSC4 were analyzed to remove uninformative groups of SNPs, with the goal of reducing the size of the~1 Mb region. Linkage disequilibrium (LD) between SNPs in the~1 Mb region was determined using Haploview [9]. Using a univariate animal model in ASREML that included sex, the interaction of trial and parity of the sow, and genotype for SNP WUR10000125 as fixed factors and pen within trial, animal, and litter as random effects, each of the remaining 35 SNPs was fitted as an additional covariate, one at a time, to determine if the additional SNP accounted for effects not captured by SNP WUR10000125. The threshold for discarding a SNP from the region was p < 0.05. Parental and offspring genotypes for the 36 informative SNPs in the~1 Mb region on SSC4 were ordered using the PHASE software [10], separately for each trial. Haplotypes with high probabilities, as identified by PHASE software, were analyzed with MEGA5 software [11] to establish a phylogenetic tree. The neighbor-joining method, with the p-distance option, was used to create the tree. Haplotypes identified in the~1 Mb region and for the reduced region were analyzed.
Using the results from the phylogenetic tree for the reduced region, haplotypes that carried the B allele at SNP WUR10000125 were allocated to two groups based on phylogenetic distance from one another. The main effect of haplotype group was fitted as a fixed class factor in the abovementioned ASREML analysis to determine which haplotypes were associated with the desirable phenotypes of VL and WG. Due to the apparent dominance effect of the B allele, an individual had to have at least one copy of a haplotype that carried the B allele to be placed in that group. If the two groups of haplotypes that carried the B allele were not significantly different from each other but were significantly different from the group of haplotypes that carried the A allele, both B haplotypes groups were assumed to carry the favorable allele of the causative mutation. A similar procedure was used for the A haplotypes to determine whether all A haplotypes were associated with an unfavorable phenotype.

Cross-validation
Accuracy of genomic predictions across populations was evaluated by cross-validation, which involved training on one population and validating on another population. Populations were defined by trial, except for trials 1 through 3, which were considered as one population, since pigs included in those three trials were crossbreds from the same lines and breeding company. Each population was validated twice, using a 'reduced' and a 'full' training population ( Table 3). The training populations included all populations except for the population that was being validated, with some exceptions. For the reduced training populations, only one of the first three trials was used. Trial 3 was used for this purpose because it had the highest estimates of heritability of the three trials for both VL and WG (0.45 and 0.50, respectively). For the full training populations, trials 1 through 3 were included when validating on trials 4 through 8. When validating on trials 1 and 2, trial 3 was included in the training population, which resulted in some of the animals in the validation population to be related to animals in the training population.
To allow predictive ability of the rest of the genome, excluding the SSC4 region, to be assessed, estimates of allele substitution effects for the training population were acquired using the additive Bayesian model described above but with genotype at SNP WUR10000125 included as a fixed class effect to account for the additive and non-additive effects of the QTL in this region on VL and WG [2]. Method Bayes-CPi was used rather than Bayes-B because it assumes a homogenous variance across all SNPs, which appeared appropriate given the results of the GWAS, and estimates parameter π from the data [7]. The starting value of π was set to 0.99. The GEBV of individual i in the validation population for the rest of the genome (excluding the SSC4 region) was predicted as: where k = number of SNPs included in the prediction (58 277 SNPs), z ij = genotype covariate of SNP j for animal i (coded 0, 1, 2, or trial average for missing genotypes), andα j = allele substitution effect estimate for SNP j based on analysis of the training population. To estimate the accuracy of predictions based on the SSC4 region alone, the estimates of the additive and dominance effects of SNP WUR10000125 from the GenSel analysis of the training population were used to estimate the genotypic value based on the observed genotype for SNP WUR10000125 of each individual in the validation population. Accuracy of prediction, defined as the correlation between the GEBV and the true BV, were estimated as the correlation between the GEBV and adjusted phenotypes in the validation population divided by the square root of heritability ( Table 2) obtained when analyzing all eight trials together [3]. Here, adjusted phenotypes were phenotypes adjusted for estimates of fixed effects (sex, pen within trial, and the interaction of trial and parity class) within the validation population. Estimates of accuracies of GEBV across all eight trials were then obtained by correlating adjusted phenotypes with GEBV deviated from their respective trial means, and dividing by the square root of heritability.

Genetic parameters
Estimates of heritability, litter effects, and phenotypic and genetic correlations for VL and WG are in Table 2.
Heritabilities for VL and WG were moderate at 0.44 and 0.29, respectively, and were of similar magnitude as those reported by Boddicker et al. [3] using a subset (trials 1-5) of the data analyzed here. In comparison to the estimates from the first five trials, the proportion of variance explained by litter effects decreased for VL (0.11 to 0.09) and increased for WG42 (0.09 to 0.12). These estimates of heritability provide solid evidence that genetic improvement of host response to PRRSV infection is possible. Phenotypic and genetic correlations between VL and WG were moderate and negative.
Further analysis of the SSC4 region SSC4 effects in trials 6-8 Results of single-marker analyses for the most significant SNP in the SSC4 region (WUR10000125) that was identified by Boddicker et al. [2] for VL and WG42 for all trials combined, and trials 6 through 8 individually, are in Figure 1. Results for SNP WUR10000125 for trials 1 through 3 are in Boddicker et al. [2] and in Boddicker et al. [3] for trials 4 and 5. This SNP was significant for WG42 ( Figure 1A) for trials 6 (p < 0.01) and 8 (p < 0.004), where susceptible AA animals gained 4.8 kg and 3.5 kg less, respectively, than AB animals. Although the effect was in the same direction for trial 7 for WG42, there was no statistical significance between the genotypes for this trial (p > 0.57). For VL ( Figure 1B), the SNP was significant for trials 6 (p < 0.005) and 7 (p < 0.001) but not for trial 8 (p > 0.56). For trial 8, the effects of the SNP on VL were, however, in the same direction as for the other trials (i.e. increased VL for AA animals compared to AB animals). The non-significant effects of the WUR SNP on WG in trial 7 and on VL in trial 8 are probably false negatives because the effect of the SNP was significant for the other trait in each of these trials; it is less likely that the significant effects in trials 7 and 8 are false positives because the effect is present in the other trials for both traits [3]. These results show that the effect of this region was present in six unrelated populations from five different breeding companies.

Reducing the region on SSC4
The region on SSC4 spans 1 284 081 base pairs and includes a total of 38 SNPs that are on the 60 k SNP panel, including two that are fixed and two for which no genotypes were called. Figure 2 shows a LD plot of the 34 polymorphic SNPs across trials 1 through 8. Haploview identified five haplotype blocks, including a block (Block 2) of 15 SNPs that spanned 487 kb that had very high LD amongst most SNPs. This block harbors SNP WUR10000125, which captured over 99% of the effects of the SSC4 region on VL and WG [2]. In general, LD is not expected to be the same between breeds and unrelated populations over such long distances. As an example, Amaral et al. [12] found very different LD patterns between a LW population, Ningxiang (a Chinese breed), and the European wild boar, across three different genomic regions. Differences in LD can result from mutation, recombination, selection, and drift. Block 2 is unique in the sense that LD was very high across six unrelated populations that consisted of different breeds (Figure 2). Table 4 shows the p-value for the effects of each SNP in the SSC4 region on VL and WG42, when the effect of SNP WUR10000125 is accounted for in the model. The SNPs located before and after Block 2 all had p-values greater than 0.05 for both VL and WG42, indicating that these SNPs do not capture a significant amount of additional variation in VL or WG42 that is not already accounted for by SNP WUR10000125. Therefore, these SNPs are probably not in high LD with the causative mutation and were excluded from the candidate region. Many of the SNPs within Block 2 also had p-values greater than 0.05 but, with the high LD present and confounding among SNPs, Block 2 was not further reduced in size. Given these results, the~1 Mb candidate region was reduced to a 487 kb region that probably harbors the causative mutation.

Haplotype analysis
A total of 77 unique haplotypes were identified in the original~1 Mb region across all eight trials, 11 of which carried the B allele at SNP WUR10000125 (B haplotypes). After reducing the region to 487 kb, only 19 unique haplotypes remained, of which four carried the B allele. Table 5 shows the frequency of each B haplotype, along with the trials each haplotype was present in. Eight of the original eleven B haplotypes were combined into one haplotype (haplotype 17) after the region was reduced. This was the most common haplotype and it was present in all trials and in every parental breed. Therefore, it is probably the ancestral haplotype, with the remaining B haplotypes originating from recombination or mutation since the breeds were established. Figure 3 shows the phylogenetic tree for haplotypes in the 487 kb region, generated using Mega 5 software. Using only the unique haplotypes, the tree clearly grouped the haplotypes based on the allele present at SNP WUR10000125, with the B haplotypes grouped at the bottom of the tree in Figure 3.
Starting with haplotype 17 in Figure 3, haplotypes were grouped by phylogenetic distance based on two cuts in the phylogenetic tree to determine whether all B haplotypes carried the favorable causative variant. If an individual had at least one of the haplotypes to the right of the cut, the individual was placed in the group to the right of the cut, because of the identified dominance mode of action of the QTL [3]. To create grouping 1, the tree was cut to the left of haplotype 17B and to the left of 18B, to create the following haplotype groups: haplotype 17B vs. 15B plus 18B vs. all A haplotypes. Only one copy of haplotype 19B was present across trials and that animal's other haplotype was 17B, so this animal was assigned to haplotype 17. Other groupings were created by moving the first cut up the tree, resulting in the groupings specified in Figure 4.
Least squares means of the effect of each haplotype group for each grouping are in Figure 4. Results showed that all B haplotypes were associated with the favorable phenotype, and were significantly different from haplotypes that carried the A allele (p < 0.001) for both VL and WG42, except for grouping 2. For grouping 2, haplotype 18 had significantly lower VL than the other B haplotypes but was not significantly different from B or A haplotypes for WG. Haplotype 18 had a total of 21 copies (20 individuals) and was specific to trial 5; therefore, the results for group 2 may represent effects specific to this trial and this may confound the estimates of the effects of this haplotype. Haplotype 14, which was at the base of the A haplotypes (Figure 3), was not significantly different from the other A haplotypes (p > 0.83) and tended towards significance from the B haplotypes (grouping 4, p < 0.09) for VL. For WG42, haplotype 14 was not significantly different from the other B and A haplotypes. Haplotype 14 only had 11 copies and the SE associated with the effect of haplotype 14 were large. Moving further up the tree, the group with A haplotypes 14, 13, and 11 was not significantly different from the other A haplotypes but was significantly different from the B haplotypes for both VL and WG.
In summary, the phylogenetic analysis segregated the haplotypes primarily by the allele at SNP WUR10000125, and there was a significant difference between the A versus B haplotype groups. Due to the high LD in the 487 kb region, relatively few haplotypes existed across the different populations for this region. This may indicate that there is little recombination in this region. In fact, across all~1600 animals evaluated, only one recombination could be identified within the~1 Mb region on SSC4. This recombination occurred between SNPs 31 and The bolded area represents Block 2 of Figure 2; 2 VL viral load calculated as area under the curve of log-transformed viremia between 0 and 21 days post-infection; 3 WG weight gain from 0 dpi to 42 dpi. Figure 2, outside the 487 kb region. Therefore, this is a region with very little recombination, explaining the relatively few haplotypes that were present across all trials.

Additional regions associated with VL
The results from the GWAS for VL are presented in Figure 5A. The top ten 1 Mb windows that explained the greatest percentage of genetic variance are in Table 6. The region on SSC4 explained the largest percentage of genetic variance at 13.2%. The remaining top 10 regions quickly dropped to less than 1% of the genetic variance and ranged from 1.24% to 0.39%. The second and third ranked regions will be further discussed. A region on SSCX, which included two consecutive 1 Mb regions, accounted for 1.73% of the genetic variance across all trials for VL but only 0.02% of the genetic variance for WG ( Figure 5A, Table 6). There have been no reports of QTL associated with traits related to growth within 3 Mb up or down stream of this region (http:// www.animalgenome.org/cgi-bin/QTLdb/SS/index). The region does include one reported QTL for blood pH in Meishan and Pietrain pigs infected with Sarcocystis miescheriana [13].
A region on SSC1 accounted for 0.70% of the genetic variance for VL across all trials and was the third highest region for VL ( Figure 5A, Table 6). This region also explained only a small percentage of genetic variance for WG (0.03%). Three QTL associated with health traits have been reported for this region, including C3c concentration, alkaline phosphatase activity, and white blood cell counts (http://www.animalgenome.org/cgi-bin/QTLdb/SS/ index). However, none of these QTL were identified under a PRRS challenge. Interesting candidate genes within 2 Mb on either side of this 1 Mb window include DBC1 (deleted in bladder cancer 1) and tumor necrosis factor (TNF) receptor-associated factor 1-like (www.animalgenome.org). The DBC1 gene is a tumor suppressor gene that is associated with programmed cell death [14]. The TNF receptor-associated factor 1-like gene is associated with antiviral activity [15].

Additional regions associated with WG
The results from the GWAS for WG are presented in Figure 5B and the top 10 windows that explained the greatest percentage of genetic variance are in Table 6. Again, the region on SSC4 explained the largest percentage of genetic variance, at 9.1%. The percentage of genetic variance explained by the remaining regions ranged from 0.43% to 2.61%.
A region on SSC5 was the second highest and accounted for 2.6% of the genetic variance across all trials ( Figure 5B, Table 6). This window does not appear to have an association with VL, as the percentage of genetic variance for VL was only 0.06%. Reports of QTL in this region associated with health in the pig include QTL for cholesterol level, haptoglobin concentration, alkaline phosphatase activity, interleukin-2 level, and interferon-gamma level (http://www.animalgenome.org/cgi-bin/QTLdb/SS/index). Interleukin-2 and interferon-gamma are both cytokines that respond to pathogen invasion of host cells. Interferongamma has been shown to inhibit PRRSV replication in macrophages [16,17]. There were two reports of QTL associated with average daily gain of healthy pigs that span the Table 5 Haplotypes for the 487 kb region (with arbitrary numbers corresponding to the phylogenetic tree in Figure 3) that contain the B allele for SNP WUR10000125 on chromosome 4, along with the SNP sequence and the number of copies present in trials 1 through 8 Letter in bold is SNP WUR10000125.   Table 6) but only 0.06% of the genetic variance for VL. This region is within a~5 Mb region (Mb 24 -29) that contains several Swine Leukocyte Antigen (SLA) genes within the swine Major Histocompatibility Complex (www.ensembl.org, accessed March, 2013). To date, there have been 19 reports of QTL associated with production traits in this region, 18 of which were associated with BW or average daily gain from birth to market weight (pig genome assembly 10.2, animalgenome.org, accessed March, 2013). However, these QTL were identified using healthy, non-challenged pigs. The QTL identified here is for WG under PRRSV challenge and could be a new QTL; alternately one of the previously reported QTL could also be associated with WG under PRRSV challenge. Additional work is required to understand the underlying mechanism.

Cross-validation
Accuracies of predictions based on the SSC4 region and based on the whole genome including the SSC4 region, were previously reported by [3], using trials 1 through 3 for training and trials 4 and 5 for validation. On average, accuracies of GEBV based only on the SSC4 region were higher (0.24 for VL and 0.33 for WG) than accuracies of GEBV based on the whole genome, including SSC4 (0.20 and 0.31, respectively). Given those results, the hypothesis here was that the rest of the genome has much lower predictive ability than the region on SSC4.
Accuracies of predictions based on SNP WUR10000125 and GEBV for the rest of the genome excluding the SSC4 region are shown in Figure 6A   Accuracies were, on average, much higher for SNP WUR10000125 than for the rest of the genome. Accuracies of predictions of genotypic values based on SNP WUR10000125 were very similar for the full and reduced training data but varied between validation trials from 0.08 to 0.51 for VL, with an accuracy of 0.48 and 0.54 when all trials were analyzed jointly, using the full and reduced training data, respectively, and from 0.06 to 0.75 for WG, with an accuracy of~0.34 for the joint analyses. For GEBV based on the rest of the genome and the reduced training data, accuracies centered around zero for VL, ranging from −0.23 to +0.18 for individual validation trials, with an accuracy of 0.10 and −0.04 for the joint analysis with the full and reduced training data, respectively. For WG, accuracies based on the rest of the genome tended to be positive, ranging from −0.14 to +0.31, with an accuracy of 0.21 and 0.07 for the joint analysis with the full and reduced training data. For the full training populations, accuracies increased substantially for trials 1 and 2, except for WG for trial 2, but had little impact on the other trials. These increases in accuracy for trials 1 and 2 were as expected because the full training data included data from the same crossbred genetics as used for validation.
Reasons for the generally observed poor predictive performance of GEBV across populations can be attributed to dominance, epistatis, and differences in linkage disequilibrium [18]. Accuracies of GEBV are generally lower when training and validation populations are more distantly related. For example, Saatchi et al. [19] partitioned data from Angus bulls into five groups using two methods; random allocation and k-means clustering, with the latter aiming at increasing relationships within groups and decreasing relationships between groups. On average, accuracies of GEBV when cross-validation in groups created by random allocation were higher than when groups were created by k-means clustering (0.65 vs 0.44). They concluded that this was due to closer  relationships between training and validation populations when using random allocation. In the current study, the reduced training populations were not related to the validation populations. However, when validating on trials 1 and 2, trial 3 was included in the full training population, which resulted in some relatedness between training and validation populations. Inclusion of trial 3 in the training population resulted in increased accuracies for trials 1 and 2 compared to the absence of trial 3 in the reduced training population, which agrees with the results of Saatchi et al. [19]. However, SE were large for all estimates of accuracy. Furthermore, the increase in relationships between training and validation populations was confounded with the increased size of the training population.

Conclusions
Traits associated with PRRS following experimental infection in growing pigs appear to have a strong genetic component, with moderate estimates of heritability for VL and WG, along with a large QTL on SSC4. Strong LD is present in the SSC4 region. However, combining all eight trials broke some of the LD in the~1 Mb region, resulting in a smaller haplotype. This reduced haplotype block could be useful to identify the causative mutation. A phylogenetic analysis separated the haplotypes by the allele at the SNP that explains nearly all the genetic variance for VL and WG that is contributed by the region, providing additional evidence that SNP WUR10000125 is the most informative SNP in the region (along with three other SNPs that were in complete LD Estimates of accuracy of genomic predictions in the validation population. Accuracy based on the correlation between predictions and phenotypes adjusted for fixed effects, divided by the square root of heritability, for predictions obtained using reduced or full training populations (see Table 3) with the WUR SNP) and is likely in very high LD with the causative mutation. The most common B haplotype was present in all breeds and lines represented in this study and was likely present before the segregation of the breeds. Cross-validation for the SSC4 QTL resulted in high estimates of accuracy when validating across populations for VL and WG, which can benefit selection strategies without infecting piglets with PRRSV. After accounting for the SSC4 region, the rest of the genome had little predictive ability for VL and WG across unrelated populations. Additional work is required to determine predictive ability within a population. With the effect of the SSC4 region present in all breeds and lines analyzed here, there is a good possibility that the effect is present in more populations and that genetic progress for PRRS tolerance or resistance can be made. However, additional work is needed to verify the effects of this region on challenges with different PRRSV strains and in the field. Other genomic regions associated with VL and WG were identified also but with small effects. In general, these regions will probably not play a major role in reducing the economic impact of PRRS.
The genomic region identified in this study can be used for marker-assisted selection for response to PRRS in growing pigs. The implementation of selection for these markers will not make pigs completely resistant to PRRS but will increase the well-being of the animals and reduce the financial impact of PRRSV infections by limiting reductions in growth rate and other negative effects of PRRS following infection. Further work is required to investigate the effects of these QTL against different PRRSV strains and in the field where pigs are subjected to many additional environmental stressors.