Fine mapping and replication of QTL in outbred chicken advanced intercross lines
© Besnier et al; licensee BioMed Central Ltd. 2011
Received: 11 August 2010
Accepted: 17 January 2011
Published: 17 January 2011
Skip to main content
© Besnier et al; licensee BioMed Central Ltd. 2011
Received: 11 August 2010
Accepted: 17 January 2011
Published: 17 January 2011
Linkage mapping is used to identify genomic regions affecting the expression of complex traits. However, when experimental crosses such as F2 populations or backcrosses are used to map regions containing a Quantitative Trait Locus (QTL), the size of the regions identified remains quite large, i.e. 10 or more Mb. Thus, other experimental strategies are needed to refine the QTL locations. Advanced Intercross Lines (AIL) are produced by repeated intercrossing of F2 animals and successive generations, which decrease linkage disequilibrium in a controlled manner. Although this approach is seen as promising, both to replicate QTL analyses and fine-map QTL, only a few AIL datasets, all originating from inbred founders, have been reported in the literature.
We have produced a nine-generation AIL pedigree (n = 1529) from two outbred chicken lines divergently selected for body weight at eight weeks of age. All animals were weighed at eight weeks of age and genotyped for SNP located in nine genomic regions where significant or suggestive QTL had previously been detected in the F2 population. In parallel, we have developed a novel strategy to analyse the data that uses both genotype and pedigree information of all AIL individuals to replicate the detection of and fine-map QTL affecting juvenile body weight.
Five of the nine QTL detected with the original F2 population were confirmed and fine-mapped with the AIL, while for the remaining four, only suggestive evidence of their existence was obtained. All original QTL were confirmed as a single locus, except for one, which split into two linked QTL.
Our results indicate that many of the QTL, which are genome-wide significant or suggestive in the analyses of large intercross populations, are true effects that can be replicated and fine-mapped using AIL. Key factors for success are the use of large populations and powerful statistical tools. Moreover, we believe that the statistical methods we have developed to efficiently study outbred AIL populations will increase the number of organisms for which in-depth complex traits can be analyzed.
In domestic animal populations, F2 crosses between divergently selected outbred lines are commonly used to map QTL [1–3]. However, only one generation of recombination occurs, in an F2 pedigree (gametes of the F1 generation) and linkage disequilibrium (LD) can be strong along the chromosomes. This long-range LD can be used to detect associations between QTL and markers even at a low marker density, e.g. one marker per 10 or 20 centiMorgans (cM) [1, 4]. However, because of the extensive LD, using an F2 design results in large confidence intervals for QTL locations  that potentially contain hundreds of genes.
To map QTL with a higher resolution, it is necessary to adopt a fine-mapping strategy that would ideally produce a QTL peak covering a chromosome region small enough to contain only a few genes. Such a strategy would facilitate identification of candidate mutations. Precision of fine-mapping relies on the use of dense SNP marker maps that provide genotypic information at cM or sub-cM intervals. However, using a high or medium high marker density in an F2 population provides only a moderate improvement in the resolution because the population has undergone only one generation of recombination . In such a population, most of the marker alleles are inherited together and share the same genetic information i.e. they are Identical By Descent (IBD) within the same haplotype block . Reducing the extensive linkage disequilibrium present in an F2 population requires breeding additional filial generations by repeated intercrossing, i.e. using F2 individuals to generate an F3 generation and so on, to form an Advanced Intercross Line (AIL) pedigree [6–8]. Each generation of breeding introduces new recombination events and thereby decreases LD between markers and QTL. Thus, an AIL makes it possible to re-analyze and fine-map QTL originally detected in the F2 generation.
Here, we developed new methods to fine-map QTL using data from an AIL produced from outbred lines. We used a nine-generation AIL pedigree produced from an intercross between two lines of chickens divergently selected for body weight at 56 days of age . All animals in the pedigree were genotyped using SNP markers located at approximately 1 cM intervals in nine chromosomal regions where significant or suggestive QTL had previously been identified in an independent F2 intercross between the lines [10, 11]. The nine regions were screened for QTL influencing body weight at 56 days of age (BW56), with the objective of replicating previous results and reducing the size of the confidence intervals containing the QTL.
To create the AIL used in this study, a large intercross pedigree was set up by crossing individuals from two divergent chicken lines, i.e. a High Weight Selected line (HWS) and a Low Weight Selected line (LWS), which were obtained as follows. A selection experiment initiated in 1957 was designed to select two chicken lines for high- and low-body weight from the same base population, which consisted of crosses between seven partly inbred lines of White Plymouth Rock chickens. Then, individuals with a high-body weight at 56 days of age (BW56) were selected as parents for the line HWS and chickens with a low BW56 were selected as parents for the line LWS .
Descriptive statistics for the AIL pedigree
Nb of males
contributing to the next
Average body weight
at 56 days of age
(g) ± SE
1522 ± 36
181 ± 5.4
663.9 ± 18.6
696.7 ± 8.1
594.6 ± 11.9
656.1 ± 14.0
770.1 ± 16.9
661.8 ± 17.2
373.6 ± 6.3
594.2 ± 5.7
Chromosome segments selected for the replication study
When detecting QTL by the variance component method, the covariance matrix (Π) of the random QTL effect must be estimated as an IBD matrix i.e. a n * n matrix that contains, at a given genomic position, the expected number of alleles IBD between all pairs of individuals in the studied population or pedigree . Therefore, to perform a genome-scan for QTL using a variance component based analysis, an IBD matrix is needed for each tested genome position τ.
IBD matrices can be estimated using methods based on descent tree likelihood , by Monte Carlo Marcov Chain methods , or deterministically . For the present study, we used a deterministic IBD estimation method that utilizes pedigree, genotype and haplotype information to infer IBD probabilities based on the approach described by Pong-Wong et al. . The use of haplotype information together with deterministic IBD estimation is, in the present case, motivated by its computational efficiency.
Most IBD estimation approaches [15–18] are based on genotype and pedigree information only, whereas marker-phases (or related measurement of the alleles inheritance pathway thorough the pedigree), which are needed to obtain the final IBD probabilities, are estimated alongside the IBD by the algorithm. Deterministic methods [17, 18] only use informative markers (where marker phase can be inferred without uncertainty) and therefore only uses part of the available information, whereas iterative methods [15, 16] provide better precision in IBD estimation, but also higher computational demand. However, if haplotypes are estimated by an independent routine, and included in the input data, it can increase the amount of information available for the deterministic algorithm. Here, we use an algorithm that first estimates haplotypes using a Genetic Algorithm based method , and subsequently includes this haplotype information to improve the accuracy of the deterministic IBD estimation. Because the present study involves analysing several times the same genomic region to test different hypotheses about the population structure, several versions of an IBD matrix are computed for the same locus. This is thus more efficient to isolate the computationally demanding part of the analysis (recursive haplotype estimation) in a preliminary step that only needs do be done once for each genomic region, and then to adopt a fast deterministic IBD estimation algorithm for the second step that is repeated several times for each locus.
Due to the recursive approach, the deterministic IBD estimation algorithm  is also flexible. IBD probabilities are computed recursively from the first (F0) to the last (F8) generation, which makes it possible to include a genetic covariance structure among the founder individuals of the pedigree. This covariance can be computed independently based on population history , or based on genetic data . The algorithm simply reads the matrix of covariance between founders of the pedigree as input data, and computes the IBD relationship of the last generations as a function of the given founder population structure.
Here, we estimated IBD probabilities recursively in the AIL pedigree as in , taking the covariance among the founders into account . Haplotypes were used as input if they were deemed robust  but individual marker genotypes were used when haplotype estimates were uncertain. The estimated IBD matrices at each tested location, τ, were then used as input in the variance component (VC) QTL analysis.
The classical VC approach to map QTL assumes that QTL alleles in the founder population are uncorrelated [22, 23]. Since the VC approach makes no distinction between the line origin of the alleles, this approach is not suitable to analyse outbred crosses between divergent lines [2, 21]. Therefore we used a VC approach  that accounts for correlation among QTL alleles within the founder lines.
where Π is the genotype IBD matrix calculated at chromosome position τ, is the genetic variance due to the QTL, I is the identity matrix, and is the residual variance.
where v * is the vector of the effects of m independent and identically distributed (iid) base generation QTL alleles, assumed normally distributed with variance , is the number of individuals in the base generation, Z is an incidence matrix of size n × m that relates individuals with the QTL alleles in the base generation, and e is the residual vector with variance .
The assumption of uncorrelated QTL allele effects in the base generation, implies that in model (2): V(v*) = where Im is an identity matrix of size m × m. Equivalently in model (1), the sub-matrix corresponding to the first rows and the first columns of Π (i.e. IBD relationships between the founders at locus τ) is an identity matrix.
The example given here illustrates a case where the founder population includes one individual from line A and three individuals from line B. The AIL population studied in the present article was obtained by crossing 30 LW with 29 HW animals. Since average body weights can vary considerably between generations (Table 1) and sexes, we fitted a mixed linear model with generation and sex as fixed effects.
We computed the score statistic at each marker as in  to test for QTL in the nine chromosome regions. Markers were assigned to their genomic locations using the dense consensus chicken genetic map [27, 28].
without including a polygenic effect in the model.
which differs from model A by including a random polygenic effect (a) in the model. The polygenic effects were calculated based upon the additive genetic relationship matrix between all animals in the pedigree.
When a significant QTL was detected in the chromosome segment scanned with either of the two alternative hypotheses (A or B), we tested whether the level of fixation within lines (ρ) was significantly different from zero (i.e. H0: ρ = 0). For simplicity, the correlation between effects of founder QTL alleles was assumed to be the same in both lines. Therefore, the same value of ρ was considered for the LWS and HWS lines. A likelihood ratio test was used to compare three alternative models that were defined based on assuming i) independence of alleles (ρ = 0), ii) fixation of alleles (ρ = 1), and iii) segregation of alleles (0 < ρ < 1). When model iii) was declared most likely (0 < ρ < 1), correlations within the LWS and HWS lines (ρ A and ρ B ) were estimated separately following the same approach as in the Flexible Intercross Analysis (FIA) described by Rönnegård et al. (2008) .
Significance thresholds for the scans were derived using randomization testing. Residuals estimated from the null model y = Xβ + e (model A) or y = Xβ + a + e (model B) were permuted. One thousand replicates of the phenotypic data were simulated with , (or for model B) where is the vector of permuted residuals, and ã are the estimated fixed and random effects obtained from the null model. For each replicate, the score statistic was calculated at every tested position in the selected region. As in Valdar et al. , the maximum score value from each replicate was then fitted to a generalized extreme value distribution using the evd library in R [30, 31]. 5% and 1% significance thresholds were then estimated respectively by the 95% and 99% quantiles of the fitted distribution.
Genomic location and genetic effect of the replicated QTL
-30.4 ± 6.7
36.4 ± 6.5
-0.32 ± 7.3
13.9 ± 7.2
22.4 ± 6.6
-14.6 ± 6.8
-7.2 ± 6.5
16.8 ± 6.7
-11.2 ± 10.3
44.0 ± 12.4
2.7 ± 7.129
-4.5 ± 7.0
-16.6 ± 6.9
20.8 ± 6.9
-15.2 ± 6.6
16.6 ± 6.7
-5.4 ± 6.7
11.5 ± 6.7
-11.5 ± 6.4
19.8 ± 6.2
Growth1 and Growth9.1 mapped to the same position in scans with model A or B. Growth4 was located within the same large interval (24 to 38 Mb) as previously, with a 4 Mb difference between estimates from scans using models A and B. Similarly, for Growth2 the same location was found with a 4 Mb difference between estimates from scans using models A (60 Mb) and B (64 Mb). For Growth12, two peaks were detected at 9 and 11 Mb with both models but the peak at 9 Mb was more significant in the scan with model B. Average effects of QTL alleles at Growth1, Growth2, Growth4, Growth6, Growth8, Growth9 and Growth12 were as expected: negative for the LWS alleles and positive for the HWS alleles. Effects of QTL alleles at Growth3 and Growth7 were cryptic, contrary to the original observation , with a positive effect for LWS alleles and a negative effect for HWS alleles.
Since QTL-mapping with AIL increases resolution compared to F2 designs, it is possible to test whether the studied segments contain one or several regions that contribute to the QTL effect. The QTL profiles initially obtained (Figure 1) suggested that several segments might contain more than a single signal. Therefore, a second scan was performed for the segments for which the detection of a QTL was replicated with model A. In this case, only the phenotypes of individuals from the last generation (F8, n = 400), with the lowest linkage disequilibrium, were included. IBD between these individuals were, however, computed using the genotypes from all individuals in the pedigree to obtain the best possible QTL genotype estimates. In this scan, a two-QTL model was fitted to evaluate the evidence for multiple linked QTL in these regions. In these analyses, the genetic variance of one of the two QTL in the region was included in the null model of the VC analysis while that of the other was included as a main effect. These analyses showed that there were two independent effects in the Growth9 region at approximately 20 Mb and 35 Mb and these were named Growth9.1 and Growth9.2, respectively. The two regions Growth9.1 and Growth9.2 are considered as different QTL in the rest of the manuscript (Table 3).
A preliminary analysis indicated that independence of the alleles was common in the present pedigree. We hereafter considered allele independence as the null hypothesis and then tested for possible fixation or segregation within the lines. When comparing the likelihood of these two alternative hypotheses of fixation (ρ = 1) or segregation (0 < ρ <1) of QTL alleles within founder lines, segregation was identified for Growth1 (P < 0.02) and for one (Growth9.1) of the two QTL on chromosome 7 (P < 0.05). For the other QTL, the model assuming independence (ρ = 0) of the alleles could not be rejected.
At Growth1, the estimated correlation of the allelic effects was 0.14 in the LWS line and 0.74 in the HWS line. For Growth9.1, the within-line correlation was 0.61 for LWS and 0.88 for HWS. For these two regions, the FIA model  indicates a higher level of fixation within the HWS line than within the LWS line.
Analysis of data obtained with an advanced intercross line originating from inbred founders is straightforward because alternative alleles of the markers are fixed in each founder line. In such designs, it is sufficient to collect the data from later generations in the pedigree and then use standard QTL mapping software developed for inbred intercross populations for the analysis. The major difference between the F2 and the following generations of the AIL is the increase in recombination events. However, QTL analysis with an AIL originating from outbred founders is not trivial because fixation of neither markers nor QTL can be assumed in the original lines. In order to maximize the power to replicate QTL detection and fine-mapping using an AIL produced from outbred founders, we propose that the genotypes and phenotypes of all the individuals in the pedigree and not just of those from later generations should be collected and analysed. In our work, we have applied this strategy to an experimental chicken dataset and analyzed the data for nine genomic regions for which significant or suggestive QTL had been previously identified with an F2 design between the same chicken lines [10, 11]. Two alternative models were used for QTL detection: (1) a model (A) without a random polygenic effect, which detected significant QTL in all nine regions and (2) a more stringent model (B) that included a random polygenic effect, which reduced the number of significant QTL to five regions. This difference in number of QTL detected is due to the fact that the covariance matrix of the polygenic effect included in model B is by definition very similar to the covariance matrix of the QTL effect when marker information is poor. The information content estimated from the IBD coefficients  appeared indeed to be lower in some regions where QTL was detected in model A but not in model B (Growth 3 Growth 7 Growth 8). However, one of the low information content regions (Growth 9.1) was nevertheless detected in both models. This makes it difficult to determine whether the loss of QTL with model B is due to false positive signals obtained with model A, or to the fact that marker information content is simply too low to distinguish between a QTL effect and a polygenic effect in a multigenerational intercross pedigree. Based on our results, it can be concluded that Growth1, Growth2, Growth4, Growth9.1 and Growth12 contain QTL that are strongly supported by both models. The allelic effects in these regions are positive for the HWS allele and negative for the LWS allele, as expected in an AIL resulting from a cross between two divergently selected lines. In addition to these five significant QTL regions detected with both models, the remaining regions (Growth3, Growth6, Growth7, Growth8, and Growth 9.2) are likely to contain QTL based on the analyses using model A. This may be resolved by further analyses with more informative markers.
Eight QTL acted in the same direction in both the F2 population and the AIL i.e. the effect of their HWS alleles were additive and led to higher BW56 as in , while two QTL acted in opposite directions. This difference can be explained by the lack of fixation of QTL alleles within the founder lines (as illustrated by the range of estimated allelic effects in Figure 3), where alleles with positive and negative effects were present in both the HWS and LWS lines. Since multiple alleles exist in both lines, the estimated difference between the average effects of alleles inherited from HWS and LWS animals is a mixture of high and low effect alleles. Thus, when analyzing a particular generation in a population, the results will depend on the actual set of alleles sampled from a limited number of ancestors. As the number of founders for each generation is rather small, genetic drift will have an influence on the results. It is worth noting that several of the QTL confirmed in our study were not detectable using methods that assume allelic fixation in the founder lines. Their detection relied on the use of a variance component approach that does not assume fixation. Another potential explanation for the difference in observed effects is epistasis, which is known to be strong among QTL in this pedigree . Therefore, the size of the genomic region containing the QTL and the direction of its effect depend on the genetic background at other loci. Differences in allele frequencies at interacting loci might influence the marginal effects of QTL and even lead to genetic effects that change direction depending on the allele frequency at the loci with which it interacts. Although an in-depth study of epistasis is beyond the scope of this paper, preliminary tests provide some strong evidence for epistasis in this pedigree, with, e.g., the LWS allele of Growth3 having a positive effect when combined with the HWS allele of Growth12 but a negative effect when combined with the LWS allele. A third possibility is that recombination in subsequent generations has disrupted linkage disequilibrium between linked QTL so that the QTL effect at the position tested in the current study deviates significantly from the one observed in the F2 generation.
The nine selected chromosome segments were first scanned for single QTL using a variance component approach. The assumption of a single QTL in each segment appears valid for all regions but Growth9. When including all phenotype data from the F2-F8 generations, the segment containing Growth9 had a complex QTL profile indicating multiple independent genetic effects. A two-QTL analysis was then performed including only the phenotypic data from the F8 generation. Since linkage disequilibrium is lower in this last generation, resolution should be higher when using this smaller dataset. In this analysis, the QTL region splits into two significant QTL at 20 and 35 Mb (Figure 2A). The peak observed between the two peaks at 30 Mb in the single-QTL scan is not significant, indicating that it is most likely a false "ghost" signal due to linkage with the two neighbouring regions.
Our scan permits the detection of QTL that segregate within the parental lines. Thus, it is a powerful approach to detect QTL in crosses produced from divergent outbred lines. It identified ten QTL in nine distinct chromosome regions. Two regions (Growth1 and Growth9.1) showed significant (p < 0.05) within-line correlations between allele effects. The estimated correlation of the within-line allele effects, calculated from the FIA model, was higher within the HWS line (0.74) then within the LWS line (0.14) for Growth1. This was consistent with the variability among allele substitution effects shown in Figure 3, which is larger among LWS alleles than among HWS alleles. For the remaining eight QTL, we could not reject the null-hypothesis of independent allelic effects (even within-line). Since the parental lines had been divergently selected and display highly divergent phenotypes, a stronger correlation of allelic effects in the founder lines was expected. However, segregation within lines is not unlikely, because the lines were not inbred, the QTL effects were rather small and the time of divergence between the lines was relatively short. This finding could, however, also be explained by a lack of power in the segregation analyses in this pedigree since it contains a large number of founder alleles with rather few observations for each inherited allele.
Here, we have produced, genotyped and analyzed a large AIL obtained from outbred parents. Most of the QTL originally detected in the F2 population were confirmed, which indicates that appropriately sized replication populations and powerful statistical tools are crucial to refine original QTL findings and dissect the genetics underlying complex phenotypes. Replicating the detection of the QTL and fine-mapping their location with an AIL strengthen the original findings, and validate AIL as a valuable tool to explore the genetic basis of complex traits. We also believe that the methods available now to analyze outbred intercross populations can be useful for in-depth genetic studies of a wider range of organisms and can provide answers to research questions that are not approachable using inbred model organisms.
OC was supported by grants from the Swedish Foundation for Strategic Research, the Swedish Research Council, the Swedish Research Council for Environment, Agricultural Sciences and Spatial Planning and the European Science Foundation (EURYI). LA was supported by grants from the Swedish Foundation for Strategic Research and the Swedish Research Council for Environment, Agricultural Sciences and Spatial Planning. The SNP technology platform in Uppsala was supported by the Knut and Alice Wallenberg Foundation (via Wallenberg Consortium North). We thank Tomas Axelsson and Kristina Larsson for assistance with genotyping in Uppsala. Genotyping was performed by the SNP&SEQ Technology platform in Uppsala, which is supported by Uppsala University and Uppsala University Hospital.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.