A gene-based radiation hybrid map of chicken microchromosome 14: comparison to human and alignment to the assembled chicken sequence.

We present a gene-based RH map of the chicken microchromosome GGA14, known to have synteny conservations with human chromosomal regions HSA16p13.3 and HSA17p11.2. Microsatellite markers from the genetic map were used to check the validity of the RH map and additional markers were developed from chicken EST data to yield comparative mapping data. A high rate of intra-chromosomal rearrangements was detected by comparison to the assembled human sequence. Finally, the alignment of the RH map to the assembled chicken sequence showed a small number of discordances, most of which involved the same region of the chromosome spanning between 40.5 and 75.9 cR(6000) on the RH map.


INTRODUCTION
The numerous efforts made these last years in the field of chicken genomics [2,12] come from the importance of this species in agriculture and its great value for research in virology, developmental biology, oncology and immunology [6]. Thus, a large genomic toolset was developed including a detailed consensus linkage map of the genome including over two thousand markers [15,16]. Large collections of chicken expressed sequence tags (EST) were released [1,4,37] and bacterial artificial chromosome (BAC) libraries [9,21] were constituted and used to assemble both local and genomewide chicken BAC contig maps [11,29]. BAC contigs are usually used as a platform on which full genome sequences are assembled. They also serve as a bridge between the genome sequence and the linkage map, the essential tool for QTL analysis [30]. In March 2004, a first draft assembly of the chicken genome sequence was released by the Washington University Genome Sequencing Center (WUGSC) and the National Human Genome Research Institute (http://www.genome.gov/11510730). Due to a significantly lower rate of interspersed repetitive elements, this draft chicken genome sequence is probably more accurate than the first draft human genome sequences originally published three years ago [18,38]. Nevertheless, the integration of all chicken genomic resources such as the BAC contigs, the genetic and the radiation hybrid (RH) maps, will be essential for the assembly of the whole genome sequence data to a reliable and more informative resource. Thus, in addition to the BAC contig map, an RH map will provide an independent platform to assist the chicken genome sequence assembly process towards a finished quality sequence. RH mapping is a powerful tool for locating genes since it uses the simple polymerase chain reaction (PCR): contrariwise to genetic markers, RH markers do not require the development of polymorphism. The whole genome radiation hybrid (WGRH) panel we produced [24] has already been used to build radiation maps for chicken microchromosome 15 [20], macrochromosome 4 and 7 [25,28], and maps of other chromosomes are under construction.
The first comparisons of chicken gene maps with those of humans have revealed an unexpectedly high level of conserved syntenies [8,14,27,32,33]. However, subsequent and more detailed mapping studies have revealed high levels of intra-chromosomal rearrangements within them [7,10,19,34,36]. The expectations are therefore, that the number of segments of the conserved gene order will increase with the number of genes mapped in the chicken.
To develop a dense RH map of chicken microchromosome GGA14, we adopted a strategy based on the use of markers from the genetic map to check the validity of the RH map and from EST or genes whose location on GGA14 could be predicted from known data on conserved syntenies. The human/chicken comparative data published in 2000 by Schmid et al. [31] showed that two genes (HBA and NTN2) localised on GGA14 were both located on HSA16p13.3, suggesting that this human chromosome should be used as a base for developing gene-based markers. More recently, the gene SREBP1, orthologous to SREBF1 localised on HSA17p11.2 in human, was also shown to be located on GGA14 by segregation analysis in the East Lansing reference family [3], providing another source for the development of markers.
The GGA14 RH map obtained was compared to the assembled human sequence to detect the chromosome rearrangements that have occurred in the lineages leading to humans and chickens. Finally, the RH map was compared to the newly available assembled chicken genome sequence in order to detect discordances pointing to potential assembly problems of the sequence.

Radiation hybrid panel
The production of the RH panel has already been described [24]. Briefly, normal diploid female chicken fibroblasts irradiated at 6000 rads by gamma rays from a Cesium-137 source were fused to the hypoxanthine guanine phosphoribosyl transferase (HPRT)-deficient hamster cell line, Wg3hCl2 [13]. The hybrid cells were selected on HAT (hypoxanthine-aminopterin-thymidine) media, tested for marker retention and subjected to large-scale culture for DNA extraction. The final panel was composed of 90 clones with a mean marker retention frequency of 22%.

Markers from the genetic map
Nine microsatellites (ADL0118, ADL0263, LEI0066, LEI0098, MCW0123, MCW0136, MCW0225, MCW0296 and ROS0005) and 2 SSCP (GCT0903 and GCT0908) markers from the two known genetic linkage groups attributed to GGA14 were used. In the course of the whole genome RH mapping work going on in the laboratory, the microsatellite marker ADL0205 was found linked to GGA14. More information for these 12 markers is available at http://www.thearkdb.org/browser or https://acedb.asg.wur.nl/.

EST (expressed sequence tag) and gene markers
All publicly available chicken EST (>420 000) from Genbank and other sources [1,4,37] were collected in a local database, after which EST selection and primer design were performed after comparison to the human genome sequence using the Iccare web server [26]. EST markers corresponding to the COQ7 and DREV1 genes had been previously developed in our laboratory and were found to be linked to the GGA14 RH linkage group. Finally, an additional gene marker was developed to map the chicken SREBP1 gene, orthologous to human SREBF1 [3] (Tab. I).

Markers from the sequence assembly
Eight markers (SEQ0168, SEQ0170, SEQ0171, SEQ0172, SEQ0173, SEQ0174, SEQ0175 and SEQ0177) were designed directly from a portion of the GGA14 sequence assembly for which no markers existed on the GGA14 RH map. Primers are given in Table II.

PCR conditions
PCR were performed in 15 µL, containing 25 ng hybrid DNA, concentrations of MgCl 2 ranging from 1 to 3 mM as determined by test experiments, 0.3 U Taq DNA polymerase (Life technologies: Carlsblad, CA, USA), 1 X buffer (Life technologies), 200 µM of each dNTP, 0.2 µM of each primer and 1X loading buffer composed of 350 mM sucrose and 0.2 mM cresol. After denaturation for 10 min at 94 • C, 33 PCR cycles of 30 s at 94 • C, 30 s at the specific annealing temperature, 30 s at 72 • C, and a final elongation step of 10 min at 72 • C were performed. PCR products were analysed using 2% agarose gels and visualised using ethidium bromide staining. Chicken DNA was used as the positive control and Wg3hCl2 DNA and TE (Tris-EDTA) buffer as negative controls. Each marker was genotyped twice and a third genotyping experiment was performed when discrepancies between the first two experiments were found.

Map construction
All markers were scored as present or absent in each of the hybrids. Ambiguous results were also reported. Pairwise and multipoint data analysis was completed using RH2PT and RHMAXLIK programmes in the RHMAP3.0 software package [5,22]. We assumed random breakage along the chromosomes and equiprobable retention of fragments. The RH map was constructed in three steps: (1) a two point analysis identified markers linked together with an LOD score greater than 6, thus defining linkage groups; (2) multipoint analyses were done with RHMAXLIK to define a framework map with the markers from the largest linkage group, using a stepwise marker addition strategy and a LOD threshold of 3. The resulting framework map was further tested by removing one marker at a time and calculating the LOD score for all of its possible positions; (3) a comprehensive map was built by calculating the location of additional markers relative to the framework markers. Finally the map figure was created using MapChart, a software for the graphical presentation of linkage maps and QTL developed by Voorrips and colleagues [39].

Comparative mapping data -Alignment to the chicken sequence assembly
Data on human gene order were obtained from the Ensembl (http://www. ensembl.org/) or NCBI Mapview (http://www.ncbi.nlm.nih.gov/mapview/) browsers. Data on the location of chicken genes were obtained by BLASTN or BLAT searches using the Ensembl (http://www.ensembl.org/) or Golden Path (http://genome.ucsc.edu/) browsers.

Development of EST (expressed sequence tag) markers
Genes located in the regions of HSA16 and HSA17 identified as having a conservation of synteny with GGA14 were used to search for chicken orthologous EST with the Iccare software. Primers were then selected according to the constraints of RH mapping: species-specific amplification of an exonic region. Eighty such chicken EST could thus be used to design PCR primers suitable for RH mapping and 63 (80%) led to successful amplifications.
Markers corresponding to the COQ7, DREV1 and SREBF1 genes were added to our study (see Materials and Methods). Altogether, 66 EST or gene markers, as shown in Table I, were used to build the GGA14 RH map. However, mapping data concerning a few human genes have evolved since we designed the primers. Hence, GCT1270 is similar to the open reading frame Res4-22C now localised on HSA4 and UBC (GCT1264) is now on HSA12. Therefore, only 64 of our EST markers were orthologous to genes located on HSA16p13.3 or HSA17p11.2.

Construction of the GGA14 RH map, marker retention
Genotyping data on the ChickRH6 panel for the 12 genetic and 66 gene markers were used to generate an RH map. After two point analysis at a LOD Table I. Primer sequences for chicken markers corresponding to human genes and EST. Position on the genome sequence and chromosome information are from the Ensembl (http://www.ensembl.org/) or NCBI Mapview (http://www.ncbi.nlm.nih.gov/mapview/) genome browsers. Unknown: the genomic sequence exists but could not be assigned to a chromosome in the genome sequence assembly. No hits: the sequence was not found by BLASTN in the genome sequence.    threshold of 6, a large linkage group containing 60 markers (49 genes or EST, plus 11 genetic markers) was defined. By lowering the LOD threshold down to 4, the microsatellite marker ADL0118 could be added. A framework map 342 cR 6000 long, containing 23 markers and covering the entire chicken chromosome 14 was constructed (Fig. 1). Thirty-eight additional markers linked to GGA14, but whose positions were not supported by a LOD score greater than 3, are indicated on the side of the map to avoid size inflation and to keep track of different possible local orders.
The average retention rate for the markers was 23.7%, in accordance with the first estimates for the panel [24]. The retention frequency along the chromosome, as estimated by plotting the retention of the markers from the framework map against their positions, demonstrates that the higher retention rates were  [15]. For clarity, only the framework markers and additional markers present both on the genetic and radiation hybrid (RH) maps are shown. The RH map is in the middle. Markers positioned on the vertical double bars of the RH map are the framework markers and additional markers are indicated to the right of the map. Microsatellite markers are underlined. The microsatellite marker ADL0118 in brackets was added to the map by lowering the twopoint lod threshold from 6 to 4. GCT903 and GCT908 (boxed) are located on the same microchromosome by FISH [23]. The chart to the right indicates the retention rate of the framework markers. The marker MCW0123 which was used for selecting the clones for the panel is indicated by an arrow on the retention chart. The suggested centromeric region is indicated by a vertical bar to the right of the RH map.
← located around the microsatellite marker LEI0066, suggesting that the centromere could be located in this area (Fig. 1). The marker MCW0123, which was used for selecting the clones while constructing the panel [24] did not show a particularly high level of retention, when compared to the surrounding markers.

Colinearity between genetic and RH maps, resolution of the panel
Eleven loci (ADL0118, ADL0263, GCT0903, GCT0908, LEI0066, LEI0098, MCW0123, MCW0136, MCW0225, MCW0296 and ROS0005) were shared between the genetic and the RH maps (the microsatellite marker ADL0205 is not on the genetic map; MCW0225 corresponds to NTN2 on the genetic map).
Comparing both maps indicates a good overall agreement with several improvements in marker ordering. One major change comes from the localisation of the centromere around LEI0066, suggesting a reverse orientation for the GGA14 genetic map. Markers MCW0136, ADL0118 and GCT0903 are mapped with a higher precision on the RH map. Finally, the localisation of GCT0908 close to MCW0296 indicates that the genetic linkage group C37 is a part of GGA14.
The part of the RH map between LEI0066 and MCW0296 was 259.7 cR 6000 long while the genetic distance between these two markers was 72 cM. The ratio between the two maps was thus 3.6 cR 6000 to 1 cM.

Comparison to the assembled GGA14 sequence
When compared, the GGA14 genome sequence assembly and the RH map presented an overall good colinearity (Fig. 2), although two major discrepancies could be detected.
The first concerned a region from positions 1.554 to 2.063 Mb, terminal on the sequence assembly, but found in position 40.5 to 75.9 cR 6000 of the RH map, shown in red in Figure 2. This region includes six genes (DECR2, DKFZp434F054, NUBP2, UBN1, PPL and ABCA3) on the sequence map, as well as the microsatellite marker ADL0205. On the RH map, it contains additionally Rab11, located on the sequence contig of GGA18, and the three markers STUB1, KIAA0643 and MGC15416, matching all three with sequence data of unknown location in the chicken genome sequence assembly. The second main discrepancy concerns the region between 14.573 Mb (HSCARG) and 20.310 Mb (MCW0225) of the sequence assembly, for which no marker could be found on the RH map, despite a high density. To test this region, we developed new markers: SEQ0168, SEQ0170, SEQ0171, SEQ0172, SEQ0173, SEQ0174, SEQ0175 and SEQ0177, directly from the genomic sequence. As a result, marker SEQ0177 was the only one to be linked to the GGA14 RH map, close to LEI0066. Markers SEQ0172, SEQ0173, SEQ0174 and SEQ0175 were linked by RH mapping to markers from GGA3 and markers SEQ0168, SEQ0170 and SEQ0171 were linked to markers not yet positioned on our RH maps.
Finally, a few improvements over the sequence assembly concerned three markers corresponding to an existing genomic sequence of unknown location. One (LLGL1) was at position 239.4 cR 6000 . The two others (MRPS34 and BM045) at position 0 cR 6000 , extended the RH map further than the sequence assembly.

Development of the EST markers
The first constraint on the choice of primers for RH mapping was to avoid the presence of introns, whose positions in the chicken were predicted on the basis of the human genomic sequence. The second was to design primers in the most divergent regions of the human-chicken sequence alignments so as to avoid cross-amplification with the hamster DNA present in the hybrids. Using Iccare proved to be very efficient, with 80% of the primer pairs designed yielding usable RH mapping data, enabling the mapping of a high number of EST and genes on the GGA14 RH map. Moreover, the development of EST through the use of Iccare minimises the chances of choosing wrong ortholog genes, since it performs a BLASTN comparison of all chicken EST against the complete set of human Unigene clusters.

Marker retention and position of the centromere
Preferential retention in RH clones of chromosome fragments from pericentromeric regions of donor cells has been shown in various species, including humans [17,35] and chickens [25]. The retention frequency of markers shown in Figure 1 indicates a drop from 35% down to 15% over the first 200 cR 6000 from one end of the map, after which the retention of the markers varies only slightly, with values comprised between 15 and 20%. This data suggests a position for the centromere towards one end of the RH map and is therefore compatible with an acrocentric microchromosome. As a result of our observations and so as to position the centromere conventionally towards the top of the figures, we suggest to reverse the orientation of the genetic map and of the Mb counting on the sequence assembly.
A similar trend, with a drop of 45% down to 15% over a similar distance of 200 cR 6000 from the centromere was observed for GGA7, although this chromosome is twice the size of GGA14 [25]. This can explain the higher retention rate observed for microchromosome markers when constructing the panel [24], since they have a higher chance than macrochromosome markers of being close to the centromere.

Comparison to the genetic map
Fluorescent in situ hybridisation (FISH) experiments with the BAC clones P1-8 and P6-V11 from which the genetic SSCP markers GCT0903 and GCT0908 are derived, suggested that the small linkage group C37 could be linked to GGA14 [23]. Here we confirm this result by the inclusion of both GCT0903 and GCT908 on the RH map. The reason why these two linkage groups are independent on the genetic map is that GCT908 and COM0079 were only mapped in the Compton population and the nearby marker MCW0296 only on the Wageningen population. Due to the history of its development using three independent populations, the chicken genetic map still contains a number of small linkage groups, whose chromosome assignment has to be determined. The RH map also enables a greater precision for the mapping of the two markers MCW0136 and ADL0118. The ratio of 3.6 cR 6000 to 1 cM was close, although slightly lower, to the previous observation (4cR 6000 to 1 cM) for GGA7 [25]. The recombination rate of microchromosomes being higher than that of macrochromosomes, a lower cR 6000 to cM ratio was expected for GGA14 than for GGA7.

Comparison to the genome sequence assembly
Although the genomic sequence assembly for GGA14 covers a total of 20.4 Mb (Fig. 2), a few discrepancies were found with the RH map, one of them suggesting that a portion of 5 Mb from the sequence assembly (between HSCARG and MCW0225) belongs in fact for a large part to GGA3 and possibly to other chromosomes. As a confirmation, the microsatellite marker MCW0083, from the GGA3 genetic linkage group was also found at position 19.2 Mb of the GGA14 sequence assembly. This brings the length of the GGA14 sequence down to 15 Mb instead of 20 Mb. On the contrary, the RH map extends 19 cR further than the available sequence, with the addition of MRPS34 and BM045. The length of the RH map to be compared to the sequence was thus 324 cR 6000 and not 343 cR 6000 . By using these figures, the ratio between the two maps was 46 kb/cR 6000 . The previously published figure of 61 kb/cR 6000 for GGA7 used size estimations for this chromosome based on cytogenetic data [25]. The updated value for GGA7 using the genomic sequence assembly was thus 56 kb/cR 6000 . Similarly, a value of 37 kb/cR 6000 can be calculated for GGA15 [20]. The breaking of chromosomes by radiation is a physical process, suggesting similar ratios to be expected for the different chromosomes. Future studies will indicate if the differences between chromosomes observed here are due to structural differences, or to errors in the RH maps and/or the sequence assembly. Given the mean retention frequency of 23.7% of GGA14 in the ChickRH6 panel containing 90 clones, we expect an average of 21.3 observations of such breaks per marker and thus a mean expected resolution power of 215 kb. However, since the retention frequency varies along the chromosome from 35% close to the centromere down to an average of 15% elsewhere, the expected resolution of the panel will vary accordingly from 146 to 340 kb.
It is noteworthy, that the two main discrepancies between the RH map and the genome sequence assembly involves the region close to the centromere: a portion of GGA14 sequence from this region was moved to the telomere region and was replaced by sequence fragments from other chromosomes, mainly GGA3. It is noticeable, that the proportion of markers corresponding to existing sequence fragments of unknown location in the sequence assembly was higher in this region of the RH map than in the other regions and that the only marker of the GGA14 RH map found on another chromosome in the genomic sequence assembly (Rab1 on GGA18), was also mapped there. One explanation for the difficulties to assemble the sequence in this region could be a similarity between subtelomeric and pericentromeric repeats creating false joining of sequences.

Comparative mapping
By increasing the number of genes assigned to GGA14 to 48, we greatly improved the comparative mapping data available for this chromosome. If we consider a group of conserved gene segments as containing at least two genes, 10 groups of global conservation were found with the human region HSA16p13.3 and one with the human region HSA17p11.2, indicated by coloured boxes in Figure 3. This comparison also showed eight genes, indicated in black in Figure 3, which could not be assigned to the conserved gene segments we defined. Apart for MAP2K3 located on HSA17, all other seven genes (ABCA3, BM045, CLCN7, MGC15416, MRPS34, NUBP2 and UBE21) are located in the region between 1.2 and 4.9 Mb on HSA16, in which the highest density of genes and EST were developed. The higher level of resolution thus obtained may partly account for the detection of such small segments of conserved gene order, but it is also possible that this region has undergone a higher number of rearrangements.
The development of EST markers based solely on the prior information of synteny conservation with HSA16p13.3 and HSA17p11.2 does not enable us to rule out the possibility that some small regions from GGA14 correspond to other HSA regions. However, as previously noted in other detailed comparative mapping studies, despite the high level of synteny conservation, a high number of intra-chromosomal rearrangements can be observed between the human and chicken genome.
Due to the lack of precision on the length of the regions of conserved synteny with humans available at the beginning of our work, we extended out of them when choosing EST markers and thus developed markers for other chicken chromosomes. This was particularly true in the case of markers from HSA17, for which the region of conserved synteny appeared to be quite small. As a result, in addition to the conservation between HSA17 and GGA14 demonstrated by NT5M, SREBF1, LLGL1, PRPSAP2 and GRAP, small blocks homologous to regions located on GGA18 (in pink in Fig. 3) or GGA19 (in blue in Fig. 3) could be defined. Finally, ACCN1 is located on GGA27 (in green in Fig. 3) and no hits for GIT1 were found in the chicken genome sequence.

CONCLUSION
The first purpose of our work on the GGA14 RH map was to develop a dense map including a high number of genes, in order to validate the use of the ChickRH6 panel for a microchromosome and to provide detailed comparative mapping information. At the end of our project, the first draft chicken genome sequence was released and we used our data to test the sequence contig of GGA14. Although the sequence assembly is globally in good agreement with our data, we show that RH mapping can detect some errors, demonstrating its usefulness as a contribution towards a high quality assembly of the sequence.  Future developments towards a complete chicken RH framework map will now be based on the genomic sequence, using it for choosing STS markers regularly spaced along the chromosomes.