Comparative analysis of vertebrate EIF2AK2 (PKR) genes and assignment of the equine gene to ECA15q24–q25 and the bovine gene to BTA11q12–q15

The structures of the canine, rabbit, bovine and equine EIF2AK2 genes were determined. Each of these genes has a 5' non-coding exon as well as 15 coding exons. All of the canine, bovine and equine EIF2AK2 introns have consensus donor and acceptor splice sites. In the equine EIF2AK2 gene, a unique single nucleotide polymorphism that encoded a Tyr329Cys substitution was detected. Regulatory elements predicted in the promoter region were conserved in ungulates, primates, rodents, Afrotheria (elephant) and Insectifora (shrew). Western clawed frog and fugu EIF2AK2 gene sequences were detected in the USCS Genome Browser and compared to those of other vertebrate EIF2AK2 genes. A comparison of EIF2AK2 protein domains in vertebrates indicates that the kinase catalytic domains were evolutionarily more conserved than the nucleic acid-binding motifs. Nucleotide substitution rates were uniform among the vertebrate sequences with the exception of the zebrafish and goldfish EIF2AK2 genes, which showed substitution rates about 20% higher than those of other vertebrates. FISH was used to physically assign the horse and cattle genes to chromosome locations, ECA15q24–q25 and BTA11q12–15, respectively. Comparative mapping data confirmed conservation of synteny between ungulates, humans and rodents.


INTRODUCTION
The eukaryotic translation initiation factor 2-alpha kinase 2 (EIF2AK2, also known as PKR or PRKR) is an important component of the host innate immune antiviral response [20]. Double-stranded RNA (dsRNA) synthesized during viral infection binds to and activates EIF2AK2. Activation by dsRNA causes autophosphorylation of EIF2AK2 and allows this kinase to phosphorylate its natural substrate, the 1-alpha subunit of eukaryotic translation initiation factor-2 (EIF2S1, also known as eIF-2alpha). Phosphorylation of this initiation factor results in inhibition of protein translation and viral replication [25].
Previous reports described the structures of the human and mouse genes encoding the translation initiation factor 2-alpha kinase 2. The human (Homo sapiens; HSA) EIF2AK2 gene (HSAEIF2AK2) consists of 19 exons including four non-coding exons (1, 2, 18, and 19). Three alternative splicing acceptors in the second HSAEIF2AK2 exon have been reported [12]. Additional alternative splice variants that differ in their 3' non-coding regions were previously listed in the Celera database; the HSAEIF2AK2 mRNA BM473760 contains exon 17 extended with a significant portion of un-spliced intron 17 but exons 18 and 19 are absent in this transcript, while the transcript AA639687, contains exons 17, 18 and 19 but not intron 17. The product of the HSAEIF2AK2 gene is a 68 kDa protein that is ubiquitously expressed at low levels. The N-terminal part of this protein contains two dsRNA binding motifs (dsRBM) [6,18]. The C-terminal EIF2S1 kinase domain includes 12 sub-domains that are conserved in several species [9,25].
We report here characterization of the structures of the equine (Equus caballus; ECA), canine (Canis familiaris; CFA) and rabbit (Oryctolagus cuniculus; OCU) EIF2AK2 genes. In addition, we performed a phylogenetic comparison of the EIF2AK2 genes of the horse, dog, cattle, pig, primates, rodents, chicken and gold fish. The equine and bovine EIF2AK2 genes were localized on horse and cattle chromosomes, respectively, by fluorescent in situ hybridization (FISH) with several bacterial artificial chromosome (BAC) clones.

Animals
Necropsy tissue samples from 1 Arabian, 1 Hanoverian, 1 Paint, 4 Quarter, 3 Thoroughbred and 6 mix-breed horses were used for genomic DNA extraction by the standard phenol and chloroform method [24]. Table I. Primers used to amplify full-length CFAEIF2AK2, OCUEIF2AK2 and partial ECAEIF2AK2 cDNA as well as to sequence exon/intron junction regions in the ECAEIF2AK2 gene.

cDNA sources
Using an RNeasy Mini Kit (Qiagen, Valencia, CA), total RNA was extracted from peripheral white blood cells (PWBC) from a Quarter horse and converted into first-strand cDNA with ThermoScript RNase H − Reverse Transcriptase (Invitrogen, Carlsbad, CA) using an oligo-dT primer. This singlestranded cDNA and a commercial dog kidney cDNA (BioChain, Hayward, CA) were utilized to amplify partial equine and canine EIF2AK2 cDNA sequences and extend them using a DNA Walking SpeedUp Kit (Seegene USA, Del Mar, CA) according to the manufacturer's protocol. Full-length EIF2AK2 cDNA was generated from rabbit kidney cDNA (Seegene) using the 3'RACE OcuPRKR-F and 5'RACE OcuPRKR-R primers (Tab. I) according to the manufacturer's protocol.

BAC clones
Four high-density filters for segment 1 of the CHORI-241 equine genomic BAC library were purchased from the Children's Hospital Oakland Research Institute (CHORI), Oakland, CA. These filters were screened using a 32 P-labeled equine EIF2AK2 cDNA probe according to the supplier's protocol. Positive equine BAC clones as well as the bovine CH240_360J16 BAC clone were purchased from CHORI. Each of the BAC clones was grown individually in 500 mL of LB media. BAC DNA was isolated using the NucleoBond BAC Maxi Kit (BD Biosciences Clontech, Palo Alto, CA) and used as the template for direct partial sequencing with a BigDye terminator v1.1 Cycle Sequencing Kit on an ABI 3100 Genetic Analyzer according to the manufacturer's recommendations. The ends of the BAC clones were sequenced using universal SP6 or T7 primers. Additionally, the BAC clone inserts were removed by NotI digestion and the insert lengths were estimated by pulsed-field electrophoresis.

Exon sequencing
The sequences of the human EIF2AK2 exons (www.ensembl.org) were aligned with equine cDNA using the bl2seq program (www.ncbi.nlm.nih.gov) to predict a potential exon structure for the equine EIF2AK2 gene. Predicted exon sequences were used to design the exon-specific primer pairs listed in Table I that were utilized to partially sequence the BAC clones. Since ECAEIF2AK2 exons 5, 6 and 7 are located close to each other in genomic DNA, exon/intron junctions in exon 6 were sequenced using primers EcaPKR-ex5F and EcaPKR-ex7R. Similarly, the two closely located 3' terminal exons, 15 and 16 were amplified from the ECAEIF2AK2 gene and sequenced using primers EcaPKR-ex15F and EcaPKR-ex16R.

Analyses of DNA sequences
The TFSEARCH computer program (www.cbrc.jp) was used to search for potential promoters upstream of the EIF2AK2 genes. The GenBank gss database was searched against sequences of the equine and bovine EIF2AK2 cDNA using the blastn program [1]. The njtree program was used to construct a phylogenetic tree with distances calculated by standard methods [13,17] and tree topology was inferred by the Neighbor-Joining algorithm [23]. Pairwise distances between sequences were calculated according to Gu and Zhang [7] based on gamma distribution by assuming the heterogeneity of substitution rates. The bootstrap algorithm [30] with 1000 replications was used to estimate the confidence of each node. The programs used in this study are available on request.

FISH mapping
DNA from equine BAC clones CHORI241_81I21 and CH241_117I21, as well as from bovine BAC clone CH240_360J16 were labeled with Spectrum Orange TM -dUTP following the manufacturer's directions (Vysis, Downer's Grove, IL). DNA from these clones was used for FISH to equine or bovine metaphase spreads, respectively, as described previously [3]. Briefly, the hybridization solution contained 100 ng of labeled probe, 6 µg of competitor DNA, 4 µg sonicated salmon sperm DNA in 50% formamide, 10% dextran sulfate, and 1X SSC. Hybridizations proceeded for approximately 16-18 h. Post-hybridization washes were done at 42 • C. The chromosomes were counterstained with DAPI prior to analysis. International cytogenetic nomenclature of the domestic horse (ISCNH1997) [11] and cattle current standard (ISCNDB2000) [10] were used to identify individual horse and cattle chromosomes, respectively. The identification of BTA11 was confirmed by S. Charter (Center for Reproduction of Endangered Species, San Diego Zoological Society).

Identification of the dog CFAEIF2AK2 gene
The GenBank canine database was searched using the HSAEIF2AK2 cDNA sequence and the Canis familiaris chromosome 17 genomic contig NW_876263, which contains the CFAEIF2AK2 gene, was identified. The blast alignment revealed that this contig includes all of the CFAEIF2AK2 coding exons. Gene-specific primers CfaPRKR-F and CfaPRKR-R (Tab. I) were used to amplify a 1739 bp PCR fragment from dog kidney cDNA. This fragment was sequenced directly and its 5' end was extended using a DNA Walking SpeedUp Kit. A full-length cDNA sequence of the CFAEIF2AK2 gene was deposited in GenBank under accession number AY906960. Alignment of this cDNA sequence and the NW_876263 genomic sequence revealed 17 exons in the CFAEIF2AK2 gene.

Identification of the rabbit OCUEIF2AK2 gene
The GenBank rabbit database was searched using the HSAEIF2AK2 cDNA sequence and several genomic contigs containing the OCUEIF2AK2 gene were identified. The sequence of the AAGW01139267 contig was used to design 3' and 5' RACE primers (Tab. I) that were utilized to amplify the full-length OCUEIF2AK2 cDNA. This cDNA was sequenced directly and the sequence obtained was deposited in GenBank under accession number DQ115394.

Identification and analysis of the horse ECAEIF2AK2 gene
Bovine, canine, human and porcine EIF2AK2 cDNA sequences were downloaded from GenBank and aligned using the MegAlign program. Based on this alignment, two degenerative primers EcaPRKR-F and EcaPRKR-R (Tab. I) were designed and used to amplify a middle portion of the equine EIF2AK2 transcript from a Quarter horse single-stranded cDNA prepared as described in Materials and Methods. The 1094 bp PCR fragment obtained was sequenced directly. An "A/G" single nucleotide polymorphism (SNP) was detected at position 824. To confirm this equine polymorphism, the SNP region was amplified and directly sequenced in an additional sixteen unrelated horses of various breeds. Ten horses were found to be AA homozygous and six were AG heterozygous.
This cDNA sequence was extended to full-length using a DNA Walking Kit and the sequence was submitted to GenBank under accession number AY850106. The "A/G" SNP detected encodes a unique Tyr329Cys substitution in the equine EIF2AK2 protein. A GenBank search of mammalian databases revealed only the "A" nucleotide at the same EIF2AK2 location in humans (27 subjects), African green monkeys (Cercopithecus aethiops; CAE) chimpanzees (Pan troglodytes; PTR), orangutans (Pongo pygmaeus; PPY), rhesus monkeys (Macaca mulatta; MML), dogs, cattle, pigs, rabbits, mice (9 subjects) and rats (4 subjects). Currently 13 EIF2AK2 SNP are listed in the mouse GenBank database and 187 in the human GenBank database, but the A/G SNP found in the horse gene was not found in either the mouse or human SNP databases, suggesting a recent origin of this polymorphism in horses.
The 1094 bp fragment of ECAEIF2AK2 cDNA was next used to screen the CHORI-241 equine BAC library and four positive clones, 81I21, 117I21, 179F14 and 179H4, were identified. The exon/intron junctions of the equine EIF2AK2 gene in two of these BAC clones, 81I21 and 117I21, were directly sequenced using primers listed in Table I. Alignment of genomic and cDNA sequences revealed 16 exons in this gene, including a non-coding 5'-terminal exon. The donor and acceptor splice sites of all of the introns corresponded to the GU/AG rule. Comparison of bovine and canine EIF2AK2 cDNA sequences with genomic contigs available in GenBank (Fig. 1, published in electroniconly form at http://www.edpsciences.org/gse) revealed similar exon/intron structures for these genes. The lengths of the exons within the EIF2AK2 open reading frame (ORF) were compared for several mammalian species. Although the number of coding EIF2AK2 exons is conserved among several mammalian species, the numbers of non-coding exons differ between primates (HSA, PPY and PTR) and other mammals. Primates contain two 5' and two 3' non-coding

Analyses of promoter regions
Analysis of potential cis-acting elements in genomic sequences located 300 bp upstream of the 5'-terminal exons of seven mammalian EIF2AK2 genes revealed conservation of both the kinase conserved sequence (KCS) and the interferon-stimulated response element (ISRE) (Fig. 2), which were previously reported in the human and mouse promoters [15]. ISRE is involved in type I interferon inducibility, while KCS functions as a constitutive activation element. These two elements are separated from each other by four bp in all mammalian EIF2AK2 promoters studied to date (Fig. 3). These two regulatory elements are also located close to the EIF2AK2 transcription start in primate and ungulate species. A search of the current dog genome draft did not reveal either the first 5' EIF2AK2 exon or the KCS and ISRE promoter elements.

Comparison of avian, teleostean and amphibian EIF2AK2 genes
A limited number of EIF2AK2 sequences were recently reported for birds and fishes. These include chicken (Gallus gallus; GGA) [14], goldfish (Carassius auratus; CAU) [9] and zebrafish (Danio rerio; DRE) [22]. Amphibian EIF2AK2 gene sequences have not been described previously. Using the USCS Genome Browser, the whole genome assemblies of Western clawed frog (Xenopus tropicalis; XTR) and fugu pufferfish (Takifugu rubripes; TRU) were searched with the BLAT program. In Xenopus, three tandemly duplicated copies of the EIF2AK2 gene were found in the genomic scaffold_41. One of these copies was an incomplete sequence due to gaps in the assembly. The two other copies were complete and were included in the phylogenetic analysis. The Takifugu genome contains a single copy of the EIF2AK2 gene located in the shotgun assembly scaffold_7150 (GenBank accession number CAAB01007140).
The 5'-terminal EIF2AK2 exons B and D encode two dsRNA binding domains in the mammals, birds (chicken) and amphibians (frog) studied to date (Fig. 2). This region is variable in fishes. The two dsRBM are conserved in the fugu EIF2AK2 protein, but the 5' ends of the zebrafish and goldfish EIF2AK2 genes encode two Z-DNA-binding domains [9,22]. Danio and Carassius EIF2AK2 genes contain a large kinase insertion in exon K. In all three fish genes, exons E and F are significantly shorter (9 codons) than those sequences in mammals. Only three amino acid residues (2%) in the N-terminal domains (exons A through D) were invariant and 13 (8%) were functionally similar among the vertebrate EIF2AK2 genes analyzed to date. The kinase catalytic domain (exons I through O) contained 57 (20%) invariant and 98 (34%) functionally similar amino acid residues. This suggests that in vertebrates the kinase catalytic domains are evolutionarily more conserved than the nucleic acid (dsRNA or Z-DNA) binding motifs.

Phylogenetic analysis of vertebrate EIF2AK2 genes
ORF sequences of vertebrate EIF2AK2 genes were aligned (Fig. 4, published in electronic-only form at http://www.edpsciences.org/gse) to build a phylogenetic tree using the neighbor-joining clustering method with distances calculated by the two-parameter substitution model with a gamma distribution parameter a = 1.5. The partial PTR and MML EIF2AK2 gene sequences were not included in this tree. The structure of the tree (Fig. 5) corresponded to a conventional taxonomy order except in the horse, dog and rabbit branches, which were associated with low bootstrap values. The substitution rates were uniform among the vertebrate sequences except for the Danio rerio and Carassius auratus cluster, which showed substitution rates that were about 20% higher as compared to those of other vertebrates.