Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUS20050244834 A1
Publication typeApplication
Application numberUS 10/831,997
Publication dateNov 3, 2005
Filing dateApr 26, 2004
Priority dateSep 10, 1999
Also published asUS6727063
Publication number10831997, 831997, US 2005/0244834 A1, US 2005/244834 A1, US 20050244834 A1, US 20050244834A1, US 2005244834 A1, US 2005244834A1, US-A1-20050244834, US-A1-2005244834, US2005/0244834A1, US2005/244834A1, US20050244834 A1, US20050244834A1, US2005244834 A1, US2005244834A1
InventorsEric Lander, Michele Cargill, James Ireland, Stacey Bolk, George Daley, Jeanette McCarthy
Original AssigneeWhitehead Institute For Biomedical Research, Millennium Pharmaceuticals, Inc.
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
Single nucleotide polymorphisms in genes
US 20050244834 A1
Abstract
The invention provides nucleic acid segments of the human genome, particularly nucleic acid segments from a gene, including polymorphic sites. Allele-specific primers and probes hybridizing to regions flanking or containing these sites are also provided. The nucleic acids, primers and probes are used in applications such as phenotype correlations, forensics, paternity testing, medicine and genetic analysis. A role for the thrombospondin gene(s) in vascular disease is also disclosed. Use of single nucleotide polymorphisms in the thrombospondin gene(s) for diagnosis, prediction of clinical course and treatment response, development of therapeutics and development of cell-culture-based and animal models for research and treatment are disclosed.
Images(8)
Previous page
Next page
Claims(33)
1. A method of diagnosing or aiding in the diagnosis of a vascular disease in an individual comprising
a) obtaining a nucleic acid sample from the individual; and
b) determining the nucleotide present at nucleotide position 2210 of the thrombospondin-1 gene,
wherein presence of a G at nucleotide position 2210 is indicative of increased likelihood of a vascular disease in the individual as compared with an individual having an A at nucleotide position 2210, and wherein presence of an A at nucleotide position 2210 is indicative of decreased likelihood of a vascular disease in the individual as compared with an individual having a G at nucleotide position 2210.
2. The method of claim 1, wherein the thrombospondin-1 gene has the nucleotide sequence of SEQ ID NO: 1.
3. The method of claim 1, wherein the vascular disease is selected from the group consisting of atherosclerosis, coronary heart disease, myocardial infarction, stroke, peripheral vascular diseases, venous thromboembolism and pulmonary embolism.
4. A method for predicting the likelihood that an individual will have a vascular disease, comprising the steps of:
a) obtaining a DNA sample from an individual to be assessed; and
b) determining the nucleotide present at nucleotide position 2210 of the thrombospondin-1 gene,
wherein presence of a G at nucleotide position 2210 is indicative of increased likelihood of a vascular disease in the individual as compared with an individual having an A at nucleotide position 2210.
5. The method according to claim 4, wherein the thrombospondin-I gene has the nucleotide sequence of SEQ ID NO: 1.
6. The method according to claim 4, wherein the individual is an individual at risk for development of a vascular disease.
7. The method according to claim 4, wherein the vascular disease is selected from the group consisting of atherosclerosis, coronary heart disease, myocardial infarction, stroke, peripheral vascular diseases, venous thromboembolism and pulmonary embolism.
8. A nucleic acid molecule comprising all or a portion of the nucleic acid sequence of SEQ ID NO: 1 wherein said nucleic acid molecule is at least 10 nucleotides in length and wherein the nucleic acid sequence comprises a polymorphic site at nucleotide position 2210 of SEQ ID NO: 1.
9. The nucleic acid molecule according to claim 8, wherein the nucleotide at the polymorphic site is different from a nucleotide at the polymorphic site in a corresponding reference allele.
10. An allele-specific oligonucleotide that hybridizes to the nucleic acid molecule of claim 8.
11. A peptide of SEQ ID NO: 2 which is at least ten contiguous amino acids, wherein the peptide comprises the serine at amino acid position 700 of SEQ ID NO: 2.
12. A method of diagnosing or aiding in the diagnosis of a vascular disease in an individual comprising
a) obtaining a biological sample comprising thrombospondin-1 protein or relevant portion thereof from the individual; and
b) determining the amino acid present at amino acid position 700 of the thrombospondin-1 protein,
wherein presence of an asparagine at amino acid position 700 is indicative of increased likelihood of a vascular disease in the individual as compared with an individual having a serine at amino acid position 700, and wherein presence of a serine at amino acid position 700 is indicative of reduced likelihood of a vascular disease in the individual as compared with an individual having an asparagine at amino acid position 700.
13. The method of claim 12, wherein the thrombospondin-1 protein has the amino acid sequence of SEQ ID NO: 2.
14. The method of claim 12, wherein the vascular disease is selected from the group consisting of atherosclerosis, coronary heart disease, myocardial infarction, stroke, peripheral vascular diseases, venous thromboembolism and pulmonary embolism.
15. A nucleic acid molecule comprising all or a portion of the nucleic acid sequence of SEQ ID NO: 3 wherein said nucleic acid molecule is at least 10 nucleotides in length and wherein the nucleic acid sequence comprises a polymorphic site at nucleotide position 1186 of SEQ ID NO: 3.
16. The nucleic acid molecule according to claim 15, wherein the nucleotide at the polymorphic site is different from a nucleotide at the polymorphic site in a corresponding reference allele.
17. An allele-specific oligonucleotide that hybridizes to the nucleic acid molecule of claim 15.
18. A peptide of SEQ ID NO: 4 which is at least ten contiguous amino acids, wherein the peptide comprises the proline at amino acid position 387 of SEQ ID NO: 4.
19. A method of diagnosing or aiding in the diagnosis of a vascular disease in an individual comprising
a) obtaining a biological sample comprising thrombospondin-4 protein or relevant portion thereof from the individual; and
b) determining the amino acid present at amino acid position 387 of the thrombospondin-4 protein,
wherein presence of an alanine at amino acid position 387 is indicative of increased likelihood of a vascular disease in the individual as compared with an individual having a proline at amino acid position 387, and wherein presence of a proline at amino acid position 387 is indicative of reduced likelihood of a vascular disease in the individual as compared with an individual having an alanine at amino acid position 387.
20. The method of claim 19, wherein the thrombospondin-4 protein has the amino acid sequence of SEQ ID NO: 4.
21. The method of claim 19, wherein the vascular disease is selected from the group consisting of atherosclerosis, coronary heart disease, myocardial infarction, stroke, peripheral vascular diseases, venous thromboembolism and pulmonary embolism.
22. A nucleic acid molecule selected from the group consisting of the genes listed in the Table, wherein said nucleic acid molecule is at least 10 nucleotides in length and comprises a polymorphic site identified in the Table, wherein a nucleotide at the polymorphic site is different from a nucleotide at the polymorphic site in a corresponding reference allele.
23. A nucleic acid molecule according to claim 22, wherein said nucleic acid molecule is at least 15 nucleotides in length.
24. A nucleic acid molecule according to claim 22, wherein said nucleic acid molecule is at least 20 nucleotides in length.
25. A nucleic acid molecule according to claim 22, wherein the nucleotide at the polymorphic site is the variant nucleotide for the gene listed in the Table.
26. An allele-specific oligonucleotide that hybridizes to a portion of a gene selected from the group consisting of the genes listed in the Table, wherein said portion is at least 10 nucleotides in length and comprises a polymorphic site identified in the Table, wherein a nucleotide at the polymorphic site is different from a nucleotide at the polymorphic site in a corresponding reference allele.
27. An allele-specific oligonucleotide according to claim 26 that is a probe.
28. An allele-specific oligonucleotide according to claim 26, wherein a central position of the probe aligns with the polymorphic site of the portion.
29. An allele-specific oligonucleotide according to claim 26 that is a primer.
30. An allele-specific oligonucleotide according to claim 29, wherein the 3′ end of the primer aligns with the polymorphic site of the portion.
31. An isolated gene product encoded by a nucleic acid molecule according to claim 22.
32. A method of analyzing a nucleic acid sample, comprising obtaining the nucleic acid sample from an individual; and determining a base occupying any one of the polymorphic sites shown in the Table.
33. A method according to claim 32, wherein the nucleic acid sample is obtained from a plurality of individuals, and a base occupying one of the polymorphic positions is determined in each of the individuals, and wherein the method further comprising testing each individual for the presence of a disease phenotype, and correlating the presence of the disease phenotype with the base.
Description
RELATED APPLICATIONS

This application is a divisional of U.S. application Ser. No. 09/657,472, filed Sep. 7, 2000, which claims the benefit of U.S. Provisional Application Serial No. 60/153,357, filed Sep. 10, 1999, U.S. Provisional Application Serial No. 60/220,947, filed Jul. 26, 2000, and U.S. Provisional Application Serial No. 60/225,724, filed Aug. 16, 2000, the entire teachings of all of which are incorporated herein by reference.

BACKGROUND OF THE INVENTION

The genomes of all organisms undergo spontaneous mutation in the course of their continuing evolution, generating variant forms of progenitor nucleic acid sequences (Gusella, Ann. Rev. Biochem. 55, 831-854 (1986)). The variant form may confer an evolutionary advantage or disadvantage relative to a progenitor form, or may be neutral. In some instances, a variant form confers a lethal disadvantage and is not transmitted to subsequent generations of the organism. In other instances, a variant form confers an evolutionary advantage to the species and is eventually incorporated into the DNA of many or most members of the species and effectively becomes the progenitor form. In many instances, both progenitor and variant form(s) survive and co-exist in a species population. The coexistence of multiple forms of a sequence gives rise to polymorphisms.

Several different types of polymorphism have been reported. A restriction fragment length polymorphism (RFLP) is a variation in DNA sequence that alters the length of a restriction fragment (Botstein et al., Am. J. Hum. Genet. 32, 314-331 (1980)). The restriction fragment length polymorphism may create or delete a restriction site, thus changing the length of the restriction fragment. RFLPs have been widely used in human and animal genetic analyses (see WO 90/13668; W090/11369; Donis-Keller, Cell 51, 319-337 (1987); Lander et al., Genetics 121, 85-99 (1989)). When a heritable trait can be linked to a particular RFLP, the presence of the RFLP in an individual can be used to predict the likelihood that the animal will also exhibit the trait.

Other polymorphisms take the form of short tandem repeats (STRs) that include tandem di-, tri- and tetra-nucleotide repeated motifs. These tandem repeats are also referred to as variable number tandem repeat (VNTR) polymorphisms. VNTRs have been used in identity and paternity analysis (U.S. Pat. No. 5,075,217; Armour et al., FEBS Lett. 307, 113-115 (1992); Horn et al., WO 91/14003; Jeffreys, EP 370,719), and in a large number of genetic mapping studies.

Other polymorphisms take the form of single nucleotide variations between individuals of the same species. Such polymorphisms are far more frequent than RFLPs, STRs and VNTRs. Some single nucleotide polymorphisms (SNP) occur in protein-coding nucleic acid sequences (coding sequence SNP (cSNP)), in which case, one of the polymorphic forms may give rise to the expression of a defective or otherwise variant protein and, potentially, a genetic disease. Examples of genes in which polymorphisms within coding sequences give rise to genetic disease include β-globin (sickle cell anemia), apoE4 (Alzheimer's Disease), Factor V Leiden (thrombosis), and CFTR (cystic fibrosis). cSNPs can alter the codon sequence of the gene and therefore specify an alternative amino acid. Such changes are called “missense” when another amino acid is substituted, and “nonsense” when the alternative codon specifies a stop signal in protein translation. When the cSNP does not alter the amino acid specified the cSNP is called “silent”.

Other single nucleotide polymorphisms occur in noncoding regions. Some of these polymorphisms may also result in defective protein expression (e.g., as a result of defective splicing). Other single nucleotide polymorphisms have no phenotypic effects.

Single nucleotide polymorphisms can be used in the same manner as RFLPs and VNTRs, but offer several advantages. Single nucleotide polymorphisms occur with greater frequency and are spaced more uniformly throughout the genome than other forms of polymorphism. The greater frequency and uniformity of single nucleotide polymorphisms means that there is a greater probability that such a polymorphism will be found in close proximity to a genetic locus of interest than would be the case for other polymorphisms. The different forms of characterized single nucleotide polymorphisms are often easier to distinguish than other types of polymorphism (e.g., by use of assays employing allele-specific hybridization probes or primers).

Only a small percentage of the total repository of polymorphisms in humans and other organisms has been identified. The limited number of polymorphisms identified to date is due to the large amount of work required for their detection by conventional methods. For example, a conventional approach to identifying polymorphisms might be to sequence the same stretch of DNA in a population of individuals by dideoxy sequencing. In this type of approach, the amount of work increases in proportion to both the length of sequence and the number of individuals in a population and becomes impractical for large stretches of DNA or large numbers of persons.

SUMMARY OF THE INVENTION

Work described herein pertains to the identification of polymorphisms which can predispose individuals to disease, by resequencing large numbers of genes in a large number of individuals. Various genes from a number of individuals have been resequenced as described herein, and SNPs in these genes have been discovered (see the Table and FIG. 3). Some of these SNPs are cSNPs which specify a different amino acid sequence, some of the SNPs are silent cSNPs and some of these cSNPs specify a stop signal in protein translation. Some of the identified SNPs were located in non-coding regions.

The invention relates to a gene which comprises a single nucleotide polymorphism at a specific location. In a particular embodiment the invention relates to the variant allele of a gene having a single nucleotide polymorphism, which variant allele differs from a reference allele by one nucleotide at the site(s) identified in the Table and FIG. 3. Complements of these nucleic acid sequences are also included. The nucleic acid molecules can be DNA or RNA, and can be double- or single-stranded. Nucleic acid molecules can be, for example, 5-10, 5-15, 10-20, 5-25, 10-30, 10-50 or 10-100 bases long.

The invention further provides allele-specific oligonucleotides that hybridize to the reference or variant allele of a gene comprising a single nucleotide polymorphism or to the complement thereof. These oligonucleotides can be probes or primers.

The invention further provides a method of analyzing a nucleic acid from an individual. The method determines which base is present at any one of the polymorphic sites shown in the Table and/or FIG. 3. Optionally, a set of bases occupying a set of the polymorphic sites shown in the Table and/or FIG. 3 is determined. This type of analysis can be performed on a number of individuals, who are tested for the presence of a disease phenotype. The presence or absence of disease phenotype is then correlated with a base or set of bases present at the polymorphic site or sites in the individuals tested.

Thus, the invention further relates to a method of predicting the presence, absence, likelihood of the presence or absence, or severity of a particular phenotype or disorder associated with a particular genotype. The method comprises obtaining a nucleic acid sample from an individual and determining the identity of one or more bases (nucleotides) at polymorphic sites of genes described herein, wherein the presence of a particular base is correlated with a specified phenotype or disorder, thereby predicting the presence, absence, likelihood of the presence or absence, or severity of the phenotype or disorder in the individual.

The thrombospondins are a family of extracellular matrix (ECM) glycoproteins that modulate many cell behaviors including adhesion, migration, and proliferation. Thrombospondins (also known as thrombin sensitive proteins or TSPs) are large molecular weight glycoproteins composed of three identical disulfide-linked polypeptide chains. The results described herein also reveal an important association between alterations, particularly SNPs, in TSP genes, particularly TSP-1 and TSP-4, and vascular disease. In particular, SNPs in these genes which are associated with premature coronary artery disease (CAD)(or coronary heart disease) and myocardial infarction (MI) have been identified and represent a potentially vital marker of upstream biology influencing the complex process of atherosclerotic plaque generation and vulnerability.

Thus, the invention relates to the TSP gene SNPs identified as described herein, both singly and in combination, as well as to the use of these SNPs, and others in TSP genes, particularly those nearby in linkage disequilibrium with these SNPs, for diagnosis, prediction of clinical course and treatment response for vascular disease, development of new treatments for vascular disease based upon comparison of the variant and normal versions of the gene or gene product, and development of cell-culture based and animal models for research and treatment of vascular disease. The invention further relates to novel compounds and pharmaceutical compositions for use in the diagnosis and treatment of such disorders. In preferred embodiments, the vascular disease is CAD or MI.

The invention relates to isolated nucleic acid molecules comprising all or a portion of the variant allele of TSP-1 (e.g., as exemplified by SEQ ID NO: 1), and to isolated nucleic acid molecules comprising all or a portion of the variant allele of TSP-4 (e.g., as exemplified by SEQ ID NO: 3). Preferred portions are at least 10 contiguous nucleotides and comprise the polymorphic site, e.g., a portion of SEQ ID NO: 1 which is at least 10 contiguous nucleotides and comprises the “G” at position 2210, or a portion of SEQ ID NO: 3 which is at least 10 contiguous nucleotides and comprises the “C” at position 1186. The invention further relates to isolated gene products, e.g., polypeptides or proteins, which are encoded by a nucleic acid molecule comprising all or a portion of the variant allele of TSP-1 or TSP-4 (e.g., SEQ ID NO: 1 or SEQ ID NO: 3, respectively). The invention also relates to nucleic acid molecules which hybridize to and/or share identity with the variant alleles identified herein (or their complements) and which also comprise the variant nucleotide at the SNP site.

The invention further relates to isolated proteins or polypeptides comprising all or a portion of the variant amino acid sequence of TSP-1 (e.g., as exemplified by SEQ ID NO: 2), and to isolated proteins or polypeptides comprising all or a portion of the variant amino acid sequence of TSP-4 (e.g., as exemplified by SEQ ID NO: 4). Preferred polypeptides are at least 10 contiguous amino acids and comprise the polymorphic amino acid, e.g., a portion of SEQ ID NO: 2 which is at least 10 contiguous amino acids and comprises the serine at residue 700, or a portion of SEQ ID NO: 4 which is at least 10 contiguous amino acids and comprises the proline at residue 387. The invention further relates to isolated nucleic acid molecules encoding such proteins and polypeptides, as well as to antibodies which bind, e.g., specifically, to such proteins and polypeptides.

The invention further relates to a method of diagnosing or aiding in the diagnosis of a disorder associated with the presence of one or more of (a) a G at nucleotide position 2210 of SEQ ID NO: 1; or (b) a C at nucleotide position 1186 of SEQ ID NO: 3 in an individual. The method comprises obtaining a nucleic acid sample from the individual and determining the nucleotide present at one or more of the indicated nucleotide positions, wherein presence of one or more of (a) a G at nucleotide position 2210 of SEQ ID NO: 1; or (b) a C at nucleotide position 1186 of SEQ ID NO: 3 is indicative of increased likelihood of said disorder in the individual as compared with an appropriate control, e.g., an individual having the reference nucleotide at one or more of said positions. In a particular embodiment the disorder is a vascular disease selected from the group consisting of atherosclerosis, coronary heart or artery disease, MI, stroke, peripheral vascular diseases, venous thromboembolism and pulmonary embolism. In a preferred embodiment, the vascular disease is selected from the group consisting of CAD and MI.

The invention further relates to a method of diagnosing or aiding in the diagnosis of a disorder associated with one or more of (a) a G at nucleotide position 2210 of SEQ ID NO: 1; or (b) a C at nucleotide position 1186 of SEQ ID NO: 3 in an individual. The method comprises obtaining a nucleic acid sample from the individual and determining the nucleotide present at one or more of the indicated nucleotide positions, wherein presence of one or more of (a) an A at nucleotide position 2210 of SEQ ID NO: 1; or (b) a G at nucleotide position 1186 of SEQ ID NO: 3 is indicative of decreased likelihood of said disorder in the individual as compared with an appropriate control, e.g., an individual having the variant nucleotide at said position. In a particular embodiment the disorder is a vascular disease selected from the group consisting of atherosclerosis, coronary heart or artery disease, MI, stroke, peripheral vascular diseases, venous thromboembolism and pulmonary embolism. In a preferred embodiment, the vascular disease is selected from the group consisting of CAD and MI.

In one embodiment, the invention relates to a method for predicting the likelihood that an individual will have a vascular disease (or aiding in the diagnosis of a vascular disease), comprising the steps of obtaining a DNA sample from an individual to be assessed and determining the nucleotide present at one or more of nucleotide positions 2210 of SEQ ID NO: 1 or 1186 of SEQ ID NO: 3. The presence of the reference nucleotide at one or more of these positions indicates that the individual has a lower likelihood of having a vascular disease than an individual having the variant nucleotide at one or more of these positions, or a lower likelihood of having severe symptomology. In a particular embodiment, the individual is an individual at risk for development of a vascular disease.

The invention further relates to a method of diagnosing or aiding in the diagnosis of a disorder associated with the presence of one or more of (a) a serine at amino acid position 700 of SEQ ID NO: 2; or (b) a proline at amino acid position 387 of SEQ ID NO: 4 in an individual. The method comprises obtaining a biological sample containing the TSP-1 and/or TSP-4 protein or relevant portion thereof from the individual and determining the amino acid present at one or more of the indicated amino acid positions, wherein presence of one or more of (a) a serine at amino acid position 700 of SEQ ID NO: 2; or (b) a proline at amino acid position 387 of SEQ ID NO: 4 is indicative of increased likelihood of said disorder in the individual as compared with an appropriate control, e.g., an individual having the reference amino acid at one or more of said positions.

The invention further relates to a method of diagnosing or aiding in the diagnosis of a disorder associated with one or more of (a) a serine at amino acid position 700 of SEQ ID NO: 2; or (b) a proline at amino acid position 387 of SEQ ID NO: 4 in an individual. The method comprises obtaining a biological sample containing the TSP-1 and/or TSP-4 protein or relevant portion thereof from the individual and determining the amino acid present at one or more of the indicated amino acid positions, wherein presence of one or more of (a) an asparagine at amino acid position 700 of SEQ ID NO: 2; or (b) an alanine at amino acid position 387 of SEQ ID NO: 4 is indicative of decreased likelihood of said disorder in the individual as compared with an appropriate control, e.g., an individual having the variant amino acid at one or more of said positions.

In one embodiment, the invention relates to a method for predicting the likelihood that an individual will have a vascular disease (or aiding in the diagnosis of a vascular disease), comprising the steps of obtaining a biological sample comprising the TSP-1 and/or TSP-4 protein or relevant portion thereof from an individual to be assessed and determining the amino acid present at one or more of amino acid positions 700 of SEQ ID NO: 2 or 387 of SEQ ID NO: 4. The presence of the reference amino acid at one or more of these positions indicates that the individual has a lower likelihood of having a vascular disease than an individual having the variant amino acid at one or more of these positions, or a lower likelihood of having severe symptomology. In a particular embodiment, the individual is an individual at risk for development of a vascular disease.

In another embodiment, the invention relates to pharmaceutical compositions comprising a reference TSP-1 and/or TSP-4 gene or gene product, or active portion thereof, for use in the treatment of vascular diseases. The invention further relates to the use of agonists and antagonists of TSP-1 and TSP-4 activity for use in the treatment of vascular diseases. In a particular embodiment the vascular disease is selected from the group consisting of atherosclerosis, coronary heart or artery disease, MI, stroke, peripheral vascular diseases, venous thromboembolism and pulmonary embolism. In a preferred embodiment, the vascular disease is selected from the group consisting of CAD and MI.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A-1D show the reference nucleotide (SEQ ID NO: 1) and amino acid (SEQ ID NO: 2) sequences for TSP-1.

FIGS. 2A-2C show the reference nucleotide (SEQ ID NO: 3) and amino acid (SEQ ID NO: 4) sequences for TSP-4.

FIG. 3 shows a table providing detailed information about the SNPs identified herein. Column one shows the internal polymorphism identifier. Column two shows the accession number for the reference sequence in the TIGR database which can be found on the world wide web at tigr.org/tdb/hgi/searching/hgigreports.html. Column three shows the nucleotide position for the SNP site. Column four shows the gene in which the polymorphism was identified. Column five shows the polymorphic site and additional flanking sequence on each side of the polymorphism. Column six shows the type of mutation produced by the polymorphism. Columns seven and eight show the reference and alternate (variant) nucleotides, respectively, for the SNP. Columns nine and ten show the reference and alternate (variant) amino acids, respectively, encoded by the alleles of the gene.

DETAILED DESCRIPTION OF THE INVENTION

The present invention relates to a gene which comprises a single nucleotide polymorphism (SNP) at a specific location. The gene which includes the SNP has at least two alleles, referred to herein as the reference allele and the variant allele. The reference allele (prototypical or wild type allele) has been designated arbitrarily and typically corresponds to the nucleotide sequence of the gene which has been deposited with GenBank or TIGR under a given Accession number. The variant allele differs from the reference allele by one nucleotide at the site(s) identified in the Table. The present invention also relates to variant alleles of the described genes and to complements of the variant alleles. The invention also relates to nucleic acid molecules which hybridize to and/or share identity with the variant alleles identified herein (or their complements) and which also comprise the variant nucleotide at the SNP site.

The invention further relates to portions of the variant alleles and portions of complements of the variant alleles which comprise (encompass) the site of the SNP and are at least 5 nucleotides in length. Portions can be, for example, 5-10, 5-15, 10-20, 5-25, 10-30, 10-50 or 10-100 bases long. For example, a portion of a variant allele which is 21 nucleotides in length includes the single nucleotide polymorphism (the nucleotide which differs from the reference allele at that site) and twenty additional nucleotides which flank the site in the variant allele. These nucleotides can be on one or both sides of the polymorphism. Polymorphisms which are the subject of this invention are defined in the Table with respect to the reference sequence deposited in GenBank or TIGR under the Accession number indicated. For example, the invention relates to a portion of a gene (e.g., AT3) having a nucleotide sequence as deposited in GenBank (e.g., U11270) comprising a single nucleotide polymorphism at a specific position (e.g., nucleotide 11918). The reference nucleotide for AT3 is shown in column 8, and the variant nucleotide is shown in column 9 of the Table. The nucleotide sequences of the invention can be double- or single-stranded.

The invention further provides allele-specific oligonucleotides that hybridize to the reference or variant allele of a gene comprising a single nucleotide polymorphism or to the complement thereof. These oligonucleotides can be probes or primers.

The invention further provides a method of analyzing a nucleic acid from an individual. The method determines which base is present at any one of the polymorphic sites shown in the Table and/or FIG. 3. Optionally, a set of bases occupying a set of the polymorphic sites shown in the Table and/or FIG. 3 is determined. This type of analysis can be performed on a number of individuals, who are tested for the presence of a disease phenotype. The presence or absence of disease phenotype is then correlated with a base or set of bases present at the polymorphic site or sites in the individuals tested.

Thus, the invention further relates to a method of predicting the presence, absence, likelihood of the presence or absence, or severity of a particular phenotype or disorder associated with a particular genotype. The method comprises obtaining a nucleic acid sample from an individual and determining the identity of one or more bases (nucleotides) at polymorphic sites of genes described herein, wherein the presence of a particular base is correlated with a specified phenotype or disorder, thereby predicting the presence, absence, likelihood of the presence or absence, or severity of the phenotype or disorder in the individual.

Definitions

A nucleic acid molecule or oligonucleotide can be DNA or RNA, and single- or double-stranded. Nucleic acid molecules and oligonucleotides can be naturally occurring or synthetic, but are typically prepared by synthetic means. Preferred nucleic acid molecules and oligonucleotides of the invention include segments of DNA, or their complements, which include any one of the polymorphic sites shown in the Table. The segments can be between 5 and 250 bases, and, in specific embodiments, are between 5-10, 5-20, 10-20, 10-50, 20-50 or 10-100 bases. For example, the segment can be 21 bases. The polymorphic site can occur within any position of the segment. The segments can be from any of the allelic forms of DNA shown in the Table.

As used herein, the terms “nucleotide”, “base” and “nucleic acid” are intended to be equivalent. The terms “nucleotide sequence”, “nucleic acid sequence”, “nucleic acid molecule” and “segment” are intended to be equivalent.

Hybridization probes are oligonucleotides which bind in a base-specific manner to a complementary strand of nucleic acid. Such probes include peptide nucleic acids, as described in Nielsen et al., Science 254, 1497-1500 (1991). Probes can be any length suitable for specific hybridization to the target nucleic acid sequence. The most appropriate length of the probe may vary depending upon the hybridization method in which it is being used; for example, particular lengths may be more appropriate for use in microfabricated arrays, while other lengths may be more suitable for use in classical hybridization methods. Such optimizations are known to the skilled artisan. Suitable probes and primers can range from about 5 nucleotides to about 30 nucleotides in length. For example, probes and primers can be 5, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 25, 26, 28 or 30 nucleotides in length. The probe or primer preferably overlaps at least one polymorphic site occupied by any of the possible variant nucleotides. The nucleotide sequence can correspond to the coding sequence of the allele or to the complement of the coding sequence of the allele.

As used herein, the term “primer” refers to a single-stranded oligonucleotide which acts as a point of initiation of template-directed DNA synthesis under appropriate conditions (e.g., in the presence of four different nucleoside triphosphates and an agent for polymerization, such as DNA or RNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature. The appropriate length of a primer depends on the intended use of the primer, but typically ranges from 15 to 30 nucleotides. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. A primer need not reflect the exact sequence of the template, but must be sufficiently complementary to hybridize with a template. The term primer site refers to the area of the target DNA to which a primer hybridizes. The term primer pair refers to a set of primers including a 5′ (upstream) primer that hybridizes with the 5′ end of the DNA sequence to be amplified and a 3′ (downstream) primer that hybridizes with the complement of the 3′ end of the sequence to be amplified.

As used herein, linkage describes the tendency of genes, alleles, loci or genetic markers to be inherited together as a result of their location on the same chromosome. It can be measured by percent recombination between the two genes, alleles, loci or genetic markers.

As used herein, polymorphism refers to the occurrence of two or more genetically determined alternative sequences or alleles in a population. A polymorphic marker or site is the locus at which divergence occurs. Preferred markers have at least two alleles, each occurring at frequency of greater than 1%, and more preferably greater than 10% or 20% of a selected population. A polymorphic locus may be as small as one base pair. Polymorphic markers include restriction fragment length polymorphisms, variable number of tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and insertion elements such as Alu. The first identified allelic form is arbitrarily designated as the reference form and other allelic forms are designated as alternative or variant alleles. The allelic form occurring most frequently in a selected population is sometimes referred to as the wildtype form. Diploid organisms may be homozygous or heterozygous for allelic forms. A diallelic or biallelic polymorphism has two forms. A triallelic polymorphism has three forms.

Work described herein pertains to the resequencing of large numbers of genes in a large number of individuals to identify polymorphisms which can predispose individuals to disease. For example, polymorphisms in genes which are expressed in liver may predispose individuals to disorders of the liver. By altering amino acid sequence, SNPs may alter the function of the encoded proteins. The discovery of the SNP facilitates biochemical analysis of the variants and the development of assays to characterize the variants and to screen for pharmaceutical that would interact directly with on or another form of the protein. SNPs (including silent SNPs) also enable the development of specific DNA, RNA, or protein-based diagnostics that detect the presence or absence of the polymorphism in particular conditions.

A single nucleotide polymorphism occurs at a polymorphic site occupied by a single nucleotide, which is the site of variation between allelic sequences. The site is usually preceded by and followed by highly conserved sequences of the allele (e.g., sequences that vary in less than 1/100 or 1/1000 members of the populations).

A single nucleotide polymorphism usually arises due to substitution of one nucleotide for another at the polymorphic site. A transition is the replacement of one purine by another purine or one pyrimidine by another pyrimidine. A transversion is the replacement of a purine by a pyrimidine or vice versa. Single nucleotide polymorphisms can also arise from a deletion of a nucleotide or an insertion of a nucleotide relative to a reference allele. Typically the polymorphic site is occupied by a base other than the reference base. For example, where the reference allele contains the base “T” at the polymorphic site, the altered allele can contain a “C”, “G” or “A” at the polymorphic site.

The invention also relates to nucleic acid molecules which hybridize to the variant alleles identified herein (or their complements) and which also comprise the variant nucleotide at the SNP site. Hybridizations are usually performed under stringent conditions, for example, at a salt concentration of no more than 1 M and a temperature of at least 25° C. For example, conditions of 5×SSPE (750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4) and a temperature of 25-30° C., or equivalent conditions, are suitable for allele-specific probe hybridizations. Equivalent conditions can be determined by varying one or more of the parameters given as an example, as known in the art, while maintaining a similar degree of identity or similarity between the target nucleotide sequence and the primer or probe used.

The invention also relates to nucleic acid molecules which share substantial sequence identity to the variant alleles identified herein (or their complements) and which also comprise the variant nucleotide at the SNP site. Particularly preferred are nucleic acid molecules and fragments which have at least about 60%, preferably at least about 70, 80 or 85%, more preferably at least about 90%, even more preferably at least about 95%, and most preferably at least about 98% identity with nucleic acid molecules described herein. The percent identity of two nucleotide or amino acid sequences can be determined by aligning the sequences for optimal comparison purposes (e.g., gaps can be introduced in the sequence of a first sequence). The nucleotides or amino acids at corresponding positions are then compared, and the percent identity between the two sequences is a function of the number of identical positions shared by the sequences (i.e., % identity=# of identical positions/total # of positions×100). In certain embodiments, the length of a sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 60%, and even more preferably at least 70%, 80% or 90% of the length of the reference sequence. The actual comparison of the two sequences can be accomplished by well-known methods, for example, using a mathematical algorithm. A preferred, non-limiting example of such a mathematical algorithm is described in Karlin et al., Proc. Natl. Acad. Sci. USA, 90:5873-5877 (1993). Such an algorithm is incorporated into the NBLAST and XBLAST programs (version 2.0) as described in Altschul et al., Nucleic Acids Res., 25:389-3402 (1997). When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., NBLAST) can be used. See the world wide web at ncbi.nlm.nih.gov. In one embodiment, parameters for sequence comparison can be set at score=100, wordlength=12, or can be varied (e.g., W=5 or W=20).

The term “isolated” is used herein to indicate that the material in question exists in a physical milieu distinct from that in which it occurs in nature. For example, an isolated nucleic acid of the invention may be substantially isolated with respect to the complex cellular milieu in which it naturally occurs. In some instances, the isolated material will form part of a composition (for example, a crude extract containing other substances), buffer system or reagent mix. In other circumstance, the material may be purified to essential homogeneity, for example as determined by PAGE or column chromatography such as HPLC. Preferably, an isolated nucleic acid comprises at least about 50, 80 or 90 percent (on a molar basis) of all macromolecular species present.

I. Novel Polymorphisms of the Invention

Some of the novel polymorphisms of the invention are shown in the Table. Columns one and two show designations for the indicated polymorphism. Column three shows the Genbank or TIGR Accession number for the wild type (or reference) allele. Column four shows the location of the polymorphic site in the nucleic acid sequence with reference to the Genbank or TIGR sequence shown in column three. Column five shows common names for the gene in which the polymorphism is located. Column six shows the polymorphism and a portion of the 3′ and 5′ flanking sequence of the gene. Column seven shows the type of mutation; N, non-sense, S, silent, M, missense. Columns eight and nine show the reference and alternate nucleotides, respectively, at the polymorphic site. Columns ten and eleven show the reference and alternate amino acids, respectively, encoded by the reference and variant, respectively, alleles. Other novel polymorphisms of the invention are shown in FIG. 3.

II. Analysis of Polymorphisms

A. Preparation of Samples

Polymorphisms are detected in a target nucleic acid from an individual being analyzed. For assay of genomic DNA, virtually any biological sample (other than pure red blood cells) is suitable. For example, convenient tissue samples include whole blood, semen, saliva, tears, urine, fecal material, sweat, buccal, skin and hair. For assay of cDNA or mRNA, the tissue sample must be obtained from an organ in which the target nucleic acid is expressed. For example, if the target nucleic acid is a cytochrome P450, the liver is a suitable source.

Many of the methods described below require amplification of DNA from target samples. This can be accomplished by e.g., PCR. See generally PCR Technology: Principles and Applications for DNA Amplification (ed. H. A. Erlich, Freeman Press, NY, N.Y., 1992); PCR Protocols: A Guide to Methods and Applications (eds. Innis, et al., Academic Press, San Diego, Calif., 1990); Mattila et al., Nucleic Acids Res. 19, 4967 (1991); Eckert et al., PCR Methods and Applications 1, 17 (1991); PCR (eds. McPherson et al., IRL Press, Oxford); and U.S. Pat. No. 4,683,202.

Other suitable amplification methods include the ligase chain reaction (LCR) (see Wu and Wallace, Genomics 4, 560 (1989), Landegren et al., Science 241, 1077 (1988), transcription amplification (Kwoh et al., Proc. Natl. Acad. Sci. USA 86, 1173 (1989)), and self-sustained sequence replication (Guatelli et al., Proc. Nat. Acad. Sci. USA, 87, 1874 (1990)) and nucleic acid based sequence amplification (NASBA). The latter two amplification methods involve isothermal reactions based on isothermal transcription, which produce both single stranded RNA (ssRNA) and double stranded DNA (dsDNA) as the amplification products in a ratio of about 30 or 100 to 1, respectively.

B. Detection of Polymorphisms in Target DNA

There are two distinct types of analysis of target DNA for detecting polymorphisms. The first type of analysis, sometimes referred to as de novo characterization, is carried out to identify polymorphic sites not previously characterized (i.e., to identify new polymorphisms). This analysis compares target sequences in different individuals to identify points of variation, i.e., polymorphic sites. By analyzing groups of individuals representing the greatest ethnic diversity among humans and greatest breed and species variety in plants and animals, patterns characteristic of the most common alleles/haplotypes of the locus can be identified, and the frequencies of such alleles/haplotypes in the population can be determined. Additional allelic frequencies can be determined for subpopulations characterized by criteria such as geography, race, or gender. The de novo identification of polymorphisms of the invention is described in the Examples section. The second type of analysis determines which form(s) of a characterized (known) polymorphism are present in individuals under test. There are a variety of suitable procedures, which are discussed in turn.

1. Allele-Specific Probes

The design and use of allele-specific probes for analyzing polymorphisms is described by e.g., Saiki et al., Nature 324, 163-166 (1986); Dattagupta, EP 235,726, Saiki, WO 89/11548. Allele-specific probes can be designed that hybridize to a segment of target DNA from one individual but do not hybridize to the corresponding segment from another individual due to the presence of different polymorphic forms in the respective segments from the two individuals. Hybridization conditions should be sufficiently stringent that there is a significant difference in hybridization intensity between alleles, and preferably an essentially binary response, whereby a probe hybridizes to only one of the alleles. Some probes are designed to hybridize to a segment of target DNA such that the polymorphic site aligns with a central position (e.g., in a 15-mer at the 7 position; in a 16-mer, at either the 8 or 9 position) of the probe. This design of probe achieves good discrimination in hybridization between different allelic forms.

Allele-specific probes are often used in pairs, one member of a pair showing a perfect match to a reference form of a target sequence and the other member showing a perfect match to a variant form. Several pairs of probes can then be immobilized on the same support for simultaneous analysis of multiple polymorphisms within the same target sequence.

2. Tiling Arrays

The polymorphisms can also be identified by hybridization to nucleic acid arrays, some examples of which are described in WO 95/11995. One form of such arrays is described in the Examples section in connection with de novo identification of polymorphisms. The same array or a different array can be used for analysis of characterized polymorphisms. WO 95/11995 also describes subarrays that are optimized for detection of a variant form of a precharacterized polymorphism. Such a subarray contains probes designed to be complementary to a second reference sequence, which is an allelic variant of the first reference sequence. The second group of probes is designed by the same principles as described in the Examples, except that the probes exhibit complementarity to the second reference sequence. The inclusion of a second group (or further groups) can be particularly useful for analyzing short subsequences of the primary reference sequence in which multiple mutations are expected to occur within a short distance commensurate with the length of the probes (e.g., two or more mutations within 9 to 21 bases).

3. Allele-Specific Primers

An allele-specific primer hybridizes to a site on target DNA overlapping a polymorphism and only primes amplification of an allelic form to which the primer exhibits perfect complementarity. See Gibbs, Nucleic Acid Res. 17, 2427-2448 (1989). This primer is used in conjunction with a second primer which hybridizes at a distal site. Amplification proceeds from the two primers, resulting in a detectable product which indicates the particular allelic form is present. A control is usually performed with a second pair of primers, one of which shows a single base mismatch at the polymorphic site and the other of which exhibits perfect complementarity to a distal site. The single-base mismatch prevents amplification and no detectable product is formed. The method works best when the mismatch is included in the 3′-most position of the oligonucleotide aligned with the polymorphism because this position is most destabilizing to elongation from the primer (see, e.g., WO 93/22456).

4. Direct-Sequencing

The direct analysis of the sequence of polymorphisms of the present invention can be accomplished using either the dideoxy chain termination method or the Maxam-Gilbert method (see Sambrook et al., Molecular Cloning, A Laboratory Manual (2nd Ed., CSHP, New York 1989); Zyskind et al., Recombinant DNA Laboratory Manual, (Acad. Press, 1988)).

5. Denaturing Gradient Gel Electrophoresis

Amplification products generated using the polymerase chain reaction can be analyzed by the use of denaturing gradient gel electrophoresis. Different alleles can be identified based on the different sequence-dependent melting properties and electrophoretic migration of DNA in solution. Erlich, ed., PCR Technology, Principles and Applications for DNA Amplification, (W. H. Freeman and Co, New York, 1992), Chapter 7.

6. Single-Strand Conformation Polymorphism Analysis

Alleles of target sequences can be differentiated using single-strand conformation polymorphism analysis, which identifies base differences by alteration in electrophoretic migration of single stranded PCR products, as described in Orita et al., Proc. Nat. Acad. Sci. 86, 2766-2770 (1989). Amplified PCR products can be generated as described above, and heated or otherwise denatured, to form single stranded amplification products. Single-stranded nucleic acids may refold or form secondary structures which are partially dependent on the base sequence. The different electrophoretic mobilities of single-stranded amplification products can be related to base-sequence differences between alleles of target sequences.

7. Single-Base Extension

An alternative method for identifying and analyzing polymorphisms is based on single-base extension (SBE) of a fluorescently-labeled primer coupled with fluorescence resonance energy transfer (FRET) between the label of the added base and the label of the primer. Typically, the method, such as that described by Chen et al., (PNAS 94:10756-61 (1997), incorporated herein by reference) uses a locus-specific oligonucleotide primer labeled on the 5′ terminus with 5-carboxyfluorescein (FAM). This labeled primer is designed so that the 3′ end is immediately adjacent to the polymorphic site of interest. The labeled primer is hybridized to the locus, and single base extension of the labeled primer is performed with fluorescently labeled dideoxyribonucleotides (ddNTPs) in dye-terminator sequencing fashion, except that no deoxyribonucleotides are present. An increase in fluorescence of the added ddNTP in response to excitation at the wavelength of the labeled primer is used to infer the identity of the added nucleotide.

III. Methods of Use

After determining polymorphic form(s) present in an individual at one or more polymorphic sites, this information can be used in a number of methods.

A. Forensics

Determination of which polymorphic forms occupy a set of polymorphic sites in an individual identifies a set of polymorphic forms that distinguishes the individual. See generally National Research Council, The Evaluation of Forensic DNA Evidence (Eds. Pollard et al., National Academy Press, DC, 1996). The more sites that are analyzed, the lower the probability that the set of polymorphic forms in one individual is the same as that in an unrelated individual. Preferably, if multiple sites are analyzed, the sites are unlinked. Thus, polymorphisms of the invention are often used in conjunction with polymorphisms in distal genes. Preferred polymorphisms for use in forensics are biallelic because the population frequencies of two polymorphic forms can usually be determined with greater accuracy than those of multiple polymorphic forms at multi-allelic loci.

The capacity to identify a distinguishing or unique set of forensic markers in an individual is useful for forensic analysis. For example, one can determine whether a blood sample from a suspect matches a blood or other tissue sample from a crime scene by determining whether the set of polymorphic forms occupying selected polymorphic sites is the same in the suspect and the sample. If the set of polymorphic markers does not match between a suspect and a sample, it can be concluded (barring experimental error) that the suspect was not the source of the sample. If the set of markers does match, one can conclude that the DNA from the suspect is consistent with that found at the crime scene. If frequencies of the polymorphic forms at the loci tested have been determined (e.g., by analysis of a suitable population of individuals), one can perform a statistical analysis to determine the probability that a match of suspect and crime scene sample would occur by chance.

p(ID) is the probability that two random individuals have the same polymorphic or allelic form at a given polymorphic site. In biallelic loci, four genotypes are possible: AA, AB, BA, and BB. If alleles A and B occur in a haploid genome of the organism with frequencies x and y, the probability of each genotype in a diploid organism is (see WO 95/12607):

    • Homozygote: p(AA)=x2
    • Homozygote: p(BB)=y2=(1−x)2
    • Single Heterozygote: p(AB)=p(BA)=xy=x(1−x)
    • Both Heterozygotes: p(AB+BA)=2xy=2x(1−x)

The probability of identity at one locus (i.e, the probability that two individuals, picked at random from a population will have identical polymorphic forms at a given locus) is given by the equation:
p(ID)=(x 2)2+(2xy)2+(y2)2.

These calculations can be extended for any number of polymnorphic forms at a given locus. For example, the probability of identity p(ID) for a 3-allele system where the alleles have the frequencies in the population of x, y and z, respectively, is equal to the sum of the squares of the genotype frequencies:
p(ID)=x 4+(2xy)2+(2yz)2+(2xz)2 +z 4 +y 4

In a locus of n alleles, the appropriate binomial expansion is used to calculate p(ID) and p(exc).

The cumulative probability of identity (cum p(ID)) for each of multiple unlinked loci is determined by multiplying the probabilities provided by each locus.
cum p(ID)=p(ID1)p(ID2)p(ID3) . . . p(IDn)

The cumulative probability of non-identity for n loci (i.e. the probability that two random individuals will be different at 1 or more loci) is given by the equation:
cump(nonID)=1−cump(ID).

If several polymorphic loci are tested, the cumulative probability of non-identity for random individuals becomes very high (e.g., one billion to one). Such probabilities can be taken into account together with other evidence in determining the guilt or innocence of the suspect.

B. Paternity Testing

The object of paternity testing is usually to determine whether a male is the father of a child. In most cases, the mother of the child is known and thus, the mother's contribution to the child's genotype can be traced. Paternity testing investigates whether the part of the child's genotype not attributable to the mother is consistent with that of the putative father. Paternity testing can be performed by analyzing sets of polymorphisms in the putative father and the child.

If the set of polymorphisms in the child attributable to the father does not match the set of polymorphisms of the putative father, it can be concluded, barring experimental error, that the putative father is not the real father. If the set of polymorphisms in the child attributable to the father does match the set of polymorphisms of the putative father, a statistical calculation can be performed to determine the probability of coincidental match.

The probability of parentage exclusion (representing the probability that a random male will have a polymorphic form at a given polymorphic site that makes him incompatible as the father) is given by the equation (see WO 95/12607):
p(exc)=xy(1−xy)
where x and y are the population frequencies of alleles A and B of a biallelic polymorphic site.

(At a triallelic site p(exc)=xy(1−xy)+yz(1−yz)+xz(1−xz)+3xyz(1−xyz))), where x, y and z and the respective population frequencies of alleles A, B and C).

The probability of non-exclusion is
p(non-exc)=1−p(exc)

The cumulative probability of non-exclusion (representing the value obtained when n loci are used) is thus:
cump(non-exc)=p(non-exc1)p(non-exc2)p(non-exc3) . . . p(non-excn)

The cumulative probability of exclusion for n loci (representing the probability that a random male will be excluded)
cump(exc)=1−cump(non-exc).

If several polymorphic loci are included in the analysis, the cumulative probability of exclusion of a random male is very high. This probability can be taken into account in assessing the liability of a putative father whose polymorphic marker set matches the child's polymorphic marker set attributable to his/her father.

C. Correlation of Polymorphisms with Phenotypic Traits

The polymorphisms of the invention may contribute to the phenotype of an organism in different ways. Some polymorphisms occur within a protein coding sequence and contribute to phenotype by affecting protein structure. The effect may be neutral, beneficial or detrimental, or both beneficial and detrimental, depending on the circumstances. For example, a heterozygous sickle cell mutationi confers resistance to malaria, but a homozygous sickle cell mutation is usually lethal. Other polymorphisms occur in noncoding regions but may exert phenotypic effects indirectly via influence on replication, transcription, and translation. A single polymorphism may affect more than one phenotypic trait. Likewise, a single phenotypic trait may be affected by polymorphisms in different genes. Further, some polymorphisms predispose an individual to a distinct mutation that is causally related to a certain phenotype.

Phenotypic traits include diseases that have known but hitherto unmapped genetic components (e.g., agammaglobulimenia, diabetes insipidus, Lesch-Nyhan syndrome, muscular dystrophy, Wiskott-Aldrich syndrome, Fabry's disease, familial hypercholesterolemia, polycystic kidney disease, hereditary spherocytosis, von Willebrand's disease, tuberous sclerosis, hereditary hemorrhagic telangiectasia, familial colonic polyposis, Ehlers-Danlos syndrome, osteogenesis imperfecta, and acute intermittent porphyria). Phenotypic traits also include symptoms of, or susceptibility to, multifactorial diseases of which a component is or may be genetic, such as autoimmune diseases, inflammation, cancer, diseases of the nervous system, and infection by pathogenic microorganisms. Some examples of autoimmune diseases include rheumatoid arthritis, multiple sclerosis, diabetes (insulin-dependent and non-independent), systemic lupus erythematosus and Graves disease. Some examples of cancers include cancers of the bladder, brain, breast, colon, esophagus, kidney, leukemia, liver, lung, oral cavity, ovary, pancreas, prostate, skin, stomach and uterus. Phenotypic traits also include characteristics such as longevity, appearance (e.g., baldness, obesity), strength, speed, endurance, fertility, and susceptibility or receptivity to particular drugs or therapeutic treatments.

The correlation of one or more polymorphisms with phenotypic traits can be facilitated by knowledge of the gene product of the wild type (reference) gene. The genes in which cSNPs of the present invention have been identified are genes which have been previously sequenced and characterized in one of their allelic forms.

Correlation is performed for a population of individuals who have been tested for the presence or absence of a phenotypic trait of interest and for polymorphic markers sets. To perform such analysis, the presence or absence of a set of polymorphisms (i.e. a polymorphic set) is determined for a set of the individuals, some of whom exhibit a particular trait, and some of which exhibit lack of the trait. The alleles of each polymorphism of the set are then reviewed to determine whether the presence or absence of a particular allele is associated with the trait of interest. Correlation can be performed by standard statistical methods such as a κ-squared test and statistically significant correlations between polymorphic form(s) and phenotypic characteristics are noted. For example, it might be found that the presence of allele A1 at polymorphism A correlates with heart disease. As a further example, it might be found that the combined presence of allele A1 at polymorphism A and allele B1 at polymorphism B correlates with increased milk production of a farm animal.

Such correlations can be exploited in several ways. In the case of a strong correlation between a set of one or more polymorphic forms and a disease for which treatment is available, detection of the polymorphic form set in a human or animal patient may justify immediate administration of treatment, or at least the institution of regular monitoring of the patient. Detection of a polymorphic form correlated with serious disease in a couple contemplating a family may also be valuable to the couple in their reproductive decisions. For example, the female partner might elect to undergo in vitro fertilization to avoid the possibility of transmitting such a polymorphism from her husband to her offspring. In the case of a weaker, but still statistically significant correlation between a polymorphic set and human disease, immediate therapeutic intervention or monitoring may not be justified. Nevertheless, the patient can be motivated to begin simple life-style changes (e.g., diet, exercise) that can be accomplished at little cost to the patient but confer potential benefits in reducing the risk of conditions to which the patient may have increased susceptibility by virtue of variant alleles. Identification of a polymorphic set in a patient correlated with enhanced receptiveness to one of several treatment regimes for a disease indicates that this treatment regime should be followed.

For animals and plants, correlations between characteristics and phenotype are useful for breeding for desired characteristics. For example, Beitz et al., U.S. Pat. No. 5,292,639 discuss use of bovine mitochondrial polymorphisms in a breeding program to improve milk production in cows. To evaluate the effect of mtDNA D-loop sequence polymorphism on milk production, each cow was assigned a value of 1 if variant or 0 if wildtype with respect to a prototypical mitochondrial DNA sequence at each of 17 locations considered. Each production trait was analyzed individually with the following animal model:
Y ijkpn =μ+YS i +P j +X k1+ . . . β17 +PE n +a n +e p
where Yijknp is the milk, fat, fat percentage, SNF, SNF percentage, energy concentration, or lactation energy record; μ is an overall mean; YSi is the effect common to all cows calving in year-season; Xk is the effect common to cows in either the high or average selection line; β1 to β17 are the binomial regressions of production record on mtDNA D-loop sequence polymorphisms; PEn is permanent environmental effect common to all records of cow n; an is effect of animal n and is composed of the additive genetic contribution of sire and dam breeding values and a Mendelian sampling effect; and ep is a random residual. It was found that eleven of seventeen polymorphisms tested influenced at least one production trait. Bovines having the best polymorphic forms for milk production at these eleven loci are used as parents for breeding the next generation of the herd.

D. Genetic Mapping of Phenotypic Traits

The previous section concerns identifying correlations between phenotypic traits and polymorphisms that directly or indirectly contribute to those traits. The present section describes identification of a physical linkage between a genetic locus associated with a trait of interest and polymorphic markers that are not associated with the trait, but are in physical proximity with the genetic locus responsible for the trait and co-segregate with it. Such analysis is useful for mapping a genetic locus associated with a phenotypic trait to a chromosomal position, and thereby cloning gene(s) responsible for the trait. See Lander et al., Proc. Natl. Acad. Sci. (USA) 83, 7353-7357 (1986); Lander et al., Proc. Natl. Acad. Sci. (USA) 84, 2363-2367 (1987); Donis-Keller et al., Cell 51, 319-337 (1987); Lander et al., Genetics 121, 185-199 (1989)). Genes localized by linkage can be cloned by a process known as directional cloning. See Wainwright, Med. J. Australia 159, 170-174 (1993); Collins, Nature Genetics 1, 3-6 (1992).

Linkage studies are typically performed on members of a family. Available members of the family are characterized for the presence or absence of a phenotypic trait and for a set of polymorphic markers. The distribution of polymorphic markers in an informative meiosis is then analyzed to determine which polymorphic markers co-segregate with a phenotypic trait. See, e.g., Kerem et al., Science 245, 1073-1080 (1989); Monaco et al., Nature 316, 842 (1985); Yamoka et al., Neurology 40, 222-226 (1990); Rossiter et al., FASEB Journal 5, 21-27 (1991).

Linkage is analyzed by calculation of LOD (log of the odds) values. A lod value is the relative likelihood of obtaining observed segregation data for a marker and a genetic locus when the two are located at a recombination fraction θ, versus the situation in which the two are not linked, and thus segregating independently (Thompson & Thompson, Genetics in Medicine (5th ed, W. B. Saunders Company, Philadelphia, 1991); Strachan, “Mapping the human genome” in The Human Genome (BIOS Scientific Publishers Ltd, Oxford), Chapter 4). A series of likelihood ratios are calculated at various recombination fractions (θ), ranging from θ=0.0 (coincident loci) to θ=0.50 (unlinked). Thus, the likelihood at a given value of θ is: probability of data if loci linked at θ to probability of data if loci unlinked. The computed likelihoods are usually expressed as the log10 of this ratio (i.e., a lod score). For example, a lod score of 3 indicates 1000:1 odds against an apparent observed linkage being a coincidence. The use of logarithms allows data collected from different families to be combined by simple addition. Computer programs are available for the calculation of lod scores for differing values of θ (e.g., LIPED, MLINK (Lathrop, Proc. Nat. Acad. Sci. (USA) 81, 3443-3446 (1984)). For any particular lod score, a recombination fraction may be determined from mathematical tables. See Smith et al., Mathematical tables for research workers in human genetics (Churchill, London, 1961); Smith, Ann. Hum. Genet. 32, 127-150 (1968). The value of θ at which the lod score is the highest is considered to be the best estimate of the recombination fraction.

Positive lod score values suggest that the two loci are linked, whereas negative values suggest that linkage is less likely (at that value of θ) than the possibility that the two loci are unlinked. By convention, a combined lod score of +3 or greater (equivalent to greater than 1000:1 odds in favor of linkage) is considered definitive evidence that two loci are linked. Similarly, by convention, a negative lod score of −2 or less is taken as definitive evidence against linkage of the two loci being compared. Negative linkage data are useful in excluding a chromosome or a segment thereof from consideration. The search focuses on the remaining non-excluded chromosomal locations.

IV. Modified Polypeptides and Gene Sequences

The invention further provides variant forms of nucleic acids and corresponding proteins. The nucleic acids comprise one of the sequences described in the Table, column 5, in which the polymorphic position is occupied by one of the alternative bases for that position. Some nucleic acids encode full-length variant forms of proteins. Similarly, variant proteins have the prototypical amino acid sequences encoded by nucleic acid sequences shown in the Table, column 5, (read so as to be in-frame with the full-length coding sequence of which it is a component) except at an amino acid encoded by a codon including one of the polymorphic positions shown in the Table. That position is occupied by the amino acid coded by the corresponding codon in any of the alternative forms shown in the Table.

Variant genes can be expressed in an expression vector in which a variant gene is operably linked to a native or other promoter. Usually, the promoter is a eukaryotic promoter for expression in a mammalian cell. The transcription regulation sequences typically include a heterologous promoter and optionally an enhancer which is recognized by the host. The selection of an appropriate promoter, for example trp, lac, phage promoters, glycolytic enzyme promoters and tRNA promoters, depends on the host selected. Commercially available expression vectors can be used. Vectors can include host-recognized replication systems, amplifiable genes, selectable markers, host sequences useful for insertion into the host genome, and the like.

The means of introducing the expression construct into a host cell varies depending upon the particular construction and the target host. Suitable means include fusion, conjugation, transfection, transduction, electroporation or injection, as described in Sambrook, supra. A wide variety of host cells can be employed for expression of the variant gene, both prokaryotic and eukaryotic. Suitable host cells include bacteria such as E. coli, yeast, filamentous fuigi, insect cells, mammalian cells, typically immortalized, e.g., mouse, CHO, human and monkey cell lines and derivatives thereof. Preferred host cells are able to process the variant gene product to produce an appropriate mature polypeptide. Processing includes glycosylation, ubiquitination, disulfide bond formation, general post-translational modification, and the like. As used herein, “gene product” includes mRNA, peptide and protein products.

The protein may be isolated by conventional means of protein biochemistry and purification to obtain a substantially pure product, i.e., 80, 95 or 99% free of cell component contaminants, as described in Jacoby, Methods in Enzymology Volume 104, Academic Press, New York (1984); Scopes, Protein Purification, Principles and Practice, 2nd Edition, Springer-Verlag, New York (1987); and Deutscher (ed), Guide to Protein Purification, Methods in Enzymology, Vol. 182 (1990). If the protein is secreted, it can be isolated from the supernatant in which the host cell is grown. If not secreted, the protein can be isolated from a lysate of the host cells.

The invention further provides transgenic nonhuman animals capable of expressing an exogenous variant gene and/or having one or both alleles of an endogenous variant gene inactivated. Expression of an exogenous variant gene is usually achieved by operably linking the gene to a promoter and optionally an enhancer, and microinjecting the construct into a zygote. See Hogan et al., “Manipulating the Mouse Embryo, A Laboratory Manual,” Cold Spring Harbor Laboratory. Inactivation of endogenous variant genes can be achieved by forming a transgene in which a cloned variant gene is inactivated by insertion of a positive selection marker. See Capecchi, Science 244, 1288-1292 (1989). The transgene is then introduced into an embryonic stem cell, where it undergoes homologous recombination with an endogenous variant gene. Mice and other rodents are preferred animals. Such animals provide useful drug screening systems.

In addition to substantially full-length polypeptides expressed by variant genes, the present invention includes biologically active fragments of the polypeptides, or analogs thereof, including organic molecules which simulate the interactions of the peptides. Biologically active fragments include any portion of the full-length polypeptide which confers a biological function on the variant gene product, including ligand binding, and antibody binding. Ligand binding includes binding by nucleic acids, proteins or polypeptides, small biologically active molecules, or large cellular structures.

Polyclonal and/or monoclonal antibodies that specifically bind to variant gene products but not to corresponding prototypical gene products are also provided. Antibodies can be made by injecting mice or other animals with the variant gene product or synthetic peptide fragments thereof. Monoclonal antibodies are screened as are described, for example, in Harlow & Lane, Antibodies, A Laboratory Manual, Cold Spring Harbor Press, New York (1988); Goding, Monoclonal antibodies, Principles and Practice (2d ed.) Academic Press, New York (1986). Monoclonal antibodies are tested for specific immunoreactivity with a variant gene product and lack of immunoreactivity to the corresponding prototypical gene product. These antibodies are useful in diagnostic assays for detection of the variant form, or as an active ingredient in a pharmaceutical composition.

V. Kits

The invention further provides kits comprising at least one allele-specific oligonucleotide as described herein. Often, the kits contain one or more pairs of allele-specific oligonucleotides hybridizing to different forms of a polymorphism. In some kits, the allele-specific oligonucleotides are provided immobilized to a substrate. For example, the same substrate can comprise allele-specific oligonucleotide probes for detecting at least 10, 100 or all of the polymorphisms shown in the Table. Optional additional components of the kit include, for example, restriction enzymes, reverse-transcriptase or polymerase, the substrate nucleoside triphosphates, means used to label (for example, an avidin-enzyme conjugate and enzyme substrate and chromogen if the label is biotin), and the appropriate buffers for reverse transcription, PCR, or hybridization reactions. Usually, the kit also contains instructions for carrying out the methods.

The thrombospondins are a family of extracellular matrix (ECM) glycoproteins that modulate many cell behaviors including adhesion, migration, and proliferation. Thrombospondins (also known as thrombin sensitive proteins or TSPs) are large molecular weight glycoproteins composed of three identical disulfide-linked polypeptide chains. TSPs are stored in the alpha-granules of platelets and secreted by a variety of mesenchymal and epithelial cells (Majack et al., Cell Membrane 3:57-77 (1987)). Platelets secrete TSPs when activated in the blood by such physiological agonists such as thrombin. TSPs have lectin properties and a broad function in the regulation of fibrinolysis and as a component of the ECM, and are one of a group of ECM proteins which have adhesive properties. TSPs bind to fibronectin and fibrinogen (Lahav et al., Eur J Biochem 145:151-6 (1984)), and these proteins are known to be involved in platelet adhesion to substratum and platelet aggregation (Leung, J Clin Invest 74:1764-1772 (1986)).

Recent work has implicated TSPs in response of cells to growth factors. Submitogenic doses of PDGF induce a rapid but transitory, increase in TSP synthesis and secretion by rat aortic smooth muscle cells (Majack et al., J Biol Chem 101: 1059-70 (1985)). PDGF responsiveness to TSP synthesis in glial cells has also been shown (Asch et al., Proc Natl Acad Sci 83:2904-8 (1986)). TSP mRNA levels rise rapidly in response to PDGF (Majack et al., J. Biol Chem 262:8821-5 (1987)). TSPs act synergistically with epidermal growth factor to increase DNA synthesis in smooth muscle cells (Majack et al., Proc Natl Acad Sci 83:9050-4 (1986)), and monoclonal antibodies to TSPs inhibit smooth muscle cell proliferation (Majack et al., J Biol Chem 106:415-22 (1988)). TSPs modulate local adhesions in endothelial cells, and TSPs, particularly TSP-1 primarily derived from platelet granules, are known to be an important activator of transforming growth factor beta-1 (TGFB-1) (Crawford et al., Cell 93:1159 (1998)) and appear to be a potential link between platelet-thrombosis and development of atherosclerosis.

To determine pivotal genes associated with premature coronary artery disease, we analyzed DNA from 347 patients with MI or coronary revascularization before age 40 (men) or 45 (women) and 422 general population controls. Cases were drawn (one per family) from a retrospective collection of sibling pairs with premature CAD. Controls were ascertained through random-digit dialing. Both cases and controls were Caucasian. A complete database of phenotypic and laboratory variables for the affected patients afforded logistic regression to control for age, diabetes, body mass index, gender.

Thrombospondin (TSP) 4 and 1 emerged as important SNPs associated with premature CAD and MI. For CAD, 148 of 347 patients carried at least one copy of the TSP-4 variant compared with 142 of 422 control subjects; adjusted odds ratio 1.47, p=0.01. For premature MI, the association was even stronger: 91 of 187 cases vs. 142 of 422 controls had the variant; adjusted odds ratio 2.08, p=0.0003. The TSP-1 SNP was rare. Nonetheless, homozygosity for the variant allele gave an adjusted odds ratio of 9.5, p=0.04.

Specific reference nucleotide (SEQ ID NO: 1) and amino acid (SEQ ID NO: 2) sequences for TSP-1 are shown in FIGS. 1A-1D. Specific reference nucleotide (SEQ ID NO: 3) and amino acid (SEQ ID NO: 4) sequences for TSP-4 are shown in FIGS. 2A-2C. It is understood that the invention is not limited by these exemplified reference sequences, as variants of these sequences which differ at locations other than the SNP sites identified herein can also be utilized. The skilled artisan can readily determine the SNP sites in these other reference sequences which correspond to the SNP sites identified herein by aligning the sequence of interest with the reference sequences specifically disclosed herein, and programs for performing such alignments are commercially available. For example, the ALIGN program in the GCG software package can be used, utilizing a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4, for example.

Two SNPs have been specifically studied as described herein. The first (G334u4) is a change from A (reference nucleotide) to G (alternate or variant nucleotide) at nucleotide position 2210 of the nucleic acid sequence of TSP—I (FIGS. 1A-1D), resulting in a missense amino acid mutation from asparagine (reference) to serine (alternate) at amino acid 700. The second SNP (G355u2) is a change from G (reference) to C (alternate) at nucleotide position 1186 of the nucleic acid sequence of TSP-4 (FIGS. 2A-2C), resulting in a missense amino acid alteration from alanine (reference) to proline (alternate) at amino acid 387. With respect to the G355u2 SNP, individuals with CAD carried at least one copy of the variant “C” allele more frequently than control individuals (43% as compared with 34%). With respect to the G355u2 SNP, individuals with MI carried at least one copy of the variant “C” allele more frequently than control individuals (49% as compared with 34%). With respect to the G334u4 SNP, individuals with CAD carried two copies of the variant “G” allele more frequently than control individuals (1.7% as compared with 0.2%). With respect to the G334u4 SNP, individuals with MI carried two copies of the variant “G” allele more frequently than control individuals (2% as compared with 0.2%).

As used herein, the term “polymorphism” refers to the occurrence of two or more genetically determined alternative sequences or alleles in a population. A polymorphic marker or site is the locus at which divergence occurs. Preferred markers have at least two alleles, each occurring at frequency of greater than 1%, and more preferably greater than 10% or 20% of a selected population. A polymorphic locus may be as small as one base pair, in which case it is referred to as a single nucleotide polymorphism (SNP).

Thus, the invention relates to a method for predicting the likelihood that an individual will have a vascular disease, or for aiding in the diagnosis of a vascular disease, or predicting the likelihood of having altered symptomology associated with a vascular disease, comprising the steps of obtaining a DNA sample from an individual to be assessed and determining the nucleotide present at one or more of nucleotide positions 2210 of the TSP-1 gene or 1186 of the TSP-4 gene. In a preferred embodiment, the nucleotides present at both of these nucleotide positions are determined. In one embodiment the TSP-1 gene has the nucleotide sequence of SEQ ID NO: 1 and the TSP-4 gene has the nucleotide sequence of SEQ ID NO: 3. The presence of one or more of a G (the variant nucleotide) at position 2210 of SEQ ID NO: 1 or a C (the variant nucleotide) at position 1186 of SEQ ID NO: 1186 indicates that the individual has a greater likelihood of having a vascular disease, or a greater likelihood of having severe symptomology associated with a vascular disease, than if that individual had the reference nucleotide at one or more of these positions. Conversely, the presence of one or more of an A (the reference nucleotide) at position 2210 of SEQ ID NO: 1 or a G (the reference nucleotide) at position 1186 of SEQ ID NO: 3 indicates that the individual has a reduced likelihood of having a vascular disease or a likelihood of having reduced symptomology associated with a vascular disease than if that individual had the variant nucleotide at one or more of these positions.

In a particular embodiment, the individual is an individual at risk for development of a vascular disease. In another embodiment the individual exhibits clinical symptomology associated with a vascular disease. In one embodiment, the individual has been clinically diagnosed as having a vascular disease. Vascular diseases include, but are not limited to, atherosclerosis, coronary heart disease, myocardial infarction (MI), stroke, peripheral vascular diseases, venous thromboembolism and puhnonary embolism. In preferred embodiments, the vascular disease is CAD or MI.

The genetic material to be assessed can be obtained from any nucleated cell from the individual. For assay of genomic DNA, virtually any biological sample (other than pure red blood cells) is suitable. For example, convenient tissue samples include whole blood, semen, saliva, tears, urine, fecal material, sweat, skin and hair. For assay of cDNA or mRNA, the tissue sample must be obtained from a tissue or organ in which the target nucleic acid is expressed.

Many of the methods described herein require amplification of DNA from target samples. This can be accomplished by e.g., PCR. See generally PCR Technology: Principles and Applications for DNA Amplification (ed. H. A. Erlich, Freeman Press, NY, N.Y., 1992); PCR Protocols: A Guide to Methods and Applications (eds. Innis, et al., Academic Press, San Diego, Calif., 1990); Mattila et al., Nucleic Acids Res. 19, 4967 (1991); Eckert et al., PCR Methods and Applications 1, 17 (1991); PCR (eds. McPherson et al., IRL Press, Oxford); and U.S. Pat. No. 4,683,202.

Other suitable amplification methods include the ligase chain reaction (LCR) (see Wu and Wallace, Genomics 4, 560 (1989), Landegren et al., Science 241, 1077 (1988), transcription amplification (Kwoh et al., Proc. Natl. Acad. Sci. USA 86, 1173 (1989)), and self-sustained sequence replication (Guatelli et al., Proc. Nat. Acad. Sci. USA, 87, 1874 (1990)) and nucleic acid based sequence amplification (NASBA). The latter two amplification methods involve isothermal reactions based on isothermal transcription, which produce both single stranded RNA (ssRNA) and double stranded DNA (dsDNA) as the amplification products in a ratio of about 30 or 100 to 1, respectively.

The nucleotide which occupies the polymorphic site of interest (e.g., nucleotide position 2210 in TSP-1 and/or nucleotide position 1186 in TSP-4) can be identified by a variety of methods, such as Southern analysis of genomic DNA; direct mutation analysis by restriction enzyme digestion; Northern analysis of RNA; denaturing high pressure liquid chromatography (DHPLC); gene isolation and sequencing; hybridization of an allele-specific oligonucleotide with amplified gene products; single base extension (SBE). In a preferred embodiment, determination of the allelic form of TSP is carried out using SBE-FRET methods as described herein, or using chip-based oligonucleotide arrays as described herein.

The invention also relates to a method for predicting the likelihood that an individual will have a vascular disease, or for aiding in the diagnosis of a vascular disease, or predicting the likelihood of having altered symptomology associated with a vascular disease, comprising the steps of obtaining a biological sample comprising TSP-1 and/or TSP-4 protein or relevant portion thereof from an individual to be assessed and determining the amino acid present at one or more of amino acid positions 700 of the TSP-1 gene product (e.g., as exemplified by SEQ ID NO: 2) or 387 of the TSP-4 gene product (e.g., as exemplified by SEQ ID NO: 4). In a preferred embodiment, the amino acids present at both of these amino acid positions are determined. As used herein, the term “relevant portion” of the TSP-1 and TSP-4 proteins is intended to encompass any portion of the protein which comprises the polymorphic amino acid positions. The presence of one or more of a serine (the variant amino acid) at position 700 of SEQ ID NO: 2, or a proline (the variant amino acid) at position 387 of SEQ ID NO: 4 indicates that the individual has a greater likelihood of having a vascular disease, or a greater likelihood of having severe symptomology associated with a vascular disease, than if that individual had the reference amino acid at one or more of these positions. Conversely, the presence of one or more of an asparagine (the reference amino acid) at position 700 of SEQ ID NO: 2, or an alanine (the reference amino acid) at position 387 of SEQ ID NO: 4 indicates that the individual has a reduced likelihood of having a vascular disease or a likelihood of having reduced symptomology associated with a vascular disease, than if that individual had the varaint amino acid at one or more of these positions.

In a particular embodiment, the individual is an individual at risk for development of a vascular disease. In another embodiment the individual exhibits clinical symptomology associated with a vascular disease. In one embodiment, the individual has been clinically diagnosed as having a vascular disease.

In this embodiment of the invention, the biological sample contains protein molecules from the test subject. In vitro techniques for detection of protein include enzyme linked immunosorbent assays (ELISAs), Western blots, immunoprecipitations and immunofluorescence. Furthermore, in vivo techniques for detection of protein include introducing into a subject a labeled anti-protein antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. Polyclonal and/or monoclonal antibodies that specifically bind to variant gene products but not to corresponding reference gene products, and vice versa, are also provided. Antibodies can be made by injecting mice or other animals with the variant gene product or synthetic peptide fragments thereof comprising the variant portion. Monoclonal antibodies are screened as are described, for example, in Harlow & Lane, Antibodies, A Laboratory Manual, Cold Spring Harbor Press, New York (1988); Goding, Monoclonal antibodies, Principles and Practice (2d ed.) Academic Press, New York (1986). Monoclonal antibodies are tested for specific immunoreactivity with a variant gene product and lack of immunoreactivity to the corresponding prototypical gene product. These antibodies are useful in diagnostic assays for detection of the variant form, or as an active ingredient in a pharmaceutical composition.

The polymorphisms of the invention may be associated with vascular disease in different ways. The polymorphisms may exert phenotypic effects indirectly via influence on replication, transcription, and translation. Additionally, the described polymorphisms may predispose an individual to a distinct mutation that is causally related to a certain phenotype, such as susceptibility or resistance to vascular disease and related disorders. The discovery of the polymorphisms and their correlation with CAD and MI facilitates biochemical analysis of the variant and reference forms and the development of assays to characterize the variant and reference forms and to screen for pharmaceutical agents that interact directly with one or another form of the protein.

Alternatively, these particular polymorphisms may belong to a group of two or more polymorphisms in the TSP gene(s) which contributes to the presence, absence or severity of vascular disease. An assessment of other polymorphisms within the TSP gene(s) can be undertaken, and the separate and combined effects of these polymorphisms, as well as alternations in other, distinct genes, on the vascular disease phenotype can be assessed.

Correlation between a particular phenotype, e.g., the CAD or MI phenotype, and the presence or absence of a particular allele is performed for a population of individuals who have been tested for the presence or absence of the phenotype. Correlation can be performed by standard statistical methods such as a Chi-squared test and statistically significant correlations between polymorphic form(s) and phenotypic characteristics are noted. This correlation can be exploited in several ways. In the case of a strong correlation between a particular polymorphic form, e.g., the variant allele for TSP-1 and/or TSP-4, and a disease for which treatment is available, detection of the polymorphic form in an individual may justify immediate administration of treatment, or at least the institution of regular monitoring of the individual. Detection of a polymorphic form correlated with a disorder in a couple contemplating a family may also be valuable to the couple in their reproductive decisions. For example, the female partner might elect to undergo in vitro fertilization to avoid the possibility of transmitting such a polymorphism from her husband to her offspring. In the case of a weaker, but still statistically significant correlation between a polymorphic form and a particular disorder, immediate therapeutic intervention or monitoring may not be justified. Nevertheless, the individual can be motivated to begin simple life-style changes (e.g., diet modification, therapy or counseling) that can be accomplished at little cost to the individual but confer potential benefits in reducing the risk of conditions to which the individual may have increased susceptibility by virtue of the particular allele. Furthermore, identification of a polymorphic form correlated with enhanced receptiveness to one of several treatment regimes for a disorder indicates that this treatment regimen should be followed for the individual in question.

Furthermore, it may be possible to identify a physical linkage between a genetic locus associated with a trait of interest (e.g., CAD or MI) and polymorphic markers that are or are not associated with the trait, but are in physical proximity with the genetic locus responsible for the trait and co-segregate with it. Such analysis is useful for mapping a genetic locus associated with a phenotypic trait to a chromosomal position, and thereby cloning gene(s) responsible for the trait. See Lander et al., Proc. Natl. Acad. Sci. (USA) 83, 7353-7357 (1986); Lander et al., Proc. Natl. Acad. Sci. (USA) 84, 2363-2367 (1987); Donis-Keller et al., Cell 51, 319-337 (1987); Lander et al., Genetics 121, 185-199 (1989)). Genes localized by linkage can be cloned by a process known as directional cloning. See Wainwright, Med. J Australia 159, 170-174 (1993); Collins, Nature Genetics 1, 3-6 (1992). Linkage studies are discussed in more detail above.

In another embodiment, the invention relates to pharmaceutical compositions comprising a reference TSP-1 and/or TSP-4 gene or gene product for use in the treatment of vascular disease, e.g., CAD and MI. As used herein, a reference TSP gene product is intended to mean gene products which are encoded by the reference allele of the TSP gene. In addition to substantially full-length polypeptides expressed by the genes, the present invention includes biologically active fragments of the polypeptides, or analogs thereof, including organic molecules which simulate the interactions of the peptides. Biologically active fragments include any portion of the full-length polypeptide which confers a biological function on the variant gene product, including ligand binding, and antibody binding. Ligand binding includes binding by nucleic acids, proteins or polypeptides, small biologically active molecules, or large cellular structures.

For instance, the polypeptide or protein, or fragment thereof, of the present invention can be formulated with a physiologically acceptable medium to prepare a pharmaceutical composition. The particular physiological medium may include, but is not limited to, water, buffered saline, polyols (e.g., glycerol, propylene glycol, liquid polyethylene glycol) and dextrose solutions. The optimum concentration of the active ingredient(s) in the chosen medium can be determined empirically, according to procedures well known to medicinal chemists, and will depend on the ultimate pharmaceutical formulation desired. Methods of introduction of exogenous peptides at the site of treatment include, but are not limited to, intradernal, intramuscular, intraperitoneal, intravenous, subcutaneous, oral and intranasal. Other suitable methods of introduction can also include rechargeable or biodegradable devices and slow release polymeric devices. The pharmaceutical compositions of this invention can also be administered as part of a combinatorial therapy with other agents and treatment regimens.

The invention further pertains to compositions, e.g., vectors, comprising a nucleotide sequence encoding reference or variant TSP-1 and/or TSP-4 gene products. For example, reference genes can be expressed in an expression vector in which a reference gene is operably linked to a native or other promoter. Usually, the promoter is a eukaryotic promoter for expression in a mammalian cell. The transcription regulation sequences typically include a heterologous promoter and optionally an enhancer which is recognized by the host. The selection of an appropriate promoter, for example trp, lac, phage promoters, glycolytic enzyme promoters and tRNA promoters, depends on the host selected. Commercially available expression vectors can be used. Vectors can include host-recognized replication systems, amplifiable genes, selectable markers, host sequences useful for insertion into the host genome, and the like.

The means of introducing the expression construct into a host cell varies depending upon the particular construction and the target host. Suitable means include fusion, conjugation, transfection, transduction, electroporation or injection, as described in Sambrook, supra. A wide variety of host cells can be employed for expression of the variant gene, both prokaryotic and eukaryotic. Suitable host cells include bacteria such as E. coli, yeast, filamentous fungi, insect cells, mammalian cells, typically immortalized, e.g., mouse, CHO, human and monkey cell lines and derivatives thereof. Preferred host cells are able to process the variant gene product to produce an appropriate mature polypeptide. Processing includes glycosylation, ubiquitination, disulfide bond formation, general post-translational modification, and the like.

It is also contemplated that cells can be engineered to express the reference allele of the invention by gene therapy methods. For example, DNA encoding the reference TSP gene product, or an active fragment or derivative thereof, can be introduced into an expression vector, such as a viral vector, and the vector can be introduced into appropriate cells in an animal. In such a method, the cell population can be engineered to inducibly or constitutively express active reference TSP gene product. In a preferred embodiment, the vector is delivered to the bone marrow, for example as described in Corey et al. (Science 244:1275-1281 (1989)).

The invention further relates to the use of compositions (i.e., agonists) which enhance or increase the activity of the reference (or variant) TSP (e.g., TSP-1 or TSP-4) gene product, or a functional portion thereof, for use in the treatment of vascular disease. The invention also relates to the use of compositions (i.e., antagonists) which reduce or decrease the activity of the variant (or reference) TSP (e.g., TSP-1 or TSP-4) gene product, or a functional portion thereof, for use in the treatment of vascular disease.

The invention also relates to constructs which comprise a vector into which a sequence of the invention has been inserted in a sense or antisense orientation. For example, a vector comprising a nucleotide sequence which is antisense to the variant TSP-1 or TSP-4 allele may be used as an antagonist of the activity of the TSP-1 or TSP-4 variant allele. Alternatively, a vector comprising a nucleotide sequence of the TSP-1 or TSP-4 reference allele may be used therapeutically to treat vascular diseases. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a “plasmid”, which refers to a circular double stranded DNA loop into which additional DNA segments can be ligated. Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors, expression vectors, are capable of directing the expression of genes to which they are operably linked. In general, expression vectors of utility in recombinant DNA techniques are often in the form of plasmids (vectors). However, the invention is intended to include such other forms of expression vectors, such as viral vectors (e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses) that serve equivalent functions.

Preferred recombinant expression vectors of the invention comprise a nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell. This means that the recombinant expression vectors include one or more regulatory sequences, selected on the basis of the host cells to be used for expression, which is operably linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, “operably linked” is intended to mean that the nucleotide sequence of interest is linked to the regulatory sequence(s) in a manner which allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). The term “regulatory sequence” is intended to include promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Such regulatory sequences are described, for example, in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence in many types of host cell and those which direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, etc.

The expression vectors of the invention can be introduced into host cells to thereby produce proteins or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein. The recombinant expression vectors of the invention can be designed for expression of a polypeptide of the invention in prokaryotic or eukaryotic cells, e.g., bacterial cells such as E. coli, insect cells (using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, supra. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

Another aspect of the invention pertains to host cells into which a recombinant expression vector of the invention has been introduced. The terms “host cell” and “recombinant host cell” are used interchangeably herein. It is understood that such terms refer not only to the particular subject cell but also to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein. A host cell can be any prokaryotic or eukaryotic cell. For example, a nucleic acid of the invention can be expressed in bacterial cells (e.g., E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those skilled in the art.

Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation. Suitable methods for transforming or transfecting host cells can be found in Sambrook, et al. (supra), and other laboratory manuals.

A host cell of the invention, such as a prokaryotic or eukaryotic host cell in culture, can be used to produce (i.e., express) a polypeptide of the invention. Accordingly, the invention further provides methods for producing a polypeptide using the host cells of the invention. In one embodiment, the method comprises culturing the host cell of the invention (into which a recombinant expression vector encoding a polypeptide of the invention has been introduced) in a suitable medium such that the polypeptide is produced. In another embodiment, the method further comprises isolating the polypeptide from the medium or the host cell.

The host cells of the invention can also be used to produce nonhuman transgenic animals. For example, in one embodiment, a host cell of the invention is a fertilized oocyte or an embryonic stem cell into which a nucleic acid of the invention has been introduced. Such host cells can then be used to create non-human transgenic animals in which exogenous nucleotide sequences have been introduced into their genome or homologous recombinant animals in which endogenous nucleotide sequences have been altered. Such animals are useful for studying the function and/or activity of the nucleotide sequence and polypeptide encoded by the sequence and for identifying and/or evaluating modulators of their activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, etc. A transgene is exogenous DNA which is integrated into the genome of a cell from which a transgenic animal develops and which remains in the genome of the mature animal, thereby directing the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal. As used herein, an “homologous recombinant animal” is a non-human animal, preferably a mammal, more preferably a mouse, in which an endogenous gene has been altered by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

A transgenic animal of the invention can be created by introducing a nucleic acid of the invention into the male pronuclei of a fertilized oocyte, e.g., by microinjection, retroviral infection, and allowing the oocyte to develop in a pseudopregnant female foster animal. The sequence can be introduced as a transgene into the genome of a non-human animal. Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to the transgene to direct expression of a polypeptide in particular cells. Methods for generating transgenic animals via embryo manipulation and microinjection, particularly animals such as mice, have become conventional in the art and are described, for example, in U.S. Pat. Nos. 4,736,866 and 4,870,009, U.S. Pat. No. 4,873,191 and in Hogan, Manipulating the Mouse Embryo (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986). Similar methods are used for production of other transgenic animals. A transgenic founder animal can be identified based upon the presence of the transgene in its genome and/or expression of mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding the transgene can further be bred to other transgenic animals carrying other transgenes.

The invention also relates to the use of the variant and reference gene products to guide efforts to identify the causative mutation for vascular diseases or to identify or synthesize agents useful in the treatment of vascular diseases, e.g., CAD and MI. Amino acids that are essential for function can be identified by methods known in the art, such as site-directed mutagenesis or alanine-scanning mutagenesis (Cunningham et al., Science, 244:1081-1085 (1989)). The latter procedure introduces single alanine mutations at every residue in the molecule. The resulting mutant molecules are then tested for biological activity in vitro, or in vitro activity. Sites that are critical for polypeptide activity can also be determined by structural analysis such as crystallization, nuclear magnetic resonance or photoaffinity labeling (Smith et al., J. Mol. Biol., 224:899-904 (1992); de Vos et al. Science, 255:306-312 (1992)).

Another aspect of the invention pertains to monitoring the influence of agents (e.g., drugs, compounds) on the expression or activity of proteins of the invention in clinical trials. An exemplary method for detecting the presence or absence of proteins or nucleic acids of the invention in a biological sample involves obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting the protein, or nucleic acid (e.g., mRNA, genomic DNA) that encodes the protein, such that the presence of the protein or nucleic acid is detected in the biological sample. A preferred agent for detecting mRNA or genomic DNA is a labeled nucleic acid probe capable of hybridizing to mRNA or genomic DNA sequences described herein, preferably in an allele-specific manner. The nucleic acid probe can be, for example, a full-length nucleic acid, or a portion thereof, such as an oligonucleotide of at least 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to appropriate mRNA or genomic DNA. Other suitable probes for use in the diagnostic assays of the invention are described herein.

The invention also encompasses kits for detecting the presence of proteins or nucleic acid molecules of the invention in a biological sample. For example, the kit can comprise a labeled compound or agent (e.g., nucleic acid probe) capable of detecting protein or mRNA in a biological sample; means for determining the amount of protein or mRNA in the sample; and means for comparing the amount of protein or mRNA in the sample with a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect protein or nucleic acid.

The following Examples are offered for the purpose of illustrating the present invention and are not to be construed to limit the scope of this invention. The teachings of all references cited herein are hereby incorporated herein by reference.

EXAMPLES

Identification of Single Nucleotide Polymorphisms

The polymorphisms shown in the Table were identified by resequencing of target sequences from individuals of diverse ethnic and geographic backgrounds by hybridization to probes immobilized to microfabricated arrays. The strategy and principles for design and use of such arrays are generally described in WO 95/11995.

A typical probe array used in this analysis has two groups of four sets of probes that respectively tile both strands of a reference sequence. A first probe set comprises a plurality of probes exhibiting perfect complementarily with one of the reference sequences. Each probe in the first probe set has an interrogation position that corresponds to a nucleotide in the reference sequence. That is, the interrogation position is aligned with the corresponding nucleotide in the reference sequence, when the probe and reference sequence are aligned to maximize complementarily between the two. For each probe in the first set, there are three corresponding probes from three additional probe sets. Thus, there are four probes corresponding to each nucleotide in the reference sequence. The probes from the three additional probe sets are identical to the corresponding probe from the first probe set except at the interrogation position, which occurs in the same position in each of the four corresponding probes from the four probe sets, and is occupied by a different nucleotide in the four probe sets. In the present analysis, probes were 25 nucleotides long. Arrays tiled for multiple different references sequences were included on the same substrate.

Publicly available sequences for a given gene were assembled into Gap4, which can be found on the world wide web at biozentrum.unibas.ch/biocomp/staden/Overview.html. PCR primers covering each exon were designed using Primer 3, which can be found on the world wide web at genome.wi.mit.edu/cgi-bin/primer/primer3.cgi. Primers were not designed in regions where there were sequence discrepancies between reads. Genomic DNA was amplified in at least 50 individuals using 2.5 pmol each primer, 1.5 mM MgCl2, 100 μM dNTPs, 0.75 μM AmpliTaq GOLD polymerase, and 19 ng DNA in a 15 μl reaction. Reactions were assembled using a PACKARD MultiPROBE robotic pipetting station and then put in MJ 96-well tetrad thermocyclers (96° C. for 10 minutes, followed by 35 cycles of 96° C. for 30 seconds, 59° C. for 2 minutes, and 72° C. for 2 minutes). A subset of the PCR assays for each individual were run on 3% NuSieve gels in 0.5×TBE to confirm that the reaction worked.

For a given DNA, 5 μL (about 50 ng) of each PCR or RT-PCR product were pooled (Final volume=150-200 μl). The products were purified using QiaQuick PCR purification from Qiagen. The samples were eluted once in 35 μl sterile water and 4 μl 10× One-Phor-All buffer (Pharmacia). The pooled samples were digested with 0.2μ DNaseI (Promega) for 10 minutes at 37° C. and then labeled with 0.5 mmols biotin-N-6-ddATP and 15μ Terminal Transferase (GibcoBRL Life Technology) for 60 minutes at 37° C. Both fragmentation and labeling reactions were terminated by incubating the pooled sample for 15 minutes at 100° C.

Low-density DNA chips (Affymetrix, Calif.) were hybridized following the manufacturer's instructions. Briefly, the hybridization cocktail consisted of 3M TMACl, 10 mM Tris pH 7.8, 0.01% Triton X-100, 100 mg/ml herring sperm DNA (Gibco BRL), 200 pM control biotin-labeled oligo. The processed PCR products were denatured for 7 minutes at 100° C. and then added to prewarmed (37° C.) hybridization solution. The chips were hybridized overnight at 44° C. Chips were washed in 1×SSPET and 6×SSPET followed by staining with 2 μg/ml SARPE and 0.5 mg/ml acetylated BSA in 200 μl of 6×SSPET for 8 minutes at room temperature. Chips were scanned using a Molecular Dynamics scanner.

Chip image files were analyzed using Ulysses (Affymetrix, Calif.) which uses four algorithms to identify potential polymorphisms. Candidate polymorphisms were visually inspected and assigned a confidence value: high confidence candidates displayed all three genotypes, while likely candidates showed only two genotypes (homozygous for reference sequence and heterozygous for reference and variant). Some of the candidate polymorphisms were confirmed by ABI sequencing. Identified polymorphisms were compared to several databases to determine if they were novel. Results are shown in the Table.

Association of Thrombospondin Gene Polymorphisms with Vascular Disease

To determine pivotal genes associated with premature coronary artery disease, we analyzed DNA from 347 patients with MI or coronary revascularization before age 40 (men) or 45 (women) and 422 general population controls. Cases were drawn (one per family) from a retrospective collection of sibling pairs with premature CAD. Controls were ascertained through random-digit dialing. Both cases and controls were Caucasian. A complete database of phenotypic and laboratory variables for the affected patients afforded logistic regression to control for age, diabetes, body mass index, gender.

Thrombospondin (TSP) 4 and 1 emerged as important SNPs associated with premature CAD and MI. For CAD, 148 of 347 patients carried at least one copy of the TSP-4 variant compared with 142 of 422 control subjects; adjusted odds ratio 1.47, p=0.01. For premature MI, the association was even stronger: 91 of 187 cases vs. 142 of 422 controls had the variant; adjusted odds ratio 2.08, p=0.0003. The TSP-1 SNP was rare. Nonetheless, homozygosity for the variant allele gave an adjusted odds ratio of 9.5, p=0.04.

Genbank
or TIGR Position Muta-
Poly WIAF Accession in Gene Flanking tion Ref Alt Ref Alt
ID ID Number Sequence Description Seq Type NT NT AA AA
AT3a7 WIAF-13246 U11270 11918 AT3, CTGCAGGAGT[G/A]GCTGGATGAA N G A W *
antithrombin III
DRD5u22 WIAF-12913 M67439 310 DRD1, CATCTGGACC[C/T]TGCTGGGCAA S C T L L
dopamine receptor
D1
DRD5u23 WIAF-12914 M67439 332 DRD1, GTGCTGGTGT[G/C]CGCAGCCATC M G C C S
dopamine receptor
D1
DRD5u24 WIAF-12915 M67439 369 DRD1, TGCGCGCCAA[C/G]ATGACCAACG M C G N K
dopamine receptor
D1
DRD5u25 WIAF-12916 M67439 522 DRD1, TGTGCTCCAC[T/C]GCCTCCATCC S T C T T
dopamine receptor
D1
DRD5u26 WIAF-12917 M67439 953 DRD1, GCAGAGCACG[C/T]GCAGAGCTGC M C T A V
dopamine receptor
D1
DRD5u27 WIAF-12918 M67439 635 DRD1, ATGGTCGGCC[T/C]GGCATGGACC M T C L P
dopamine receptor
D1
DRD5u2B WIAF-129l9 M67439 606 DRD1, GCAAGATGAC[T/C]CAGCGCATGG S T C T T
dopamine receptor
D1
DRD5u29 WIAF-12920 M67439 845 DRD1, TCGCTCATCA[G/A]CTTCTACATC M G A S N
dopamine receptor
D1
DRD5u30 WIAF-12921 M67439 720 DRD1, CGGGCCGGCT[G/T]GACCTGCCAA S G T L L
dopamine receptor
D1
DRD5u31 WIAF-12922 M67439 1044 DRD1, AGACCCTGTC[G/A]GTGATCATGG S G A S S
dopamine receptor
D1
DRD5u32 WIAF-12923 M67439 766 DRD1, GGAGGAGGAC[T/G]TTTGGGAGCC M T G F V
dopamine receptor
D1
DRD5u33 WIAF-12924 N67439 777 DRD1, TTTCCGAOCC[C/T]GACCTCAATG S C T P P
dopamine receptor
D1
DRD5u34 WIAF-12925 M67439 786 DRD1, CCGACGTGAA[T/C]GCACACAACT M T G N K
dopamine receptor
D1
DRD5u35 WIAF-12926 M67439 881 DRD1, ACCTACACGC[G/A]CATCTACCGC M G A R H
dopamine receptor
D1
DRD5u36 WIAF-12927 M67439 1279 DRD1, GTGCACCCAC[T/G]TCTGCTCCCG M T G F V
dopamine receptor
D1
DRD5u37 WIAF-12928 M67439 1370 DRD1, GAAATCGCAG[C/T]TGCCTACATC M C T A V
dopamine receptor
D1
DRD5u38 WIAF-12929 M67439 1500 DRD1, ACCCTGTTGC[T/A]GAGTCTGTCT S T A A A
dopamine receptor
D1
DRD5u39 WIAF-12930 M67439 1338 DRD1, TCTCCTACAA[C/T]CAAGACATCG S C T N N
dopamine receptor
D1
DRD5u40 WIAF-12931 M67439 1215 DRD1, CACTCAACCC[C/A]CTCATCTATG S C A P P
dopamine receptor
D1
DRD5u41 WIAF-12932 M67439 1242 DRD1, ACGCCGACTT[T/C]CAGAAGGTGT S T C P F
dopamine receptor
D1
DRD5u42 WIAF-12933 M67439 1441 DRD1, CGAGGAGGAC[G/A]GTCCTTTCGA M G A G S
dopamine receptor
D1
DRD5u43 WIAF-12934 M67439 1460 DRD1, GATCGCATGT[T/C]CCAGATCTAT M T C F S
dopamine receptor
D1
DRD5u44 WIAF-12960 M67439 399 DRD1, TGTCTCTGGC[C/T]GTGTCTGACC S C T A A
dopamine receptor
D1
DRD5u45 WIAF-12961 M46743 9162 DRD1, TGCCGCCAGC[C/G]AGcAAcOCCA S C G G G
dopamine receptor
D1
DRD5u46 WIAF-12962 M67439 195 DRD1, GGCAGTTCGC[T/G]CTATACCAGC S T G A A
dopamine receptor
D1
DRD5u47 WIAF-12963 M67439 264 DRD1, TGGGGCCCTC[A/C]CAGCTCCTCA S A G S S
dopamine receptor
D1
DRD5u48 WIAF-12964 M61439 465 DRD1, TGGCCGGTTA[C/T]TGCCCCTTTC S C T Y Y
dopamine receptor
D1
DRD5u49 WIAF-12965 M67439 511 DRD1, CTTCGACATC[A/T]TGTGCTCCAC M A T M L
dopamine receptor
D1
DRD5u50 WIAF-12966 M67439 557 DRD1, ATCAGCGTGG[A/C]CCCCTACTGG M A G D G
dopamine receptor
D1
DRD5u51 WIAF-12967 M67439 476 DRD1, TGGCCCTTTG[C/A]AGCGTTCTGC M G A G E
dopamine receptor
D1
DRD5u52 WIAF-12968 M67439 1004 DRD1, AGCCTGCGCG[C/T]TTCCATCAAC M C T A V
dopamine receptor
D1
DRD5u53 WIAF-12969 M67439 1036 DRD1, GGTTCTCAAG[A/C]CCCTGTCGGT M A C T P
dopamine receptor
D1
DRD5u54 WIAF-12970 M67439 859 DRD1, CTACATCCCC[G/A]TTGCCATCAT M G A V I
dopamine receptor
D1
DRD5u55 WIAF-12971 M67439 931 DRD1, GATTTCCTCC[C/T]TGGAGAGGGC S C T L L
dopamine receptor
D1
G10u1 WIAF-10234 J04111 1308 JUN, v-jun avian CCCTCAACGC[C/T]TCGTTCCTCC S C T A A
sarcoma virus 17
oncogene homolog
G10u2 WIAF-10235 J04111 1471 JUN, v-jun avian GCTGCTCAAG[C/T]TGGCGTCGCC S C T L L
sarcoma virus 17
oncogene homolog
G10u3 WIAF-10253 J04111 2010 JUN, v-jun avian TGGAGTCCCA[G/A]GAGCCGATCA S G A Q Q
sarcoma virus 17
oncogene homolog
G1001u1 WIAF-13746 D26135 993 DGKG, diacyl- CCCCAGTCGT[C/A]TACCTGAAGG S G A V V
glycerol kinase,
gamma (90 kD)
G1001u2 WIAF-13764 D26135 2313 DGKG, diacyl- ATGTGATGAG[A/T]GAGAAACATC M A T R S
glycerol kinase,
gamma (90 kD)
G1002u1 WIAF-13918 X57206 334 ITPKB, inositol CCCCAACATC[A/C]GGACAAGCCT M A C Q P
1,4,5-trisphosphate
3-kinase B
G1002u2 WIAF-13925 X57206 575 ITPKB, inositol CCAACTCAGC[T/C]TTCCTGCATA S T C A A
1,4,5-trisphosphate
3-kinase B
G1004u1 WIAF-13567 L36151 1854 PIK4CA, phospha- GCCGCTCAGA[C/T]TCCGAGGATG S C T D D
tidylinositol 4-
kinase, catalytic,
alpha polypeptide
G1006u1 WIAF-12375 HT2690 858 PRKCA, protein GGTACAAGTT[G/A]CTTAACCAAG S G A L L
kinase C, alpha
G1008u1 WIAF-12397 HT2136 300 PRKCZ, protein CTGGCCTGCC[A/C]TCTCCCCGAC S A G P P
kinase C, zeta
C1008u2 WIAF-12398 HT2136 246 PRKCZ, protein AGTGCAGGGA[T/C]GAAGGCCTCA S T C D D
kinase C, zeta
G1008u3 WIAF-12399 HT2136 504 PRKCZ, protein GCTCCCACCC[C/T]CTCGTCCCCC S C T G G
kinase C, zeta
G1008u4 WIAF-12403 HT2136 807 PRKCZ, protein AGAAGAATGA[C/T]CAAATTTACG S C T D D
kinase C, zeta
G1008u5 WIAF-12404 HT2136 1514 PRKCZ, protein GGATTTTCTG[A/T]CATCAACTCC M A T D V
kinase C, zeta
G1008u6 WIAF-12412 HT2136 166 PRKCZ, protein CAAGTGGGTC[C/A]ACACCGAAGG M G A D N
kinase C, zeta
C1008u7 WIAF-12418 HT2136 560 PRKCZ, protein TCCCAACAGC[C/T]TCCACTACAC M C T P L
kinase C, zeta
C1009u1 WIAF-12396 L05186 2495 PTK2, PTK2 protein TCATCAACAA[G/A]ATGAAACTCG S G A K K
tyrosine kinase 2
G1011u1 WIAF-11988 X07876 1250 WNT2, wingless-type TCCCATCTCA[C/A]CCGCATGACC M C A T N
MMTV integration
site family member
2
G1011u2 WIAF-11997 X07876 788 WNT2, wingless-type CACTATGGGA[T/C]CAAATTTGCC M T C I T
MMTV integration
site family member
2
G1011u3 WIAF-12014 X07876 1338 WNT2, wingless-type TGCACACATG[C/A]AAGGCCCCCA N C A C *
MMTV integration
site family member
2
C1011u4 WIAF-13475 X07876 856 WNT2, wingless-type CCTGATGAAT[C/T]TTCACAACAA M C T L F
MMTV integration
site family member
2
C1011u5 WIAF-13476 X07876 958 WNT2, wingless-type GACATGCTGG[C/T]TGGCCATGGC S C T L L
MMTV integration
site family member
2
G1011u6 WIAF-13477 X07876 789 WNT2, wingless-type ACTATCCGAT[C/T]AAATTTCCCC S C T I I
MMTV integration
site family member
2
G1011u7 WIAF-13478 X07876 823 WNT2, wingless-type TGCAAAGGAA[A/C]GCAAACCAAA M A G R G
MMTV integration
site family member
2
G1012u1 WIAF-12408 HT48910 1574 WNT2B, wingless-type ATACTTGCAA[A/C]GCCCCCAAGA S A G K K
MMTV integration
site family, member
2B
G1016a1 WIAF-12125 Z22534 793 ACVR1, activin A CCCAAGCCGA[A/C]AATGTTCCCG S A G E E
receptor, type I
G1016u2 WIAF-12392 Z22534 373 ACVR1, activin A CTGGCCAACC[T/C]GTGGACTGCT S T C A A
receptor, type I
G1018u1 WIAF-12413 X74210 1150 ADCY2, adenylate CAAATTCCGA[C/T]TGCGTATTAA M G T V L
cyclase 2 (brain)
G1019u1 WIAF-12394 U83867 5475 SPTAN1, spectrin, GCGACCTAAC[T/C]CGCCTCCACA S T C T T
alpha, nonerythro-
cytic 1 (alpha-
fodrin)
G1019u2 WIAF-12406 U83867 1223 SPTAN1, spectrin, GCCCTCATCA[A/G]TGCACATCAC M A G N S
alpha, nonerythro-
cytic 1 (alpha-
fodrin)
G1019u3 WIAF-12409 U83867 3555 SPTAN1, spectrin, CTCAAGCTCT[T/C]ATCCCACACC S T C L L
alpha, nonerythro-
cytic 1 (alpha-
fodrin)
G1019u4 WIAF-12415 U83867 3369 SPTAN1, spectrin, TCCCTCAACC[C/A]AATGAACTAC S G A A A
alpha, nonerythro-
cytic 1 (alpha-
fodrin)
C1019u5 WIAF-12417 U83867 5839 SPTAN1, spectrin, TCACACACAC[T/A]TCACCCTCCA M T A F I
alpha, nonerythro-
cytic 1 (alpha-
fodrin)
G1022u1 WIAF-12393 U45945 631 ATP1B2, ATPase, CATCAATCTT[A/C]CCTCTCCTCC M A G T A
Na+/K+
transporting, beta
2 polypeptide
G1022u2 WIAF-12400 U45945 432 ATP1B2, ATPase, GCCGCCCTGG[G/A]CGCTATTACG S G A G G
Na+/K+
transporting, beta
2 polypeptide
G1023u1 WIAF-12401 D89722 395 ARNTL, aryl hydro- AACATTAAGA[C/C]GTGCCACCAA M G C G R
carbon receptor
nuclear
translocator-like
G1023u2 WIAF-12407 D89722 681 ARNTL, aryl hydro- CTCATAGATC[C/T]AAAAACTGGA M C T A V
carbon receptor
nuclear
translocator-like
G1024u1 WIAF-12410 U85946 731 Homo sapiens brain CATACATTTT[C/T]ACAACTTAAA M C T S L
secretory protein
hSec10p (HSEC10)
mRNA, complete cds.
G1027u1 WIAF-12402 L47647 1135 CKB, creatine TCGAGATGGA[A/G]CAGCGGCTGG S A G E E
kinase, brain
G1027u2 WIAF-12405 L47647 499 CKB, creatine CCGAGCCCCG[A/C]GCCATCGACA S A C R R
kinase, brain
G103u1 WIAF-10427 HT2269 335 ERCC5, excision GGGATCCCCA[T/C]CCGAACTCAA S T C H H
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complementation
group G (Cockayne
syndrome))
G103u2 WIAF-10429 HT2269 1221 ERCC5, excision CCCTCCTTCT[C/T]CAAGAACTTT M C T P S
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complementation
group G (Cockayne
syndrome))
G103u3 WIAF-10431 HT2269 1783 ERCC5, excision TCTCCAACTT[C/C]TACAAATTCT M G C C S
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complenientation
group G (Cockayne
syndrome))
G103u4 WIAF-10432 HT2269 2077 ERCC5, excision ACTGAATCTG[C/A]AGGCCAGGAT M C A A E
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complementation
group G (Cockayne
syndrome))
G103u5 WIAF-10446 HT2269 3338 ERCC5, excision AATTTGAGCT[A/T]CTTGATAAGG S A T L L
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complementation
group G (Cockayne
syndrome))
G103u6 WIAF-10447 HT2269 3487 ERCC5, excision TCAGAATCAT[C/T]TGATGGATCT M C T S F
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complementation
group G (Cockayne
syndrome))
G103u7 WIAF-10448 HT2269 3507 ERCC5, excision TTCAAGTGAA[C/G]ATGCTGAAAG M C G H D
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complementation
group G (Cockayne
syndrome))
G103u8 WIAF-10457 HT2269 1388 ERCC5, excision CTCTTCACGAE[T/G]CACCAAGATC M T G D E
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma-
pigmentosum,
complementation
group G (Cockayne
syndrome))
G103u9 WIAF-10458 HT2269 1362 ERCC5, excision CCGGACTCTT[T/C]CAGCCATTAA M T C S P
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complementation
group G (Cockayne
syndrome))
G103u10 WIAF-10459 HT2269 2357 ERCC5, excision CTGAGAAAGA[T/C]GCCGAACATT S T C D D
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complementation
group G (Cockayne
syndrome))
G103u11 WIAF-10462 HT2269 3109 ERCC5, excision TGGAACAGAA[C/T]GAAGACAGAT M C T T M
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complementation
group G (Cockayne
syndrome))
G103u12 WIAF-10463 HT2269 3138 ERCC5, excision GTTTCCTGTA[T/C]TAAAGCAACT S T C L L
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complementation
group G (Cockayne
syndrome))
G103u14 WIAF-10484 HT2269 3553 ERCC5, excision AGAACAGCTG[C/T]GAAAGAGCCA M C T A V
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complementation
group G (Cockayne
syndrome))
G103u15 WIAF-10485 HT2269 1429 ERCC5, excision CATCTCCACA[C/T]CCGACCGCCA M C T T M
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complementation
group G (Cockayne
syndrome))
G103a16 WIAF-12097 HT2269 3335 ERCC5, excision AACAATTTGA[G/T]CTACTTCATA M G T E D
repair cross-
complementing
rodent repair
deficiency,
complementation
group 5 (xeroderma
pigmentosum,
complementation
group G (Cockayne
syndrome))
G1030u1 WIAF-12411 U07358 203 ZPK, zipper ACACTTCTGA[C/T]TGCACTCCCG S C T D D
(leucine) protein
kinase
G1030u2 WIAF-12416 U07358 1806 ZPK, zipper GCCACCCCAT[G/T]AACCTGGACG N G T E *
(leucine) protein
kinase
G1031a1 WIAF-12124 U87460 2825 GPR37, G protein- GAGTCACCAC[C/T]TTCACCTTAT S C T T T
coupled receptor 37
(endothelin
receptor type
B-like)
G1032u1 WIAF-12381 U57911 926 C110RF8, chromosome ACGTACATCA[A/C]TGCCTCGACG M A C N T
11 open reading
frame 8
G1033u1 WIAF-12437 M65188 431 GJA1, gap junction TCTGTACCCA[C/T]ACTCTTCTAC M C T T I
protein, alpha 1,
43 kD (connexin 43)
G1033u2 WIAF-12438 M65188 169 GJA1, gap junction ACGCAACATG[G/C]GTGACTGGAG M G C G R
protein, alpha 1,
43 kD (connexin 43)
G1033u3 WIAF-12439 M65188 467 GJA1, gap junction TATCTCATGC[C/A]AAAGGAACAG M G A R Q
protein, alpha 1,
43 kD (connexin 43)
G1033u4 WIAF-12440 M65188 263 GJA1, gap junction TTCATTTTCC[C/A]AATCCTGCTG M C A R Q
protein, alpha 1,
43 kD (connexin 43)
G1033u5 WIAF-12441 M65188 218 GJA1, gap junction CAAGCCTACT[C/T]AACTGCTCGA M C T S L
protein, alpha 1,
43 kD (connexin 43)
G1033u6 WIAF-12442 M65188 498 GJA1, gap junction AGAAAGAGGA[A/G]GAACTCAAGC S A G E E
protein, alpha 1,
43 kD (connexin 43)
G1033u7 WIAF-12465 M65188 550 GJA1, gap junction GCACTTGAAG[C/A]AGATTGAGAT M C A Q K
protein, alpha 1,
43 kD (connexin 43)
G1033u8 WIAF-12466 M65188 548 GJA1, gap junction ATGCACTTGA[A/G]GCACATTGAC M A G K R
protein, alpha 1,
43 kD (connexin 43)
G1033u9 WIAF-12486 M65188 933 GJA1, gap junction CCCTGAGCCC[T/C]GCCAAACACT S T C P P
protein, alpha 1,
43 kD (connexin 43)
G1033u10 WIAF-12487 M65188 990 GJA1, gap junction CCTCACCAAC[C/T]GCTCCCCTCT S C T T T
protein, alpha 1,
43 kD (connexin 43)
G1033u11 WIAF-12488 M65188 1034 GJA1, gap junction AACCTGCTTA[C/A]TCGCGACAGA M C A T N
protein, alpha 1,
43 kD (connexin 43)
G1033u12 WIAF-12489 M65188 1158 GJA1, gap junction CTAACTCCCA[T/C]CCACAGCCTT S T C H H
protein, alpha 1,
43 kD (connexin 43)
G1033u13 WIAF-12490 M65188 1222 GJA1, gap junction TGGACATGAA[T/C]TACAGCCACT S T C L L
protein, alpha 1,
43 kD (connexin 43)
G1033u14 WIAF-12491 M65188 1069 GJA1, gap junction CCGCAATTAC[A/C]ACAAGCAAGC M A G N D
protein, alpha 1,
43 kD (connexin 43)
G1033u15 WIAF-12492 M65188 1250 GJA1, gap junction CTCCACCACC[G/A]ACCTTCAAGC M G A R Q
protein, alpha 1,
43 kD (connexin 43)
G1033u16 WIAF-12496 M65188 423 GJA1, gap junction TATTTCTCTC[T/C]GTACCCACAC S T C S S
protein, alpha 1,
43 kD (connexin 43)
G1033u17 WIAF-12503 M65188 880 GJA1, gap junction CCTTAAGGAT[C/T]GGGTTAACCG M C T R W
protein, alpha 1,
43 kD (connexin 43)
G1033u18 WIAF-12504 M65188 855 GJA1, gap junction AACTCTTCTA[T/C]GTTTTCTTCA S T C Y Y
protein, alpha 1,
43 kD (connexin 43)
G1033u19 WIAF-12505 M65188 576 GJA1, gap junction AGTTCAAGTA[C/T]GGTATTGAAG S C T Y Y
protein, alpha 1,
43 kD (connexin 43)
G1033u20 WIAF-12512 M65188 1255 GJA1, gap junction CCACCCACCT[T/G]CAACCACACC M T G S A
protein, alpha 1,
43 kD (connexin 43)
G1033u21 WIAF-12513 M65188 1078 GJA1, gap junction CAACAAGCAA[C/A]CAAGTGACCA M G A A T
protein, alpha 1,
43 kD (connexin 43)
G1033u22 WIAF-12514 M65188 1097 GJA1, gap junction CAAAACTCCG[C/G]TAATTACACT M C G A G
protein, alpha 1,
43 kD (connexin 43)
G1034u1 WIAF-12443 J03544 1201 PYGB, phosphory- AGACCTGTGC[A/G]TACACCAACC S A G A A
lase, glycogen;
brain
G1034u2 WIAF-12469 J03544 771 PYGB, phosphory- GACACCCCAG[T/C]CCCCGGCTAC M T C V A
lase, glycogen;
brain
G1034u3 WIAF-12470 J03544 1465 PYGB, phosphory- TCCACTCCGA[C/C]ATCGTCAAAC M G C E D
lase, glycogen;
brain
G1034u4 WIAF-12471 J03544 1583 PYGB, phosphory- CCCGCTCCCC[G/A]ATACCATCCT M G A D N
lase, glycogen;
brain
G1034u5 WIAF-12472 J03544 1774 PYGB, phosphory- CCATGTTCGA[T/C]GTGCATGTGA S T C D D
lase, glycogen;
brain
G1034u6 WIAF-12474 J03544 2449 PYGB, phosphory- AGGTGGACCA[G/A]CTGTACCGGA S G A Q Q
lase, glycogen;
brain
G1034u7 WIAF-12508 J03544 718 PYGB, phosphory- CCCCCGACGG[C/T]GTGAAGTGGC S C T G G
lase, glycogen;
brain
G1035u1 WIAF-12484 U97105 1962 DPYSL2, dihydro- GCAGAGGAGC[A/G]GCAGACGATC M A G Q R
pyrimidinase-like 2
G1035u2 WIAF-12485 U97105 2842 DPYSL2, dihydro- ATGACGGACC[T/C]GTGTGTGAAG S T C P P
pyrimidinase-like 2
G1035u3 WIAF-12511 U97105 2062 DPYSL2, dihydro- CCATCACCAT[C/T]GCCAACCAGA S C T I I
pyrimidinase-like 2
G1036u1 WIAF-12444 D88460 311 WASL, Wiskott- ACGTGGGGTC[C/T]CTGTTGCTCA S C T S S
Aldrich syndrome
like
G1038u1 WIAF-12445 HT2746 994 PCTK2, PCTAIRE TAGAAGAAAG[C/A]TATTGCATCG M G A V I
protein kinase 2
G1039u1 WIAF-12429 HT2747 955 serine/threonine ATCCAAGAGT[C/T]GCATGTCAGC M C T R C
kinase, PCTAIRE-3
G1039u2 WIAF-12458 HT2747 808 serine/threonine CACAGAAGAG[A/T]CGTGGCCCGG M A T T S
kinase, PCTAIRE-3
G1041u1 WIAF-12459 X72886 544 H.sapiens TYRO3 CAAGTGGCTG[G/C]CCCTGGAGAG M G C A P
mRNA.
G1041u2 WIAF-12460 X72886 693 H.sapiens TYRO3 TTGGCGGGAA[C/T]CGCCTGAAAC S C T N N
mRNA.
G1041u3 WIAF-12502 X72886 561 H.sapiens TYRO3 AGAGCCTGGC[C/T]GACAACCTGT S C T A A
mRNA.
G1043u1 WIAF-12448 M94055 5481 Human voltage-gated CTCTGAGTGA[G/A]GATGACTTTG S G A E E
sodium channel
mRNA, complete cds.
G1043u2 WIAF-12449 M94055 5205 Human voltage-gated TTGACACCTT[T/C]GGCAACAGCA S T C F F
sodium channel
mRNA, complete cds.
G1043u3 WIAF-12450 M94055 5224 Human voltage-gated CATGATCTGC[C/T]TGTTCCAAAT S C T L L
sodium channel
mRNA, complete cds.
G1043u4 WIAF-12451 M94055 5514 Human voltage-gated AGGTTTGGGA[C/A]AACTTTCATC S C A E E
sodium channel
mRNA, complete cds.
G1043u5 WIAF-12452 M94055 5217 Human voltage-gated CCAACAGCAT[G/C]ATCTCCCTGT M G C M I
sodium channel
mRNA, complete cds.
G1043u6 WIAF-12453 M94055 5334 Human voltage-gated CCTCACTTAA[A/G]CCAGACTCTG S A G K K
sodium channel
mRNA, complete cds.
G1043u7 WIAF-12454 M54055 5424 Human voltage-gated TGTACATCGC[G/C]GTCATCCTGG S G C A A
sodium channel
mRNA, complete cds.
G1043u8 WIAF-12455 M94055 5322 Human voltage-gated ATCACCCTGG[A/C]AGCTCAGTTA S A C G G
sodium channel
mRNA, complete cds.
G1043u9 WIAF-12456 M94055 1200 Human voltage-gated ATGGCTACAC[G/A]AGCTTTGACA S G A T T
sodium channel
mRNA, complete cds.
G1043u10 WIAF-12499 M94055 1170 Human voltage-gated TCTGTGTGAA[G/T]GCTCGTAGAA M G T K N
sodium channel
mRNA, complete cds.
G1046a1 WIAF-13187 U50352 267 ACCN1, amiloride- TCCCACCTGT[C/A]ACCCTCTCTA S G A V V
sensitive cation
channel 1, neuronal
(degenerin)
G1046a2 WIAF-13188 U50352 282 ACCN1, amiloride- TCTGTAACCT[C/g]AATGGCTTCC S C g L L
sensitive cation
channel 1, neuronal
(degenerin)
G1046a3 WIAF-13189 U50352 315 ACCN1, amiloride- TCACCACCAA[C/t]CACCTGTACC S C t N N
sensitive cation
channel 1, neuronal
(degenerin)
G1046a4 WIAF-13190 U50352 386 ACCN1, amiloride- CCCCATCTGG[C/a]TGACCCCTCC M C a A D
sensitive cation
channel 1, neuronal
(degenerin)
G1046a5 WIAF-13191 U50352 417 ACCN1, amiloride- CCCTCCGGCA[C/A]AACCCCAACT S G A Q Q
sensitive cation
channel 1, neuronal
(degenerin)
G1048u1 WIAF-12641 HT5174S 3214 REST, RE1-silencing CAGTCAAACC[G/A]CCTAAGGCAC S G A A A
transcription
factor
G1048u2 WIAF-12642 HT5174S 3199 REST, RE1-silencing CAAAGGAAGC[C/G]TTGGCAGTCA S C G A A
transcription
factor
G1048u3 WIAF-12657 HT5174S 2125 REST, RE1-silencing CTCCCATCGA[C/T]ACTCCTCAGA M G T E D
transcription
factor
G1048u4 WIAF-12660 HT5174S 2333 REST, RE1-silencing CGAACCTCTT[A/C]ACATACACCT M A C K Q
transcription
factor
G1051u1 WIAF-12431 HT28321 658 SCNN1G, sodium ATGACACCTC[C/T]GACTGTGCCA S C T S S
channel, non-
voltage gated 1,
gamma
G1051u2 WIAF-12434 HT28321 1735 SCNN1G, sodium AAGCCAAGGA[G/A]TGGTCCGCCT S C A E E
channel, non-
voltage gated 1,
gamma
G1051u3 WIAF-12473 HT28321 409 SCNN1G, sodium AGTCCCTGTA[T/C]GCCTTTCCAC S T C Y Y
channel, non-
voltage gated 1,
gamma
G1051u4 WIAF-12475 HT28321 953 SCNN1G, sodium AGTCATTTTG[T/C]ACATAAACGA M T C Y H
channel, non-
voltage gated 1,
gamma
G1051u5 WIAF-12476 HT28321 975 SCNN1G, sodium GAGCAATACA[A/C]CCCATTCCTC M A G N S
channel, non-
voltage gated 1,
gamma
G1051u6 WIAF-12477 HT28321 1192 SCNN1G, sodium CTGCCTACTC[C/A]CTCCAGATCT S G A S S
channel, non-
voltage gated 1,
gamma
G1053a1 WIAF-13192 HT2201 4085 SCN5A, sodium CGTCCTCTGA[C/A]AGCTCTCTCA M G A R K
channel, voltage-
gated, type V,
alpha polypeptide
(long (electro-
cardiographic)
QT syndrome 3)
G1053a2 WIAF-13193 HT2201 5607 SCN5A, sodium ACTTTCCCCA[C/T]CCCCTGTCTG S C T D D
channel, voltage-
gated, type V,
alpha polypeptide
(long (electro-
cardiographic)
QT syndrome 3)
G1053a3 WIAF-13194 HT2201 5828 SCN5A, sodium GACCCCATCA[C/T]CACCACACTC M C T T I
channel, voltage-
gated, type V,
alpha polypeptide
(long (electro-
cardiographic)
QT syndrome 3)
G1053a4 WIAF-13202 HT2201 713 SCN5A, sodium GCGTTCACTT[T/A]CCTTCCGGAC M T A F Y
channel, voltage-
gated, type V,
alpha polypeptide
(long (electro-
cardiographic)
QT syndrome 3)
G1053a5 WIAF-13203 HT2201 6148 SCN5A, sodium CCACACTGAA[G/T]ATCTCGCCGA M G T D Y
channel, voltage-
gated, type V,
alpha polypeptide
(long (electro-
cardiographic)
QT syndrome 3)
G1053a6 WIAF-13204 HT2201 6217 SCN5A, sodium GCCCTCGCTC[C/T]CCACGACACA G T
channel, voltage-
gated, type V,
alpha polypeptide
(long (electro-
cardiographic)
QT syndrome 3)
G1053a7 WIAF-13205 HT2201 6324 SCN5A, sodium AATCCCCCTC[G/A]CCCCCGCCCA G A
channel, voltage-
gated, type V,
alpha polypeptide
(long (electro-
cardiographic)
QT syndrome 3)
G1054u1 WIAF-12419 HT2202 2252 SCN4A, sodium TTGGCAAGAG[C/T]TACAAGGAGT S C T S S
channel, voltage-
gated, type IV,
alpha polypeptide
G1054u2 WIAF-12423 HT2202 4559 SCN4A, sodium TGGTCATGTT[C/T]ATCTACTCCA S C T F F
channel, voltage-
gated, type IV,
alpha polypeptide
G1054u3 WIAF-12424 HT2202 4856 SCN4A, sodium TCAACATGTA[C/G]ATCGCCATCA N C G Y *
channel, voltage-
gated, type IV,
alpha polypeptide
G1054u4 WIAF-12425 HT2202 4777 SCN4A, sodium GTCAAGGCTC[A/G]CTGCGGCAAC M A G D G
channel, voltage-
gated, type IV,
alpha polypeptide
G1054u5 WIAF-12426 HT2202 4863 SCN4A, sodium GTACATCGCC[A/G]TCATCCTGGA M A G I V
channel, voltage-
gated, type IV,
alpha polypeptide
G1054u6 WIAF-12427 HT2202 4566 SCN4A, sodium GTTCATCTAC[T/G]CCATCTTCGG M T G S A
channel, voltage-
gated, type IV,
alpha polypeptide
G1054u7 WIAF-12428 HT2202 4923 SCN4A, sodium TGGTGAAGAT[G/T]ACTTTGAGAT M G T D Y
channel, voltage-
gated, type IV,
alpha polypeptide
G1054u8 WIAF-12446 HT2202 3595 SCN4A, sodium TTCTGGCTGA[T/C]CTTCAGCATC M T C I T
channel, voltage-
gated, type IV,
alpha polypeptide
G1054u9 WIAF-12447 HT2202 4203 SCN4A, sodium GGAGACAGAC[G/A]ACCAGAGCCA M G A D N
channel, voltage-
gated, type IV,
alpha polypeptide
G1054u10 WIAF-12495 HT2202 4811 SCN4A, sodium TCTGCTTCTT[C/A]TGCAGCTATA M C A F L
channel, voltage-
gated, type IV,
alpha polypeptide
G1054u11 WIAF-12497 HT2202 5555 SCN4A, sodium CAGGGCAGAC[T/G]GTGCGCCCAG S T G T T
channel, voltage-
gated, type IV,
alpha polypeptide
G1054u12 WIAF-12498 HT2202 5480 SCN4A, sodium CACGGGACGC[C/T]GGACCCACTA S C T A A
channel, voltage-
gated, type IV,
alpha polypeptide
G1059u1 WIAF-12432 HT33704 112 APLP1, amyloid beta CGCTGCTGCT[G/A]CCACTATTGC S G A L L
(A4) precursor-like
protein 1
G1059u2 WIAF-12433 HT33704 140 APLP1, amyloid beta TCTGCGCGCG[C/T]AGCCCGCCAT N C T Q *
(A4) precursor-like
protein 1
G1059u3 WIAF-12435 HT33704 1344 APLP1, amyloid beta CACCATGTGG[C/T]CGCCCTGGAT M C T A V
(A4) precursor-like
protein 1
G1059u4 WIAF-12457 HT33704 1687 APLP1, amyloid beta ATCACCGAAA[C/A]CTGAATGCGT S C A K K
(A4) precursor-like
protein 1
G1059u5 WIAF-12500 HT33704 976 APLP1, amyloid beta CGTTCCTGAG[A/C]GCCAAGATGG S A G R R
(A4) precursor-like
protein 1
G1059u6 WIAF-12501 HT33704 1786 APLP1, amyloid beta GTCAGGCTCT[A/G]TCGGGTCTGC S A G V V
(A4) precursor-like
protein 1
G1060u1 WIAF-12436 HT1418 1744 APLP2, amyloid beta CCAAGAAATT[C/C]AAGAGGAAAT M C G Q E
(A4) precursor-like
protein 2
G1060u2 WIAF-12467 HT1418 2213 APLP2, amyloid beta ATCACCCTGC[T/C]GATGCTGACC M T G V G
(A4) precursor-like
protein 2
G1060u3 WIAF-12468 HT1418 2256 APLP2, amyloid beta GCCACGGGAT[C/T]CTGGAGGTTG S C T I I
(A4) precursor-like
protein 2
G1066a1 WIAF-13195 HT3538 566 CCKBR, cholecysto- CTTTGGCACC[G/A]TCATCTGCAA M G A V I
kinin B receptor
G1066a2 WIAF-13196 HT3538 607 CCKBR, cholecysto- GGGTGTCTGT[G/A]AGTGTGTCCA S G A V V
kinin B receptor
G1066a3 WIAF-13206 HT3538 864 CCKBR, cholecysto- CTGCTGCTTC[T/A]GCTCTTGTTC M T A L Q
kinin B receptor
G1067u1 WIAF-12478 HT0830 684 KCNA1, potassium AAACGCTGTG[C/T]ATCATCTGGT S C T C C
voltage-gated
channel, shaker-
related subfamily,
member 1 (episodic
ataxia with
myokymia)
G1067u2 WIAF-12479 HT0830 722 KCNA1, potassium GTGCGCTTCT[T/C]CGCCTGCCCC M T C F S
voltage-gated
channel, shaker-
related subfamily,
member 1 (episodic
ataxia with myo-
kymia)
G1067u3 WIAF-12480 HT0830 804 KCNA1, potassium ATTTCATCAC[C/C]CTCGCCACCG S C G T T
voltage-gated
channel, shaker-
related subfamily,
member 1 (episodic
ataxia with myo-
kymia)
G1067u4 WIAF-12509 HT0830 690 KCNA1, potassium TGTGCATCAT[C/T]TGGTTCTCCT S C T I I
voltage-gated
channel, shaker-
related subfamily,
member 1 (episodic
ataxia with myo-
kymia)
G1068u1 WIAF-12493 HT0831 774 KCNA2, potassium TGAACATCAT[T/A]GACATTGTGG S T A I I
voltage-gated
channel, shaker-
related subfamily,
member 2
G1070a1 WIAF-13197 HT27728 522 KCNJ6, potassium CACAGTGACC[T/C]GGCTCTTTTT M T C W R
inwardly-rectifying
channel, subfamily
J, member 6
G1070a2 WIAF-13201 HT27728 1244 KCNJ6, potassium CCCTGGAGGA[T/C]GGGTTCTACG S T C D D
inwardly-rectifying
channel, subfamily
J, member 6
G1070a3 WIAF-13207 HT27728 707 KCNJ6, potassium ATAAATGCCC[C/A]GACCGAATTA S G A P P
inwardly-rectifying
channel, subfamily
J, member 6
G1071u1 WIAF-12422 HT48672 1534 KCNJ3, potassium TTCCGGGCAA[C/T]TCAGAACAAA S C T N N
inwardly-rectifying
channel, subfamily
J, member 3
G1073u1 WIAF-12461 HT4556 1127 KCNJ1, potassium CACTGTGCCA[T/C]GTGCCTTTAT M T C M T
inwardly-rectifying
channel, subfamily
J, member 1
G1074u1 WIAF-12462 HT27804 289 KCNAB2, potassium ACCTCTTCGA[T/C]ACACCAGAAG S T C D D
voltage-gated
channel, shaker-
related subfamily,
beta member 2
G1079u1 WIAF-12463 HT27383 1130 potassium channel, ACCTGGCCGA[T/A]GAGATCCTGT M T A D E
inwardly rectifing
(GB:D50582)
G1079u2 WIAF-12464 HT27383 1192 potassium channel, CCTTACTCTG[T/G]GGACTACTCC M T G V G
inwardly rectifing
(GB:D50582)
G1079u3 WIAF-12481 HT27383 708 potassium channel, CCTTCCCTCC[A/G]TCTTCATCAA M A G I V
inwardly rectifing
(GB:D50582)
G1079u4 WIAF-12482 HT27383 779 potassium channel, CGGTCATCGC[T/C]CTCCCCCACG S T C A A
inwardly rectifing
(GB:D50582)
G1079u5 WIAF-12483 HT27383 276 potassium channel, GCACCCTGCC[C/A]ACCCCACCTA M G A E K
inwardly rectifing
(GB:D50582)
G1079u6 WIAF-12510 HT27383 489 potassium channel, CTGCCTCATC[C/A]CCTTCCCCCA M G A A T
inwardly rectifing
(GB:D50582)
G1080u1 WIAF-12536 HT4412 1099 KCNJ4, potassium TCGACTACTC[A/G]CGTTTTCACA S A G S S
inwardly rectifying
channel, subfamily
J, member 4
G1080u2 WIAF-12537 HT4412 1050 KCNJ4, potassium GGCCACCGCT[T/A]TGAGCCTGTG M T A F Y
inwardly-rectifying
channel, subfamily
J, member 4
G1081u1 WIAF-12538 HT27724 1090 KCNJ2, potassium GGCCACCGCT[A/T]TGAGCCTGTG M A T Y F
inwardly-rectifying
channel, subfamily
J, member 2
G1082u1 WIAF-12662 HT28319 768 potassium channel, CGCGGGTCAC[C/T]GACGAGGGCG S C T T T
inwardly rectify-
ing, high
conductance,
alpha subunit
G1082u2 WIAF-12663 HT28319 854 potassium channel, CTGGTGTCGC[C/T]CATCACCATC M C T P L
inwardly rectify-
ing, high
conductance, alpha
subunit
G1082u3 WIAF-12679 HT28319 471 potassium channel, TCTCCATCGA[G/C]ACGCAGACCA M G C E D
inwardly rectify-
ing, high
conductance, alpha
subunit
G1084a1 WIAF-13198 HT0383 2028 KCNB1, potassium CACTCCCCAG[C/A]AAGACTCCGG M C A S R
voltage-gated
channel, Shab-
related subfamily,
member 1
G1084a2 WIAF-13199 HT0383 2033 KCNB1, potassium CCCAGCAAGA[C/G]TGGGCGCAGC M C G T S
voltage-gated
channel, Shab-
related subfamily,
member 1
G1084a3 WIAF-13200 HT0383 2321 KCNB1, potassium GAGTGTGCCA[C/A]GCTTTTGGAC M C A T K
voltage-gated
channel, Shab-
related subfamily,
member 1
G1084a4 WIAF-13208 HT0383 870 KCNB1, potassium ACAACCCCCA[G/A]CTGGCCCACG S C A Q Q
voltage-gated
channel, Shab-
related subfamily,
member 1
G1088u1 WIAF-12516 HT0522 1503 KCNA5, potassium TCCTGGGCAA[G/A]ACCTTCCAGC S G A K K
voltage-gated
channel, shaker-
related subfamily,
member 5
G1088u2 WIAF-12519 HT0522 1249 KCNA5, potassium CGAGCTGCTC[G/A]TGCGCTTCTT M G A V M
voltage-gated
channel, shaker-
related subfamily,
member 5
G1088u3 WIAF-12520 HT0522 973 KCNA5, potassium CTCTGGGTCC[G/A]CGCGGGCCAT M G A A T
voltage-gated
channel, shaker-
related subfamily,
member 5
G1088u4 WIAF-12521 HT0522 1013 KCNA5, potassium GTTATCCTCA[T/C]CTCCATCATC M T C I T
voltage-gated
channel, shaker-
related subfamily,
member 5
G1090u1 WIAF-12651 HT1497 1836 KCNA5, potassium CAACCAGCCA[G/A]TGGAGGAGGC M G A S N
voltage-gated
channel, shaker-
related subfamily,
member 6
G1091u1 WIAF-12714 HT0222 843 KCNA3, potassium CATCATCTGG[T/C]TCTCCTTCGA M T C F L
voltage-gated
channel, shaker-
related subfamily,
member 3
G1094a1 WIAF-13218 HT27381 1280 KCNJ8, potassium GTGTATTCTG[T/A]GGATTACTCC M T a V E
inwardly-rectifying
channel, subfamily
J, member 8
G1095u1 WIAF-12532 HT2629 765 KCNMA1, potassium TTCTCTACTT[C/T]GGCTTGCGGT S C T F F
large conductance
calcium-activated
channel, subfamily
M, alpha member 1
G1095u2 WIAF-12533 HT2629 2441 KCNMA1, potassium GTGGTCTGCA[T/C]CTTTGCCCAC M T C I T
large conductance
calcium-activated
channel, subfamily
M, alpha member 1
G1095u3 WIAF-12534 HT2629 2714 KCNMA1, potassium GATGATACTT[C/C]GCTCCACCAC M C G S W
large conductance
calcium-activated
channel, subfamily
M, alpha member 1
G1095u4 WIAF-12535 HT2629 2439 KCNMA1, potassium TCGTGGTCTG[C/T]ATCTTTGGCG S C T C C
large conductance
calcium-activated
channel, subfamily
M, alpha member 1
G1095u5 WIAF-12539 HT2629 3048 KCNMA1, potassium CACTCATGAG[C/T]GCGACGTACT S C T S S
large conductance
calcium-activated
channel, subfamily
M, alpha member 1
G1095u6 WIAF-12544 HT2629 2352 KCNMA1, potassium GGATGTTTCA[C/T]TGGTGTGCAC S C T H H
large conductance
calcium-activated
channel, subfamily
M, alpha member 1
G1095u7 WIAF-12545 HT2629 2392 KCNMA1, potassium CATCCTGACT[C/T]GAAGTGAAGC N C T R *
large conductance
calcium-activated
channel, subfamily
M, alpha member 1
G1095u8 WIAF-12546 HT2629 2295 KCNMA1, potassium CTGGCAATGA[T/C]CAGATTGACA S T C D D
large conductance
calcium-activated
channel, subfamily
M, alpha member 1
G1095u9 WIAF-12548 HT2629 2949 KCNMA1, potassium AGTTTTTGGA[C/T]CAAGACGATG S C T D D
large conductance
calcium-activated
channel, subfamily
M, alpha member 1
G1095u10 WIAF-12549 HT2629 2865 KCNMA1, potassium TGCACGGCAT[G/A]TTACGTCAAC M C A M I
large conductance
calcium-activated
channel, subfamily
M, alpha member 1
G1095u1 WIAF-12547 L26318 930 PRKMB, protein TGCTGGTAAT[A/T]CATCCATCTA S A T I I
kinase mitogen
activated 8
(MAP kinase)
G1098u1 WIAF-12515 L19711 2650 DAG1, dystroglycan TCTACCTGCA[C/T]ACAGTCATTC S C T H H
1 (dystrophin-
associated
glycoprotein 1)
G110u1 WIAF-10385 HT27392 230 meiosis-specific CAAAGGTATA[C/T]AGATGACAAC N C T Q *
recA homolog,
HsLim15
G110u2 WIAF-10397 HT27392 1050 meiosis-specific CCTGAAAATG[A/G]AGCCACCTTC M A G E G
recA homolog,
HsLim15
G110u3 WIAF-10399 HT27392 674 meiosis-specific TGAACATCAG[A/G]TGGACCTACT M A G M V
recA homolog,
HsLim15
G1106u1 WIAF-12647 HT5073 5781 MAP1B, microtubule- ACTATGAGAA[G/A]ATAGACAGAA S C A K K
associated protein
1B
G1106u2 WIAF-12648 HT5073 5916 MAP1B, microtubule- CTGAACAGCG[C/T]GGGTACTCAT S C T G G
associated protein
1B
G1106u3 WIAF-12650 HT5073 1837 MAP1B, microtubule- AGACAAGCCA[G/A]TAAAAACAGA M G A V I
associated protein
1B
G1105u4 WIAF-12653 HT5073 2476 MAP1B, microtubule- CACCACACCA[G/A]CTGTCATGGC M G A A T
associated protein
1B
G1106u5 WIAF-12656 HT5073 3913 MAP1B, microtubule- GCCCAATGAG[A/C]TTAAACTCTC M A G I V
associated protein
1B
G1106u6 WIAF-12667 HT5073 559 MAP1B, microtubule- GATTTTCACC[G/A]ATCAAGAGAT M G A D N
associated protein
1B
G1106u7 WIAF-12668 HT5073 570 MAP1B, microtubule- ATCAAGAGAT[C/T]CCGGAGTTAC S C T I I
associated protein
1B
G1106u8 WIAF-12669 HT5073 6175 MAP1B, microtubule- TACTTCCACA[T/C]ACTCTTACCA M T C Y H
associated protein
1B
G1106u9 WIAF-12670 HT5073 1215 MAP1B, microtubule- TCACTCTCCA[C/C]TACCTAAACA M G C Q H
associated protein
1B
G1106u10 WIAF-12672 HT5073 1821 MAP1B, microtubule- AGGTAATGGT[G/A]AAAAAAGACA S G A V V
associated protein
1B
G1106u11 WIAF-12673 HT5073 2727 MAP1B, microtubule- CTCCTGCCGA[G/T]TCCCCTGATG M G T E D
associated protein
1B
G1106u12 WIAF-12674 HT5073 2739 MAP1B, microtubule- CCCCTCATGA[G/A]GGAATCACTA S G A E E
associated protein
1B
G1106u13 WIAF-12676 HT5073 3643 MAP1B, microtubule- ACATGCCACT[C/A]ATCCCAAGCA M G A D N
associated protein
1B
G1106u14 WIAF-12677 HT5073 3609 MAP1B, microtubule- CACCCCTCAA[C/T]CCATTTTCTG S C T N N
associated protein
1B
G1106u15 WIAF-12682 HT5073 4752 MAP1B, microtubule- TTCCACACCC[A/T]ACAACAGATG S A T p p
associated protein
1B
G1110u1 WIAF-12517 HT1096 1527 myelin associated GCCCCCTCGT[G/C]CTCACCAGCA S G C V V
glycoprotein
G1110u2 WIAF-12518 HT1096 1678 myelin associated TGTGCGCCCC[G/T]TGGTCGCCTT M G T V L
glycoprotein
G1110u3 WIAF-12522 HT1096 1271 myelin associated GCCGTGTCAC[C/T]CCACGATGAT M C T P L
glycoprotein
G1113u1 WIAF-12523 HT2242 353 myelin transcrip- AATTCCGATC[C/T]GATCCTCACC M C T R L
tion factor 1
G1116a1 WIAF-13217 HT28451 417 myelin oligodendro- CAACCTTATC[G/A]ACACCCTCTC S G A S S
cyte glycoprotein
(MOG)
G1116a2 WIAF-13219 HT28451 913 myelin oligodendro- GCAGATCACT[C/G]TTGGCCTCGT M C G L V
cyte glycoprotein
(MOG)
G1116a3 WIAF-132201 HT28451 922 myelin oligodendro- TCTTGGCCTC[G/A]TCTTCCTCTG M G A V I
cyte glycoprotein
(MOG)
G1120u1 WIAF-12525 HT3695 1200 neurofilament, TAGAGATAGC[T/C]GCTTACAGAA S T C A A
subunit H
G1123u1 WIAF-12542 HT2569 2269 OMG, oligodendro- CAGCTGCAAC[T/C]CTAACTATTC S T C T T
cyte myelin
glycoprotein
G1126u1 WIAF-12526 HT28354 626 PSEN2, presenilin 2 GAGCGAAGCA[T/C]GTGATCATGC S T C H H
(Alzheimer disease
4)
G1126u2 WIAF-12527 HT28354 494 PSEN2, presenilin 2 ATGGAGAGAA[T/C]ACTGCCCAGT S T C N N
(Alzheimer disease
4)
G1126u3 WIAF-12528 HT28354 434 PSEN2, presenilin 2 TAATGTCGGC[C/T]GAGAGCCCCA S C T A A
(Alzheimer disease
4)
G1126u4 WIAF-12543 HT28354 550 PSEN2, presenilin 2 GACCCTGACC[G/A]CTATGTCTGT M G A R H
(Alzheimer disease
4)
G117u1 WIAF-10391 HT27765 156 GTBP, G/T mismatch- ACTTCTCACC[A/G]GGAGATTTGG S A G P P
binding protein
G117u2 WIAF-10392 HT27765 420 GTBP, G/T mismatch- AACGTGCAGA[T/C]GAAGCCTTAA S T C S S
binding protein
G117u3 WIAF-10407 HT27765 939 GTBP, G/T mismatch- CCCACGTTAG[T/C]GGAGGTGGTG S T C S S
binding protein
G117u4 WIAF-10411 HT27765 1622 GTBP, G/T mismatch-
binding protein CATTGTTCGA[G/A]ATTTAGGACT M G A R K
G117u5 WIAF-10412 HT27765 2405 GTBP, G/T mismatch- GACAGCAGGG[C/T]TATAATGTAT M C T A V
binding protein
G117u6 WIAF-10413 HT27765 2387 GTBP, G/T mismatch- AAGAGTCAGA[A/T]CCACCCAGAC M A T N I
binding protein
G125u1 WIAF-10371 HT28632 1999 ATM, ataxia CAGTAATTTT[C/T]CTCATCTTGT M C T P S
telangiectasia
mutated (includes
complementation
groups A, C and D)
G125u2 WIAF-10372 HT28632 2631 ATM, ataxia TAATGAATGA[C/A]ATTGCAGATA M C A D E
telangiectasia
mutated (includes
complementation
groups A, C and D)
G125u3 WIAF-10373 HT28632 3084 ATM, ataxia CAATGGAAGA[T/G]GTTCTTGAAC M T G D E
telangiectasia
mutated (includes
complementation
groups A, C and D)
G125Su5 WIAF-10375 HT28632 4767 ATM, ataxia CACTTATACC[C/T]CTTGTGTATG S C T P P
telangiectasia
mutated (includes
complementation
groups A, C and D)
G125u6 WIAF-10383 HT28632 8713 ATM, ataxia ATTCTTGGAT[C/T]CAGCTATTTG M C T P S
telangiectasia
mutated (includes
complementation
groups A, C and D)
G125u7 WIAF-10396 HT28632 1825 ATM, ataxia CACTTTGGCA[C/G]TGACCACCAG M C G L V
telangiectasia
mutated (includes
complementation
groups A, C and D)
G125u8 WIAF-10398 HT28632 2924 ATM, ataxia ACTACTGCTC[A/G]GACCAATACT M A G Q R
telangiectasia
mutated (includes
complementation
groups A, C and D)
G125u9 WIAF-10405 HT28632 8967 ATM, ataxia TTCAACGTGT[C/T]TTCACAAGAT S C T V V
telangiectasia
mutated (includes
complementation
groups A, C and D)
G125u10 WIAF-10408 HT28632 6954 ATM, ataxia CCAAACACCT[T/C]GTACAACTCT S T C L L
telangiectasia
mutated (includes
complementation
groups A, C and D)
G125u11 WIAF-10409 HT28632 6855 ATM, ataxia TTCACCACCC[T/C]ATCATCGCTC S T C P P
telangiectasia
mutated (includes
complementation
groups A, C and D)
G125u12 WIAF-10410 HT28632 6801 ATM, ataxia TATATATTAA[G/T]TGGCAGAAAC M G T K N
telangiectasia
mutated (includes
complementation
groups A, C and D)
G125u13 WIAF-10421 HT28632 335 ATM, ataxia CATTCAGATT[C/C]CAAACAAGCA M C G S C
telangiectasia
mutated (includes
complementation
groups A, C and D)
G125u14 WIAF-11607 HT28632 3966 ATM, ataxia TTCCACATCT[C/A]GTCATTAGAA S G A L L
telangiectasia
mutated (includes
complementation
groups A, C and D)
G125a15 WIAF-13130 HT28632 8642 ATM, ataxia CAGAAATATC[A/C]ACTCTTCATG M A C E A
telangiectasia
mutated (includes
complementation
groups A, C and D)
G136u1 WIAF-10388 HT3337 535 MLH1, mutL AGGAGAAAAG[C/T]TTTAAAAAAT M C T A V
(E. coli) homolog 1
(colon cancer, non-
polyposis type 2)
G136u2 WIAF-10389 HT3337 769 MLH1, mutL TTCAAAATGA[A/G]TGGTTACATA M A G N S
(E. coli) homolog 1
(colon cancer, non-
polyposis type 2)
G144u1 WIAF-11638 HT3625 1129 FOS, v-fos FBJ CCTCTGCACT[C/T]CGGTCGTCAC M C T P S
murine osteo-
sarcoma viral
oncogene homolog
G1461u1 WIAF-12562 HT0329 684 pRB-binding protein TTGCCAAGAA[C/A]TCCAAGAACC S G A K K
G1466u1 WIAF-12571 HT27849 2128 API2, apoptosis ATGATCCATG[G/C]GTAGAACATG M G C W C
inhibitor 2
G1468u1 WIAF-12563 HT4986 1928 apoptosis CCACCAGACC[A/T]GACGAGGGGC S A T P P
inhibitor, neuronal
G1468u2 WIAF-12564 HT4986 3057 apoptosis TTTGCAATTC[C/C]TTCAAGGGAG M C G L V
inhibitor, neuronal
G1472u1 WIAF-12565 HT28478 242 BAK1, BCL2- GGCACCAGTC[C/T]GGAGAGCCTG S C T C C
antagonist/killer 1
G1472u2 WIAF-12572 HT28478 509 BAK1, BCL2- TGCACCCCAC[G/A]GCAGAGAATG S G A T T
antagonist/killer 1
G1473u1 WIAF-12568 HT28606 394 CASP6, caspase 6, GGTGTCAACT[G/C]TTAGCCACGC M G C V L
apoptosis-related
cysteine protease
G1473u2 WIAF-12576 HT28606 411 CASP6, caspase 6, ACGCAGATGC[C/T]GATTGCTTTG S C T A A
apoptosis related
cysteine protease
G1479u1 WIAF-12550 Y09077 711 ATR, ataxia ACTTTATTAA[T/C]GGTTCTTACT M T C M T
telangiectasis and
Rad3 related
G1479u2 WIAF-12551 Y09077 4303 ATR, ataxia TTGCGTATGC[T/C]GATAATAGCC S T C A A
telangiectasia and
Rad3 related
G1479u3 WIAF-12552 Y09077 1894 ATR, ataxia ATTCTGATGA[T/C]CCCTGTTTAA S T C D D
telangiectasia and
Rad3 related
G1479u4 WIAF-12553 Y09077 1855 ATR, ataxia ATTTATGTGG[T/A]ATGCTCTCAC S T A G G
telangiectasia and
Rad3 related
G1479u5 WIAF-12558 Y09077 5287 ATR, ataxia TCATTCATTA[T/C]CATGGTCTAG S T C Y Y
telangiectasia and
Rad3 related
G1479u6 WIAF-12559 Y09077 5539 ATR, ataxia CAGCTTTTTA[T/C]GACTCACTGA S T C Y Y
telangiectasia and
Rad3 related
G1479u7 WIAF-12569 Y09077 1540 ATR, ataxia ATCCTGTTAT[T/C]GAGATGTTAG S T C I I
telangiectasia and
Rad3 related
G1479u8 WIAF-12570 Y09077 2521 ATR, ataxia ATTTAATGGA[A/C]GATCCAGACA S A G E E
telangiectasia and
Rad3 related
G1482u1 WIAF-12560 HT27870 3176 BLM, Bloom syndrome AAAATATAAC[G/A]GAATCCAGGA S G A T T
G1482u2 WIAF-12561 HT27870 3605 BLM, Bloom syndrome GAAATAAAGC[C/A]CAAACTGTAC S C A A A
G1482u3 WIAF-12573 HT27870 2677 BLM, Bloom syndrome TATCTATTAC[C/T]GAAAAACCCT M C T P L
G1483u1 WIAF-12597 HT1470 1910 MYBL2, v-myb avian GGATGAGGAT[C/A]TGAAGCTGAT M G A V M
myeloblastosis
viral oncogene
homolog-like 2
G14B3u2 WIAF-12610 HT1470 244 MYBL2, v-myb avian ATGAGGAGGA[C/T]GAGCAGCTGA S C T D D
myeloblastosis
viral oncogene
homolog-like 2
G1483u3 WIAF-12611 HT1470 1406 MYBL2, v-myb avian CACTCAGAAT[A/G]GCACCAGTCT M A G S G
myeloblastosis
viral oncogene
homolog-like 2
G1485u1 WIAF-12581 HT1432 1941 BCR, breakpoint TGGAGATGAG[A/G]AAATGGGTCC S A G R R
cluster region
G1485u2 WIAF-12582 HT1432 3144 BCR, breakpoint TGACCATCAA[T/C]AAGGAAGATG S T C N N
cluster region
G1485u3 WIAF-12583 HT1432 3777 BCR, breakpoint ATAACAAGGA[T/C]GTGTCGGTGA S T C D D
cluster region
G1485u4 WIAF-12603 HT1432 2831 BCR, breakpoint CAGATCAAGA[C/A]TGACATCCAG M G A S N
cluster region
G1485u5 WIAF-12608 HT1432 4217 BCR, breakpoint ATCCCTGCCC[C/T]GGACAGCAAG M C T P L
cluster region
G1486u1 WIAF-12578 HT33770 1909 BRCA2, breast ATTGATAATG[G/A]AAGCTGGCCA M G A G E
cancer 2, early
onset
G1486u2 WIAF-12579 HT33770 3623 BRCA2, breast AGTTTAGAAA[A/G]CCAAGCTACA S A G K K
cancer 2,
early onset
G1486u3 WIAF-12586 HT33770 1341 BRCA2, breast AAATGTAGCA[A/C]ATCAGAAGCC M A C N H
cancer 2, early
onset
G1486u4 WIAF-12594 HT33770 446 BRCA2, breast CTTATAATCA[G/A]CTGGCTTCAA S G A Q Q
cancer 2, early
onset
G1486u5 WIAF-12598 HT33770 3013 BRCA2, breast ACCATGGTTT[T/C]ATATGGAGAC M T C L S
cancer 2, early
onset
G1486u6 WIAF-12599 HT33770 3187 BRCA2, breast GAAAAAAATA[A/T]TGATTACATG M A T N I
cancer 2, early
onset
G1486u7 WIAF-12604 HT33770 4971 BRCA2, breast AGCATGTGAG[A/C]CCATTGAGAT M A C T P
cancer 2, early
onset
G1486u8 WIAF-12607 HT33770 4034 BRCA2, breast ATGATTCTGT[C/T]GTTTCAATGT S C T V V
cancer 2, early
onset
G1487u1 WIAF-12584 HT27632 2536 BRCA1, breast AGTCAGTGTG[C/G]AGCATTTGAA M C G A G
cancer 1, early
onset
G1487u2 WIAF-12587 HT27632 4697 BRCA1, breast CATCTCAAGA[G/C]GAGCTCATTA M G C E D
cancer 1, early
onset
G1487u3 WIAF-12595 HT27632 469 BRCA1, breast TCTCCTGAAC[A/G]TCTAAAAGAT M A G H R
cancer 1, early
onset
G1487u4 WIAF-12600 HT27632 3667 BRCA1, breast AGCGTCCAGA[A/G]AGGAGAGCTT M A G K R
cancer 1, early
onset
G1487u5 WIAF-12601 HT27632 3537 BRCA1, breast TATGGGAAGT[A/G]GTCATGCATC M A G S G
cancer 1, early
onset
G1487u6 WIAF-12602 HT27632 4956 BRCA1, breast ATCTGCCCAG[A/G]GTCCAGCTGC M A G S G
cancer 1, early
onset
G1487u7 WIAF-12605 HT27632 2090 BRCA1,breast AGTACAACCA[A/G]ATGCCAGTCA S A G Q Q
cancer 1, early
onset
G1487u8 WIAF-12614 HT27632 233 BRCA1,breast TCTCCACAAA[G/A]TGTGACCACA S G A K K
cancer 1, early
onset
G1492u1 WIAF-12585 HT3506 3912 cell death- TCCAGGTCCG[T/C]GGCCTGGAGA S T C R R
associated kinase
G1492u2 WIAF-12593 HT3506 4352 cell death- TACAACACCA[A/G]TAACGGGGCT M A G N S
associated kinase
G1492u3 WIAF-12606 HT3506 2127 cell death- GCAATTTGGA[C/T]ATCTCCAACA S C T D D
associated kinase
G1492u4 WIAF-12612 HT3506 1605 cell death- TCAAATTTCT[C/T]ACTGACAACA S C T L L
associated kinase
G1494u1 WIAF-12589 HT28507 366 cell death-inducing TTCACCACAC[T/C]TAAGGAGAAC M T C L P
protein Bik
G1495u1 WIAF-12580 HT27803 759 CSE1L, chromosome TTTCTTCCCT[G/C]ATCCTGATCT S G C L L
segregation 1
(yeast homolog)-
like
G1501u1 WIAF-13502 HT1949 1181 MCC, mutated in CAGCAATGAC[A/C]TTCCCATCGC M A C I L
colorectal cancers
G1501u2 WIAF-13503 HT1949 1753 MCC, mutated in CAGCTGAGAA[C/T]GCTGCCAAGG S C T N N
colorectal cancers
G1501u3 WIAF-13504 HT1949 2344 MCC, mutated in TGTCCCTAGC[T/C]GAACTCAGGA S T C A A
colorectal cancers
G1501u4 WIAF-13521 HT1949 445 MCC, mutated in AGCGAACGAC[G/A]CTTCGCTATG S G A T T
colorectal cancers
G1501u5 WIAF-13522 HT1949 1504 MCC, mutated in AAAGCAATGC[T/C]GAGAGGATGA S T C A A
colorectal cancers
G1501u6 WIAF-13527 HT1949 2511 MCC, mutated in TTCGTGAATG[A/G]TCTAAAGCGG M A G D G
colorectal cancers
G1502u1 WIAF-12633 HT1547 870 CCND1, cyclin D1 AGTGTGACCC[A/G]GACTGCCTCC S A G P P
(PRAD1: para-
thyroid adeno-
matosis 1)
G1503u1 WIAF-13741 U37022 1151 CDK4, cyclin- CATGCCAATT[G/A]CATCGTTCAC M G A C Y
dependent kinase 4
G1503u2 WIAF-13742 U37022 1410 CDK4, cyclin- CTGAAGCCCA[C/T]CAGTTGGGCA S C T D D
dependent kinase 4
G1503u3 WIAF-13743 U37022 1328 CDK4, cyclin- TATGCAACAC[C/T]TGTGGACATG M C T P L
dependent kinase 4
G1503u4 WIAF-13780 U37022 1194 CDK4, cyclin- TTCTGGTCAC[A/G]AGTGGTCCAA S A G T T
dependent kinase 4
G1503u5 WIAF-13781 U37022 1443 CDK4, cyclin- TGATTGGGCT[G/A]CCTCCAGAGG S G A L L
dependent kinase 4
G1503u6 WIAF-13787 U37022 1633 CDK4, cyclin- CTCTTATCTA[C/T]ATAAGGATGA M C T H Y
dependent kinase 4
G1517u1 WIAF-12618 HT1132 3894 ERBB3, v-erb-b2 CAGACCTCAG[T/C]GCCTCTCTGG S T C S S
avian erythro-
blastic leukemia
viral oncogene
homolog 3
G152u1 WIAF-11608 HT3854 1673 HSPA1L, heat shock GTGAGTGATG[A/C]AGGTTTCAAG M A C E A
70 kD protein like
1
G152u2 WIAF-11629 HT3854 1683 HSPA1L, heat shock AAGGTTTGAA[G/A]GGCAAGATTA S G A K K
70 kD protein like
1
G152u3 WIAF-11609 HT3854 1478 HSPA1L, heat shock GTCACAGCCA[C/T]GGACAAGAGC M C T T M
70 kD protein like
1
G152u4 WIAF-11610 HT3854 1443 HSPA1L, heat shock TGACGTTTCA[C/T]ATTGATGCCA S C T D D
70 kD protein like
1
G1520u1 WIAF-12162 HT1175 2211 DNA excision repair TGACCGTGGA[C/T]GAGGGTGTCC S C T D D
protein ERCC2, 5′
end
G1520u2 WIAF-12166 HT1175 546 DNA excision repair CCCACTGCCG[A/C]TTCTATGAGG S A C R R
protein ERCC2, 5′
end
G1527u1 WIAF-12168 HT0086 577 GSTM2, glutathione TCATCTCCCG[A/C]TTTGAGGGCT S A C R R
S-transferase M2
(muscle)
G1527u2 WIAF-12169 HT0086 644 GSTM2, glutathione ACCTGTGTTC[A/T]CAAAGATGGC M A T T S
S-transferase M2
(muscle)
G1527u3 WIAF-12171 HT0086 100 GSTM2, glutathione ACTCAAGCTA[C/T]GAGGAAAAGA S C T Y Y
S-transferase M2
(muscle)
G1527u4 WIAF-12172 HT0086 41 GSTM2, glutathione GGGGTACTGG[A/G]ACATCCGCGG M A G N D
S-transferase M2
(muscle)
G1527u5 WIAF-12173 HT0086 215 GSTM2, glutathione GATTGATGGG[A/G]CTCACAAGAT M A G T A
S-transferase M2
(muscle)
G1527u6 WIAF-12194 HT0086 238 GSTM2, glutathione CCCAGAGCAA[T/C]GCCATCCTGC S T C N N
S-transferase M2
(muscle)
G1528u1 WIAF-11950 HT1811 529 GSTM3, glutathione GTATATTTGA[C/G]CCCAAGTGCC M C G D E
S-transferase M3
(brain)
G1528u2 WIAF-11951 HT1811 674 GSTM3, glutathione CAACAAGCCT[C/A]TATGCTGAGC M G A V I
S-transferase M3
(brain)
G1528u3 WIAF-11989 HT1811 572 GSTM3, glutathione GGCTTTCATG[T/C]GCCGTTTTGA M T G C G
S-transferase M3
(brain)
G1528u4 WIAF-13470 HT1811 240 GSTM3, glutathione CAGAGCAATG[C/A]CATCTTGCGC M C A A D
S-transferase M3
(brain)
G1529u1 WIAF-14146 HT2006 797 GSTM4, glutathione TGGACGCCTT[C/T]CCAAATCTGA S C T F F
S-transferase M4
G153u1 WIAF-12163 HT3856 1212 HSPA1B, heat shock TGGGGCTGGA[G/A]ACGGCCGGAG S G A E E
70 kD protein 1
G153u2 WIAF-12182 HT3856 676 HSPA1B, heat shock GGCCGGGGAC[A/G]CCCACCTGGG M A G T A
70 kD protein 1
G153u3 WIAF-12183 HT3856 1695 HSPA1B, heat shock TCAGCGAGGC[C/G]GACAAGAAGA S C G A A
70 kD protein 1
G153u4 WIAF-12189 HT3856 330 HSPA1B, heat shock ACAAGGGGGA[G/C]ACCAAGGCAT M G C E D
70 kD protein 1
G153u5 WIAF-12190 HT3856 1053 HSPA1B, heat shock AGCTGCTGCA[A/G]GACTTCTTCA S A G Q Q
70 kD protein 1
G1530u1 WIAF-11964 HT3010 673 GSTM5, glutathione ATTCCTCCGA[G/A]GTCTTTTGTT M G A G S
S-transferase M5
G1530u2 WIAF-11995 HT3010 593 GSTM5, glutathione GACGCCTTCC[T/C]AAACTTGAAG M T C L P
S-transferase M5
G1530u3 WIAF-13473 HT3010 693 GSTM5, glutathione TTGGAAAGTC[A/G]GCTACATGGA S A G S S
S-transferase M5
G1533u1 WIAF-13458 HT27460 543 GSTT2, glutathione CTCTCGGCTA[C/T]GAACTGTTTG S C T Y Y
S-transferase
theta 2
G1533u2 WIAF-13460 HT27460 417 GSTT2, glutathione GGACTGCCAT[G/A]GACCAGGCCC M G A M I
S-transferase
theta 2
G1533u3 WIAF-13461 HT27460 359 GSTT2, glutathione CAGGTGTTGG[G/A]GCCACTCATT M G A G E
S-transferase
theta 2
G1533u4 WIAF-13462 HT27460 363 GSTT2, glutathione TGTTGGGGCC[A/C]CTCATTGGGG S A C P P
S-transferase
theta 2
G1533u5 WIAF-13463 HT27460 385 GSTT2, glutathione CCAGGTGCCC[G/A]AGGAGAAGGT M G A E K
S-transferase
theta 2
G1535u1 WIAF-11952 HT0436 517 HCK, hemopoietic CCGCGTTGAC[T/C]CTCTGGACAC M T C S P
cells kinase
G1535u2 WIAF-12013 HT0436 783 HCK, hemopoietic TGGACCACTA[C/T]AAGAAGGGGA S C T Y Y
cells kinase
G1535u3 WIAF-13464 HT0436 357 HCK, hemopoietic TCATCGTGGT[T/C]GCCCTGTATG S T C V V
cells kinase
G1535u4 WIAF-13465 HT0436 387 HCK, hemopoietic CCATTCACCA[C/T]GAAGACCTCA S C T H H
cells kinase
G1535u5 WIAF-13466 HT0436 471 HCK, hemopoietic CCCTGGCCAC[C/G]CGGAAGGACG S C G T T
cells kinase
G1535u6 WIAF-13467 HT0436 240 HCK, hemopoietic CCAGCGCCAG[C/T]CCACACTCTC S C T S S
cells kinase
G1535u7 WIAF-13468 HT0436 394 HCK, hemopoietic CCACGAAGAC[C/T]TCAGCTTCCA M C T L F
cells kinase
G1537u1 WIAF-12020 U04045 1514 MSH2, mutS GTGAATTAAG[A/C]GAAATAATCA S A G R R
(E. coli) homolog 2
(colon cancer, non-
polyposis type 1)
G1537u2 WIAF-12044 U04045 599 MSH2, mutS GACTGTGTGA[A/T]TTCCCTGATA M A T E D
(E. coli) homolog 2
(colon cancer, non-
polyposis type 1)
G1537u3 WIAF-12045 U04045 1452 MSH2, mutS AGATATGGAT[C/T]AGGTGGAAAA N C T Q *
(E. coli) homolog 2
(colon cancer, non-
polyposis type 1)
G1537u4 WIAF-12076 U04045 938 MSH2, mutS GACACTTTGA[A/T]CTGACTACTT M A T E D
(E. coli) homolog 2
(colon cancer, non-
polyposis type 1)
G1537u5 WIAF-12077 U04045 1878 MSH2, mutS TCAGCTAGAT[G/A]CTGTTGTCAG M G A A T
(E. coli) homolog 2
(colon cancer, non-
polyposis type 1)
G1543u1 WIAF-13856 J00119 553 MOS, v-mos Moloney GAGTTTCTGG[G/T]CTGAGCTCAA M G T A S
murine sarcoma
viral oncogene
homolog
G1543u2 WIAF-13857 J00119 621 MOS, v-mos Moloney GCACGCGCAC[G/A]CCCGCAGGGT S G A T T
murine sarcoma
viral oncogene
homolog
G1544u1 WIAF-12018 U59464 3821 PTCH, patched CATCCCGAAT[C/T]CAGGCATCAC M C T S F
(Drosophila)
homolog
G1544u2 WIAF-12019 U59464 3618 PTCH, patched GCGTGGTCCG[C/T]TTCGCCATGC S C T R R
(Drosophila)
homolog
G1544u3 WIAF-12027 U59464 1761 PTCH, patched ATTTTGCCAT[G/T]GTTCTGCTCA M G T M I
(Drosophila)
homolog
G1544u4 WIAF-12029 U59464 4074 PTCH, patched CTGCCATGGG[C/T]AGCTCCGTGC S C T G G
(Drosophila)
homolog
G1544u5 WIAF-12043 U59464 3845 PTCH, patched CCCTCGAACC[C/T]GAGACAGCAG M C T P L
(Drosophila)
homolog
G1544u6 WIAF-12056 U59464 1433 PTCH, patched CTGCTGGTTG[C/T]ACTGTCAGTG M C T A V
(Drosophila)
homolog
G1544u7 WIAF-12058 U59464 3298 PTCH, patched CACCGTTCAC[G/C]TTGCTTTGGC M G C V L
(Drosophila)
homolog
G1544u8 WIAF-12062 U59464 3986 PTCH, patched TCTACTGAAG[G/A]GCATTCTGGC M G A G E
(Drosophila)
homolog
G1544u9 WIAF-13489 U59464 1665 PTCH, patched CCATCAGCAA[T/C]GTCACAGCCT S T C N N
(Drosophila)
homolog
G1544u10 WIAF-13490 U59464 2396 PTCH, patched AAATACTTTT[C/T]TTTCTACAAC M C T S F
(Drosophila)
homolog
G1544u11 WIAF-13491 U59464 2199 PTCH, patched GGACACTCTC[A/G]TCTTTTGCTG S A G S S
(Drosophila)
homolog
G1544u12 WIAF-13492 U59464 2222 PTCH, patched AAGCACTATG[C/T]TCCTTTCCTC M C T A V
(Drosophila)
homolog
G1544u13 WIAF-13500 U59464 1686 PTCH, patched TCTTCATGGC[C/T]GCGTTAATCC S C T A A
(Drosophila)
homolog
G1545u1 WIAF-12032 HT0473 1835 RAG1, recombina- GGACATGGAA[G/A]AAGACATCTT M G A E K
tion activating
gene 1
G1545u2 WIAF-12035 HT0473 2519 RAG1, recombina- TGACATTGGC[A/G]ATGCAGCTGA M A G N D
tion activating
gene 1
G1545u3 WIAF-12046 HT0473 3045 RAG1, recombina- CGGAAAATGA[A/G]TGCCAGGCAG M A G N S
tion activating
gene 1
G1545u4 WIAF-12047 HT0473 3146 RAG1, recombina- TCATAATGCA[T/C]TAAAAACCTC S T C L L
tion activating
gene 1
G1545u5 WIAF-12075 HT0473 2513 RAG1, recombina- CCACTGTGAC[A/T]TTGGCAATGC M A T I F
tion activating
gene 1
G1545u6 WIAF-13484 HT0473 1322 RAG1, recombina- GTCGCTGACT[C/T]GGAGAGCTCA M C T R W
tion activating
gene 1
G1545u7 WIAF-13494 HT0473 2571 RAG1, recombina- GAAGTGTATA[A/C]GAATCCCAAT M A G K R
tion activating
gene 1
G1545u8 WIAF-13498 HT0473 1018 RAG1, recombina- TTCTGGCTGA[C/A]CCTGTGGAGA M C A D E
tion activating
gene 1
G1545u9 WIAF-13499 HT0473 2782 RAG1, recombina- ATCTTTACCT[G/C]AAGATGAAAC S G C L L
tion activating
gene 1
G1548u1 WIAF-12015 HT4999 133 IF127, interferon, CTCTGCCGTA[G/A]TTTTGCCCCT M G A V I
alpha-inducible
protein 27
G1548u2 WIAF-13482 HT4999 380 IFI27, interferon, ATCCTGGGCT[C/T]CATTGGGTCT M C T S F
alpha-inducible
protein 27
G1548u3 WIAF-13483 HT4999 135 IF127, interferon, CTGCCGTAGT[T/C]TTGCCCCTGG S T C V V
alpha-inducible
protein 27
G155u1 WIAF-11634 HT3962 991 CHCl, chromosome AGCTGGATGT[G/A]CCTGTGGTAA S G A V V
condensation 1
G155u2 WIAF-11635 HT3962 1271 CHCl, chromosome CGGCTTCGGC[C/T]TCTCCAACTA M C T L F
condensation 1
G155u3 WIAF-11636 HT3962 1192 CHCl, chromosome GCCGGGGCCA[C/T]GTGAGATTCC S C T H H
condensation 1
G155u4 WIAF-11637 HT3962 1267 CHCl, chromosome TGTACGGCTT[C/T]GGCCTCTCCA S C T F F
condensation 1
G155u5 WIAF-11649 HT3962 1657 CHCl, chromosome TGATGGGCAA[A/G]CAGCTGGAGA S A G K K
condensation 1
G1550u1 WIAF-12057 M16038 611 LYN, v-yes-1 Yama- GCAAAGTCCC[T/G]TTTAACAAAA M T G L R
guchi sarcoma viral
related oncogene
homolog
G1550u2 WIAF-12061 M16038 1371 LYN, v-yes-1 Yama- TGGCATACAT[C/T]GAGCGGAAGA S C T I I
guchi sarcoma viral
related oncogene
homolog
G1550u3 WIAF-12080 M16038 1059 LYN, v-yes-1 Yama- AAAGGCTTGG[C/T]GCTGGGCAGT S C T G G
guchi sarcoma viral
related oncogene
homolog
G1550u4 WIAF-12081 M16038 996 LYN, v-yes-1 Yama- AGCCACAGAA[G/A]CCATGGGATA S G A K K
guchi sarcoma viral
related oncogene
homolog
G1552u1 WIAF-12030 HT4578 2355 PMS1, postmeiotic CCTGCTATTT[A/T]AAAGACTTCT N A T K *
segregation
increased
(S. cerevisiae) 1
G1552u2 WIAF-12031 HT4578 2231 PMS1, postmeiotic
segregation ACAAAGTTGA[C/T]TTAGAAGAGA S C T D D
increased
(S. cerevisiae) 1
G1552u3 WIAF-12040 HT4578 617 PMS1, postmeiotic TCATGAGCTT[T/C]GGTATCCTTA S T C F F
segregation
increased
(S. cerevisiae) 1
G1552u4 WIAF-12063 HT4578 1723 PMS1, postmeiotic TCATGTAACA[A/C]AAAATCAAAT M A G K R
segregation
increased
(S. cerevisiae) 1
G1552u5 WIAF-12064 HT4578 1732 PMS1, postmeiotic AAAAAATCAA[A/G]TGTAATAGAT M A G N S
segregation
increased
(S. cerevisiae) 1
G1552u6 WIAF-12065 HT4578 1660 PMS1, postmeiotic TTACCATGTA[A/G]AGTAAGTAAT M A G K R
segregation
increased
(S. cerevisiae) 1
G1552u7 WIAF-12066 HT4578 1975 PMS1, postmeiotic GAACGATACA[A/G]TAGTCAAATG M A G N S
segregation
increased
(S. cerevisiae) 1
G1552u8 WIAF-12067 HT4578 1881 PMS1, postmeiotic TTTAGAGGAT[G/T]CAACACTACA M G T A S
segregation
increased
(S. cerevisiae) 1
G1552u9 WIAF-12068 HT4578 2454 PMS1, postmeiotic TTTAGACGTT[T/A]TATATAAAAT M T A L I
segregation
increased
(S. cerevisiae) 1
G1552u10 WIAF-12069 HT4578 2457 PMS1, postmeiotic AGACGTTTTA[T/C]ATAAAATGAC M T C Y H
segregation
increased
(S. cerevisise) 1
G1552u11 WIAF-12082 HT4578 2557 PMS1, postmeiotic ATACCAGGAC[T/C]TTCAATTACT M T C V A
segregation
increased
(S. cerevisiae) 1
G1552u12 WIAF-12083 HT4578 971 PMS1, postmeiotic
segregation TTTTCTTTCT[G/T]AAAATCGATG S G T L L
increased
(S. cerevisiae) 1
G1554u1 WIAF-12028 HT4161 1500 ELK3, ELK3, ETS- CTCAGAAATC[C/T]TGATGACCTC S C T S S
domain protein (SRF
accessory protein
2) NOTE: Symbol
and name
provisional.
G1554u2 WIAF-12059 HT4161 1380 ELK3, ELK3, ETS- CTGCCAGGCT[G/A]CAAGGGCCAA S G A L L
domain protein (SRF
accessory protein
2) NOTE: Symbol
and name
provisional.
G1554u3 WIAF-12060 HT4161 1436 ELK3, ELK3, ETS- CACATGCCAG[T/C]GCCAATCCCC M T C V A
domain protein (SRF
accessory protein
2) NOTE: Symbol
and name
provisional.
G1562u1 WIAF-12024 HT28220 804 PDCD1, programmed GGGGCTCAGC[T/C]GACGGCCCTC S T C A A
cell death 1
G1562u2 WIAF-13488 HT28220 644 PDCD1, programmed GACCCCTCAG[C/T]CGTGCCTCTG M C T A V
cell death 1
G1563u1 WIAF-13493 HT1187 1748 EGFR, epidermal CCGGAGCCCA[G/A]GGACTGCGTC M G A R K
growth factor
receptor (avian
erythroblastic
leukemia viral
(v-erb-b) oncogene
homolog)
G1563u2 WIAF-13497 HT1187 2073 EGFR, epidermal ACGGATGCAC[T/A]GGGCCAGGTC S T A T T
growth factor
receptor (avian
erythroblastic
leukemia viral
(v-erb-b) oncogene
homolog)
G1566u1 WIAF-12016 HT27594 235 PDCD2, programmed GCGCCGCTGC[C/G]TGGCCGCCCG M C G P R
cell death 2
G1566u2 WIAF-12033 HT27594 904 PDCD2, programmed TTGGAATTCC[A/G]GGTCATGCCT M A G Q R
cell death 2
G1566u3 WIAF-12041 HT27594 331 PDCD2, programmed AATCAACTAC[C/T]CAGGAAAAAC M C T P L
cell death 2
G1566u4 WIAF-12071 HT27594 649 PDCD2, programmed CCTGAGGTTG[T/C]GGAAAAGGAA M T C V A
cell death 2
G1566u5 WIAF-12072 HT27594 633 PDCD2,programmed AGAAGATGAG[A/T]TTATGCCTGA M A T I F
cell death 2
G1567u1 WIAF-12042 M95936 293 AKT2, v-akt murine GAGAGGCCGC[G/A]ACCCAACACC M G A R Q
thymoma viral
oncogene homolog 2
G1572u1 WIAF-12212 HT3998 1894 proto-oncogene c- TGTTCCAGGA[A/G]TCCAGTATCT S A G E E
abl, tyrosine
protein kinase,
alt. transcript 2
G1572u2 WIAF-12233 HT3998 3694 proto-oncogene c- AGCTTCAGAT[C/T]TGCCCGGCGA S C T I I
abl, tyrosine
protein kinase,
alt. transcript 2
G1572u3 WIAF-12234 HT3998 3721 proto-oncogene c- GCAGTGGTCC[G/A]GCGGCCACTC S G A P P
abl, tyrosine
protein kinase,
alt. transcript 2
G1573u1 WIAF-12021 HT0642 343 CBL, Cas-Br-M TCATGGACAA[G/C]CTGGTGCGGT M G C K N
(murine) ecotropic
retroviral
transforming
sequence
G1573u2 WIAF-12022 HT0642 363 CBL, Cas-Br-M TTGTGTCAGA[A/T]CCCAAAGCTG M A T N I
(murine) ecotropic
retroviral
transforming
sequence
G1573u3 WIAF-12034 HT0642 2364 CBL, Cas-Br-M AATATTCAGT[C/T]CCAGGCGCCA M C T S P
(murine) ecotropic
retroviral
transforming
sequence
G1573u4 WIAF-12049 HT0642 387 CBL, Cas-Br-M CTAAAGAATA[G/A]CCCACCTTAT M G A S N
(murine) ecotropic
retroviral
transforming
sequence
G1573u5 WIAF-12050 HT0642 947 CBL, Cas-Br-M AACTCATCCT[G/A]GCTACATGGC M G A G S
(murine) ecotropic
retroviral
transforming
sequence
G1573u6 WIAF-12070 HT0642 2740 CBL, Cas-Br-M TCGAGAACCT[C/T]ATGAGTCAGG S C T L L
(murine) ecotropic
retroviral
transforming
sequence
G1573u7 WIAF-12073 HT0642 661 CBL, Cas-Br-M TCTTTCCAAG[T/C]GGACTCTTTC S T C S S
(murine) ecotropic
retroviral
transforming
sequence
G1573u8 WIAF-12074 HT0642 2569 CBL, Cas-Br-M CTCTGGATGG[T/C]GATCCTACAA S T C G G
(murine) ecotropic
retroviral
transforming
sequence
G1573u9 WIAF-13486 HT0642 2006 CBL, Cas-Br-M CCGGCACTCA[C/T]TTCCATTTTC M C T L F
(murine) ecotropic
retroviral
transforming
sequence
G1574u1 WIAF-12037 HT1508 2493 FES, feline sarcoma
(Snyder-Theilen) AGCGGCCCAG[C/T]TTCAGCACCA S C T S S
viral (v-fes)/
Fujinami avian
sarcoma (PRCII)
viral (v-
fps) oncogene
homolog
G1574u2 WIAF-12051 HT1508 189 FES, feline sarcoma
(Snyder-Theilen) CCCAGCGGGT[C/T]AAGAGTGACA S C T V V
viral (v-fes)/
Fujinami avian
sarcoma (pRCII)
viral (v-
fps) oncogene
homolog
G1574u3 WIAF-12052 HT1508 1441 FES, feline sarcoma GAAGCCCCTG[C/T]ATGAGCAGCT M C T H Y
(Snyder-Theilen)
viral (v-fes)/
Fujinami avian
sarcoma (PRCII)
viral (v-
fps) oncogene
homolog
G1574u4 WIAF-12053 HT1508 2202 FES, feline sarcoma GAGAGGAAGC[C/T]GATGGGGTCT S C T A A
(Snyder-Theilen)
viral (v-fes)/
Fujinami avian
sarcoma (PRCII)
viral (v-fps)
oncogene homolog
G1574u5 WIAF-12054 HT1508 2088 FES, feline sarcoma CTGCTGGCAT[G/T]GAGTACCTGG M G T M I
(Snyder-Theilen)
viral (v-fes)/
Fujinami avian
sarcoma (PRCII)
viral (v-fps)
oncogene homolog
G1574u6 WIAF-12078 HT1508 1577 FES, feline sarcoma GATGGTCTGC[C/T]CCGGCACTTC M C T P L
(Snyder-Theilen)
viral (v-fes)/
Fujinami avian
sarcoma (PRCII)
viral (v-fps)
oncogene homolog
G1574u7 WIAF-13495 HT1508 579 FES, feline sarcoma GTGACAAGGC[T/C]AAGGACAAGT S T C A A
(Snyder-Theilen)
viral (v-fes)/
Fujinami avian
sarcoma (PRCII)
viral (v-fps)
oncogene homolog
G1575u1 WIAF-12079 HT1052 963 FGR, Gardner- TGGGCACCGG[C/T]TGCTTCGGGG S C T G G
Rasheed feline
sarcoma viral
(v-fgr) oncogene
homolog
G1575u2 WIAF-13487 HT1052 232 FGR, Gardner- CAGAAGCTAC[G/A]GGGCAGCAGA M G A G R
Rasheed feline
sarcoma viral
(v-fgr) oncogene
homolog
G1585u1 WIAF-12017 HT1675 996 CRK, v-crk avian TGGATCAACA[G/A]AATCCCGATG S G A Q Q
sarcoma virus CT10
oncogene homolog
G1585u2 WIAF-12036 HT1675 446 CRK, v-crk avian ACTACAACGT[T/C]GATAGAACCA M T C L S
sarcoma virus CT10
oncogene homolog
G1587u1 WIAF-12023 HT0590 1473 proto-oncogene dbl GGCCAATCCA[A/G]TTTGTGGTAC S A G Q Q
G1587u2 WIAF-12025 HT0590 2549 proto-oncogene dbl GTCCAGGCTT[C/T]TAATGTAGAT M C T S F
G1587u3 WIAF-12026 HT0590 2828 proto-oncogene dbl GCATCACAAT[C/T]TGCAGAAATC M C T S F
G1587u4 WIAF-12038 HT0590 982 proto-oncogene dbl AAATTCTCAG[G/C]AGCTATTATC M G C E Q
G1587u5 WIAF-12039 HT0590 2343 proto-oncogene dbl AACCAATGCA[G/T]CGACACCTTT M G T Q H
G1587u6 WIAF-12048 HT0590 683 proto-oncogene dbl GACACTGAAG[G/A]AGCTGTCAGT M G A G E
G1587u7 WIAF-12055 HT0590 2686 proto-oncogene dbl TTCTCTTCAG[C/T]AGAATGATGA N C T Q *
G1587u8 WIAF-13485 HT0590 2136 proto-oncogene dbl ACTGTGAAGG[T/A]TCTGCTCTGT S T A G G
G1587u9 WIAF-13496 HT0590 1566 proto-oncogene dbl AAAATCAGAG[C/T]AACTTAAAAA S C T S S
G159u1 WIAF-11616 HT4209 1059 RAD23B, RAD23 AGTACTGGGG[C/T]TCCTCAGTCT M C T A V
(S. cerevisiae)
homolog B
G1590u1 WIAF-13897 HT2455 1257 ETS2, v-ets avian GCCAGTCTCT[C/G]TGCCTCAATA S C G L L
erythroblastosis
virus E26 oncogene
homolog 2
G1590u2 WIAF-13913 HT2455 1107 ETS2, v-ets avian ATTCTGGGAC[T/G]CCCAAAGACC S T G T T
erythroblastosis
virus E26 oncogene
homolog 2
G1590u3 WIAF-13914 HT2455 1314 ETS2, v-ets avian GGAGTGACCC[A/G]GTGGAGCAAG S A G P P
erythroblastosis
virus E26 oncogene
homolog 2
G1591u1 WIAF-13924 HT2333 417 HRAS, v-Ha-ras TCCAGAACCA[T/C]TTTGTGGACG S T C H H
Harvey rat sarcoma
viral oncogene
homolog
G1595u1 WIAF-12262 HT33778 1302 proto-oncogene GCATACCTCA[G/C]TGGCTACTAA M G C S T
1-myc, alt.
transcript 1
G1597u1 WIAF-12243 HT0410 900 MAS1, MAS1 oncogene CCATCTTGGT[C/T]GTGAAGATCC S C T V V
G150u1 WIAF-11630 HT4247 690 RAD23A, RAD23 AGAGCCAGGT[A/G]TCGGAGCAGC S A G V V
(S. cerevisiae)
homolog
G1602u1 WIAF-14180 HT1903 1321 proto-oncogene GTCGCCGGGG[C/A]CCAGCAAATA M C A P T
pim-1
G1604u1 WIAF-12319 HT2788 1182 REL, v-rel avian CCTCCCAAAG[T/C]GCTGGGATTA S T C S S
reticuloendo-
theliosis viral
oncogene homolog
G1609u1 WIAF-12358 HT33646 348 RIPK1, receptor GACGCACGGT[C/T]TCCCATGACC S C T V V
(TNFRSF) interact-
ing serine-
threonine kinase
1
G161u1 WIAF-11654 HT4251 1522 DNA repair and TATGATCCAT[C/T]TTAACTGAGG M C T S F
recombination
homolog RAD52
G1610a1 WIAF-12101 HT27727 501 replication TGCAACTCCT[G/A]CTATTAAGAC M G A A T
protein Rpa4,
30 kDa
G1610a2 WIAF-12102 HT27727 554 replication TACCGTGTAA[C/T]GTGAACCAGC S C T N N
protein Rpa4,
30 kDa
G1610u3 WIAF-12307 HT27727 450 replication TTCTGCTGCT[G/A]ATGGAGCGAG M G A D N
protein Rpa4,
30 kDa
G1610u4 WIAF-12320 HT27727 1037 replication TGATTCATGA[G/C]TGTCCTCATC M G C E D
protein Rpa4,
30 kDa
G1610u5 WIAF-12321 HT27727 857 replication TAGAGGACAT[G/A]AACGAGTTCA M G A M I
protein Rpa4,
30 kDa
G1610u6 WIAF-12343 HT27727 539 replication GAATTCAGGA[C/T]GTTGTACCGT S C T D D
protein Rpa4,
30 kDa
G1630u1 WIAF-12302 HT3563 4312 DCC, deleted in ACTCATGAAG[C/T]AGCTTAATGC N C T Q *
colorectal
carcinoma
G1632u1 WIAF-13572 HT27355 742 tumor suppressor, TTTATGACAT[G/C]AAGCGGGGCT M G C M I
PDGF receptor
beta-like
G1632u2 WIAF-13584 HT27355 1102 tumor suppressor, TGGAAGACTT[C/T]GAGACGATTG S C T F F
PDGF receptor
beta-like
G1632u3 WIAF-13601 HT27355 258 tumor suppressor, AAGACGCAGT[C/T]TATCATGATG M C T S F
PDGF receptor
beta-like
G1633u1 WIAF-13957 HT1778 1263 FER, fer (fps/ TTCAGGCAAA[T/C]GAGATCATGT S T C N N
fes related)
tyrosine kinase
(phosphoprotein
NCP94)
G1633u2 WIAF-13958 HT1778 2407 F2R, fer (fps/ TATGTTGTAT[C/T]TCGAGAGTAA M C T L F
fes related)
tyrosine kinase
(phosphoprotein
NCP94)
G1634u1 WIAF-13505 HT3216 1569 ELK1, ELK1, member TCTCGACCCC[C/T]GTGGTGCTCT S C T P P
of ETS oncogene
family
G1634u2 WIAF-13858 HT3216 456 ELK1, ELK1, member GGCTGTGGGG[A/G]CTACGCAAGA S A G G G
oncogene family
G1634u3 WIAF-13859 HT3216 745 ELK1, ELK1, member AGGCCCAGGC[G/A]GTTTGGCACG M G A G S
of ETS oncogene
family
G1638u1 WIAF-14172 HT1224 98 uracil-DNA GCTGGGACCT[G/C]TTCCACAAAT G C
glycosylase
G1643u1 WIAF-13517 HT3751 629 DXS648E, DNA seg- TACATCCCCA[G/A]TCGTGGCCCT M G A S N
ment on chromosome
X (unique) 648
expressed sequence
G1645u1 WIAF-14087 D21089 363 XPC, xeroderma AAAACCTCAA[G/A]GTTATAAAGG S G A K K
pigmentosum, com-
plementation group
C
G1645u2 WIAF-14088 D21089 2166 XPC, xeroderma TGCATTCCAG[G/A]CACACGTGGC S G A R R
pigmentosum, com-
plementation group
C
G1645u3 WIAF-14089 D21089 1580 XPC, xeroderma GGGAGCCATC[G/A]TAAGGACCCA M G A R H
pigmentosum, com-
plementation group
C
G1645u4 WIAF-14090 D21089 1601 XPC, xeroderma AGCTTGCCAG[T/C]GGCATCCTCA M T C V A
pigmentosum, com-
plementation group
C
G1645u5 WIAF-14091 D21089 2920 XPC, xeroderma CCCATTTGAG[A/C]AGCTGTGAGC M A C K Q
pigmentosum, com-
plementation group
C
G1645u6 WIAF-14103 D21089 405 XPC, xeroderma ATGACCTCAG[G/A]GACTTTCCAA S G A R R
pigmentosum, com-
plementation group
C
G1645u7 WIAF-14104 D21089 151 XPC, xeroderma GGGACGCGAA[C/G]TGCGCAGCCA M C G L V
pigmentosum, com-
plementation group
C
G1645u8 WIAF-14105 D21089 2133 XPC, xeroderma AAGCGGTCTA[C/T]TCCAGGGATT S C T Y Y
pigmentosum, com-
plementation group
C
G167u1 WIAF-11632 HT4579 83 PMS2L8, postmeiotic CCTATTCATC[G/A]GAAGTCAGTC M G A R Q
segregation
increased 2-like 8
G167u2 WIAF-11633 HT4579 219 PMS2L8, postmeiotic GAGTGGATCT[T/C]ATTGAAGTTT S T C L L
segregation
increased 2-like 8
G167u3 WIAF-11644 HT4579 768 PMS2L8, postmeiotic TGCCCCCTAG[T/C]GACTCCGTGT S T C S S
segregation
increased 2-like 8
G161u4 WIAF-11622 HT4579 1645 PMS2L8, postmeiotic GAAAGCGCCT[G/A]AAACTGACGA M G A E K
segregation
increased 2-like 8
G167u5 WIAF-11645 HT4579 1512 PMS2L8, postmeiotic ACTCGGGGCA[C/T]GGCAGCACTT S C T H H
segregation
increased 2-like 8
G167u6 WIAF-11646 HT4579 1619 PMS2L8, postmeiotic TCGCAGGAAC[A/C]TGTGGACTCT M A G H R
segregation
increased 2-like 8
G167u7 WIAF-11647 HT4579 1432 PMS2L8, postmeiotic CGTCCTGAGA[C/T]CTCAGAAAGA M C T P S
segregation
increased 2-like 8
G167u8 WIAF-11625 HT4579 2490 PMS2L8, postmeiotic GGACTGCTCT[T/C]AACACAAGCG S T C L L
segregation
increased 2-like 8
G167u9 WIAF-11619 HT4579 804 PMS2L8, postmeiotic TGAGCTGTTC[G/C]GATGCTCTGC S G C S S
segregation
increased 2-like 8
G167u10 WIAF-11623 HT4579 1555 PMS2L8, postmeiotic CATCCCAGAC[A/G]CGGGCAGTCA M A G T A
segregation
increased 2-like 8
G167u11 WIAF-11624 HT4575 2364 PMS2L8, postmeiotic CCTTCGGACC[C/T]CAGGACGTCG S C T P P
segregation
increased 2-like 8
G167u12 WIAF-11626 HT4579 2348 PMS2L8, postmeiotic ACTAGTAAAA[A/G]CTGGACCTTC M A G N S
segregation
increased 2-like 8
G181u1 WIAF-11697 HT48793 311 ERCC4, excision ATATTTGCGA[C/T]AAGTAGGATA M C T T I
repair cross-
complementing
rodent repair
deficiency,
complementation
group 4
G181u2 WIAF-11698 HT48793 295 ERCC4, excision CACACAAGGT[G/C]GTGTTATATT M G C G R
repair cross-
complementing
rodent repair
deficiency,
complementation
group 4
G181u3 WIAF-11699 HT48793 234 ERCC4, excision TTGAACACCT[C/T]CCTCCCCCTC S C T L L
repair cross-
complementing
rodent repair
deficiency,
complementation
group 4
G181u4 WIAF-11704 HT48793 808 ERCC4, excision TTTGTGGCAC[C/T]AGCTTGGAGC N C T Q *
repair cross-
complementing
rodent repair
deficiency,
complementation
group 4
G181u5 WIAF-11705 HT48793 640 ERCC4, excision TTCTATGACA[C/T]CTACCATGCT M C T P S
repair cross-
complementing
rodent repair
deficiency,
complementation
group 4
G181u6 WIAF-11670 HT48793 1117 ERCC4, excision AGAAAGCAAC[C/T]CAAAGTGGGA M C T P S
repair cross-
complementing
rodent repair
deficiency,
complementation
group 4
G185u1 WIAF-11668 HT5122 319 ACVR2B, activin A TCTGCAACGA[G/A]CGCTTCACTC S G A E E
receptor, type IIB
G185u2 WIAF-11707 HT5122 70 ACVR2B, activin A AGACACGGGA[G/C]TGCATCTACT M G C E D
receptor, type IIB
G185u3 WIAF-11672 HT5122 812 ACVR2B, activin A CCTCACGGAT[T/C]ACCTCAAGGG M T C Y H
receptor, type IIB
G185u4 WIAF-13542 X77533 1109 ACVR2B, activin A GGCTCCTGAG[G/A]TGCTCGAGGG M G A V M
receptor, type IIB
G185u5 WIAF-13558 X77533 997 ACVR2B, activin A TGCTGAAGAG[C/T]GACCTCACAG S C T S S
receptor, type IIB
G187u1 WIAF-11669 HT97400 183 androgen CCAGAGACAG[C/T]GCGACCCGGA M C T R C
G191u1 WIAF-10176 AF025375 414 CXCR4, chemokine ACCTGGCCAT[C/T]GTCCACGCCA S C T I I
(C-X-C motif),
receptor 4 (fusin)
G193u1 WIAF-10178 D29984 231 CCR2, chemokine AGTGCTTGAC[T/A]GACATTTACC S T A T T
(C-C motif)
receptor 2
G193u2 WIAF-10179 D29984 190 CCR2, chemokine CATGCTGGTC[G/A]TCCTCATCTT M G A V I
(C-C motif)
receptor 2
G194u1 WIAF-10211 D43767 121 SCYA17, small ACATCCACCC[A/C]GCTCGAGGGA S A C A A
inducible cytokine
subfamily A
(Cys-Cys), member
17
G197u1 WIAF-10167 D50403 1515 NRAMP1, natural GGTGCTAGTC[T/C]GCGCCATCAA M T C C R
resistance-
associated
macrophage protein
1 (might include
Leishmaniasis)
G197u2 WIAF-10173 D50403 1629 NRAMP1, natural CACCTACCTG[G/C]TCTGGACCTG M G C V L
resistance-
associated
macrophage protein
1 (might include
Leishmaniasis)
G20u1 WIAF-10249 U14722 896 ACVR1B, activin A CGGTACACAG[T/C]GACAATTGAG M T C V A
receptor, type IB
G20u2 WIAF-10250 U14722 866 ACVR1B, activin A GAGCACGGGT[C/T]CCTGTTTGAT M C T S F
receptor, type IB
G20u3 WIAF-10251 U14722 1391 ACVR1B, activin A CAGAGTTATG[A/T]GGCACTGCGG M A T E V
receptor, type IB
G20u4 WIAF-10252 U14722 1236 ACVR1B, activin A TATATTGGGA[G/C]ATTGCTCGAA M G C E D
receptor, type IB
G20u5 WIAF-10261 U14722 518 ACVR1B, activin A GAGATGTGTC[T/C]CTCCAAAGAC M T C L P
receptor, type IB
G207a1 WIAF-10516 L25259 866 Human CTLA4 AGCTGTACTT[C/T]CAACAGTTAT M C T P S
counter-receptor
(B7-2) mRNA,
complete cds.
G208u1 WIAF-10204 L31581 85 CCR7, chemokine GGGGAAACCA[A/G]TGAAAAGCGT M A G M V
(C-C motif)
receptor 7
G211u1 WIAF-10213 M24545 174 SCYA2, small, TCACCTGCTG[T/C]TATAACTTCA S T C C C
inducible cytokine
A2 (monocyte
chemotactic protein
1, homologous to
mouse Sig-je)
G214u1 WIAF-10191 M27533 452 CD80, CD80 antigen TGAAAGAAGT[G/A]GCAACGCTGT S G A V V
(CD28 antigen
ligand 1, B7-1
antigen)
G215u1 WIAF-11659 M28393 822 PRF1, perforin 1 GCATCTCTGC[C/T]GAAGCCAAGG S C T A A
(preforming
protein)
G215u2 WIAF-11723 M28393 159 PRF1, perform 1 TGACCAGCCT[C/T]CGCCGCTCGG S C T L L
(preforming
protein)
G215u3 WIAF-11724 M28393 96 PRF1, perform 1 CAGAGTGCAA[G/A]CGCAGCCACA S G A K K
(preforming
protein)
G215u4 WIAF-11725 M28393 1377 PRF1, perform 1 ATAACAACCC[C/T]ATCTGGTCAG S C T P P
(preforming
protein)
G215u5 WIAF-11726 M28393 1326 PRF1, perform 1 TGAAGCTCTT[C/T]TTTGGTGGCC S C T F F
(preforming
protein)
G215u6 WIAF-11727 M28393 1076 PRF1, perform 1 CGGCGGGAGG[C/T]ACTGAGGAGG M C T A V
(preforming
protein)
G217u1 WIAF-11691 M31932 649 FCGR2B, Fc fragment GCAGCTCTTC[A/C]CCAATGGGGA S A G S S
of IgG, low
affinity IIb,
receptor for (CD32)
G217u2 WIAF-11692 M31932 625 FCGR2B, Fc fragment TCACTGTCCA[A/G]GTGCCCAGCA S A G Q Q
of IgG, low
affinity IIb,
receptor for (CD32)
G217u3 WIAF-11712 M31932 332 FCGR2B, Fc fragment GACTGGCCAG[A/C]CCAGCCTCAG M A C T P
of IgG, low
affinity IIb,
receptor for (CD32)
G217u4 WIAF-11713 M31932 101 FCGR2B, Fc fragment GGCTTCTGCA[G/T]ACAGTCAAGC M G T D Y
of IgG, low
affinity IIb,
receptor for (CD32)
G218u1 WIAF-10184 M36712 677 CD8B1, CD8 antigen, TTTTACAAAT[A/G]AGCAGAGAAT N A G * *
beta polypeptide 1
(p37)
G218u2 WIAF-10188 M36712 326 CD8B1, CD8 antigen, GCTGTGTTTC[G/C]GGATGCAAGC M G C R P
beta polypeptide 1
(p37)
G218u3 WIAF-10189 M36712 196 CD8B1, CD8 antigen, CAGTAACATG[C/T]GCATCTACTG M C T R C
beta polypeptide 1
(p37)
G218u4 WIAF-10190 M36712 225 CD8B1, CD8 antigen, AGCGCCAGGC[A/C]CCGAGCAGTG S A C A A
beta polypeptide 1
(p37)
G218u5 WIAF-10194 M36712 583 CD8B1, CD8 antigen, GGTGGCTGGC[G/A]TCCTGGTTCT M G A V I
beta polypeptide 1
(p37)
G218u6 WIAF-10208 M36712 372 CD8B1, CD8 antigen, TGAAGCCGGA[A/G]GACAGTGGCA S A G E E
beta polypeptide 1
(p37)
G218u7 WIAF-10209 M36712 400 CD8B1, CD8 antigen, CTGCATGATC[G/T]TCGGGAGCCC M G T V F
beta polypeptide 1
(p37)
G218u8 WIAF-10210 M36712 270 CD8B1, CD8 antigen, TCTGGGATTC[C/T]GCAAAAGGGA S C T S S
beta polypeptide 1
(p37)
G218a9 WIAF-10518 M36712 618 CD8B1, CD8 antigen, GAGTGGCCAT[C/G]CACCTGTGCT M C G I M
beta polypeptide 1
(p37)
G218a10 WIAF-13223 M36712 556 CD8B1, CD8 antigen, TTGTAGCCCC[A/G]TCACCCTTGG M A G I V
beta polypeptide 1
(p37)
G218a11 WIAF-13224 M36712 836 CD8B1, CD8 antigen, CTGTGTGTGA[T/C]GTGCATGGGA T C
beta polypeptide 1
(p37)
G22u1 WIAF-10301 U86136 6719 Human telomerase- GGTGGTAACC[G/A]TCGGGCTAGA M G A V I
associated protein
TP-1 mRNA, complete
cds.
G22u2 WIAF-10302 U86136 7537 Human telomerase- CTGATGGGAT[C/G]CTATGGAACC M C G I M
associated protein
TP-1 mRNA, complete
cds.
G22u3 WIAF-10311 U86136 1798 Human telomerase- ATGATGCCAT[T/C]GATGCCCTCG S T C I I
associated protein
TP-1 mRNA, complete
cds.
G22u4 WIAF-10312 U86136 2397 Human telomerase- CTGTCTCTGG[C/T]TGGCCAAAGG M C T A V
associated protein
TP-1 mRNA, complete
cds.
G22u5 WIAF-10313 U86136 3289 Human telomerase- AGAAAGGGAT[A/C]ACCTGCCGCA S A C I I
associated protein
TP-1 mRNA, complete
cds.
G22u6 WIAF-10314 U86136 3242 Human telomerase- AGAGGCCGCA[T/C]GTCGGATCTC M T C C R
associated protein
TP-1 mRNA, complete
cds.
G22u7 WIAF-10315 U86136 4482 Human telomerase- CCGTTTGCCT[G/A]CCTCGTCCAG M G A C Y
associated protein
TP-1 mRNA, complete
cds.
G22u8 WIAF-10316 U86136 4363 Human telomerase- GTTTGACTGT[G/A]GACCAGCTGC S G A V V
associated protein
TP-1 mRNA, complete
cds.
G22u9 WIAF-10317 U86136 4230 Human telomerase- GTGTCTGAGA[G/A]ACTCCGGACC M G A R K
associated protein
TP-1 mRNA, complete
cds.
G22u10 WIAF-10318 U86136 4419 Human telomerase- GGGACTAAGA[G/C]CTGGGAAGAA M G C S T
associated protein
TP-1 mRNA, complete
cds.
G22u11 WIAF-10319 U86136 5269 Human telomerase- TCTCCGATGA[T/C]ACACTCTTTC S T C D D
associated protein
TP-1 mRNA, complete
cds.
G22u12 WIAF-10320 U86136 5015 Human telomerase- GCTGCTCTCC[C/T]GGAGATGGCA M C T R W
associated protein
TP-1 mRNA, complete
cds.
G22u13 WIAF-10321 U86136 5133 Human telomerase- GTGGCCTTCT[C/T]CACCAATGGG M C T S F
associated protein
TP-1 mRNA, complete
cds.
G22u14 WIAF-10322 U86136 7764 Human telomerase- ACAGCCCTCC[A/G]TGTCCTACCT M A G H R
associated protein
TP-1 mRNA, complete
cds.
G22u15 WIAF-10323 U86136 7884 Human telomerase- TGCCTGGAAC[C/T]TTGCCTGGGC M C T P L
associated protein
TP-1 mRNA, complete
cds.
G22u16 WIAF-10324 U86136 7744 Human telomerase- AGATTCACTC[C/A]GCCTCTGTCA S G A S S
associated protein
TP-1 mRNA, complete
cds.
G22u17 WIAF-10337 U86136 1018 Human telomerase- CCATTGCTGC[T/C]TTCTTGCCGG S T C A A
associated protein
TP-1 mRNA, complete
cds.
G22u18 WIAF-10338 U86136 1000 Human telomerase- TGGCCAATAA[C/A]ATCTTGGCCA M C A N K
associated protein
TP-1 mRNA, complete
cds.
G22u19 WIAF-10339 U86136 1182 Human telomerase- ATGACGGACA[A/G]ATTTGCCCAG M A G K R
associated protein
TP-1 mRNA, complete
cds.
G22u20 WIAF-10340 U86136 1939 Human telomerase- AGCAGCTTCG[T/G]ATGGCAATGA S T G R R
associated protein
TP-1 mRNA, complete
cds.
G22u21 WIAF-10341 U86136 2227 Human telomerase- TCACGAGGGC[G/A]GAGCAGGTGG S G A A A
associated protein
TP-1 mRNA, complete
cds.
G22u22 WIAF-10342 U86136 2776 Human telomerase- GGCGCAGCAT[C/T]CGGCTTTTCA S C T I I
associated protein
TP-1 mRNA, complete
cds.
G22u23 WIAF-10343 U86136 2877 Human telomerase- GCCCCTCACC[C/A]TATCAGCCTT M G A R H
associated protein
TP-1 mRNA, complete
cds.
G22u24 WIAF-10344 U86136 3087 Human telomerase- TCAGGGCGCT[C/T]TGTGACAGAG M C T S F
associated protein
TP-1 mRNA, complete
cds.
G22u25 WIAF-10345 U86136 3662 Human telomerase- CAAGGTGGCA[C/T]CATTAGTCTT M C T P S
associated protein
TP-1 mRNA, complete
cds.
G22u26 WIAF-10346 U86136 4762 Human telomerase- TTTCGAAGTT[C/T]CTTACCAACC S C T F F
associated protein
TP-1 mRNA, complete
cds.
G22u27 WIAF-10351 U86136 1737 Human telomerase- CTCCAGCATG[G/C]GAAGTCGGTG M G C G A
associated protein
TP-1 mRNA, complete
cds.
G22u28 WIAF-10352 U86136 3543 Human telomerase- ACAGTGCAAC[A/G]GCTGATGCTG M A G Q R
associated protein
TP-1 mRNA, complete
cds.
G22u29 WIAF-10353 U86136 4232 Human telomerase- GTCTGAGAGA[C/T]TCCGGACCCT M C T L F
associated protein
TP-1 mRNA, complete
cds.
G22u30 WIAF-10354 U86136 4523 Human telomerase- GGAGGGCCCT[C/T]TGGAGCGCCC S C T L L
associated protein
TP-1 mRNA, complete
cds.
G22u31 WIAF-10355 U86136 5333 Human telomerase- TGGTTGTCGG[G/T]TGCTGCAGAC M G T V L
associated protein
TP-1 mRNA, complete
cds.
G22u32 WIAF-10356 U86136 6208 Human telomerase- AGCTGCTGAC[G/A]CGGCCACACA S G A T T
associated protein
TP-1 mRNA, complete
cds.
G22u33 WIAF-10357 U86136 7703 Human telomerase- TAGTCAGCCA[A/G]CACCACATCT M A G T a
associated protein
TP-1 mRNA, complete
cds.
G22u34 WIAF-10360 U86136 3881 Human telomerase- CATCGATGGG[G/A]CTGATAGGTT M G A A T
associated protein
TP-1 mRNA, complete
cds.
G222u1 WIAF-11700 M57230 697 IL6ST, interleukin TGAGTGGGAT[G/C]GTGGAAGGGA M G C G R
6 signal transducer
(gp130, oncostatin
M receptor)
G222u2 WIAF-11701 M57230 708 IL6ST, interleukin GTGGAAGGGA[A/G]ACACACTTGG S A G E E
6 signal transducer
(gp130, oncostatin
M receptor)
G222u3 WIAF-11702 M57230 677 IL6ST, interleukin GAGGGGAACA[A/G]AATGAGGTGT M A G K R
6 signal transducer
(gp130, oncostatin
M receptor)
G222u4 WIAF-11706 M57230 1616 IL6ST, interleukin AAGAAATATA[T/C]ACTTGAGTGG M T C I T
6 signal transducer
(gp130, oncostatin
M receptor)
G222u5 WIAF-11667 M57230 1444 IL6ST, interleukin TGATCGCTAT[C/G]TAGCAACCCT M C G L V
6 signal transducer
(gp130, oncostatin
M receptor)
G222u6 WIAF-11708 M57230 981 IL6ST, interleukin TCTTAAAATT[G/C]ACATGGACCA M G C L F
6 signal transducer
(gp130, oncostatin
M receptor)
G226u1 WIAF-11714 M85079 869 TGFBR2, transform- CACTGGGAGT[T/C]GCCATATCTG S T C V V
ing growth factor,
beta receptor II
(70-80 kD)
G226u2 WIAF-11715 H85079 1749 TGFBR2, transform- ACATTATCAC[C/T]CTCCATTTCC M C T P S
ing growth factor,
beta receptor II
(70-80 kD)
G226u3 WIAF-11716 M85079 1601 TGFBR2, transform- TCCCAACTCC[A/C]ACATACATCC S A G A A
ing growth factor,
beta receptor II
(70-80 kD)
G226u4 WIAF-11721 H85079 1256 TGFBR2, transform- TACTCCACTT[C/C]CTCACCCCTC M C G F L
ing growth factor,
beta receptor II
(70-80 kD)
G226u5 WIAF-11722 M85079 1502 TGFBR2, transform- TCCTCAACAA[C/T]CACCTAACCT S C T N N
ing growth factor,
beta receptor II
(70-80 kD)
G226u6 WIAF-11671 M85079 888 TGFBR2, transform- TCTCATCATC[A/C]TCTTCTACTC M A C I L
ing growth factor,
beta receptor II
(70-80 kD)
G226u7 WIAF-11674 M85079 1425 TGFBR2, transform- CCTCCACAGT[G/A]ATCACACTCC M G A D N
ing growth factor,
beta receptor II
(70-80 kD)
G227u1 WIAF-10197 M86511 685 CD14, CD14 antigen CCTGTCTGAC[A/G]ATCCTGGACT M A G N D
G227u2 WIAF-10212 M86511 497 CD14, CD14 antigen GAAGCCACAG[G/A]ACTTGCACTT M G A G E
G2278u1 WIAF-14117 AF034611 959 CUBN, cubilin ACATAAATAA[T/C]CGCCCCTGTT S T C N N
(intrinsic factor-
cobalamn receptor)
G2278u2 WIAF-14118 AF034611 781 CUBN, cubilin CCGTGGATGT[C/T]TTCACCCAAC M C T S F
(intrinsic factor-
cobalain receptor)
G2278u3 WIAF-14119 AF034611 641 CUBN, cubilin CTCAGACGTA[C/T]CCACCCCAGT S C T Y Y
(intrinsic factor-
cobalamin receptor)
G2278u4 WIAF-14121 AF034611 1185 CUBN, cubilin TCCTTATCCG[C/A]CAAATGCATG M C A P T
(intrinsic factor-
cobalamin receptor)
G2278u5 WIAF-14133 AF034611 1532 CUBN, cubilin TCTGCGTTAT[C/G]AAAACTGAAA M C G I M
(intrinsic factor
cobalamin receptor)
G2278u6 WIAF-14134 AF034611 2208 CUBN, cubilin GCCTTTCACT[C/T]ACACCAGGCA M C T H Y
(intrinsic factor
cobalamin receptor)
G228u1 WIAF-10199 U00672 586 IL10RA, interleukin CCAACGTCCC[G/A]CCAAACTTCA S G A P P
10 receptor, alpha
G228u2 WIAF-10200 U00672 731 IL10RA, interleukin AGAGCAGTGC[A/G]TCTCCCTCAC M A G I V
10 receptor, alpha
G2280u1 WIAF-13970 AJ001515 1747 RYR3, ryanodine CAGGTATCTT[G/A]GAAGTTTTGC S G A L L
receptor 3
G2280u2 WIAF-13974 AJ001515 8593 RYR3, ryanodine TAGAAGCCAT[T/C]GTCAGCAGTG S T C I I
receptor 3
G2282u1 WIAF-12694 D00726 263 FECH, ACATGGGAGG[C/T]CCTGAAACTG S C T G G
ferrochelatase
(protoporphyria)
G2282u2 WIAF-12695 D00726 514 FECH, TACTATATTG[G/A]ATTTCGGTAC M G A G E
ferrochelatase
(protoporphyria)
G2285u1 WIAF-12688 D16611 673 CPO, copropor- AGAAGACGCT[G/A]TCCATTTTCA M G A V I
phyrinogen oxidase
(coproporphyria,
harderoporphyria)
G2285u2 WIAF-12689 D16611 783 CPO, copropor- ATCGTGGAGA[G/A]CGGCGGGGCA S G A E E
phyrinogen oxidase
(coproporphyria,
harderoporphyria)
G2287u1 WIAF-12687 D28472 502 PTGER4, prosta- GGGCCTCACG[C/T]TCTTTGCAGT M C T L F
glandin E receptor
4 (subtype EP4)
G2287u2 WIAF-12691 D28472 1309 PTGER4, prosta- TGAAAATGGC[C/T]TTGGAGGCAG M C T L F
glandin E receptor
4 (subtype EP4)
G2287u3 WIAF-12707 D28472 243 PTGER4, prosta- AGGAGACGAC[C/T]TTCTACACGC S C T T T
glandin E receptor
4 (subtype EP4)
G2287u4 WIAF-12710 D28472 1342 PTGER4, prosta- GGTGTGCCTG[G/A]CATGGGCCTG M G A G D
glandin E receptor
4 (subtype EP4)
G229u1 WIAF-10185 U16752 202 SDF1, stromal cell- CATGTTGCCA[G/A]AGCCAACCTC M G A R K
derived factor 1
G2295u1 WIAF-12727 D89079 613 LTB4R, leukotriene CTATGTCTGC[G/C]GAGTCAGCAT M G C G R
b4 receptor (chemo-
kine receptor-like
1)
G2295u2 WIAF-1272B D89079 1248 LTB4R, leukotriene AGGGCACGGG[T/C]TCCGAGGCGT S T C G G
b4 receptor (chemo-
kine receptor-like
1)
G2295u3 WIAF-12753 D89079 1348 LTB4R, leukotriene CCTCACTGCC[T/G]CCAGCCCTCT M T G S A
b4 receptor (chemo-
kine receptor-like
1)
G230u1 WIAF-10201 U31628 627 IL15RA, interleukin ACAGCCAAGA[A/C]CTGGGAACTC M A C N T
15 receptor, alpha
alpha
G2300u1 WIAF-12735 J02959 102 LTA4H, leukotriene ACCTGCACCT[C/T]CGCTGCACGC S G T L L
A4 hydrolase
G2300u2 WIAF-12738 J02959 1380 LTA4H, leukotriene CCTGGCTCTA[C/T]TCTCCTGGAC S C T Y Y
A4 hydrolase
G2302u1 WIAF-12741 J03037 627 CA2, carbonic TCCTGAATCC[C/T]TGGATTACTG S C T L L
anhydrase II
G2302u2 WIAF-12742 J03037 819 CA2, carbonic GCCACTGAAG[A/G]ACAGGCAAAT M A G N D
anhydrase II
G2303u1 WIAF-12751 J03571 304 ALOX5, arachidonate CGCTGAAGAC[G/A]CCCCACGGGG S G A T T
5-lipoxygenase
G2303u2 WIAF-12752 J03571 794 ALOX5, arachidonate AGAGCTGCCC[G/A]AGAAGCTCCC M G A E K
5-lipoxygenase
G2304u1 WIAF-12772 J03575 840 PDHA1, pyruvate TCCGAGAGGC[A/C]ACAAGGTTTG S A G A A
dehydrogenase
(lipoamide) alpha 1
G2304u2 WIAF-12779 J03575 1044 PDHA1, pyruvate CCAGTGTGGA[A/C]GAACTAAAGG M A C E D
dehydrogenase
(lipoamide) alpha 1
G2305u1 WIAF-12763 J03576 456 PDHB, pyruvate TCTTCAGGGG[A/G]CCCAATGGTG S A G G G
debydrogenase
(lipoamide) beta
G2305u2 WIAF-12764 J03576 650 PDHB, pyruvate GTTCCTTTTG[A/C]ATTTCTCCCG M A C E A
dehydrogenase
(lipoamide) beta
G231u1 WIAF-10202 U32324 734 IL11RA, interleukin CCAGGGCCTG[C/T]GGGTAGAGTC M C T R W
11 receptor, alpha
G2312u1 WIAF-12762 J05096 3726 ATP1A2, ATPase, TCAAGAACCA[C/T]ACAGAGATCG S C T H H
Na+/K+
transporting,
alpha 2 (+)
polypeptide
G2313u1 WIAF-12760 J05200 6141 RYR1, ryanodine TGCAATTCAA[A/G]GATGGTACAG S A G K K
receptor 1
(skeletal)
G2313u2 WIAF-12767 J05200 3048 RYR1, ryanodine CGGCGCAGAC[A/G]ACACTGGTGG S A G T T
receptor 1
(skeletal)
G2313u3 WIAF-12768 J05200 3084 RYR1, ryanodine ATGGCCACAA[C/T]GTGTGGGCCC S C T N N
receptor 1
(skeletal)
G2313u4 WIAF-12777 J05200 5667 RYR1, ryanodine GCATCTTTGG[C/T]GATGAGGATG S C T G G
receptor 1
(skeletal)
G2313u5 WIAF-12780 J05200 6600 RYR1, ryanodine GCTCGCTGCT[C/T]ATCGTGCAGA S C T L L
receptor 1
(skeletal)
G2313u6 WIAF-12781 J05200 7191 RYR1, ryanodine AGCCTGAGTG[C/T]TTCGHACCCG S C T C C
receptor 1
(skeletal)
G2313u7 WIAF-12782 J05200 7602 RYR1, ryanodine ACCACAAGGC[G/A]TCCATGGTGC S G A A A
receptor 1
(skeletal)
G2313u8 WIAF-12784 J05200 9288 RYR1, ryanodine CAGACGCCCC[A/G]GCTCTGGTCA S A G P P
receptor 1
(skeletal)
G2313u9 WIAF-12786 J05200 13690 RYR1, ryanodine TCCAAAGAAG[G/A]AGGAAGCTGG M G A E K
receptor 1
(skeletal)
G2313u10 WIAF-12789 J05200 3147 RYR1, ryanodine ACATCCCAGC[G/A]CGCCCAAACC S G A A A
receptor 1
(skeletal)
G23144u1 WIAF-12771 J05272 1920 IMPDH1, IMP TGAAGATCGC[A/G]CAGGGTGTCT S A G A A
(inosine mono-
phosphate)
dehydrogenase 1
G2319u1 WIAF-12814 K03191 651 CYP1A1, cytochrome CCCCTACAGG[T/C]ATGTGGTGGT M T C Y H
P450, subfamily I
(aromatic compound
inducible), poly-
peptide 1
G232u1 WIAF-11657 U58917 1490 Homo sapiens IL,-17 TGAACATGAT[C/T]CTCCCGGACT S C T I I
receptor mRNA,
complete cds.
G232u2 WIAF-11677 U58917 1293 Homo sapiens IL-17 GCAGGCCATC[T/C]CGGAGGCAGG M T C S P
receptor mRNA,
complete cds.
G232u3 WIAF-11658 U58917 1132 Homo sapiens IL-17 GGCCTGCCTG[C/T]GGCTCACCTG M C T A V
receptor mRNA,
complete cds.
G232u4 WIAF-11679 U58917 905 Homo sapiens IL-17 GCAGCTGCCT[C/T]AATGACTGCC S C T L L
receptor mRNA,
complete cds.
G232u5 WIAF-11682 U58917 1794 Homo sapiens IL-17 GTTCGAATGTE[G/T]AGAACCTCTA N G T E *
receptor mRNA,
complete cds.
G232u7 WIAF-11660 U58917 743 Homo sapiens IL-17 TGACCAGTTT[T/C]CCGCACATGG S T C F F
receptor mRNA,
complete cds.
G2322u1 WIAF-12853 L01406 1316 GHRHR, growth CTGACATCTA[T/C]GTGCTAGGCT M T C M T
hormone releasing
hormone receptor
G2328u1 WIAF-12845 L20316 1285 GCGR, glucagon TGCGGGCACG[G/C]CAGATGCACC S G C R R
receptor
G2329u1 WIAF-12850 L22214 713 ADORA1, adenosine TGCTGGCAAT[T/C]GCTGTGGACC S T C I I
A1 receptor
G2329u2 WIAF-12851 L22214 716 ADORA1, adenosine TGGCAATTGC[T/G]GTGGACCGCT S T G A A
A1 receptor
G2335a1 WIAF-12136 L32961 265 ABAT, 4-amino- CCTAGATCTC[A/G]GGAGTTAATG M A G Q R
butyrate amino-
transferase
G2335a2 WIAF-12137 L32961 407 ABAT, 4-amino- TCTCCTCTGT[T/C]CCCATAGGTT S T C V V
butyrate amino-
transferase
G2335u3 WIAF-12838 L32961 365 ABAT, 4-amino- TTGATGTGGA[C/T]GGCAACCGAA S C T D D
butyrate amino-
transferase
G2335u4 WIAF-12839 L32961 583 ABAT, 4-amino- ATCACCATGG[C/T]CTGCGGCTCC M C T A V
butyrate amino-
transferase
G2335u5 WIAF-12841 L32961 1082 ABAT, 4-amino- TGGACGAGGT[C/A]CAGACCGGAG S C A V V
butyrate amino-
transferase
G2335u6 WIAF-12852 L32961 227 ABAT, 4-amino- ATTATGATGG[C/A]CCTCTGATGA S G A G G
butyrate amino-
transferase
G2337u1 WIAF-13577 L34820 149 ALDH5A1, aldehyde TGTTCTCGAA[A/C]GAATGCCAAG M A G K R
dehydrogenase 5
family, member A1
(succinate-semi-
aldehyde
dehydrogenase)
G2342a1 WIAF-12138 M12530 1602 TF, transferrin GCCTAAACCT[G/C]TGTGAACCCA S G C L L
G2342a2 WIAF-12139 M12530 1795 TF, transferrin TACCAGGAAA[C/T]CTGTGGAGGA M C T P S
G2346u1 WIAF-12829 M13928 234 ALAD, aminolevuli- TGGCCAGGTA[T/C]GGTGTGAAGC S T C Y Y
nate, delta-,
dehydratase
G2346u2 WIAF-12830 M13928 529 ALAD, aminolevuli- TGAGGTGGCA[T/C]TGGCGTATGC S T C L L
nate, delta-,
dehydratase
G2346u3 WIAF-12843 M13928 480 ALAD, aminolevuli- TGAGTGAAAA[C/T]GGAGCATTCC S C T N N
nate, delta-,
dehydratase
G2348u1 WIAF-12835 M14016 621 UROD, uroporphyrino- CTCTGGTCCC[A/G]TATCTGGTAG S A G P P
gen decarboxylase
G235u1 WIAF-11678 U83171 100 SCYA22, small CAGGCCCCTA[C/T]GGCGCCAACA S C T Y Y
inducible cytokine
subfamily A (Cys-
Cys), member 22
G2363a1 WIAF-10519 M37435 596 CSF1, colony GACAAGGACT[G/T]GAATATTTTC M G T W L
stimulating factor
1 (macrophage)
G2363a2 WIAF-13225 M37435 498 CSF1, colony AAGAGCATGA[C/T]AAGGCCTGCG S C T D D
stimulating factor
1 (macrophage)
G2363a3 WIAF-13226 M37435 712 CSF1, colony CAGTGACCCG[G/T]CCTCTGTCTC M G T A S
stimulating factor
1 (macrophage)
G2369u1 WIAF-12854 M30773 857 PPP3R1, protein TTGATTTGCA[C/T]AATTCTCGTT S C T D D
phosphatase 3
(formerly 2B),
regulatory subunit
B (19 kD), alpha
isoform
(calcineurin B,
type I)
G2369u2 WIAF-12855 M30773 1274 PPP3R1, protein ATGTGTGACT[C/T]TTATCAGAGA - C T - -
phosphatase 3
(formerly 25),
regulatory subunit
B (19 kD), alpha
isoform
(calcineurin B,
type I)
G237u1 WIAF-11662 U86358 311 SCYA25, small CACCACAACA[T/C]GCAGACCTTC M T C M T
inducible cytokine
subfamily A (Cys-
Cys), member 25
G237u2 WIAF-11680 U86358 134 SCYA25, small GTGCTCCGGC[G/A]CGCCTGGACT M G A R H
inducible cytokine
subfamily A (Cys-
Cys), member 25
G237u3 WIAF-11681 U86358 133 SCYA25, small TGTGCTCCGG[C/T]GCGCCTGGAC M C T R C
inducible cytokine
subfamily A (Cys-
Cys), member 25
G237u5 WIAF-11661 U86358 302 SCYA25, small GCAAAGCTCC[A/G]CCACAACATG M A G H R
inducible cytokine
subfamily A (Cys-
Cys), member 25
G237u6 WIAF-11663 U86358 378 SCYA25, small AGTTATCATC[A/C]TCCAAGTTTA S A G S S
inducible cytokine
subfamily A (Cys-
Cys), member 25
G2373u1 WIAF-12870 M36035 500 BZRP, benzo- GCTGGCCTTC[G/A]CGACCACACT M G A A T
diazapine receptor
(peripheral)
G2376u1 WIAF-13025 M57414 979 TACR2, tachykinin CTGCTGCCCA[T/C]GGGTCACACC M T C W R
receptor 2
G238u1 WIAF-10177 X01394 239 TNF, tumor necrosis GCTCCAGGCG[G/T]TGCTTGTTCC S G T R R
factor (TNF super-
family, member 2)
G2381u1 WIAF-12894 M59941 730 CSF2RB, colony CAGAGGTTTG[C/T]TGGGACTCCC S C T C C
stimulating factor
2 receptor, beta,
low-affinity
(granulocyte-
macrophage)
G2381u2 WIAF-12896 M59941 1306 CSF2RB, colony GGATCTGGAG[C/T]GAGTGGAGTG S C T S S
stimulating factor
2 receptor, beta,
low-affinity
(granulocyte-
macrophage)
G2381u3 WIAF-12900 M59941 1972 CSF2RB, colony CGATGGGACCC[G/A]GGACAGGCCG S G A P P
stimulating factor
2 receptor, beta,
low-affinity
(granulocyte-
macrophage)
G2381u4 WIAF-12901 M59941 1982 CSF2RB, colony GGGACAGGCC[G/A]TGGAACTGGA M G A V M
stimulating factor
2 receptor, beta,
low-affinity
(granulocyte-
macrophage)
G2381u5 WIAF-12942 M59941 773 CSF2RB, colony CCAGAACCTG[G/C]AGTGCTTCTT M G C E Q
stimulating factor
2 receptor, beta,
low-affinity
(granulocyte-
macrophage)
G2381u6 WIAF-12946 M59941 2458 CSF2RB, colony CCCCACAGCC[C/A]GAGGGCCTCC S C A P P
stimulating factor
2 receptor, beta,
low-affinity
(granulocyte-
macrophage)
G2384u1 WIAF-12908 M61831 1000 AHCY, S-adenosyl- GCCGTGGAGA[A/C]GGTGAACATC M A C K T
homocysteine
hydrolase
G2387u1 WIAF-12910 M63967 2585 ALDH5, aldehyde CTGCTGAACC[T/G]CCTGGCAGAC M T G L R
dehydrogenase 5
G2387u2 WIAF-12911 M63967 2996 ALDH5, aldehyde TATGGCCCAA[C/G]AGCAGGTGCG M C G T R
dehydrogenase 5
G2387u3 WIAF-12954 M63967 2522 ALDH5, aldehyde GCCCGGGAAG[C/T]CTTCCGCCTG M C T A V
dehydrogenase 5
G2387u4 WIAF-12955 M63967 2448 ALDH5, aldehyde ACCCTACCAC[C/T]GGGGAGGTCA S C T T T
dehydrogenase 5
G2387u5 WIAF-12956 M63967 2460 ALDH5, aldehyde GGGAGGTCAT[C/T]GGGCACGTGG S C T I I
dehydrogenase 5
G2387u6 WIAF-12957 M63967 2991 ALDH5, aldehyde CGGGGTATGG[C/T]CCAACAGCAG S C T G G
dehydrogenase 5
G2387u7 WIAF-12958 M63967 3022 ALDH5, aldehyde CGCCCAGCAC[A/G]TGGATGTTGA M A G M V
dehydrogenase 5
G2387u8 WIAF-12959 M63967 2943 ALDH5, aldehyde CCCTCATCAA[G/C]GAGGCAGGCT M G C K N
dehydrogenase 5
G2388u1 WIAF-12888 M64590 588 GLDC, glycine TGCCACAGAC[G/A]ATTTTGCCGA S C A T T
dehydrogenase
(decarboxylating;
glycine decarboxyl-
ase, glycine cleav-
age system protein
P)
G2388u2 WIAF-12889 M64590 651 GLDC, glycine ACCAGCCTGA[G/A]GTGTCTCAGG S G A E E
dehydrogenase
(decarboxylating;
glycine decarboxyl-
ase, glycine cleav-
age system protein
P)
G2388u3 WIAF-12890 M64590 698 GLDC, glycine CAGACCATGG[T/C]GTGTGACATC M T C V A
dehydrogenase
(decarboxylating;
glycine decarboxyl-
ase, glycine cleav-
age system protein
P)
G2388u4 WIAF-12891 M64590 557 GLDC, glycine TATATTGGCA[T/C]GGGCTATTAT M T C M T
dehydrogenase
(decarboxylating;
glycine decarboxyl-
ase, glycine cleav-
age system protein
P)
G2388u5 WIAF-12938 M64590 587 GLDC, glycine GTGCCACAGA[C/G]GATTTTGCGG M C G T R
dehydrogenase
(decarboxylating;
glycine decarboxyl-
ase, glycine cleav-
age system protein
P)
G2388u6 WIAF-12939 M64590 518 GLDC, glycine CTGCATGCCA[T/C]TTCAAGCAAA M T C I T
dehydrogenase
(decarboxylating;
glycine decarboxyl-
ase, glycine cleav-
age system protein
P)
G2388u7 WIAF-12940 M64590 810 GLDC, glycine GGAAATTTCT[C/T]GTTGATCCCC S C T L L
dehydrogenase
(decarboxylating;
glycine decarboxyl-
ase, glycine cleav-
age system protein
P)
G2388u8 WIAF-12941 M64590 1481 GLDC, glycine CATTGTGGCT[G/A]CTCAGTGAAG M G A C Y
dehydrogenase
(decarboxylating;
glycine decarboxyl-
ase, glycine cleav-
age system protein
P)
G2388u9 WIAF-12947 M64590 1841 GLDC, glycine AAACTGAACA[C/A]TTCGTCTGAA M G A S N
dehydrogenase
(decarboxylating;
glycine decarboxyl-
ase, glycine cleav
age system protein
P)
G2388u10 WIAF-12948 M64590 2325 GLDC, glycine GACAGGTCTA[C/T]CTACACCGGG S C T Y Y
dehydrogenase
(decarboxylating;
glycine decarboxyl-
ase, glycine cleav-
age system protein
P)
G2388u11 WIAF-12949 M64590 2362 GLDC, glycine GGTGGGAATC[T/A]GTCCCCCTGG M T A C S
dehydrogenase
(decarboxylating;
glycine decarboxyl-
ase, glycine cleav-
age system protein
P)
G2388u12 WIAF-12950 M64590 3220 GLDC, glycine TTAGTCCTCT[C/G]TCCCTAAGTT - C G - -
dehydrogenase
(decarboxylating;
glycine decarboxyl-
ase, glycine cleav-
age system protein
P)
G2391u1 WIAF-12998 M69238 623 ARNT, aryl hydro- TGGTGTATGT[G/C]TCTGACTCCG S G C V V
carbon receptor
nuclear
translocator
G2391u2 WIAF-13002 M69238 1072 ARNT, aryl hydro- TGCCTAGTGG[C/T]CATTGGCAGA M C T A V
carbon receptor
nuclear
translocator
G2391u3 WIAF-13021 M69238 966 ARNT, aryl hydro- ACCTCACTTC[G/A]TGGTGGTCCA M G A V M
carbon receptor
nuclear
translocator
G2394u1 WIAF-13003 M73747 2061 TSHR, thyroid TTGCTCGTAC[T/A]CTTCTATCCA M T A L H
stimulating hormone
receptor
G2394u2 WIAF-13004 M73747 2248 TSHR, thyroid TTACCCACGA[C/G]ATGAGGCAGG M C G D E
stimulating hormone
receptor
G2396u1 WIAF-12995 M74542 1027 ALDH3, aldehyde CCCCCAGTCC[C/G]CGGTGATGCA M C G P A
dehydrogenase 3
G2396u2 WIAF-13019 M74542 1295 ALDH3, aldehyde GGCAACAACA[G/A]CTTCGAGACT M G A S N
dehydrogenase 3
G2403u1 WIAF-13583 M83670 280 CA4, carbonic TACGATAAGA[A/T]GCAAACGTGG M A T K M
anhydrase IV
G2409u1 WIAF-10010 HT2156 1268 AGTR1, angiotensin CCACTCAAAC[C/T9 TTTCAACAAA M C T L F
receptor 1
G2411u1 WIAF-13541 M97759 210 ADORA2B, adenosine TGGCGGGCAA[C/T]GTGCTGGTGT S C T N N
A2b receptor
G2422u1 WIAF-14077 S90469 375 POR, P450 GCAGCCTGCC[A/G]GAGATCGACA S A G P P
(cytochrome)
oxidoreductase
G2422u2 WIAF-14078 S90469 852 POR, P450 TCCTGGCTGC[A/G]GTCACCACCA S A G A A
(cytochrome)
oxidoreductase
G2422u3 WIAF-14082 S90469 1496 POR, P450 AAGGAGCCTG[T/C]CGCCGAGAAC M T C V A
(cytochrome)
oxidoreductase
G2422u4 WIAF-14099 S90469 1443 POR, P450 AGACCAAGGC[C/T]GGCCGCATCA S C T A A
(cytochrome)
oxidoreductase
G2422u5 WIAF-14100 S90469 1704 POR, P450 GCCGCCGCTC[G/A]GATGAGGACT S G A S S
(cytochrome)
oxidoreductase
G2427u1 WIAF-14079 U07919 1369 ALDH6, aldehyde ACTATGGACT[C/T]ACAGCAGCCG S C T L L
dehydrogenase 6
G2427u2 WIAF-14096 U07919 1347 ALDH6, aldehyde ATAAAAAGAG[C/T]GAATACCACC M C T A V
dehydrogenase 6
G243u1 WIAF-11684 X57522 926 TAP1, transporter ATAGCCAGTG[C/G]AGTGCTGGAG M C G A G
1, ABC (ATP
binding cassette)
G243u2 WIAF-11685 X57522 627 TAP1, transporter ACCCTACCGC[C/T]TTCGTTGTCA S C T A A
1, ABC (ATP
binding cassette)
G243u3 WIAF-11686 X57522 538 TAP1, transporter CCTGCCGGGA[C/G]TTCCCTTGTT M C G L V
1, ABC (ATP
binding cassette)
G243u4 WIAF-11687 X57522 798 TAP1, transporter TGGTGGTCCT[C/G]TCCTCTCTTG S C G L L
1, ABC (ATP
binding cassette)
G243u5 WIAF-11689 X57522 1465 TAP1, transporter TAGTATTTCA[G/T]GTATCCTGCT M G T G C
1, ABC (ATP
binding cassette)
G243u6 WIAF-11690 X57522 177 TAP1, transporter AGAGTCCCAG[A/G]CCCGGCCGGG S A G R R
1, ABC (ATP
binding cassette)
G243u7 WIAF-11693 X57522 1067 TAP1, transporter AACATCATGT[C/T]TCGGGTAACA M C T S F
1, ABC (ATP
binding cassette)
G243u8 WIAF-11665 X57522 1207 TAP1, transporter GGTCACCCTG[A/G]TCACCCTGCC M A G I V
1, ABC (ATP
binding cassette)
G243u9 WIAF-11664 X57522 1757 TAP1, transporter CCAAACCCCC[C/T]ACATGTCTTA N C T P L
1, ABC (ATP
binding cassette)
G244u1 WIAF-10174 X60592 239 TNFRSF5, tumor CTTGCGGTCA[A/C]AGCGAATTCC S A G E E
necrosis factor
receptor super-
family, member 5
G2441u1 WIAF-13682 U30246 1355 SLC12A2, solute TGCTTAAGGA[A/G]CATTCCATAC S A G E E
carrier family 12
(sodium/potassium/
chloride trans-
porters), member 2
G2441u2 WIAF-13714 U30246 2691 SLC12A2, solute AGCCAAATAT[C/G]AGCCATGGCT M C G Q E
carrier family 12
(sodium/potassium/
chloride trans-
porters), member 2
G2443u1 WIAF-14004 U37143 1456 CYP2J2, cytochrome CTGAAGTTTA[G/A]AATGGGTATC M G A R K
P450, subfamily IIJ
(arachidonic acid
epoxygenase)
polypeptide 2
G2443u2 WIAF-14032 U37143 376 CYP2J2, cytochrome TTTAAGAAAA[A/G]TGGATTGATT M A G N S
P450, subfamily IIJ
(arachidonic acid
epoxygenase)
polypeptide 2
G2443u3 WIAF-14033 U37143 1502 CYP2J2, cytochrone TCTGCGCTGT[T/A]CCTCAGGTGT S T A V V
P450, subfamily IIJ
(arachidonic acid
epoxygenase)
polypeptide 2
G2444u1 WIAF-14065 U37519 771 ALDH3, aldehyde CCCGCACGGA[A/G]TTGCCTCGTG M A G N S
dehydrogenase 3
G2444u2 WIAF-14066 U37519 1698 ALDH3, aldehyde AAGGAGATCC[G/A]CTACCCACCC M G A R H
dehydrogenase 3
G2445u1 WIAF-14114 U38178 236 CNP, 2′,3′-cyclic TGCCGGGCGC[C/A]CCTCTCGCTG M G A R H
nucleotide 3′
phosphodiesterase
G2445u2 WIAF-14115 U38175 849 CNP, 2′,3′-cyclic GTGCCGCCGA[A/G]GAAAAAGTGC S A G G E
nucleotide 3′
phosphodiesterase
G2445u3 WIAF-14122 U38178 1655 CNP, 2′,3′-cyclic GTTATCTTGC[A/T]GAGATCTCTG M A T Q L
nucleotide 3′
phosphodiesterase
G2445u4 WIAF-14241 X95520 941 CNP, 2′,3′-cyclic TGCAAAATAT[T/C]CAGGAGACCG ? T C ? ?
nucleotide 3′
phosphodiesterase
G2445u5 WIAF-14242 X95520 1057 CNP, 2′,3′-cyclic TCGAGTTGAT[C/T]TTTCAGTGCT ? C T ? ?
nucleotide 3′
phosphodiesterase
G2445u6 WIAF-14243 X95520 1583 CNP, 2′,3′-cyclic TCTACTGGCT[C/G]TCTAACTAAT ? C G ? ?
nucleotide 3′
phosphodiesterase
G2448u1 WIAF-13973 U46689 1895 ALDH10, aldehyde TTGTCAAGGC[A/T]GAATATTACT S A T A A
dehydrogenase 10
(fatty aldehyde
dehydrogenase)
G2457u1 WIAF-13898 U90277 1304 GRIN2A, glutamate GGTCCCGATG[C/T]ACACCTTGCA M C T H Y
receptor, iono-
tropic, N-methyl
D-aspartate 2A
G2457u2 WIAF-13899 U90277 1934 GRIN2A, glutamate AAGAAGTAAT[G/T]GCACCGTCTC M G T G C
receptor, iono-
tropic, N-methyl
D-aspartate 2A
G2457u3 WIAF-13900 U90277 2230 GRIN2A, glutamate TCGCTGTCAT[A/G]TTCCTGGCTA M A G I M
receptor, iono-
tropic, N-methyl
D-aspartate 2A
G2457u4 WIAF-13902 U90277 2916 GRIN2A, glutamate GGCATCTACA[G/A]CTGCATTCAT M G A S N
receptor, iono-
tropic, N-methyl
D-aspartate 2A
G2457u5 WIAF-13903 U90277 3251 GRIN2A, glutamate CTATGTATTC[C/T]AGGCACAACA N C T Q *
receptor, iono-
tropic, N-methyl
D-aspartate 2A
G2457u6 WIAF-13917 U90277 2756 GRIN2A, glutamate GGACATTGAC[A/C]ACATCCCGGG M A G N D
receptor, iono-
tropiC, N-methyl
D-aspartate 2A
G2468u1 WIAF-13642 X04011 1017 CYBB, cytochrome AGGTGTCCAA[G/A]CTGGAGTGGC S G A K K
b-245, beta poly-
peptide (chronic
granulomatous
disease)
G2473u1 WIAF-13670 X06990 1417 ICAM1, inter- GGTCACCCGC[G/A]AGGTGACCGT M G A E K
cellular adhesion
molecule 1 (CD54),
human rhinovirus
receptor
G2473u2 WIAF-13695 X06990 179 ICAM1, inter- GACCAGCCCA[A/T]GTTGTTGGGC M A T K M
cellular adhesion
molecule 1 (CD54),
human rhinovirus
receptor
G2480u1 WIAF-14148 X55330 800 AGA, aspartylgluco- TTGGCATGGT[T/G]GTAATCCATA S T G V V
saminidase
G2480u2 WIAF-14149 X55330 852 AGA, aspartylgluco- AAATGGTATA[A/T]AATTCAAAAT M A T K *
saminidase
G2480u3 WIAF-14158 X55330 616 AGA, aspartylgluco- TTATCTACCA[G/C]TGCTTCTCAA M G C S T
saminidase
G2485u1 WIAF-13612 X59543 2301 RRM1, ribo- ATTGATCAAA[C/A]CCAATCTTTG M G A S N
nucleotide
reductase M1
polypeptide
G2485u2 WIAF-43613 X59543 2410 RRM1, ribo- ATTTAAGGAC[G/A]AGACCAGCAG S G A T T
nucleotide
reductase M1
polypeptide
G2485u3 WIAF-13651 X59543 548 RRM1, ribo- CAAGTCAACA[T/C]TGGATATTGT S T C L L
nucleotide
reductase M1
polypeptide
G2485u4 WIAF-13652 X59543 199 RRM1, ribo- TGCATGTGAT[C/T]AAGCGAGATG S C T I I
nucleotide
reductase M1
polypeptide
G2485u5 WIAF-13653 X59543 1037 RRM1, ribo- CAACACAGCT[C/A]GATATGTGGA S C A R R
nucleotide
reductase M1
polypeptide
G2485u6 WIAF-13660 X59543 1955 RRM1, ribo- GAAGATTGCA[A/C]ADTATGGTAT M A C K Q
nucleotide
reductase M1
polypeptide
G2485u7 WIAF-13877 X59543 860 RRM1, ribo- GAGTATGAAA[G/C]ATGACAGCAT M G C E Q
nucleotide
reductase M1
polypeptide
G2486u1 WIAF-14075 X59618 543 RRM2, ribo- TCAGCACTGG[G/C]AATCCCTGAA M G C E Q
nucleotide
reductase M2
polypeptide
G2486u2 WIAF-14076 X59618 189 RRM2, ribo- TCGCTGCGCC[T/G]CCACTATGCT - T G - -
nucleotide
reductase M2
polypeptide
G2486u3 WIAF-14092 X59618 524 RRM2, ribo- TTGACCTCTC[C/G]AACGACATTC S C G S S
nucleotide
reductase M2
polypeptide
G2488u1 WIAF-13585 X63563 1633 POLR2B, polymerase CCTTGATGGC[C/A]TATATTTCAG S G A A A
(RNA) II (DNA
directed) poly-
peptide B (140 kD)
G2488u2 WIAF-13586 X63563 2452 POLR2B, polymerase CTGTAGACCG[C/T]GGCTTCTTCA S C T R R
(RNA) II (DNA
directed) poly-
peptide B (140 kD)
G2488u3 WIAF-13587 X63563 2740 POLR2B, polymerase TCAGAACTAG[T/C]GAGACCGGCA S T C S S
(RNA) II (DNA
directed) poly-
peptide B (140 kD)
G2488u4 WIAF-13602 X63563 1411 POLR2B, polymerase GGGGTGATCA[A/G]AAGAAAGCTC S A G Q Q
(RNA) II (DNA
directed) poly-
peptide B (140 kD)
G2488u5 WIAF-13603 X63563 2386 POLR2B, polymerase CAATTGTGGC[C/T]ATTGCATCAT S C T A A
(RNA) II (DNA
directed) poly-
peptide B (140 kD)
G2489u1 WIAF-14181 X63564 1346 POLR2A, polymerase TGGTGGACAA[T/C]GAGCTGCCTG S T C N N
(RNA) II (DNA
directed) poly-
peptide A (220 kD)
G2489u2 WIAF-14236 X63564 1647 POLR2A, polymerase TGAATCTTAG[C/T]GTGACAACTC ? C T ? ?
(RNA) II (DNA
directed) poly-
peptide A (220 kD)
G2489u3 WIAF-14237 X63564 2678 POLR2A, polymerase CTGAATACAA[C/T]AACTTCAAGT ? C T ? ?
(RNA) II (DNA
directed) poly-
peptide A (220 kD)
G2489u4 WIAF-14238 X63564 3059 POLR2A, polymerase AGCTGCGCTA[C/T]GGCGAACACG ? C T ? ?
(RNA) II (DNA
directed) poly-
peptide A (220 kD)
G2489u5 WIAF-14239 X63564 3827 POLR2A, polymerase TGGGCCAGTC[C/T]GCTCGAGATG ? C T ? ?</