CA2271720A1 - Streptococcus pneumoniae polynucleotides and sequences - Google Patents

Streptococcus pneumoniae polynucleotides and sequences Download PDF

Info

Publication number
CA2271720A1
CA2271720A1 CA002271720A CA2271720A CA2271720A1 CA 2271720 A1 CA2271720 A1 CA 2271720A1 CA 002271720 A CA002271720 A CA 002271720A CA 2271720 A CA2271720 A CA 2271720A CA 2271720 A1 CA2271720 A1 CA 2271720A1
Authority
CA
Canada
Prior art keywords
pneumoniae
streptococcus pneumoniae
sequence
gene
fragments
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002271720A
Other languages
French (fr)
Inventor
Charles A. Kunsch
Gil H. Choi
Patrick J. Dillon
Craig A. Rosen
Steven C. Barash
R. Michael Fannon
Brian A. Dougherty
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Human Genome Sciences Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=21851789&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=CA2271720(A1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Individual filed Critical Individual
Publication of CA2271720A1 publication Critical patent/CA2271720A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/315Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Streptococcus (G), e.g. Enterococci
    • C07K14/3156Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Streptococcus (G), e.g. Enterococci from Streptococcus pneumoniae (Pneumococcus)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/04Antibacterial agents
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P37/00Drugs for immunological or allergic disorders
    • A61P37/02Immunomodulators
    • A61P37/04Immunostimulants
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/315Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Streptococcus (G), e.g. Enterococci
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies

Abstract

The present invention provides polynucleotide sequences of the genome of Streptococcus pneumoniae, polypeptide sequences encoded by the polynucleotide sequences, corresponding polynucleotides and polypeptides, vectors and hosts comprising the polynucleotides, and assays and other uses thereof. The present invention further provides polynucleotide and polypeptide sequence information stored on computer readable media, and computer-based systems and methods which facilitate its use.

Description

DEMANDES OU BREVETS VOLUMINEUX
LA PRESENTS PARTIE DE CETTE DEMANDS OU CE BREVET
COMPREND PLUS D'UN TOME.
CECI EST LE TOME ~ DE
NOTE: Pour les tomes additionels, veuillez contacter le Bureau canadien des brevets JUMBO APPLICATIONS/PATENTS
THiS SECTION OF THE APPLICATION/PATENT CONTAINS MORE
THAN ONE VOLUME
THIS IS VOLUME ~ OF
NOTE: For additional volumes please contact the Canadian Patent Office Streptococcus pneumoniae Polynucleotides and Sequences FIELD OF THE INVENTION
The present invention relates to the field of molecular biology. In particular, it relates to, among other things, nucleotide sequences of Streptococcus pnecsmoniae, contigs, ORFs, fragments, probes, primers and related polynucleotides thereof) peptides and polypeptides encoded by the sequences, and uses of the polynucleotides and sequences thereof, such as in fermentation, ~ o polypeptide production, assays and pharmaceutical development, among others.
BACKGROUND OF THE INVENTION
Streptococcus pneumoniae has been one of the most extensively studied ~ 5 microorganisms since its first isolation in 188 I . It was the object of many investigations that led to important scientific discoveries. In I928, Griffith observed that when heat-killed encapsulated pneumococci and live strains constitutively lacking any capsule were concomitantly injected into mice, the nonencapsulated could be converted into encapsulated pneumococci with the same 20 capsular type as the heat-killed strain. Years later, the nature of this "transforming principle," or carrier of genetic information, was shown to be DNA. (Avery, O.'L'., et al., J. Exp. Med.) 79: l37-157 ( l944)).
In spite of the vast number of publications on S. prceumoniae many questions about its virulence are still unanswered, and this pathogen remains a 25 major causative agent of serious human disease, especially community-acquired pneumonia. (Johnston) R.B., et al., Rev. Infect. Dis. !3(Suppl. 6):S509-5I7 ( 199l )). In addition, in developing countries, the pneumococcus is responsible for the death of a large number of children under the age of 5 years from pneumococcal pneumonia. The incidence of pneumococcal disease is highest in infants under 2 3o years of age and in people over 60 years of age. Pneumococci are the second most frequent cause (after Haemophilus influenzae type b) of bacterial meningitis and otitis media in children. With the recent introduction of conjugate vaccines for H.
influenzae type b, pneumococcal meningitis is likely to become increasingly prominent. S. pneumoniae is the most important etiologic agent of community-acquired pneumonia in adults and is the second most common cause of bacterial meningitis behind Neisseria meningitides.
The antibiotic generally prescribed to treat S. pneumoniae is benzylpeniciIlin, although resistance to this and to other antibiotics is found occasionally. Pneumococcal resistance to penicillin results from mutations in its penicillin-binding proteins. In uncomplicated pneumococcal pneumonia caused by a sensitive strain, treatment with penicillin is usually successful unless started too late. Erythromycin or clindamycin can be used to treat pneumonia in patients hypersensitive to penicillin, but resistant strains to these drugs exist.
Broad ~o spectrum antibiotics (e.g., the tetracyclines) may also be effective, although tetracycline-resistant strains are not rare. In spite of the availability of antibiotics, the mortality of pneumococcal bacteremia in the last four decades has remained stable between 25 and 29%. (Gillespie, S.H., et al., J. Med. Microbiol. 28:237-248 ( 1989).
S. pneumoniae is carried in the upper respiratory tract by many healthy individuals. It has been suggested that attachment of pneumococci is mediated by a disaccharide receptor on fibronectin. present on human pharyngeal epithelial cells.
(Anderson, B.J., et al., J. Immunol. l42:2464-2468 ( 1989). The mechanisms by which pneumococci translocate from the nasopharynx to the lung, thereby causing pneumonia, or migrate to the blood, giving rise to bacteremia or septicemia, are poorly understood. (Johnston, R.B., et al., Rev. Infect. Dis. 13(Suppl.
6):S509-517 (199l).
Various proteins have been suggested to be involved in the pathogenicity of S. pneumoniae, however, only a few of them have actually been confirmed as virulence factors. Pneumococci produce an IgA 1 protease that might interfere with host defense at mucosal surfaces. (Kornfield, S.J.) et al., Rev. Inf. Dis.
3:521-534 ( 1981 ). S. pneumoniae also produces neuraminidase, an enzyme that may facilitate attachment to epithelial cells by cleaving sialic acid from the host glycolipids and gangliosides. Partially purified neuraminidase was observed to 3o induce meningitis-like symptoms in mice; however, the reliability of this finding has been questioned because the neuraminidase preparations used were probably contaminated with cell wall products. Other pneumococcal proteins besides neuraminidase are involved in the adhesion of pneumococci to epithelial and endothelial cells. These pneumococcal proteins have as yet not been identified.
Recently, Cundell et- al. , reported that peptide permeases can modulate pneumococcal adherence to epithelial and endothelial cells. It was, however, unclear whether these permeases function directly as adhesions or whether they enhance adherence by modulating the expression of pneumococcal adhesions.
(DeVelasco, E.A., et al., Micro. Rev. 59:591-603 ( I995). A better understanding s of the virulence factors determining its pathogenicity will need to be developed to cope with the devastating effects of pneumococcal disease in humans.
Ironically, despite the prominent role of S. pneumoniae in the discovery of DNA, little is known about the molecular genetics of the organism. The S.
pneumoniae genome consists of one circular, covalently closed, double-stranded ~o DNA and a collection of so-called variable accessory elements, such as prophages, plaslnids, transposons and the like. Most physical characteristics and almost ail of the genes of S. pneumoniae are unknown. Among the few that have been identified, most have not been physically mapped or characterized in detail.
Only a few genes of this organism have been sequenced. (See, for instance current 15 versions of GENBANK and other nucleic acid databases, and references that relate to the genome of S. pneumoniae such as those set out elsewhere herein.) It is clear that the etiology of diseases mediated or exacerbated by S.
pneumoniae, infection involves the programmed expression of S. pneumoniae genes, and that characterizing the genes and their patterns of expression would add 2o dramatically to our understanding of the organism and its host interactions.
Knowledge of S. pneumoniae genes and genomic organization would improve our understanding of disease etiology and lead to improved and new ways of preventing, ameliorating, arresting and reversing diseases. Moreover, characterized genes and genomic fragments of S. pneumoniae would provide 25 reagents for, among other things, detecting, characterizing and controlling S .
pneumoniae infections. There is a need to characterize the genome of S.
pneumoniae and for polynucleotides of this organism.
SUMMARY OF THE INVENTION
The present invention is based on the sequencing of fragments of the Streptococcus pneumoniae genome. The primary nucleotide sequences which were generated are provided in SEQ ID NOS:1-391.
The present invention provides the nucleotide sequence of several hundred contigs of the Streptococcus pneumoniae genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representative t o fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan. In one embodiment, the present invention is provided as contiguous strings of primary sequence information corresponding to the nucleotide sequences depicted in SEQ ID NOS:1-391.
The present invention further provides nucleotide sequences which are at ~5 least 9S% identical to the nucleotide sequences of SEQ ID NOS:1-39l.
The nucleotide sequence of SEQ ID NOS:1-391, a representative fragment thereof, or a nucleotide sequence which is at least 95% identical to the nucleotide sequence of SEQ ID NOS:1-391 may be provided in a variety of mediums to facilitate its use. In one application of this embodiment, the sequences of the 2o present invention are recorded on computer readable media. Such media includes, but is not limited to: magnetic storage media, such as floppy discs) hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM;
electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media.
25 The present invention further provides systems, particularly computer-based systems which contain the sequence information herein described stored in a data storage means. Such systems are designed to identify commercially important fragments of the Streptococcus pneumoniae genome.
Another embodiment of the present invention is directed to fragments of the 30 Streptococcus pneumoniae genome having particular structural or functional attributes. Such fragments of the Streptococcus pneumoniae genome of the present invention include, but are not limited to, fragments which encode peptides, hereinafter referred to as open reading frames or ORFs, fragments which modulate the expression of an operably linked ORF, hereinafter referred to as expression 35 modulating fragments or EMFs, and fragments which can be used to diagnose the presence of Streptococcus pneumoniae in a sample, hereinafter referred to as . diagnostic fragments or DFs.
Each of the ORFs in fragments of the Streptococcus pneumoniae genome disclosed in Tables 1-3, and the EMFs found 5' to the ORFs, can be used in 5 numerous ways as polynucleotide reagents. For instance, the sequences can be used as diagnostic probes or amplification primers for detecting or determining the presence of a specific microbe in a sample, to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides encoded by ORFs of the present invention, particular those polypeptides that have a to pharmacological activity.
The present invention further includes recombinant constructs comprising one or more fragments of the Streptococcus pneumoniae genome of the present invention. The recombinant constructs of the present invention comprise vectors, such as a plasmid or viral vector) into which a fragment of the Streptococcus ~ 5 pneumoniae has been inserted.
The present invention further provides host cells containing any of the isolated fragments of the Streptococcus pneumoniae genome of the present invention. The host cells can be a higher eukaryotic host cell, such as a mammalian cell) a lower eukaryotic cell) such as a yeast cell, or a procaryotic cell such as a 2o bacterial cell.
The present invention is further directed to isolated polypeptides and proteins encoded by ORFs of the present invention. A variety of methods) well known to those of skill in the art, routinely may be utilized to obtain any of the polypeptides and proteins of the present invention. For instance, polypeptides and proteins of the present invention having relatively short, simple amino acid sequences readily can be synthesized using commercially available automated peptide synthesizers. Polypeptides and proteins of the present invention also may be purified from bacterial cells which naturally produce the protein. Yet another alternative is to purify polypeptide and proteins of the present invention from cells 3o which have been altered to express them.
The invention further provides methods of obtaining homologs of the fragments of the Streptococcus pneumoniae genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention.
Specifically, by using the nucleotide and amino acid sequences disclosed herein as CA 0227t720 t999-04-29 WO 98I18931 PC'f/US97/19588 a probe or as primers, and techniques such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs.
The invention further provides antibodies which selectively bind polypeptides and proteins of the present invention. Such antibodies include both monoclonal and polyclonal antibodies.
The invention further provides hybridomas which produce the above-described antibodies. A hybridoma is an immortalized cell line which is capable of secreting a specific monoclonal antibody.
The present invention further provides methods of identifying test samples l0 derived from cells which express one of the ORFs of the present invention, or a homolog thereof. Such methods comprise incubating a test sample with one or more of the antibodies of the present invention, or one or more of the DFs of the present invention, under conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced therefrom.
I S In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry out the above-described assays.
Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the antibodies, or one of the DFs of the present invention;
and 20 (bl one or more other containers comprising one or more of the following:
wash reagents, reagents capable of detecting presence of bound antibodies or hybridized DFs.
Using the isolated proteins of the present invention, the present invention further provides methods of obtaining and identifying agents capable of binding to 25 a polypeptide or protein encoded by one of the ORFs of the present invention.
Specifically, such agents include, as further described below, antibodies, peptides, carbohydrates, pharmaceutical agents and the like. Such methods comprise steps of: (a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention; and (b) determining whether the agent binds to said protein.
30 The present genomic sequences of Streptococcus pneumoniae will be of great value to all laboratories working with this organism and for a variety of commercial purposes. Many fragments of the Streptococcus pneumoniae genome will be immediately identified by similarity searches against GenBank or protein databases and will be of immediate value to Streptococcus pneumoniae researchers and for immediate commercial value for the production of proteins or to control gene expression.
The methodology and technology for elucidating extensive genomic sequences of bacterial and other genomes has and will greatly enhance the ability to analyze and understand chromosomal organization. In particular, sequenced contigs and genomes will provide the models for developing tools for the analysis of chromosome stnzcture and function, including the ability to identify genes within large segments of genomic DNA, the structure, position, and spacing of regulatory elements, the identification of genes with potential industrial applications) and the to ability to do comparative genomic and molecular phylogeny.
DESCRIPTION OF THE FIGURES
FIGURE 1 is a block diagram of a computer system ( 102) that can be used to implement computer-based systems of present invention.
FIGURE 2 is a schematic diagram depicting the data flow and computer programs used to collect) assemble, edit and annotate the contigs of the Streptococcus pneumoniae genome of the present invention. Both Macintosh and 2o Unix platforms are used to handle the AB 373 and 377 sequence data files, largely as described in Kerlavage et al., Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, 585, IEEE Computer Society Press, Washington D.C. ( 1993). Factura (AB) is a Macintosh program designed for automatic vector sequence removal and end-trimming of sequence files. The program Loadis runs on a Macintosh platform and parses the feature data extracted from the sequence files by Factura to the Unix based Streptococcus pneumoniae relational database. Assembly of contigs (and whole genome sequences) is accomplished by retrieving a specific set of sequence files and their associated features using Extrseq, a Unix utility for retrieving sequences from an SQL
3o database. The resulting sequence file is processed by seq_fiiter to trim portions of the sequences with more than 2% ambiguous nucleotides. The sequence files were assembled using TIGR Assembler, an assembly engine designed at The Institute for Genomic Research ( TIGR ) for rapid and accurate assembly of thousands of sequence fragments. The collection of contigs generated by the assembly step is loaded into the database with the lassie program. Identification of open reading WO 98l18931 PCT/US97I19588 frames (ORFs) is accomplished by processing contigs with zorf or GenMark. The ORFs are searched against S. pneumoniae sequences from GenBank and against all protein sequences using the BLASTN and BLASTP programs, described in Altschul et al., J. Mol. Biol. 2l5: 4Q3-4l0 ( l990)). Results of the ORF
determination and similarity searching steps were loaded into the database. As described below, some results of the determination and the searches are set out in Tables 1-3.
DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
io The present invention is based on the sequencing of fragments of the Streptococcus pneumoniae genome and analysis of the sequences. The primary nucleotide sequences generated by sequencing the fragments are provided in SEQ
>D NOS:1-391. (As used herein, the "primary sequence" refers to the nucleotide l 5 sequence represented by the IUPAC nomenclature system. ) In addition to the aforementioned Streptococcus pneumoniae polynucleotide and polynucleotide sequences, the present invention provides the nucleotide sequences of SEQ n7 NOS:1-39l, or representative fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan.
2o As used herein, a "representative fragment of the nucleotide sequence depicted in SEQ ID NOS:1-39I " refers to any portion of the SEQ ID NOS:1-391 which is not presently represented within a publicly available database.
Preferred representative fragments of the present invention are Streptococcus pneumoniae open reading frames ( ORFs ), expression modulating fragment ( ENIFs ) and 25 fragments which can be used to diagnose the presence of Streptococcus pneumoniae in sample ( DFs ). A non-limiting identification of preferred representative fragments is provided in Tables 1-3. As discussed in detail below, the information provided in 5EQ ID NOS:1-391 and in Tables 1-3 together with routine cloning, synthesis, sequencing and assay methods will enable those skilled 3o in the art to clone and sequence a11 "representative fragments" of interest, including open reading frames encoding a large variety of Streptococcus pneumoniae proteins.
While the presently disclosed sequences of SEQ ID NOS:1-391 are highly accurate, sequencing techniques are not perfect and, in relatively rare instances) 35 further investigation of_ a fragment or sequence of the invention may reveal a nucleotide sequence error present in a nucleotide sequence disclosed in SEQ >D
NOS:1-391. However, once the present invention is made available (i.e., once the information in SEQ 1D NOS:1-391 and Tables 1-3 has been made available), resolving a rare sequencing error in SEQ >D NOS:1-391 will be well within the S skill of the art. The present disclosure makes available sufficient sequence information to allow any of the described contigs or portions thereof to be obtained readily by straightforward application of routine techniques. Further sequencing of such polynucleotide may proceed in like manner using manual and automated sequencing methods which are employed ubiquitous in the art. Nucleotide sequence editing software is publicly available. For example, Applied Biosystem's (AB) AutoAssembler can be used as an aid during visual inspection of nucleotide sequences. By employing such routine techniques potential errors readily may be identified and the correct sequence then may be ascertained by targeting further sequencing effort, also of a routine nature, to the region containing the potential 15 error.
Even if all of the very rare sequencing errors in SEQ ff~ NOS:1-39l were corrected, the resulting nucleotide sequences would still be at least 95%
identical) nearly all would be at least 99% identical, and the great majority would be at least 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-39l.
2o As discussed elsewhere herein, polynucleotides of the present invention readily may be obtained by routine application of well known and standard procedures for cloning and sequencing DNA. Detailed methods for obtaining libraries and for sequencing are provided below) for instance. A wide variety of Streptococcus pneumoniae strains that can be used to prepare S. pneumoniae 25 genomic DNA for cloning and for obtaining polynucleotides of the present invention are available to the public from recognized depository institutions, such as the American Type Culture Collection ( ATCC ). While the present invention is enabled by the sequences and other information herein disclosed, the S.
pneumoniae strain that provided the DNA of the present Sequence Listing, Strain 30 7/87 14.8.9l, has been deposited in the ATCC, as a convenience to those of skill in the art. As a further convenience, a library of S. pneumaniae genomic DNA, derived from the same strain, also has been deposited in the ATCC. The S .
pneumoniae strain was deposited on October 10, l996, and was given Deposit No.
55840, and the cDNA library was deposited on October 11, 1996 and was given 35 Deposit No. 97755. The genomic fragments in the library are 15 to 20 kb fragments generated by partial Sau3A 1 digestion and they are inserted into the BamHI site in the well-known lambda-derived vector lambda DASH II (Stratagene, La Jolla, CA). The provision of the deposits is not a waiver of any rights of the inventors or their assignees in the present subject matter.
5 The nucleotide sequences of the genomes from different strains of Streptococcus pneumoniae differ somewhat. However, the nucleotide sequences of the genomes of all Streptococcus pneumoniae strains will be at least 95%
identical, in corresponding part, to the nucleotide sequences provided in SEQ
)T7 NOS:1-39l. Nearly a11 will be at least 99% identical and the great majority will be 1 o 99.9% identical.
Thus, the present invention further provides nucleotide sequences which are at least 95%, preferably 99% and most preferably 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-39l, in a form which can be readily used) analyzed and interpreted by the skilled artisan.
Methods for determining whether a nucleotide sequence is at least 95%, at least 99% or at least 99.9% identical to the nucleotide sequences of SEQ )D
NOS:1-391 are routine and readily available to the skilled artisan. For example, the well known fasts algorithm described in Pearson and Lipman, Proc. Natl. Acad.
Sci. USA 85: 2444 ( l988) can be used to generate the percent identity of nucleotide sequences. The BLASTN program also can be used to generate an identity score of polynucleotides compared to one another.
COMPUTER RELATED EMBODIMENTS
The nucleotide sequences provided in SEQ ID NOS:1-391, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99%
and most preferably at least 99.9% identical to a polynucleotide sequence of SEQ
ID NOS:1-391 may be "provided" in a variety of mediums to facilitate use thereof.
As used herein, provided refers to a manufacture, other than an isolated nucleic acid molecule, which contains a nucleotide sequence of the present invention;
i. e. , 3o a nucleotide sequence provided in SEQ ID NOS:1-39l, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to a polynucleotide of SEQ ID NOS:1-39I.
Such a manufacture provides a large portion of the Streptococcus pneumoniae genome and parts thereof ( e. g. , a Streptococcus pneumoniae open reading frame (ORF)) in a form which. allows a skilled artisan to examine the manufacture using WO 98l18931 PCT/LTS97/19588 means not directly applicable to examining the Streptococcus pneumoniae genome or a subset thereof as it exists in nature or in purified form.
In one application of this embodiment) a nucleotide sequence of the present invention can be recorded on computer readable media. As used herein, "computer readable media" refers to any medium which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD- ROM; electrical storage media such as RAM and ROM; and hybrids of these categories, such as magnetic/optical storage media. A skilled artisan can readily appreciate how any of the presently known computer readable mediums can be used to create a manufacture comprising computer readable medium having recorded thereon a nucleotide sequence of the present invention.
Likewise, it will be clear to those of skill how additional computer readable media that may be developed also can be used to create analogous manufactures having ~ 5 recorded thereon a nucleotide sequence of the present invention.
As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled artisan can readily adopt any of the presently know methods for recording information on computer readable medium to generate manufactures comprising the nucleotide sequence information of the present 2o invention. A variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the 25 nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially- available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily 30 adapt any number of data-processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.
Computer software is publicly available which allows a skilled artisan to - access sequence information provided in a computer readable medium. Thus, by 35 providing in computer readable form the nucleotide sequences of SEQ ID
NOS:1-WO 98l18931 PCT/US97119588 39l, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to a sequence of SEQ ID NOS: l-39l the present invention enables the skilled artisan routinely to access the provided sequence information for a wide variety of purposes.
The examples which follow demonstrate how software which implements the BLAST (Altschul et al., J. Mol. Biol. 2l5:403-410 ( 1990)) and BLAZE
(Brutlag et al., Comp. Chem. l7:203-207 ( l993)) search algorithms on a Sybase system was used to identify open reading frames (ORFs) within the Streptococcus pneumoniae genome which contain homology to ORFs or proteins from both ~ o Streptococcus pneumoniae and from other organisms. Among the ORFs discussed herein are protein encoding fragments of the Streptococcus pneumoniae genome useful in producing commercially important proteins, such as enzymes used in fermentation reactions and in the production of commercially useful metabolites.
The present invention further provides systems, particularly computer-based systems, which contain the sequence information described herein. Such systems are designed to identify) among other things, commercially important fragments of the Streptococcus pneumoniae genome.
As used herein, "a computer-based system" refers to the hardware means, software means, and data storage means used to analyze the nucleotide sequence zo information of the present invention. The minimum hardware means of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently available computer-based systems are suitable for use in the present invention.
As stated above, the computer-based systems of the present invention comprise a data storage means having stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means for supporting and implementing a search means.
As used herein, "data storage means" refers to memory which can store 3o nucleotide sequence information of the present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide sequence information of the present invention.
As used herein, "search means" refers to one or more programs which are implemented on the computer-based system to compare a target sequence or target structural motif with the sequence information stored within the data storage means. Search means are used to identify fragments or regions of the present genomic sequences which match a particular target sequence or target motif. A
variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software includes, but is not limited to, MacPattern (EMBL), BLASTN and BLASTX
(NCBIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing software packages for conducting homology searches can be adapted for use in the present computer-based systems.
1 o As used herein, a "target sequence" can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. The most preferred sequence length of a target sequence is from about 10 to 100 amino acids t 5 or from about 30 to 300 nucleotide residues. However, it is well recognized that searches for commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.
As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combination of sequences in which the sequences) 2o are chosen based on a three-dimensional configuration which is formed upon the folding of the target motif. There are a variety of target motifs known in the art.
Protein target motifs include, but are not limited to) enzymic active sites and signal sequences. Nucleic acid target motifs include, but are not limited to, promoter sequences, hairpin structures and inducible expression elements (protein binding 25 sequences).
A variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention. A preferred format for an output means ranks fragments of the Streptococcus pneumoniae genomic sequences possessing varying degrees of 3o homology to the target sequence or target motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the target sequence or target motif and identifies the degree of homology contained in the identified fragment.
- A variety of comparing means can be used to compare a target sequence or 35 target motif with the data storage means to identify sequence fragments of the Streptococcus pneumoniae genome. In the present examples) implementing software which implement the BLAST and BLAZE algorithms, described in Altschul et al., J. Mol. Biol. 21 S: 403-410 ( l990}, is used to identify open reading frames within the Streptococcus pneumoniae genome. A skilled artisan can readily recognize that any one of the publicly available homology search programs can be used as the search means for the computer-based systems of the present invention.
Of course, suitable proprietary systems that may be known to those of skill also may be employed in this regard.
Figure 1 provides a block diagram of a computer system illustrative of embodiments of this aspect of present invention. The computer system 102 includes a processor 106 connected to a bus 104. Also connected to the bus 104 are a main memory l08 (preferably implemented as random access memory, RAM) and a variety of secondary storage devices 110, such as a hard drive 112 and a removable medium storage device 114. The removable medium storage device 114 ~ s may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A removable storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape) etc. ) containing control logic and/or data recorded therein may be inserted into the removable medium storage device 114. The computer system 102 includes appropriate software for reading the control logic and/or the 2o data from the removable medium storage device 114, once it is inserted into the removable medium storage device 114.
A nucleotide sequence of the present invention may be stored in a well known manner in the main memory 108, any of the secondary storage devices 110, and/or a removable storage medium 116. During execution, software for accessing 25 and processing the genomic sequence (such as search tools, comparing tools, etc. ) reside in main memory l08, in accordance with the requirements and operating parameters of the operating system, the hardware system and the software program or programs.

IS
BIOCHEMICAL EMBODIMENTS
Other embodiments of the present invention are directed to isolated fragments of the Streptococcus pneumoniae genome. The fragments of the Streptococcus pneumoniae genome of the present invention include, but are not limited to fragments which encode peptides and polypeptides, hereinafter open reading frames (ORFs), fragments which modulate the expression of an operably linked ORF, hereinafter expression modulating fragments (EMFs) and fragments which can be used to diagnose the presence of Streptococcus pneumoniae in a to sample, hereinafter diagnostic fragments (DFs).
As used herein, an "isolated nucleic acid molecule" or an "isolated fragment of the St, ~eptococcus pneumoniae genome" refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification means to reduce, from the composition, the number of compounds which are normally associated with the composition. Particularly, the term refers to the nucleic acid molecules having the sequences set out in SEQ ID NOS:1-391, to representative fragments thereof as described above, to polynucleotides at least 95%) preferably at least 99% and especially preferably at least 99.9%
identical in sequence thereto, also as set out above.
2o A variety of purification means can be used to generate the isolated fragments of the present invention. These include, but are not limited to methods which separate constituents of a solution based on charge, solubility, or size.
In one embodiment, Streptococcus pneumoniae DNA can be enzymatically sheared to produce fragments of 15-20 kb in length. These fragments can then be used to generate a Streptococcus pneumoniae library by inserting them into lambda clones as described in the Examples below. Primers flanking, for example, an ORF, such as those enumerated in Tables 1-3 can then be generated using nucleotide sequence information provided in SEQ ID NOS:1-391. Well known and routine techniques of PCR cloning then can be used to isolate the ORF from the lambda DNA library or Streptococcus pneumoniae genomic DNA. Thus, given the availability of SEQ >D NOS:1-391, the information in Tables 1, 2 and 3, and the information that may be obtained readily by analysis of the sequences of SEQ
ID NOS:1-391 using methods set out above, those of skill will be enabled by the present disclosure to isolate any ORF-containing or other nucleic acid fragment of the present invention.

The isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and double stranded DNA, and single stranded RNA.
As used herein, an "open reading frame," ORF, means a series of triplets coding for amino acids without any termination codons and is a sequence translatable into protein.
Tables 1) 2, and 3 list ORFs in the Streptococcus prceumoniae genomic contigs of the present invention that were identified as putative coding regions by the GeneMark software using organism-specific second-order Markov probability transition matrices. It will be appreciated that other criteria can be used, in accordance with well known analytical methods, such as those discussed herein, to generate more inclusive, more restrictive, or more selective lists.
Table 1 sets out ORFs in the Streptococcus pneumoniae contigs of the present invention that over a continuous region of at least 50 bases are 95%
or more identical (by BLAST analysis) to a nucleotide sequence available through GenBank in October, 1997.
Table 2 sets out ORFs in the Streptococcus pneumoniae contigs of the present invention that are not in Table l and match, with a BLASTP probability score of 0.01 or less, a polypeptide sequence available through GenBank in October, 1997.
Table 3 sets out ORFs in the Streptococcus pneumoniae contigs of the present invention that do not match significantly, by BLASTP analysis, a polypeptide sequence available through GenBank in October, 1997.
In each table, the first and second columns identify the ORF by, respectively, contig number and ORF number within the contig; the third column indicates the first nucleotide of the ORF (actually the first nucleotide of the stop codon immediately preceeding the ORF), counting from the 5' end of the contig strand; and the fourth column, "stop (nt)" indicates the last nucleotide of the stop codon defining the 3'end of the ORF.
In Tables 1 and 2, column five) lists the Reference for the closest matching sequence available through GenBank. These reference numbers are the databases entry numbers commonly used by those of skill in the art, who will be familiar with their denominators. Descriptions of the nomenclature are available from the National Center for Biotechnology Information. Column six in Tables 1 and 2 provides the gene name of the matching sequence; column seven provides the BLAST identity scpre and column eight the BLAST similarity score from the comparison of the ORF and the homologous gene; and column nine indicates the length in nucleotides of the highest scoring segment pair identified by the BLAST
identity analysis.
Each ORF described in the tables is defined by "start (nt)" (5' ) and "stop (nt)" (3' ) nucleotide position numbers. These position numbers refer to the boundaries of each ORF and provide orientation with respect to whether the forward or reverse strand is the coding strand and which reading frame the coding sequence is contained. The "start" position is the first nucleotide of the triplet encoding a stop codon just 5' to the ORF and the "stop" position is the last I o nucleotide of the triplet encoding the next in-frame stop codon (i.e., the stop codon at the 3' end of the ORF). Those of ordinary skill in the art appreciate that preferred fragments within each ORF described in the table include fragments of each ORF which include the entire sequence from the delineated "start" and "stop"
positions excepting the first and last three nucleotides since these encode stop codons. Thus, polynucleotides set out as ORFs in the tables but lacking the three (3) 5' nucleotides and the three (3) 3' nucleotides are encompassed by the present invention. Those of skill also appreciate that particularly preferred are fragments within each ORF that are polynucleotide fragments comprising polypeptide coding sequence. As defined herein, "coding sequence" includes the fragment within an 2o ORF beginning at the first in-frame ATG (triplet encoding methionine) and ending with the last nucleotide prior to the triplet encoding the 3' stop codon.
Preferred are fragments comprising the entire coding sequence and fragments comprising the entire coding sequence, excepting the coding sequence for the N-terminal methionine. Those of skill appreciate that the N-terminal methionine is often removed during post-translational processing and that polynucleotides lacking the ATG can be used to facilitate production of N-termainal fusion proteins which may be benefical in the production or use of genetically engineered proteins. Of course, due to the degeneracy of the genetic code many polynucleotides can encode a given polypeptide. Thus, the invention further includes polynucleotides comprising a 3o nucleotide sequence encoding a polypeptide sequence itself encoded by the coding sequence within an ORF described in Tables 1-3 herein. Further, polynucleotides at least 95%, preferably at least 99% and especially preferably at least 99.9%
identical in sequence to the foregoing polynucleotides, are contemplated by the present invention.

Polypeptides encoded by polynucleotides described above and elsewhere herein are also provided by the present invention as are polypeptide comprising a an amino acid sequence at least about 95%, preferably at least 97% and even more preferably 99% identical to the amino acid sequence of a polypeptide encoded by an ORF shown in Tables I -3 . These polypeptides may or may not comprise an N-terminal methionine.
The concepts of percent identity and percent similarity of two polypeptide sequences is well understood in the art. For example, two polypeptides 10 amino acids in length which differ at three amino acid positions ( e. g. , at positions 1, 3 and S) are said to have a percent identity of 70%. However) the same two polypeptides would be deemed to have a percent similarity of 80% if, for example at position 5, the amino acids moieties, although not identical, were "similai ' ( i. e. , possessed similar biochemical characteristics). Many programs for analysis of nucleotide or amino acid sequence similarity) such as fasta and BLAST
specifically ~ 5 list percent identity of a matching region as an output parameter. Thus, for instance, Tables 1 and 2 herein enumerate the percent identity of the highest scoring segment pair in each ORF and its listed relative. Further details concerning the algorithms and criteria used for homology searches are provided below and are described in the pertinent literature highlighted by the citations 2o provided below.
It will be appreciated that other criteria can be used to generate more inclusive and more exclusive listings of the types set out in the tables. As those of skill will appreciate, narrow and broad searches both are useful. Thus, a skilled artisan can readily identify ORFs in contigs of the Streptococcus pneumoniae 25 genome other than those listed in Tables 1-3, such as ORFs which are overlapping or encoded by the opposite strand of an identified ORF in addition to those ascertainable using the computer-based systems of the present invention.
As used herein, an "expression modulating fragment," EMF, means a series of nucleotide molecules which modulates the expression of an operably 30 linked ORF or EMF.

As used herein, a sequence is said to "modulate the expression of an operably linked sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs include, but are not limited to, promoters, and promoter modulating sequences (inducible elements). One class of EMFs are fragments which induce the expression or an operably linked ORF in response to a specific regulatory factor or physiological event.
EMF sequences can be identified within the contigs of the Streptococcus pneumoniae genome by their proximity to the ORFs provided in Tables 1-3. An intergenic segment, or a fragment of the intergenic segment, from about 10 to to nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate the expression of an operably linked ORF in a fashion similar to that found with the naturally l~nked ORF sequence. As used herein, an "intergenic segment" refers to fragments of the Streptococcus pneumoniae genome which are between two ORF(s) herein described. EMFs also can be identified using known EMFs as a target sequence or target motif in the computer-based systems of the present invention. Further, the two methods can be combined and used together.
The presence and activity of an EMF can be confirmed using an EMF trap vector. An EMF trap vector contains a cloning site linked to a marker sequence. A
marker sequence encodes an identifiable phenotype, such as antibiotic resistance or a complementing nutrition auxotrophic factor, which can be identified or assayed when the EMF trap vector is placed within an appropriate host under appropriate conditions. As described above, a EMF will modulate the expression of an operably linked marker sequence. A more detailed discussion of various marker sequences is provided below. A sequence which is suspected as being an EMF is cloned in all three reading frames in one or more restriction sites upstream from the marker sequence in the EMF trap vector. The vector is then transformed into an appropriate host using known procedures and the phenotype of the transformed host in examined under appropriate conditions. As described above, an EMF will modulate the expression of an operably linked marker sequence.
3o As used herein, a "diagnostic fragment," DF) means a series of nucleotide molecules which selectively hybridize to Streptococcc~s pneumnniae sequences.
DFs can be readily identified by identifying unique sequences within contigs of the Streptococcus pneumoniae genome, such as by using well-known computer analysis software, and by generating and testing probes or amplification primers consisting of the DF sequence in an appropriate diagnostic format which determines amplification or hybridization selectivity.
The sequences falling within the scope of the present invention are not limited to the specific sequences herein described, but also include allelic and 5 species variations thereof. Allelic and species variations can be routinely determined by comparing the sequences provided in SEQ ID NOS:1-39l, a representative fragment thereof, or a nucleotide sequence at least 95%) preferrably at least 99% and most at least preferably 99.9% identical to SEQ ID NOS:1-391, with a sequence from another isolate of the same species. Furthermore, to t o accommodate codon variability, the invention includes nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of one codon for another which encodes the same amino acid is expressly contemplated. Any specific sequence disclosed herein can be readily screened for errors by f 5 resequencing a particular fragment, such as an ORF, in both directions ( i. e. , sequence both strands). Alternatively, error screening can be performed by sequencing corresponding polynucleotides of Streptococcus pneumoniae origin isolated by using part or all of the fragments in question as a probe or primer.
Preferred DFs of the present invention comprise at least about 17, 2o preferrably at least about 20, and more preferrably at least about SO
contiguous nucleotides within an ORF set out in Tables 1-3. Most highly preferred DFs specifically hybridize to a polynucleotide containing the sequence of the ORF
from which they are derived. Specific hybridization occurs even under stringent conditions defined elsewhere herein.
Each of the ORFs of the Streptococcus pneumoniae genome disclosed in Tables 1, 2 and 3, and the EMFs found 5' to the ORFs, can be used as polynucleotide reagents in numerous ways. For example, the sequences can be used as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe in a sample, particularly Streptococcus pneumoniae.
3o Especially preferred in this regard are ORFs such as those of Table 3, which do not match previously characterized sequences from other organisms and thus are most likely to be highly selective for Streptococcus pneumoniae. Also particularly preferred are ORFs that can be used to distinguish between strains of Streptococcus pneumoniae, particularly those that distinguish medically important strain, such as drug-resistant strains.

WO 98l18931 PCT/US97/19588 In addition, the fragments of the present invention, as broadly described, can be used to control gene expression through triple helix formation or antisense DNA or RNA, both of which methods are based on the binding of a polynucleotide sequence to DNA or RNA. Triple helix-formation optimally results in a shut-off of RNA transcription from DNA) while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Information from the sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides. Polynucleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary to a region of the gene involved in transcription, for triple-helix formation, or to the mRNA
itself, for antisense inhibition. Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well known and involve routine procedures. Triple helix techniques are discussed in, for example, Lee et al., Nucl. Acids Res. 6:3073 ( I979); Cooney et al., Science 24l:456 ( 1988); and Dervan et al., Science 25l:1360 ( I991 ). Antisense techniques in general are discussed in, for instance, Okano, J. Neurochem. 56:560 ( 1991 ) and Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, FL ( 1988)).
The present invention further provides recombinant constructs comprising one or more fragments of the Streptococcus pneumoniae genomic fragments and contigs of the present invention. Certain preferred recombinant constructs of the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the Streptococcus pneumoniae genome has been inserted, in a forward or reverse orientation. In the case of a vector comprising one of the ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a promoter, operably linked to the ORF. For vectors comprising the EMFs of the present invention, the vector may further comprise a marker sequence or heterologous ORF operably linked to the EMF.
Large numbers of suitable vectors and promoters are known to those of 3o skill in the art and are commercially available for generating the recombinant constructs of the present invention. The following vectors are provided by way of example. Useful bacterial vectors include phagescript, PsiX 174, pBluescript S
K, pBS KS, pNHBa, pNHl6a, pNHlBa, pNH46a (available from Stratagene);
' pTrc99A, pKK223-3, pKK233-3, pDR540, pRITS (available from Pharmacia).
Useful eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXTI, pSG

WO 98/18931 PCT/iJS97/19588 (available from Stratagene) pSVK3, pBPV, pMSG, pSVL (available from Pharmacia).
Promoter regions can be selected from any desired gene using CAT
(chloramphenicol transferase) vectors or other vectors with selectable markers.
Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters include lack lacZ, T3> T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HS V thymidine kinase, early and late S V40, LTRs from retrovirus, and mouse metallothionein- I. Selection of the appropriate vector and promoter is well within the level of ordinary skill in the art.
o The present invention further provides host cells containing any one of the isolated fragments of the Streptococcus pneumoniae genomic fragments and contigs of the present invention, wherein the fragment has been introduced into the host cell using known methods. The host cell can be a higher eukaryotic host cell, such as a mammalian cell) a lower eukaryotic host cell) such as a yeast cell, or ~ s a procaryotic cell, such as a bacterial cell.
A polynucleotide of the present invention, such as a recombinant conswct comprising an ORF of the present invention, may be introduced into the host by a variety of well established techniques that are standard in the art) such as calcium phosphate transfection, DEAF, dextran mediated transfection and electroporation, 2o which are described in, for instance, Davis, L. et al., BASIC METHODS IN
MOLECULAR BIOLOGY ( 1986).
A host cell containing one of the fragments of the Streptococcus pneumoniae genomic fragments and contigs of the present invention, can be used in conventional manners to produce the gene product encoded by the isolated 25 fragment (in the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF. The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present invention or by degenerate variants of the nucleic acid fragments of the present invention. By "degenerate variant" is intended nucleotide fragments which differ 3o from a nucleic acid fragment of the present invention (e. g. , an ORF) by nucleotide sequence but, due to the degeneracy of the Genetic Code, encode an identical polypeptide sequence.
Preferred nucleic acid fragments of the present invention are the ORFs and subfragments thereof depicted in Tables 2 and 3 which encode proteins.

A variety of methodologies known in the art can be utilized to obtain any one of the isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid sequence can be synthesized using commercially available peptide synthesizers. This is particularly useful in producing small peptides and fragments of larger polypeptides. Such short fragments as may be obtained most readily by synthesis are useful, for example, in generating antibodies against the native polypeptide, as discussed further below.
In an alternative method, the polypeptide or protein is purified from bacterial cells which naturally produce the polypeptide or protein. One skilled in t o the art can readily employ well-known methods for isolating polypeptides and' proi~.ins to isolate and purify polypeptides or proteins of the present invention produced naturally by a bacterial strain, or by other methods. Methods for isolation and purification that can be employed in this regard include, but are not limited to, immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, and immuno-affinity chromatography.
The polypeptides and proteins of the present invention also can be purified from cells which have been altered to express the desired polypeptide or protein.
As used herein, a cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic manipulation, is made to produce a 2o polypeptide or protein which it normally does not produce or which the cell normally produces at a lower level. Those skilled in the art can readily adapt procedures for introducing and expressing either recombinant or synthetic sequences into eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides or proteins of the present invention.
Any host/vector system can be used to express one or more of the ORFs of the present invention. These include) but are not limited to, eukaryotic hosts such as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. The most preferred cells are those which do not normally express the particular polypeptide or protein or which expresses the 3o polypeptide or protein at low natural level.

WO 98/18931 PCT/US9?/19588 "Recombinant," as used herein, means that a polypeptide or protein is derived from recombinant (e. g. , microbial or mammalian) expression systems.
"Microbial" refers to recombinant polypeptides or proteins made in bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial"defines a polypeptide or protein essentially free of native endogenous substances and unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most bacterial cultures, e. g. , E. coli, will be free of glycosylation modifications; polypeptides or proteins expressed in yeast will have a glycosylation pattern different from that expressed in mammalian cells.
t o "Nucleotide sequence" refers to a heteropolymer of deoxyribonucleotides.
Generally, DNA segments encoding the polypeptides and proteins provided by this invention are assembled from fragments of the Streptococcus pneumoniae genome and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is capable of being expressed in a recombinant transcriptional ~ 5 unit comprising regulatory elements derived from a microbial or viral operon.
Recombinant expression vehicle or vector" refers to a plasmid or phage or virus or vector) for expressing a polypeptide from a DNA (RNA) sequence. The expression vehicle can comprise a transcriptional unit comprising an assembly of ( 1 ) a genetic regulatory elements necessary for gene expression in the host, 2o including elements required to initiate and maintain transcription at a level sufficient for suitable expression of the desired polypeptide, including, for example, promoters and, where necessary, an enhancer and a polyadenylation signal; (2) a structural or coding sequence which is transcribed into mRNA anu. translated into protein, and (3) appropriate signals to initiate translation at the beginning of the 25 desired coding region and terminate translation at its end. Structural units intended for use in yeast or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated protein by a host cell.
Alternatively, where recombinant protein is expressed without a leader or transport sequence, it may include an N-terminal methionine residue. This residue may or 3o may not be subsequently cleaved from the expressed recombinant protein to provide a final product.
"Recombinant expression system" means host cells which have stably integrated a recombinant transcriptional unit into chromosomal DNA or carry the recombinant transcriptional unit extra chromosomally. The cells can be prokaryotic 35 or eukaryotic. Recombinant expression systems as defined herein will express WO 98/18931 PCTlUS97/19588 heterologous polypeptides or proteins upon induction of the regulatory elements - linked to the DNA segment or synthetic gene to be expressed.
Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate promoters. Cell-free translation 5 systems can also be employed to produce such proteins using RNAs derived from the DNA constructs of the present invention. Appropriate cloning and expression vectors for use with prokaryotic and eukaryotic hosts are described in Sambrook er al., Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York ( 1989), the disclosure of which I o is hereby incorporated by reference in its entirety.
Generally, recombinant expression vectors will include origins of replication and selectable markers permitting transformation of the host cell, e. g. , the ampicillin resistance gene of E. coli and S. cerevisiae TRP 1 gene, and a promoter derived from a highly expressed gene to direct transcription of a i s downstream structural sequence. Such promoters can be derived from operons encoding glycolytic enzymes such as 3- phosphoglycerate kinase (PGK), alpha-factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is assembled in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable of directing 2o secretion of translated protein into the periplasmic space or extracellular medium.
Optionally) the heterologous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired characteristics, e.g., stabilization or simplified purification of expressed recombinant product.
Useful expression vectors for bacterial use are constructed by inserting a 25 structural DNA sequence encoding a desired protein together with suitable translation initiation and termination signals in operable reading phase with a fixnctional promoter. The vector will comprise one or more phenotypic selectable markers and an origin of replication to ensure maintenance of the vector and, when desirable, provide amplification within the host.
3o Suitable prokaryotic hosts for transformation include strains of E. coli, B
.
subtilis, Salmonella typhimurium and various species within the genera Pseudomonas and Streptomyces. Others may, also be employed as a matter of choice.
" As a representative but non-limiting example, useful expression vectors for bacterial use can comprise a selectable marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements of the well known cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 (available form Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, WI, USA). These pBR322 "backbone" sections are combined with an appropriate promoter and the structural sequence to be expressed.
Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the selected promoter, where it is inducible, is derepressed or induced by appropriate means (e. g. , temperature shift or chemical I o induction) and cells are cultured for an additional period to provide for expression ' of the induced gene product. Thereafter cells are typically harvested, generally by centrifugation, disrupted to release expressed protein, generally by physical or chemical means, and the resulting crude extract is retained for further purification.
Various mammalian cell culture systems can also be employed to express ~ 5 recombinant protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 23:l75 ( 1981 ), and other cell lines capable of expressing a compatible vector, for example) the C 127, 3T3, CHO, HeLa and BHK cell lines.
Mammalian expression vectors will comprise an origin of replication, a 2o suitable promoter and enhancer, and also any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required 25 nontranscribed genetic elements.
Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps.
Microbial cells employed in expression of proteins can be disrupted by any 3o convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. Protein refolding steps can be used, as necessary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can be employed for final purification steps.

The present invention further includes isolated polypeptides, proteins and nucleic acid molecules which are substantially equivalent to those herein described.
As used herein, substantially equivalent can refer both to nucleic acid and amino acid sequences, for example a mutant sequence) that varies from a reference sequence by one or more substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity between reference and subject sequences. For purposes of the present invention, sequences having equivalent biological activity, and equivalent expression characteristics are considered substantially equivalent. For purposes of determining equivalence) ~ o truncation of the mature sequence should be disregarded.
The invention further provides methods of obtaining homologs from other strains of Streptococcus pneumoniae, of the fragments of the Streptococcus pneumoniae genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. As used herein, a sequence or protein of ~ 5 Streptococcus pneumoniae is defined as a homolog of a fragment of the Streptococcus pneumoniae fragments or contigs or a protein encoded by one of the ORFs of the present invention, if it shares significant homology to one of the fragments of the Streptococcus pneumoniae genome of the present invention or a protein encoded by one of the ORFs of the present invention. Specifically, by 2o using the sequence disclosed herein as a probe or as primers, and techniques such as PCR cloning and colonylplaque hybridization, one skilled in the art can obtain homologs.
As used herein, two nucleic acid molecules or proteins are said to "share significant homology" if the two contain regions which possess greater than 85%
25 sequence (amino acid or nucleic acid) homology. Preferred homologs in this regard are those with more than 90% homology. Especially preferred are those with 93% or more homology. Among especially preferred homologs those with 95% or more homology are particularly preferred. Very particularly preferred among these are those with 97% and even more particularly preferred among those 30 are homologs with 99% or more homology. The most preferred homologs among these are those with 99.9% homology or more. It will be understood that, among measures of homology, identity is particularly preferred in this regard.
Region specific primers or probes derived from the nucleotide sequence * provided in SEQ 117 NOS: I -39 I or from a nucleotide sequence at least 95 %, 35 particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ

ID NOS:1-39l can be used to prime DNA synthesis and PCR amplification, as well as to identify colonies containing cloned DNA encoding a homolog. Methods suitable to this aspect of the present invention are well known and have been described in great detail in many publications such as, for example, Innis et al., PCR Protocols, Academic Press, San Diego, CA ( l990)).
When using primers derived from SEQ ID NOS: l-391 or from a nucleotide sequence having an aforementioned identity to a sequence of SEQ ID NOS :1-3 91, one skilled in the art will recognize that by employing high stringency conditions (e.g., annealing at 50-60°C in 6X SSPC and 50% formamide, and washing at 50-~o 65°C in 0.5X 5SPC) only sequences which are greater than 75%
homologous to the primer will be amplified. By employing lower stringency conditions (e.g., hybridizing at 35-37°C in SX SSPC and 40-45% formamide, and washing at 42°C
in 0.5X SSPC), sequences which are greater than 40-50% homologous to the primer will also be amplified.
~5 When using DNA probes derived from SEQ ~ NOS:I-391, or from a nucleotide sequence having an aforementioned identity to a sequence of SEQ ID
NOS:1-391, for colony/plaque hybridization, one skilled in the art will recognize that by employing high stringency conditions (e.g., hybridizing at 50-65°C in SX
SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC), sequences 2o having regions which are greater than 90% homologous to the probe can be obtained, and that by employing lower stringency conditions ( e. g. , hybridizing at 35-37°C in SX SSPC and 40-45% formamide, and washing at 42°C in 0.5X
SSPC), sequences having regions which are greater than 35-45% homologous to the probe will be obtained.
25 Any organism can be used as the source for homologs of the present invention so long as the organism naturally expresses such a protein or contains genes encoding the same. The most preferred organism for isolating homologs are bacteria which are closely related to Streptococcus pneumoniae.
3o ILLUSTRATIVE USES OF COMPOSITIONS OF THE
INVENTION
Each ORF provided in Tables 1 and 2 is identified with a function by homology to a known gene or polypeptide. As a result, one skilled in the art can use the polypeptides of the present invention for commercial, therapeutic and 35 industrial purposes consistent with the type of putative identification of the polypeptide. Such identifications permit one skilled in the art to use the Streptococcus pneumoniae ORFs in a manner similar to the known type of sequences for which the identification is made; for example, to ferment a particular sugar source or to produce a particular metabolite. A variety of reviews illustrative of this aspect of the invention are available, including the following reviews on the industrial use of enzymes, for example, BIOCHEMICAL ENGINEERING AND
BIOTECHNOLOGY HANDBOOK, 2nd Ed., MacMillan Publications, Ltd. NY
(1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper et al., Eds., Elsevier Science Publishers, Amsterdam, The Netherlands ( 1985). A variety of t o exemplary uses that illustrate this and similar aspects of the present invention are discussed below.
1. Biosynthetic Enzymes Open reading frames encoding proteins involved in mediating the catalytic reactions involved in intermediary and macromolecular metabolism, the biosynthesis of small molecules, cellular processes and other functions includes enzymes involved in the degradation of the intermediary products of metabolism, enzymes involved in central intermediary metabolism, enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation, 2o enzymes involved in ATP proton motor force conversion, enzymes involved in broad regulatory function, enzymes involved in amino acid synthesis, enzymes involved in nucleotide synthesis, enzymes involved in cofactor and vitamin synthesis, can be used for industrial biosynthesis.
The various metabolic pathways present in Streptococcus pneumoniae can be identified based on absolute nutritional requirements as well as by examining the various enzymes identified in Table 1-3 and SEQ ID NOS:1-391.
Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non-macromolecular metabolism. Such enzymes include amylases, glucose oxidases, and catalase.
3o Proteolytic enzymes are another class of commercially important enzymes.
Proteolytic enzymes find use in a number of industrial processes including the processing of flax and other vegetable fibers, in the extraction, clarification and depectinization of fruit juices, in the extraction of vegetables' oil and in the ' maceration of fruits and vegetables to give unicellular fruits. A detailed review of the proteolytic enzymes_ used in the food industry is provided in Rombouts et al., WO 98I18931 PCT/(JS97/19588 Symbiosis 21:79 ( 1986) and Voragen et al. in Biocatalysts In Agricultural Biotechnology, Whitaker et al., Eds., American Chemical Society Symposium Series 389:93 (1989) .
The metabolism of sugars is an important aspect of the primary metabolism 5 of Streptococcus pneumoniae. Enzymes involved in the degradation of sugars, such as, particularly, glucose, galactose, fructose and xylose, can be used in industrial fermentation. Some of the important sugar transforming enzymes, from a commercial viewpoint, include sugar isomerases such as glucose isomerase.
Other metabolic enzymes have found commercial use such as glucose oxidases to which produces ketogulonic acid (KGA). KGA is an intermediate in the commercial production of ascorbic acid using the Reichstein's procedure, as described in Krueger et al., Biotechnology 6~A~, Rhine et al., Eds., Verlag Press, Weinheim, Germany ( 1984).
Glucose oxidase (GOD) is commercially available and has been used in ~ 5 purified form as well as in an immobilized form for the deoxygenation of beer.
See, for instance, Hartmeir et al., Biotechnology Letters l:21 ( 1979). The most important application of GOD is the industrial scale fermentation of gluconic acid.
Market for gluconic acids which are used in the detergent, textile, leather, photographic, pharmaceutical, food) feed and concrete industry) as described, for 2o example, in Bigelis et al. , beginning on page 357 in GENE MANIPULATIONS
AND FUNGI; Benett et al.) Eds., Academic Press, New York ( 1985). In addition to industrial applications, GOD has found applications in medicine for quantitative determination of glucose in body fluids recently in biotechnology for analyzing syrups from starch and cellulose hydrosylates. This application is described in 25 Owusu et al., Biochem. et Biophysica. Acta. 872: 83 ( 1986), for instance.
The main sweetener used in the world today is sugar which comes from sugar beets and sugar cane. In the field of industrial enzymes, the glucose isomerase process shows the largest expansion in the market today. Initially, soluble enzymes were used and later immobilized enzymes were developed 30 (Krueger et al.) Biotechnology, The Textbook of Industrial Microbiology, Sinauer Associated Incorporated, Sunderland, Massachusetts ( 1990)). Today, the use of glucose- produced high fructose syrups is by far the largest industrial business using immobilized enzymes. A review of the industrial use of these enzymes is provided by Jorgensen, Starch 40:307 ( 1988).

Proteinases, such as alkaline serine proteinases) are used as detergent additives and thus represent one of the largest volumes of microbial enzymes used in the industrial sector. Because of their industrial importance, there is a large body of published and unpublished information regarding the use of these enzymes in industrial processes. (See Faultman et al., Acid Proteases Structure Function and Biology, Tang, J., ed.) Plenum Press, New York ( 1977) and Godfrey et al.) Industrial Enzymes, MacMilian Publishers, Surrey, UK (1983) and Hepner et al., Report Industrial Enzymes by 1990, Hel Hepner & Associates, London ( l986)).
Another class of commercially usable proteins of the present invention are t o the microbial lipases, described by, for instance, Macrae et al., Philosophical Transactions of the Chiral Society of London 3l0:227 ( l985) and Poserke, Journal of the Amarican Oil Chemist Society 61: l758 ( l984). A major use of lipases is in the fat and oil industry for the production of neutral glycerides using lipase catalyzed inter-esterification of readily available triglycerides. Application of ~ 5 lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the washing procedures.
The use of enzymes, and in particular microbial enzymes, as catalyst for key steps in the synthesis of complex organic molecules is gaining popularity at a great rate. One area of great interest is the preparation of chiral intermediates.
20 Preparation of chiral intermediates is of interest to a wide range of synthetic chemists particularly those scientists involved with the preparation of new pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies et al. , Recent Advances irc the Generation of Chiral Intermediates Using Enzymes, CRC Press, Boca Raton, Florida ( 1990)). The following reactions catalyzed by enzymes are of 25 interest to organic chemists: hydrolysis of carboxylic acid esters) phosphate esters, amides and nitriles, esterification reactions, trans-esterification reactions, synthesis of amides, reduction of alkanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides, and carbon bond forming reactions such as the aldol reaction.
3o When considering the use of an enzyme encoded by one of the ORFs of the present invention for biotransformation and organic synthesis it is sometimes necessary to consider the respective advantages and disadvantages of using a microorganism as opposed to an isolated enzyme. Pros and cons of using a whole cell system on the one hand or an isolated partially purified enzyme on the other hand, has been described in detail by Bud et al., Chemistry in Britain ( 1987), p.
I27.
Amino transferases, enzymes involved in the biosynthesis and metabolism of amino acids, are useful in the catalytic production of amino acids. The advantages of using microbial based enzyme systems is that the amino transferase enzymes catalyze the stereo- selective synthesis of only L-amino acids and generally possess uniformly high catalytic rates. A description of the use of amino transferases for amino acid production is provided by Roselle-David, Methods of Enzymology l36:479 ( 1987).
1o Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in nucleic acid synthesis, repair, and recombination.
2. Generation of Antibodies i 5 As described here, the proteins of the present invention, as well as homologs thereof, can be used in a variety of procedures and methods known in the art which are currently applied to other proteins. The proteins of the present invention can further be used to generate an antibody which selectively binds the protein. Such antibodies can be either monoclonal or polyclonal antibodies, as well 2o fragments of these antibodies, and humanized forms.
The invention further provides antibodies which selectively bind to one of the proteins of the present invention and hybridomas which produce these antibodies. A hybridoma is an immortalized cell line which is capable of secreting a specific monoclonal antibody.
25 In general, techniques for preparing poIyclonal and monoclonal antibodies as well as hybridomas capable of producing the desired antibody are well known in the art (Campbell, A. M., Monoclonal Antibody Technology: Laboratory Techniques In Biochemistry And Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands ( 1984); St. Groth et al., J. Immunol. Methods 35.~

30 21 ( 1980), Kohler and Milstein, Nature 256: 495-497 ( 1975)), the trioma technique) the human B-cell hybridoma technique (Kozbor et al., Immunology Today 4:72 ( l983)) pgs. 77-96 of Cole et al., in Monoclonal Antibodies And Cancer Therapy, Alan R. Liss) Inc. ( 1985)). Any animal (mouse, rabbit, etc. ) which is known to produce antibodies can be immunized with the pseudogene 35 polypeptide. Methods for immunization are well known in the art. Such methods include subcutaneous or interperitoneal injection of the polypeptide. One skilled in the art will recognize that the amount of the protein encoded by the ORF of the present invention used for immunization will vary based on the animal which is immunized, the antigenicity of the peptide and the site of injection.
The protein which is used as an immunogen may be modified or administered in an adjuvant in order to increase the proteins antigenicity.
Methods of increasing the antigenicity of a protein are well known in the art and include, but are not limited to coupling the antigen with a heterologous protein (such as globulin or galactosidase) or through the inclusion of an adjuvant during immunization.
For monoclonal antibodies, spleen cells from the immunized animals are removed, fused with myeioma cells, such as SP2J0-Ag 14 myeloma cells, and allowed to become monoclonal antibody producing hybridoma cells.
Any one of a number of methods well known in the art can be used to identify the hybridoma cell which produces an antibody with the desired t 5 characteristics. These include screening the hybridomas with an ELISA
assay, western blot analysis, or radioimmunoassay (Lutz et al., Exp. Cell Res.
175:109-124 ( 1988)).
Hybridomas secreting the desired antibodies are cloned and the class and subclass is determined using procedures known in the art (Campbell, A. M., 20 Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science Publishers) Amsterdam, The Netherlands ( l984)).
Techniques described for the production of single chain antibodies (U. S .
Patent 4,946,778) can be adapted to produce single chain antibodies to proteins of 25 the present invention.
For polyclonal antibodies, antibody containing antisera is isolated from the immunized animal and is screened for the presence of antibodies with the desired specificity using one of the above-described procedures.
The present invention further provides the above- described antibodies in 30 detestably labelled form. Antibodies can be detestably labelled through the use of radioisotopes, affinity labels (such as biotin, avidin, etc. ), enzymatic labels (such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FITC or rhodamine, etc. ), paramagnetic atoms, etc. Procedures for accomplishing such labeling are well-known in the art, for example see Sternberger et al., J.
35 Histochem. Cytochem. 18: 315 ( 1970); Bayer, E. A. et al., Meth. Enzym.
62:308 ( l979); Engval, E. et al., Immunol. l09:129 ( l972); Goding, J. W., J.
Immunal.
Meth. 13: 215 ( 1976)).
The labeled antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to identify cells or tissues in which a fragment of the Streptococcus pneumoniae genome is expressed.
The present invention further provides the above-described antibodies immobilized on a solid support. Examples of such solid supports include plastics such as polycarbonate, complex carbohydrates such as agarnse and sepharose, acrylic resins and such as polyacrylamide and latex beads. Techniques for to coupling antibodies to such solid supports are well known in the art (Weir, D. M.
et al., "Handbook of Experimental Immunology" 4th Ed., Blackwell Scientific Publications, Oxford, England) Chapter 10 ( 1986); Jacoby, W. D. et al., Meth.
Enzym. 34 Academic Press, N. Y. ( 1974)). The immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for ~ 5 immunoaffinity purification of the proteins of the present invention.
3. Diagnostic Assays and Kits The present invention further provides methods to identify the expression of one of the ORFs of the present invention, or homolog thereof, in a test sample, 2o using one of the DFs or antibodies of the present invention.
In detail, such methods comprise incubating a test sample with one or more of the antibodies or one or more of the DFs of the present invention and assaying for binding of the DFs or antibodies to components within the test sample.
Conditions for incubating a DF or antibody with a test sample vary.
25 Incubation conditions depend on the format employed in the assay, the detection methods employed, and the type and nature of the DF or antibody used in the assay. One skilled in the art will recognize that any one of the commonly available hybridization, amplification or immunological assay formats can readily be adapted to employ the DFs or antibodies of the present invention. Examples of such assays 30 can be found in Chard, T., An Introduction to Radioimmunoassay and Related Techniques) Elsevier Science Publishers, Amsterdam, The Netherlands ( 1986);
Bullock, G. R. et al., Techniques in Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 ( 1982), Vol. 2 ( l983), Vol. 3 ( 1985); Tijssen, P., Practice and Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands ( l985).
The test samples of the present invention include cells, protein or membrane extracts of cells, or biological fluids such as sputum, blood, serum, plasma, or 5 urine. The test sample used in the above-described method will vary based on the assay format, nature of the detection method and the tissues, cells or extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane extracts of cells are well known in the art and can be readily be adapted in order to obtain a sample which is compatible with the system utilized.
1 o In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry out the assays of the present invention.
Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the DFs or antibodies of the present invention; and (b) one or t 5 more other containers comprising one or more of the following: wash reagents, reagents capable of detecting presence of a bound DF or antibody.
In detail, a compartmentalized kit includes any kit in which reagents are contained in separate containers. Such containers include small glass containers, plastic containers or strips of plastic or paper. Such containers allows one to 2o efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one compartment to another. Such containers will include a container which will accept the test sample, a container which contains the antibodies used in the assay, 25 containers which contain wash reagents (such as phosphate buffered saline, Tris-buffers, etc. ), and containers which contain the reagents used to detect the bound antibody or DF.
Types of detection reagents include labelled nucleic acid probes, labelled secondary antibodies, or in the alternative, if the primary antibody is labelled, the 3o enzymatic, or antibody binding reagents which are capable of reacting with the labelled antibody. One skilled in the art will readily recognize that the disclosed DFs and antibodies of the present invention can be readily incorporated into one of the established kit formats which are well known in the art.
35 4. Screening. Assay for Binding Agents Using the isolated proteins of the present invention, the present invention further provides methods of obtaining and identifying agents which bind to a protein encoded by one of the ORFs of the present invention or to one of the fragments and the Streptococcus pneumoniae fragment and contigs herein described.
In general, such methods comprise steps of:
(a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention, or an isolated fragment of the Streptococcus pneumoniae genome; and t o (b) determining whether the agent binds to said protein or said fragment.
The agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, or other pharmaceutical agents.
The agents can be selected and screened at random or rationally selected or designed using protein modeling techniques.
t 5 For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention.
Alternatively) agents may be rationally selected or designed. As used herein, an agent is said to be "rationally selected or designed" when the agent is 2o chosen based on the configuration of the particular protein. For example, one skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, for example see Hurby et al., "Application of Synthetic Peptides: Antisense Pepvides," in 25 Synthetic Peptides, A User's Guide, W . H. Freeman, NY ( 1992), pp. 289-307, and Kaspczak et al., Biochemistry 28:9230-8 ( 1989), or pharmaceutical agents, or the like.
In addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to control gene expression through binding to one 30 of the ORFs or EMFs of the present invention. As described above, such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or multiple ORFs which rely on the same EMF for expression control.

One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have base attachment capacity.
Agents suitable for use in these methods usually contain 20 to 40 bases and are designed to be complementary to a region of the gene involved in transcription (triple helix - see Lee et al., Nucl. Acids Res. 6:3073 (l979); Cooney et al.) Science 241:456 ( 1988); and Dervan et al., Science 251:1360 ( l991 )) or to the mRNA itself (antisense - Okano, J. Neurochem. 56: 560 ( 1991 );
Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, FL ( 1988)). Triple helix- formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated to be effective in model systems. Information contained in the t 5 sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides, and other DNA binding agents.
5. Pharmaceutical Compositions and Vaccines The present invention further provides pharmaceutical agents which can be used to modulate the growth or pathogenicity of Streptococcus pneumoniae, or another related organism, in vivo or in vitro. As used herein, a "pharmaceutical agent" is defined as a composition of matter which can be formulated using known techniques to provide a pharmaceutical ~ compositions. As used herein, the "pharmaceutical agents of the present invention" refers the pharmaceutical agents which are derived from the proteins encoded by the ORFs of the present invention or are agents which are identified using the herein described assays.
As used herein, a pharmaceutical agent is said to "modulate the growth pathogenicity of Streptococcus pneumoniae or a related organism, in vivo or in vitro," when the agent reduces the rate of growth, rate of division, or viability of 3o the organism in question. The pharmaceutical agents of the present invention can modulate the growth or pathogenicity of an organism in many fashions, although an understanding of the underlying mechanism of action is not needed to practice the use of the pharmaceutical agents of the present invention. Some agents will modulate the growth by binding to an important protein thus blocking the biological activity of the protein, while other agents may bind to a component of the outer WO 98l18931 PCT/LTS97/19588 surface of the organism blocking attachment or rendering the organism more prone to act the bodies nature immune system. Alternatively, the agent may comprise a protein encoded by one of the ORFs of the present invention and serve as a vaccine. The development and use of a vaccine based on outer membrane components are well known in the art.
As used herein, a "related organism" is a broad term which refers to any organism whose growth can be modulated by one of the pharmaceutical agents of the present invention. In general, such an organism will contain a homolog of the protein which is the target of the pharmaceutical agent or the protein used as a vaccine. As such, related organisms do not need to be bacterial but may be fungal or viral pathogens.
The pharmaceutical agents and compositions of the present invention may be administered in a convenient manner, such as by the oral, topical, intravenous) intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal routes. The ~ 5 pharmaceutical compositions are administered in an amount which is effective for treating and/or prophylaxis of the specific indication. In general, they are administered in an amount of at least about 1 mg/kg body weight and in most cases they will be administered in an amount not in excess of about 1 glkg body weight per day. In most cases, the dosage is from about 0.1 mg/kg to about 10 gikg body 2o weight daily, taking into account the routes of administration, symptoms, etc.
The agents of the present invention can be used in native form or can be modified to form a chemical derivative. As used herein, a molecule is said to be a "chemical derivative" of another molecule when it contains additional chemical moieties not normally a part of the molecule. Such moieties may improve the 25 molecule's solubility, absorption, biological half life, etc. The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable side effect of the molecule, etc. Moieties capable of mediating such effects are disclosed in, among other sources, REMINGTON'S
PHARMACEUTICAL SCIENCES ( 1980) cited elsewhere herein.
3o For example, such moieties may change an immunological character of the functional derivative, such as affinity for a given antibody. Such changes in immunomodulation activity are measured by the appropriate assay, such as a competitive type immunoassay. Modifications of such protein properties as redox or thermal stability) biological half life, hydrophobicity, susceptibility to proteolytic 35 degradation or the tendency to aggregate with carriers or into multimers also may WO 98/18931 PCTfUS97J19588 be effected in this way and can be assayed by methods well known to the skilled artisan.
The therapeutic effects of the agents of the present invention may be obtained by providing the agent to a patient by any suitable means (e.g., inhalation, '" 5 intravenously, intramuscularly, subcutaneously, enterally, or parenterally). It is preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood or tissue in which the growth of the organism is to be controlled. To achieve an effective blood concentration, the preferred method is to administer the agent by injection. The administration may be ~ o by continuous infusion, or by single or multiple injections.
In providing a patient with one of the agents of the present invention) the dosage of the administered agent will vary depending upon such factors as the patient's age, weight, height, sex, general medical condition, previous medical history, erc. In general, it is desirable to provide the recipient with a dosage of 15 agent which is in the range of from about 1 pgJkg to 10 mglkg (body weight of patient), although a lower or higher dosage may be administered. The therapeutically effective dose can be lowered by using combinations of the agents of the present invention or another agent.
As used herein, two or more compounds or agents are said to be 2o administered "in combination" with each other when either ( 1 ) the physiological effects of each compound, or (2) the serum concentrations of each compound can be measured at the same time. The composition of the present invention can be administered concurrently with, prior to, or following the administration of the other agent.
25 The agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to decrease the rate of growth (as defined above) of the target organism.
The administration of the agents) of the invention may be for either a "prophylactic" or "therapeutic" purpose. When provided prophylactically, the 3o agents) are provided in advance of any symptoms indicative of the organisms growth. The prophylactic administration of the agents) serves to prevent, attenuate, or decrease the rate of onset of any subsequent infection. When provided therapeutically, the agents) are provided at (or shortly after) the onset of - an indication of infection. The therapeutic administration of the compounds}

serves to attenuate the pathological symptoms of the infection and to increase the rate of recovery.
The agents of the present invention are administered to a subject, such as a mammal, or a patient, in a pharmaceutically acceptable form and in a therapeutically 5 effective concentration. A composition is said to be "pharmacologically acceptable"
if its administration can be tolerated by a recipient patient. Such an agent is said to be administered in a "therapeutically effective amount" if the amount administered is physiologically significant. An agent is physiologically significant if its presence results in a detectable change in the physiology of a recipient patient.
t o The agents of the present invention can be formulated according to known methods to prepare pharmaceutically useful compositions, whereby these materials, or their functional derivatives, are combined in a mixture with a pharmaceutically acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of other human proteins, e.g., human serum albumin, are described, for example) in is REMINGTON'S PHARMACEUTICAL SCIENCES, 16th Ed., Osol, A., Ed., Mack Publishing, Easton PA ( l980). In order to form a pharmaceutically acceptable composition suitable for effective administration, such compositions will contain an effective amount of one or more of the agents of the present invention) together with a suitable amount of carrier vehicle.
2o Additional pharmaceutical methods may be employed to control the duration of action. Control release preparations may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention.
The controlled delivery may be effectuated by a variety of well known techniques, including formulation with macromolecules such as, for example, polyesters, 25 polyamino acids) polyvinyl, pyrrolidone, ethylenevinylacetate, methylcellulose, carboxymethylcellulose) or protamine> sulfate, adjusting the concentration of the macromolecules and the agent in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired time course of release. Another possible method to control the duration of action by 3o controlled release preparations is to incorporate agents of the present invention into particles of a polymeric material such as polyesters, polyamino acids, hydrogels, poly(lactic acid) or ethylene vinylacetate copolymers. Alternatively, instead of incorporating these agents into polymeric particles, it is possible to entrap these materials in microcapsules prepared, for example, by coacervation techniques or by 35 interfacial polymerization with, for example, hydroxymethylcellulose or gelatine-microcapsules and poly(methylmethacylate) microcapsules, respectively, or in colloidal drug delivery systems, for example, liposomes, albumin microspheres, microemulsions, nanoparticles, and nanocapsules or in macroemulsions. Such techniques are disclosed in REMINGTON'S PHARMACEUTICAL SCIENCES
s ( 1980).
The invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the pharmaceutical compositions of the invention. Associated with such containers) can be a notice in the form prescribed by a governmental agency regulating the manufacture, use or t o sale of pharmaceuticals or biological products, which notice reflects approval by the agency of manufacture, use or sale for human administration.
In addition, the agents of the present invention may be employed in conjunction with other therapeutic compounds.
t5 6. Shot-Gun Approach to Megabase DNA Sequencing The present invention further demonstrates that a large sequence can be sequenced using a random shotgun approach. This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating and ordering overlapping or contiguous subclones prior to the start of the sequencing 2o protocols.
Certain aspects of the present invention are described in greater detail in the examples that follow. The examples are provided by way of illustration. Other aspects and embodiments of the present invention are contemplated by the inventors, as will be clear to those of skill in the art from reading the present 25 disclosure.
ILLUSTRATIVE EXAMPLES
LIBRARIES AND SEQUENCING
30 1. Shotgun Sequencing Probability Analysis The overall strategy for a shotgun approach to whole genome sequencing follows from the Lander and Waterman (Landerman and Waterman, Genomics 2: 231 ( 1988)) application of the equation for the Poisson distribution.
According to this treatment, the probability, P , that any given base in a sequence of size L, in 35 nucleotides, is not sequenced after a certain amount, n, in nucleotides, of random sequence has been determined can be calculated by the equation P = e-m, where m is L/n, the fold coverage. For instance, for a genome of 2.8 Mb, m=1 when 2.8 Mb of sequence has been randomly generated ( 1 X coverage). A~that point, P -e-1 = 0.37. The probability that any given base has not been sequenced is the same as the probability that any region of the whole sequence L has not been determined and, therefore, is equivalent to the fraction of the whole sequence that has yet to be determined. Thus, at one-fold coverage, approximately 3?% of a polynucleotide of size L, in nucleotides has not been sequenced. When 14 Mb of sequence has been generated, coverage is 5X for a 2.8 Mb and the unsequenced fraction drops to to .0067 or 0.67%. SX coverage of a 2.8 Mb sequence can be attained by sequencing approximately 17,000 random clones from both insert ends with an average sequence read length of 410 bp.
Similarly, the total gap length, G, is determined by the equation G = Le-m) and the average gap size, g, follows the equation, g = L/n. Thus> 5X coverage ~ s leaves about 240 gaps averaging about 82 by in size in a sequence of a polynucleotide 2.8 Mb long.
The treatment above is essentially that of Lander and Waterman, Genomics 2: 231 ( 1988).
20 2. Random Library Construction In order to approximate the random model described above during actual sequencing, a nearly ideal library of cloned genomic fragments is required.
The following library construction procedure was developed to achieve this end.
Streptococcus pneumoniae DNA is prepared by phenol extraction. A
25 mixture containing 200 ~tg DNA in 1.0 ml of 300 mM sodium acetate) 10 mM
Tris HCI, 1 mM Na-EDTA, 50% glycerol is processed through a nebulizer (IPI Medical Products) with a stream of nitrogen adjusted to 35 Kpa for 2 minutes. The sonicated DNA is ethanol precipitated and redissolved in 500 ~1 TE buffer.
To create blunt-ends, a l00 Itl aliquot of the resuspended DNA is digested 3o with 5 units of BAL31 nuclease (New England BioLabs) for 10 min at 30°C in 200 ~.l BAL31 buffer. The digested DNA is phenol-extracted, ethanol-precipitated, redissolved in l00 ~tl TE buffer, and then size-fractionated by electrophoresis through a 1.0% low melting temperature agarose gel. The section containing DNA
fragments 1.6-2.0 kb in size is excised from the gel, and the LGT agarose is melted 35 and the resulting solution is extracted with phenol to separate the agarose from the WO 98I18931 PCTlUS97/19588 DNA. DNA is ethanol precipitated and redissolved in 20 ltl of TE buffer for iigation to vector.
A two-step ligation procedure is used to produce a plasmid library with 97% inserts, of which >99% were single inserts. The first ligation mixture {50 ul}
contains 2 p.g of DNA fragments, 2 p.g pUC 18 DNA (Pharmacia} cut with SmaI
and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase (GIBCOlBRL} and is incubated at 14°C for 4 hr. The ligation mixture then is phenol extracted and ethanol precipitated, and the precipitated DNA is dissolved in 20 p.l TE buffer and electrophoresed on a 1.0% low melting agarose gel.
Discrete bands in a ladder are visualized by ethidium bromide-staining and UV
illumination and identified by size as insert (I), vector (v), v+I, v+2i, v+3i, etc. The portion of the gel containing v+I DNA is excised and the v+I DNA is recovered and resuspended into 20 pl TE. The v+I DNA then is blunt-ended by T4 polymerase treatment for 5 min. at 37°C in a reaction mixture (50 ul) containing the v+I linears) t 5 500 1tM each of the 4 dNTPs, and 9 units of T4 polymerase (New England BioLabs), under recommended buffer conditions. After phenol extraction and ethanol precipitation the repaired v+I linears are dissolved in 20 111 TE. The final ligation to produce circles is carried out in a 50 pl reaction containing 5 pl of v+I
linears and S units of T4 ligase at 14°C overnight. After 10 min. at 70°C the 2o following day) the reaction mixture is stored at -20°C.
This two-stage procedure results in a molecularly random collection of single-insert plasmid recombinants with minimal contamination from double-insert chimeras (< 1 %) or free vector (<3%).
Since deviation from randomness can arise from propagation the DNA in 25 the host, E. roll host cells deficient in all recombination and restriction functions (A. Greener, Strategies 3 (1 ):5 ( 1990)) are used to prevent rearrangements, deletions, and loss of clones by restriction. Furthermore, transformed cells are plated directly on antibiotic diffusion plates to avoid the usual broth recovery phase which allows multiplication and selection of the most rapidly growing cells.
3o Plating is carried out as follows. A l00 p.l aliquot of Epicurian Coli SURE
II Supercompetent Cells (Stratagene 200152) is thawed on ice and transferred to a chilled Falcon 2059 tube on ice. A 1.7 p,l aliquot of 1.42 M beta-mercaptoethanol is added to the aliquot of cells to a final concentration of 25 mM. Cells are incubated on ice for 10 min. A 1 ~tl aliquot of the final ligation is added to the cells 35 and incubated on ice fot 30 min. The cells are heat pulsed for 30 sec. at 42°C and placed back on ice for 2 min. The outgrowth period in liquid culture is eliminated from this protocol in order to minimize the preferential growth of any given transformed cell. Instead the transformation mixture is plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 20 g tryptone, 5 g yeast extract, 0.5 g NaCI, 1.5% Difco Agar per liter of media).
The 5 ml bottom layer is supplemented with 0.4 ml of 50 m~ml ampicillin per 100 ml SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml X-Gal (2%), 1 ml MgCI (1 M), and i ml MgSO /100 m1 SOB agar. The 15 ml top layer is poured just prior to plating. Our titer is approximately 100 colonies/10 pl aliquot of transformation 4 All colonies are picked for template preparation regardless of size. Thus) only clones lost due to "poison" DNA or deleterious gene products are deleted from the library, resulting in a slight increase in gap number over that expected.
t5 3. Random DNA Sequencing High quality double stranded DNA plasmid templates are prepared using a "boiling bead" method developed in collaboration with Advanced Genetic Technology Core. (Gaithersburg, MD) (Adams et al.) Science 252:l651 (1991);
Adams et al., Nature 35S: 632 ( l992)). Plasmid preparation is performed in a 2o well format for all stages of DNA preparation from bacterial growth through final DNA purification. Template concentration is determined using Hoechst Dye and a Millipore Cytofluor. DNA concentrations are not adjusted, but low-yielding templates are identified where possible and not sequenced.
Templates are also prepared from two Streptococcus pneumonic~e lambda 25 genomic libraries. An amplified library is constructed in the vector Lambda GEM
12 (Promega) and an unamplified library is constructed in Lambda DASH II
(Stratagene). In particular, for the unamplified lambda library, Streptococcus pneumoniae DNA (> 100 kb) is partially digested in a reaction mixture (200 ul) containing 50 p,g DNA, 1X Sau3AI buffer, 20 units Sau3AI for 6 min. at 23°C.
3o The digested DNA was phenol-extracted and electrophoresed on a 0.5 % low melting agarose gel at 2V/cm for 7 hours. Fragments from 15 to 25 kb are excised and recovered in a final volume of 6 ul. One pl of fragments is used with 1 ~,1 of DASHII vector (Stratagene) in the recommended ligation reaction. One p.l of the ligation mixture is used per packaging reaction following the recommended 35 protocol with the Gigapack II XL Packaging Extract (Stratagene, #227711 ).
Phage are plated directly without amplification from the packaging mixture (after dilution with 500 p,l of recommended SM buffer and chloroform treatment). Yield is about 2.5x 103 pfu/ul. The amplified library is prepared essentially as above except the lambda GEM-12 vector is used. After packaging) about 3.5x104 pfu are plated on 5 the restrictive NM539 host. The lysate is harvested in 2 ml of SM buffer and stored frozen in 7% dimethylsulfoxide. The phage titer is approximately 1 x pfu/ml.
Liquid lysates (100 p.l) are prepared from randomly selected plaques (from the unamplified library) and template is prepared by long-range PCR using T7 and 1 o T3 vector-specific primers.
Sequencing reactions are carried out on plasmid and/or PCR templates using the AB Catalyst LabStation with Applied Biosystems PRISM Ready Reaction Dye Primer Cycle Sequencing Kits for the M 13 forward (M 13-21 ) and the M 13 reverse (M 13RP 1 ) primers (Adams et al., Nature 368:474 ( 1994)).
Dye t 5 terminator sequencing reactions are carried out on the lambda templates on a Perkin-Elmer 9600 Thermocycler using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits. T7 and SP6 primers are used to sequence the ends of the inserts from the Lambda GEM-12 library and T7 and T3 primers are used to sequence the ends of the inserts from the Lambda DASH II library.
20 Sequencing reactions are performed by eight individuals using an average of fourteen AB 373 DNA Sequencers per day. All sequencing reactions are analyzed using the Stretch modification of the AB 373, primarily using a 34 cm well-to-read distance. The overall sequencing success rate very approximately is about 85%
for M13-21 and M13RP1 sequences and 65% for dye-terminator reactions. The 25 average usable read length is 485 by for M 13-21 sequences, 445bp for M

sequences) and 375 by for dye-terminator reactions.
Richards et al., Chapter 28 in AUTOMATED DNA SEQUENCING AND
ANALYSIS, M. D. Adams, C. Fields, J. C. Venter, Eds., Academic Press, London, ( 1994) described the value of using sequence from both ends of 3o sequencing templates to facilitate ordering of contigs in shotgun assembly projects of lambda and cosmid clones. We balance the desirability of both-end sequencing (including the reduced cost of lower total number of templates) against shorter read-lengths for sequencing reactions performed with the M13RP1 (reverse) primer compared to the M 13-21 (forward) primer. Approximately one-half of the 35 templates are sequenced from both ends. Random reverse sequencing reactions are done based on successful forward sequencing reactions. Some M13RP1 sequences are obtained in a semi-directed fashion: M 13-21: sequences pointing outward at the ends of contigs are chosen for M 13RP I sequencing in an effort to specifically order contigs.
4. Protocol for Automated Cycle Sequencing The sequencing is carried out using ABI Catalyst robots and AB 373 Automated DNA Sequencers. The Catalyst robot is a publicly available sophisticated pipetting and temperature control robot which has been developed specifically for DNA sequencing reactions. The Catalyst combines pre-aliquoted templates and reaction mixes consisting of deoxy- and dideoxynucleotides, the thermostable Taq DNA polymerase, fluorescently-labelled sequencing primers, and reaction buffer. Reaction mixes and templates are combined in the wells of an aluminum 96-well thermocycling plate. Thirty consecutive cycles of linear t5 amplification (i.e.., one primer synthesis) steps are performed including denaturation) annealing of primer and template, and extension; i. e., DNA
synthesis. A heated lid with rubber gaskets on the thermocycling plate prevents evaporation without the need for an oil overlay.
Two sequencing protocols are used: one for dye-labelled primers and a 2o second for dye-labelled dideoxy chain terminators. The shotgun sequencing involves use of four dye-labelled sequencing primers, one for each of the four terminator nucleotide. Each dye-primer is labelled with a different fluorescent dye, permitting the four individual reactions to be combined into one lane of the DNA Sequences for electrophoresis, detection, and base-calling. ABI currently 25 supplies pre-mixed reaction mixes in bulk packages containing all the necessary non-template reagents for sequencing. Sequencing can be done with both plasmid and PCR- generated templates with both dye-primers and dye- terminators with approximately equal fidelity, although plasmid templates generally give longer usable sequences.
3o Thirty-two reactions are loaded per AB373 Sequences each day, for a total of 960 samples. Electrophoresis is run overnight following the manufacturer's protocols, and the data is collected for twelve hours. Following electrophoresis and fluorescence detection, the ABI 373 performs automatic lane tracking and base-calling. The lane-tracking is confirmed visually. Each sequence electropherogram 35 (or fluorescence lane trace) is inspected visually and assessed for quality. Trailing WO 98I18931 PCT/US9?I19588 sequences of low quality are removed and the sequence itself is loaded via software to a Sybase database (archived daily to 8mm tape). Leading vector polylinker sequence is removed automatically by a software program. Average edited lengths of sequences from the standard ABI 373 are around 400 by and depend mostly on the quality of the template used for the sequencing reaction. ABI 373 Sequencers converted to Stretch Liners provide a longer electrophoresis path prior to fluorescence detection and increase the average number of usable bases to 500-bp.
io INFORMATICS
1. Data Management A number of information management systems for a large-scale sequencing lab have been developed. (For review see) for instance, Kerlavage et al., Proceedings of the Twenty-Sixth Annual Hawaii International Conference on ~ 5 System Sciences, IEEE Computer Society Press, Washington D. C., 585 ( 1993)) The system used to collect and assemble the sequence data was developed using the Sybase relational database management system and was designed to automate data flow wherever possible and to reduce user error. The database stores and correlates all information collected during the entire operation from template 2o preparation to final analysis of the genome. Because the raw output of the Sequencers was based on a Macintosh platform and the data management system chosen was based on a Unix platform, it was necessary to design and implement a variety of mufti- user, client-server applications which allow the raw data as well as analysis results to flow seamlessly into the database with a minimum of user effort.
2. Assembly An assembly engine (TIGR Assembler) developed for the rapid and accurate assembly of thousands of sequence fragments was employed to generate contigs. The TIGR assembler simultaneously clusters and assembles fragments of the genome. In order to obtain the speed necessary to assemble more than 104 fragments, the algorithm builds a hash table of 12 by oligonucleotide subsequences to generate a list of potential sequence fragment overlaps. The number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements. Beginning with a single seed sequence fragment, TIGR
Assembler extends the _ current contig by attempting to add the best matching WO 98l18931 PCT/US97/19588 fragment based on oligonucleotide content. The contig and candidate fragment are aligned using a modified version of the Smith-Waterman algorithm which provides for optimal gapped alignments (Waterman) M. S., Methods in Enzymology l64:765 ( l988)). The contig is extended by the fragment only if strict criteria for the quality of the match are met. The match criteria include the minimum length of overlap, the maximum length of an unmatched end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal coverage and raised in regions with a possible repetitive element. The number of potential overlaps for each fragment determines which fragments are o likely to fall into repetitive elements. Fragments representing the boundaries of repetitive elements and potentially chimeric fragments are often rejected based on partial mismatches at the ends of alignments and excluded from the current contig.
TIGR Assembler is designed to take advantage of clone size information coupled with sequencing from both ends of each template. It enforces the constraint that ~ 5 sequence fragments from two ends of the same template point toward one another in the contig and are located within a certain range of base pairs (definable for each clone based on the known clone size range for a given library).
The process resulted in 391 contigs as represented by SEQ ID NOs: l-39l.
20 3. Identifying Genes The predicted coding regions of the Streptococcus pneumoniae genome were initially defined with the program GeneMark, which finds ORFs using a probabilistic classification technique. The predicted coding region :,equences were used in searches against a database of all nucleotide sequences from yJenBank 25 (October, 1997), using the BLASTN search method to identify overlaps of 50 or more nucleotides with at least a 95% identity. Those ORFs with nucleotide sequence matches are shown in Table 1. The ORFs without such matches were translated to protein sequences and compared to a non-redundant database of known proteins generated by combining the Swiss-prot, PIR and GenPept 3o databases. ORFs that matched a database protein with BLASTP probability less than or equal to 0.01 are shown in Table 2. The table also lists assigned functions based on the closest match in the databases. ORFs that did not match protein or nucleotide sequences in the databases at these levels are shown in Table 3.

ILLUSTRATIVE APPLICATIONS
1. Production of an Antibody to a Streptococcus pneumoniae Protein Substantially pure protein or polypeptide is isolated from the transfected or transformed cells using any one of the methods known in the art. The protein can also be produced in a recombinant prokaryotic expression system, such as E.
coli, or can be chemically synthesized. Concentration of protein in the final preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a few microgramslml. Monoclonal or polyclonal antibody to the protein can 1 o then be prepared as follows.
2. Monoclonal Antibody Production by Hybridoma Fusion Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from murine hybridomas according to the ~ 5 classical method of Kohler, G. and Milstein, C., Nature 256:495 ( 1975) or modifications of the methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein over a period of a few weeks.
The mouse is then sacrificed, and the antibody producing cells of the spleen isolated.
The spleen cells are fused by means of polyethylene glycol with mouse myeloma 2o cells, and the excess unfused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued. Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay 25 procedures, such as ELISA, as originally described by Engvall, E., Meth.
Enzymol. 70: 419 ( 1980), and modified methods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use.
Detailed procedures for monoclonal antibody production are described in Davis, L. et al., Basic Methods in Molecular Biology, Elsevier, New York. Section 21-2 ( 1989).

3. Polyclonal Antibody Production by Immunization Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by immunizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance 5 immunogenicity. Effective polyclonal antibody production is affected by many factors related both to the antigen and the host species. For example, small molecules tend to be less immunogenic than others and may require the use of carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or excessive doses of antigen resulting in low titer t o antisera. Small doses (ng level) of antigen administered at multiple intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis) J. et al., J. Clin. Endocrinol. Metab. 33:988-99l ( 1971 ).
Booster injections can be given at regular intervals, and antiserum harvested t 5 when antibody titer thereof) as determined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the antigen) begins to fall. See, for example, Ouchterlony) O. et al., Chap. 19 in:
Handbook of Experimental Immunology, Wier, D., ed, Blackwell ( 1973). Plateau concentration of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12M).
2o Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, for example, by Fisher, D., Chap. 42 in: Manual of Clinical Immunology, second edition, Rose and Friedman, eds., Amer. Soc. For Microbiology, Washington, D. C. ( 1980) Antibody preparations prepared according to either protocol are useful in 25 quantitative immunoassays which determine concentrations of antigen-bearing substances in biological samples; they are also used semi- quantitatively or qualitatively to identify the presence of antigen in a biological sample. In addition, antibodies are useful in various animal models of pneumococcal disease as a means of evaluating the protein used to make the antibody as a potential vaccine target or 3o as a means of evaluating the antibody as a potential immunotherapeutic or immunoprophylactic reagent.

WO 98l18931 PCT/US97/19588 4. Preparation of PCR Primers and Amplification of DNA
Various fragments of the Streptococcus pneumoniae genome, such as those of Tables 1-3 and SEQ ID NOS:1-391 can be used, in accordance with the present invention, to prepare PCR primers for a variety of uses. The PCR primers are preferably at least 15 bases, and more preferably at least 18 bases in length.
When selecting a primer sequence, it is preferred that the primer pairs have approximately the same G/C ratio, so that melting temperatures are approximately the same.
The PCR primers and amplified DNA of this Example find use in the Examples that follow.

5. Gene expression from DNA Sequences Corresponding to ORFs A fragment of the Streptococcccs pneumoniae genome provided in Tables 1-3 is introduced into an expression vector using conventional technology.
~ 5 Techniques to transfer cloned sequences into expression vectors that direct protein translation in mammalian, yeast, insect or bacterial expression systems are well known in the art. Commercially available vectors and expression systems are available from a variety of suppliers including Stratagene (La Jolla) California), Promega (Madison, Wisconsin), and Invitrogen (5an Diego, California). If 2o desired, to enhance expression and facilitate proper protein folding, the codon context and codon pairing of the sequence may be optimized for the particular expression organism, as explained by Hatfield et al., U. S. Patent No.
5,082,767, incorporated herein by this reference.

The following is provided as one exemplary method to generate polypeptide(s) from cloned ORFs of the Streptococcus pneumoniae genome fragment. Bacterial ORFs generally lack a poly A addition signal) The addition signal sequence can be added to the construct by, for example, splicing out the poly s A addition sequence from pSGS (Stratagene) using BgII and SaII restriction endonuclease enzymes and incorporating it into the mammalian expression vector pXTI (Stratagene) for use in eukaryotic expression systems. pXTI contains the LTRs and a portion of the gag gene of Moloney Murine Leukemia Virus. The positions of the LTRs in the construct allow efficient stable transfection.
The 1 o vector includes the Herpes Simplex thymidine kinase promoter and the selectable neomycin gene. The Streptococcus pneumoniae DNA is obtained by PCR from the bacterial vector using oligonucleotide primers complementary to the Streptococcus pneumoniae DNA and containing restriction endonuclease sequences for PstI
incorporated into the 5' primer and BgIII at the 5' end of the corresponding 15 Streptococcus pneumoniae DNA 3' primer, taking care to ensure that the Streptococcus pneumoniae DNA is positioned such that its followed with the poly A addition sequence. The purified fragment obtained from the resulting PCR
reaction is digested with PstI, blunt ended with an exonuclease, digested with BgIII, purified and ligated to pXTI) now containing a poly A addition sequence 2o and digested BgIII.
The ligated product is transfected into mouse NIH 3T3 cells using Lipofectin (Life Technologies, Inc., Grand Island, New York) under conditions outlined in the product specification. Positive transfectants are selected after growing the transfected cells in 600 u~ml G418 (Sigma, St. Louis, Nlissouri).
25 The protein is preferably released into the supernatant. However if the protein has membrane binding domains, the protein may additionally be retained within the cell or expression may be restricted to the cell surface. Since it may be necessary to purify and locate the transfected product, synthetic 15-mer peptides synthesized from the predicted Streptococcus pneumoniae DNA sequence are injected into mice 3o to generate antibody to the polypeptide encoded by the Streptococcus pneumoniae DNA.

Alternatively and if antibody production is not possible, the Streptococcus pneumoniae DNA sequence is additionally incorporated into eukaryotic expression vectors and expressed as, for example, a globin fusion. Antibody to the globin moiety then is used to purify the chimeric protein. Corresponding protease cleavage sites are engineered between the globin moiety and the polypeptide encoded by the Streptococcus pneumoniae DNA so that the latter may be freed from the formed by simple protease digestion. One useful expression vector for generating globin chimerics is pSGS (Stratagene). This vector encodes a rabbit globin. Intron II of the rabbit globin gene facilitates splicing of the expressed ~ 0 transcript, and the polyadenylation signal incorporated into the construct increases the level of expression. These techniques are well known to those skilled in the art of molecular biology. Standard methods are published in methods texts such as Davis et al., cited elsewhere herein, and many of the methods are available from the technical assistance representatives from Stratagene, Life Technologies, Inc., or t 5 Promega. Polypeptides of the invention also may be produced using in vitro translation systems such as in vitro ExpressTM Translation Kit (Stratagene).
While the present invention has been described in some detail for purposes of clarity and understanding, one skilled in the art will appreciate that various changes in form and detail can be made without departing from the true scope of 20 the invention.
All patents, patent applications and publications referred to above are hereby incorporated by reference.

S. pneumoniae - Coding regions containing known sequences ,________ ,____,_______ ,_______, ________________+ __ __________________________________________________________________ nti ~ORF St h ____ _____ _____ C St --+-- --+-- --+-- --r t o ~ ~ ~ ~ p ~ G
g ar op matc match t~
gene name I

ID ~ID ~ ~ ~ ~
ident len len (nt) (nt)acession th th 9 g ~

________,____ y_______,_______,_______________ _,____________________________________________________________________________, _ _______,_ ________,_________, vp 1 ~ ~ ~ ~gb~U41735~ Streptococcus 1 437 1003 pneumoniae ~ ~
peptide methionine sulfoxide reductase (msrA) and ~

~ ~ ~

homoserine kinase homolog (thrB) genes com lete cds , p ________,____ ,_______,_______,______._________ _y____________________________________________________________________________, _ _______i_ ________~_________, ~D

2 ~ ~ ~ (gb~U04047~ Streptococcus 6169 5720 pneumoniae ~
~
SSZ
dextran glucosidase gene and insertion ~

sequence transposase gene, complete cds f y________+____ y_______~_______+_______________ _+____________________________________________________________________________+
_ ____-__+_ ________y_________y 2 ~ ~ ~ ~emb~283335~SPZ8 ~S.pneumoniae 6 6S92 6167 dexB, ~
~
capl[A,H,C,D,E,F,G,H,I,J,K]

genes, dTDP-rhamnose ~

biosynthesis genes and aliA
gene ,________,____ ,_______+_______+_______________ _+____________________________________________________________________________+
_ _______,_ ________+_________+

3 (11 ~ ~ ~emb~283335~SPZ8 ~S.pneumoniae 9770 9147 dex8, ~ ~

capl(A,B,C,D,E,F,G,H,I,J,K]

genes, dTDP-rhamnose ~

biosynthesis genes and aliA
gene ________v____ v_______,_______,_______________ _y______________________________________________________~__ __ _______y_ ________+_________+

__y_ 3 ~12 A ~ ~emb~283335~SP28 ~S.pneumoniae 0489 9671 dexB, capl(A,B,C,D,E,F,G,H,I,J,K]

genes, dTUP-rhamnose i j i i biosynthesis genes and aliA
gene , ,____ ,_______,_______,_______________ _+______________________________________________________-_____________________+_ _______,_ ________,______-__y _______~13 A 12019~gb~U43526~ Streptococcus 3 1546 pneumoniae ~ ~
neuraminidase B
InanB) gene, complete cds, and ~

neuraminidase (naM) gene, partial cds ________,____ ,_______,_______,_______________ _,____________________________________________________________________________, _ _______y_ ________,_________, 3 ~14 A 13375~gb~U43526~ Streptococcus 2017 pneumoniae ~
~
neuraminidase B
(nanB) gene, complete cds) and ~

~ ~ ~ ~

neuraminidase W
anA1 gene) partial cds ________y____ ,_______,_______,_______________ _,____________________________________________________________________________, _ _______,_ ________,_________y 3 ~IS A 14338~gb~U43526~ Streptococcus 3421 pneumoniae ~
~
neuraminidase B
(nanB) gene, complete cds, and ~

neuraminidase (nanA) gene, partial cds ,________,____ ,_______,_______,_______________ _,__________________________________________________________________________-_y_ _______,_ ________,_________, 3 Q16 A432915171~gb~U43526~ Streptococcus pneumoniae ~ ~
neuraminidase B
(nanB1 gene, complete cds, and ~

neuraminidase (nanA) gene) partial cds +________,____ +_______,_______,_______________ _,____________________________________________________________________________y _ _______,_ ________y_________y 3 (17 A A7282~gb~U43526~ Streptococcus 5132 pneumoniae ~
~
neuraminidase B
(nanB) gene) complete cds, and ~

neuraminidase (nanA) gene) partial cds ________,____ ,_______+_______,_______________ _+____________________________________________________________________________y _ _______y_ ________y_________y 3 Q18 A e18397~9b~U93526~ Streptococcus 7267 pneumoniae ~

neuraminidase H
(nanB) gene, complete cds, and ~

neuraminidase ________,____ +_______,_______,__________ (nanAl _ gene) partial cds _ _,__________________________________________-_______________________________ 4 ~ ~ ~ ___ __,_ _______,_ ________v_________, 1 46 1188~emb~Y11463~SPDN (Streptococcus pneumoniae ~ ( dnaG, rpoD, cpoA
genes and and ORFS
~

y________,____ +_______,_______,_______________ _y____________________________________________________________________________y _ _______y_ ________y_________y 4 ~ ~ ~ ~emb~Y11463~SPDN ~Streptococtus 2 119B 2S29 pneumoniae ~ ~
dnaG) rpoD) cpoA
genes and and ORFS
~

________+____ ,_______,_______+_______________ _,____________________________________________________________________________y _ _______y_ ________+_________+

5 ~ 1129711473~gb~U41735~ Streptococcus 7 pneumoniae peptide methionine sulfoxide reductase (msrA) and ~ i i homoserine kinase homolog (thrB1 genes) complete cds ,________;____ ,_______,_______+_______________ _y_______________________________________________ _______y_ ________y_________+
6 ~ ~ ~ ~ _ 7 7125 7369b~Z77726 _ SPIS __y_ S
i i em ~ 93 238 240 ~ .pneumon ~ ~

ae DNA
for nsertion sequence (1372 bp) ~

,________,____ ,_______,_______,_______________ _+____________________________________________________________________________y _ _______y_ ________+_________+

6 ~ ~ ~ ~emb~277725~SPIS ~S.pneumoniae B 7322 7S70 DNA ~ ~

for insertion sequence (966 bp) ~

________,____ y_______,_______+________________y____________________________ ________y_________, __ __,________+_ -6 ~ ~ ~ ~emb~Z77725~SPIS ~S.pneumoniae 99 q53 453 H
9 7533 7985 DNA ~ ~
~
for insertion sequence (966 bp) ~

,________y____ ,_______f_______,_______________ _y__ _______y_ ________f_________y ____________ __y_ 6 ~23 2019719733~emb~Z83335~SPZB ~S.pneumoniae dexB, ~ ~
capl(A,H,C,D,E,F,G,H,I,J,K]

genes, dTDP-rhamnose ~

biosynthesis genes and aliA
gene ,________+____ ,_______,_______+_______________ _y____________________________________________________________________________y _ _______+_ ________+_________+

( ~10 ~ ~ ~emb~Z83335~SPZ8 ~S.pneumoniae 7 8305 7682 dexB, ~ ~
capl(A,B,C,D,E,F,G,Fi,I,J,K]

genes) dTDP-rhamnose ~

biosynthesis genes and aliA
gene ~

+ ____ ______________, ______y_________y _____-___y _______t t r ________________ -____________________________________________________________________________,__ TABLC 1 S, pneumoniae - Coding regions containing known sequences +________+____+_______+_______+________________+_______________________________ _____________________________________________+________+_________y_________+

( (ORF ( t ( match ( match gene name ( percent( HSP ( ( Contig Star ( nt ORF
Stop nt ( (ID ( ( ( acession( ID (nt) (nt) dent ( length( i length( ________+____ y_____ __+______ _y________________+____________________________________________________________ ______________ 0~0 ------__l ___ 7 (I1 ( ( (emb(Z83335(SPZB(S.pneumoniae dex8, capl[A,B,C,D,E,F,G,H,I.J,KI ( 95 ( 819 ( 9024 8206 genes) dTDP-rhamnose ( ( ( ( ( ( ( biosynthesis genes and aliA gene ( ( ( ( +________a____ y_____ __y______ _+________________y____________________________________________________________ ________________+________ +_________y_________+ W

i13 i i igb(L29323(iStreptococcus pneumoniae methyl transferase 93 9304 8078 (mtr) gene cluster, complete i i i i cds +________+____ +_____ __+______ _~________________+____________________________________________________________ ________________w________+_________y_________y ( ( ( ( (emb(279691(SOOR(S.pneumoniae yorf[A,B,C,D,E], ftsL, pbpX
( 11 2 548 919 and regR genes 99 ( ( ( +________+____ y_____ __+______ _y________________y____________________________________________________________ ________________y________y_________+_________, ( ( ( ( (emb(279691(SOOR(S.pneumoniae yorf[A.B,C,O;EI. ftsL, pbpX
( 11 1 892 1980 and regR genes ~

~

( +________+____ ~_______+______ _+________________+____________________________________________________________ ________________+________+_________+_________+

( ( ( ( (emb(279691(SOOR(S.pneumoniae yorf[A,B,C,D,EI) ftsL, pbpX
( I1 5 3040 3477 and regR genes ( ( ( +________+____ +_______+______ _+________________+______________________________________________________~_____ ________________+________+_________+_________+

( ( ( ( (emb(Z79691(SOOR(S.pneumoniae yorf(A,B.C,D,E], ftsL, pbpX
( 11 6 3480 3247 and regR genes ( ( ( +________+____ +_______y______ _~________________+____________________________________________________________ ________________+________+______.__y_________y 11 ( ( ( (emb(Z79691(SOOR(S.pneumoniae yorf[A,B,C,D,EI, ftsL, pbpX
( 7 3601 4557 and regR genes 98 ( ( ( +________+____ +,.____ __+______ _+________________+____________________________________________________________ ________________+________+_________+_________y ( ( ( ( (emb(Z79691(SOOR(S.pneumoniae yorf(A,B,C,p,E), ftsL, pbpX
( 11 8 4506 48A6 and regR genes ( ( ( (________,____ ,_____ __+______ _,________________,____________________________________________________________ ________________+________+_________+_______,_+ o 11 9 48B4 7142 emb X16367Stre tococcus pneumoniae pbpX gene for g ( ( ( ( ( SPPB p 99 ( ( ( ( P penicillin bindin rotein 2X ( ( ( +________+____ +_______+______ _y________________+____________________________________________________________ ________________~________~_________+_________+

( (10 ( ( (emb(X16367~SPP0(Streptococcus pneumoniae pbpX gene for ( 11 7132 8124 penicillin binding protein 2X

( ( ( ________,____ +_______,______ _~_____________________________________________________________________________ ______________ v _ __ _ __ __+__ ______+_________y +

( ( ( ~ (gb(M31296((S.pneumoniae recP gene, complete cds ( o ( ( ( y________+____ +_____ __~______ _+________________,____________________________________________________________ ________________,________~_________y_________y ( ( ( ( (emb(Z83335(SP28(S.pneumoniae dexB, capl/A,B,C,D,E,F,G,H,I,J,K)87 312 14 3 1B37 2148 genes, dTDP-rhamnose i i i ( ( ( ( ( ( biosynthesis genes and aliA gene i +________+____ ~_____ __+______ _+________________~____________________________________________________________ ________________+________~_________+_________+

( ( ~ (2108(gb~M36180((Streptococcus pneumoniae transposase, (comAse( 98 ( 411 ( ( 14 4 2518 and coma) and SAICAR syntheta o ( ( ( ~ ( ( (purCl genes) complete cds ( ( ( ( ________+____ ,_______+______ _+________________+____________________________________________________________ ________________+________ ~_________+_________+

( ~ ( (851l(gb(U09239~(Streptococcus pneumoniae type 29F capsular 89 340 432 ( 9 8942 polysaccharide biosynthesis ( ( ( ( ( ( operon, (cpsl9fABCDEFGHIJKLHNO) genes, i i complete cds) and eliA gene, ( ( ( ( ( ( partial cds ( ( ( +________+____ +_____ __+______ _+________________~_______________________________________________..___________ _________________+________+_________+_________ +

( ( ( ( (emb(277726(SPIS(S.pneumoniae DNA for insertion sequence ( 17 7 3910 3458 I51318 (1372 bp) ( ( ( ~________+____ y_____ __,______ _~________________+____________________________________________________________ ________________E________+_________t_________+

( ( ( ( (emb(Z77727(SPIS(S.pneumonise DNA for insertion sequence ( 17 8 4304 3873 I51318 (823 bp) ( ( ( ~________+____ +_____ __+______ _~________________y____________________________________________________________ ________________+________+_________y_________+

( ( ( ( (emb~X94909~SPIG(S.pneumoniae iga gene ( ( ( ( +________~____ ~_____ __+______ _~________________~____________________________________________________________ ________________+________+_________+_________+

( ( ( ( (gb(L07752((Streptococcus pneumoniae attachment site ( 19 2 5S4 757 (atte), DNA sequence 99 ( ( ( +________+____ +_____ __+______ _+________________+____________________________________________________________ ______________ __ _ ___ __+__ _ __~__ ____~_________y ( ( ( ( (gb(L07752((Streptococcus pneumoniae attachment sate ( 19 3 946 1827 (att8l, DNA sequence ( ( ( +________+____ +_____ __+_______+________________+___________________________________________________ _________________________+________y_________+_________+
H
( ( ( (182 (gb(U33315((Streptococcus pneumoniae orfL gene, partial( 1 937 cds, competence stimulating 99 ( ( ( ( ~ ( ( ( peptide precursor (comC), histidine protein kinase (comDl and response ; i ( ( ( ( i ( regulator (comE) genes, complete cds, tRNA-Arg and tRNA-Gln genes +________+____ +_____ __+______ _+________________~___________________________________________________________.
.________________y________+_________~_________+

( ( ( (93I (gb(U33315((Streptococcus pneumoniae orfL gene, partial( 98 20 2 2271 cds) competence stimulating ( ( ( ( ( ( ( ~ ( peptide precursor (comC), histidine protein kinase (comb) and response i ( ( ( ( ( ( regulator (comE) genes, complete cds, tRNA-Arg and tRNA-Gln genes _____+____ f_______+______ _+________________+____________________________________________________________ ________________+________+_________y_________y TABLE 1 S. pneumoniae - Coding regions containing known sequence ;________;____ _______f_______________________~__________________________________-____________._____________________________y________4_ _________________f Contig~ORF~ ~ ~ match ~ match gene name ~
percentsHSP ~ORF ( 0 StartStop nt nt ID SID~ ~ ~ acession ~ ~
identlength~len Intllntl ~ th ________,____y______________________________f__________________________________ __________________________________________;________y_________y_________y pp 20 ~ ~ ~ ~9b~U76218~Streptococcus pneumoniae competence stimulating~

3 31752684 peptide precursor ComC ~

(comC)) histidine kinase homolog Comb (comb)) and response regulator homolog ComE (comf:) genes, complete cds ~ ~ ~ W

y________i____y_______;_______________________y________________________________ ____________________________________________y________;_________f_________ 20 ~ ~ ~ ~gb~AF000658~Streptococcus pneumoniae R801 tRNA-Arg gene, ~
99 I206~
4 33224527 partial sequence. and putative ~

( serine protease (sphtra), SPSpoJ (spspoJ), initiator protein (spdnaa) and beta subunit of DNA polymerise III (spdnan) genes) complete cds ___________________y___________________________________________________________ ________________________________________y________,__________________ , 20 ~ ~ ( (gb~AF000658~(Streptococcus pneumoniae R801 tRNA-Arg gene)( 99 771 ~
45735343 partial sequence, and putative ~ 77l serine protease (sphtra), SPSpoJ (spspoJ), initiator protein (spdnaa) and beta subunit of DNA polymerise III (spdnan) genes, complete cds ________,____,_______,_______,________________,________________________________ ______________________\_____________________________y_________,_________t 20 ~ ~ ~ ~gb~AF000658~Streptococcus pneumoniae R801 tRNA-Arg gene, ~
99 1386~
6 55326917 partial sequence, and putative ~

serine protease (sphtra)) SPSpoJ (spspoJ)) initiator protein (spdnaa) and beta subunit of DNA polymerise III (spdnan) genes, complete cds ________y____;_______y_______,________________~________________________________ ____________________________________________,________,_ ________~_________ y 20 ~ ~ ~ ~gb~AF000658~Streptococcus pneumoniae R801 tRNA-Arg gene, ~
99 1218~ ~
7 6995H212 partial sequence and putative ) o ( ~ serine protease (sphtra), SPSpoJ (spspoJl. ~ ~ ~
N
initiator protein (spdnaa) and beta subunit of DNA of N
p ymerase III (spdnan) genes, complete cds ~ ( ~ ~

,J
________,____,_______,_______________________;_________________________________ ___________________________________________________f__________________ ( 20 ~ ~ ~ ~gb~AF000658~Streptococcus pneumoniae R801 tRNA-Arg gene, 9B
258 ~ 'J
8 82148471 partial sequence, and putative ~

~

f ~ ~ serine protease Isphtra), SPSpoJ IspspoJ), ' ~ ~ ~
o initiator protein (spdnaa) and beta subunit of DNA polymerise III (spdnan) genes) complete cds ;________;____-_____________-_______________________________________________________________________________ ____________;________;_________f_________ ~ Ir 20 9 85349670b AF000658 ~ 99 134 ~
~9 ~ ~ ~ P pneumonfae R801 tRNA-Ar ene artial se ~ 1137 Stre tococcus g g , p quence) and putative serine protease Isphtra), SPSpoJ (spspoJ)) ~ ~ ~ yp initiator protein (spdnaa) and ~ ~ beta subunit of DNA polymerise III (spdnan) genes, complete cds ________,____,_______,_______,________________,________________________________ ____________________________________________,________,____ _ o _ 22 l4 118A712Z67emb Z77726 S.pneumoniae DNA for insertion sequence IS1318 ___ _, SPIS (l372 bp1 99 226 ________) f ~

;________;____;_______;_______________________;________________________________ ____________________________________________y________;__________________ N

22 ~15A2708A ~emb~277727~SPIS~S.pneumoniae DNA for insertion sequence IS1318~ 97 353 ~
2256 (823 bp) ~ 453 ____________4__________________________________________________________________ ________________________________________,_________________a_________y 22 ~16A3165A2662~emb~277726~SPIS~S.pneumoniae DNA for insertion sequence IS1318~ 98 504 ~
(1372 bp) ~ 504 ________y____f____________________________________--____________________________________________________________________;________y_ ________y_________ 22 ~23A 18910~emb~Z86112~SPZ8~S.pneumoniae genes encoding galacturonosyl 95 463 5l3 8398 transferase and transposase and i i i ( insertion sequence IS1515 i _____ y____y_______y_______~__.._____________;_______________________________________ _____________________________________f.._______y_________y _________;

22 ~2441882919299(emb~286112~SPZ8(S.pneumoniae genes encoding galacturonosyl ~ 99 443 ' f transferase and transposase and ' 471 insertion sequence IS1515 (________~____;_______t_______y________________,_______________________________ _____________________________________________,________,_________y_________ , 23 ~ ~ ~ ~emb~X5247d~SPPL~S.pneumoniae ply gene for pneumolysin ~ 99 1422~
5 56244203 ~

;__________________________f________________y__________________________________ _________________________________________________________ _ _ _______ 23 6 60635629b M17717 S. neumoniae p pneumolysin gene, complete cds ~ 98 197 ~
, 435 ________,____,_______,_______,________________,________________________________ ____________________________________________f________+_________y_________ 26 ~ ~ ( ~emb~X94909~SPIG~S.pneumoniae iga gene ~ 87 3d87, 1 55002 ~

________,____y_________________________________________________________________ _________________________________________,_________________,_________y 26 , ~ ~ ~gb~U47687~Streptococcus pneumoniae immunoglobulin A1 ~ 99 151 ~
2 58235584 protease (iga) gene, complete ~

cds w __ y_______;______________________________________________________________________ _______ __ _ _ __ ________y_________f_______ 26 3 68785685gb~U47687~ _ ~ 100 50 ~
__ ~ 1194 _____________ Streptococcus pneumoniae immunoglobulin A1 protease (iga) gene, complete i i ; i ~ i cds ;________f__________________f__________________________________________________ ___________________________________________________________f_________ TAI3LF 1 S. pneumoniae - Coding regions containing known sequences y________~____y_______,_______,________________y_______________________________ _____________________________________________ y________y_________y___ ______a ( Contig(ORF( ( ( match ( match gene name ( HSP ORF ( StartStop percent(nt nt ( ID (ID( ( ( acession ( ( ( ~ ( (nt) (nt) identlength length ________,____,_______,______ _,________________y____________________________________________________________ ________________ f________,_________ y_________f 00 ( 26 ( (14498(14854(emb(283335(SPZB(S.pneumoniae dex8, capl[A,B,C,b,E,F,G,H,I,J,KJ( ( ( 357 8 genes, dTDP-rhamnose 99 338 ( ( ( ( ( ( biosynthesis genes and aliA gene ( ( ( ,________,____y_______,______ _f________________,____________________________________________________________ ________________ y________,_________ ,_________y W

( 26 ( (14753(14924(emb(Z83335(SPZB(S.pneumoniae dexH) capl)A,H,C,D,E,F,G,H,I,J,K)( 94 ( 162 9 genes, dTDP-rhamnose 100 ( ( ( ( ( ( biosynthesis genes and aliA gene ( ~ ( -________,____,_______ ,_______,________________,_____________________________________________________ _________________._____ y________,_________ a_________f ( 26 (10(14922(15173(gb(U04047((Streptococcus pneumoniae SSZ dextran glucosidase( ( ( 252 gene and insert::.n 97 242 ( ( ( ( ( ( ( sequence IS1202 transposase gene, complete ( ( ( ( cds ________,____,_______ ,_______,________________,_____________________________________________________ ____.__________________,________y_________,___ ______f ( 28 ( ( ( (emb(283335(SP28(S.pneumoniae dexB) capl(A,B,C,D,E,F,G,H,I,J,K199 426 426 1 80 505 genes, dTDP-rhamnose ' f ~ 1 ( ( ( ( ( ( biosynthesis genes and aliA gene f________y____f_______,______ _y________________y______________________.~_______________________~_______~y___ _____y_________ y_________y ____________________ ( 28 ( 503 ( (gb(U04047((Streptococcus pneumoniae SSZ dextran glucosidase( ( ( 450 ( 2 ~ 952 gene and inseztion 97 450 ( ( ( ( ( sequence IS1202 transposase gene, complete cds ( ( ( ( y________y____i_______y_______f________________y_______________________________ ___________________________________________ ____ _____ _ ____ ( 28 ( (,780( (gb(U04047 Streptococcus 3 1298 ( pneumoniae SS2 dextcan qlucosidase gene ( ( ( 519 and insertion 96 1B1 ( ( ( ( ( ( ( sequence IS1202 transposase gene, complete ( ( ( ( cds o ,________,____y_______y_______y________.._______y______________________________ ______________________________________________y________y_________ y_________f N

( 34 ~ ( ( (gb~L08611((Streptococcus pneumoniae maltose/maltodextrin( ( ( 1317( 1 207 1523 uptake (malX) and two 99 1317 ( ( ( ( ( maltodextrin permease (malC and malD) genes)( ( ( ( complete cds (________,____,_______,_______,________________,_______________________________ _____________________________________________ ,________,_________ ,_________, ,J

( 34 ( ( ( (gb(L08611((Streptococcus pneumoniae maltoseJmaltodextrin( ( ( 891 ( N
2 1477 2367 uptake (malX) and two 96 79S

( ( ( ( ( ( maltodextrin permease (malC and malD) genes,( ( ( ( o complete cds ________y____f_______y_______f________________f________________________________ ____________________________________________ f________i_________ y_________y t!~

( 34 ( ( ( (gb(L21856((Streptococcus pneumoniae malA gene, complete( ( 828 3 259J 3420 cds; malR gene, complete cds 96 496 ( ( ,______y____,_ y _ __ _ _ _ _ _ _f________________f____________________________________________________________ ________________,________y_________y___ ______y __ ____ ( 39 ( ( ( (gb(L21856((Streptococcus pneumoniae malA gene) complete( ( 144 4 2790 2647 cds; malR gene, complete cds 98 137 ( ( (________,____,_______,_______,________________,_______________________________ _____________________________________________,________,_________,___ ______f ( 34 ( ( ( (gb(L21856((Streptococcus pneumoniae malA

3418 4416 comp)ete ( ( cdst 96 999 malR ( gene, complete cds gene, ________,____,_______,_______,________________-y________y_________y___ ______y N
-' _ _ ____________________________ f - ___________________ ( 34 ( ( ( (gb(U41735((Streptococcus pneumoniae peptide methionine ( ( 258 ( 9 7764 7S07 sulfoxide reductase (msrA) and 93 201 ( ( ( ( ( ( ( homoserine kinase homolog (thrB) genes. ( ( ( complete cds ~

________,____,_______,_______y________________y________________________________ ____________________________________________,________,_________y___ ______y ( 34 (16(1056210257(emb~X63602(SPBO(S.pneumoniae mmsA-Box ~ ( ( ,________y____,_______y_______y _______________________________________________________________________________ ____________y y ___ ________y _________ ___ ( 35 ( ( ( (emb(283335(SP28(S.pneumoniee dexH, capl(A,B,C,D,E,F,G,H,I,J,KJ( ( ( 264 4 1176 1439 genes, dTDP-rhamnose 87 248 ( ( ( ( ( ( biosynthesis genes and aliA gene ( ( ( y________,_..__y_______f_______y________________y_.____________________________ _______________________________________________y________f_________y___ ______i ( 35 ( ( ( (gb~U09239~(Streptococcus pneumoniae type 19F capsular ( ( ( 504 5 I458 1961 polysaccharide biosynthesis 98 264 ( ( ( ( ( ( ( operon, (cpsl9fABCDEFGHIJKLHNO) genes) complete cds) and aliA gene, ( ( ( ( ( ( partial cds ________~____,_______y_______y________________a________________________________ ____________________________________________,________,_________ ,_________y ( 35 (1716172(15477(emb(X85787(SPCP(S.pneumoniae dexB, cpsl4A, cpsl4B, cpsl4C, ( ( 696 cpsl4D, cpsl4E, cpsl4F, cpsl4G, 97 696 ( ( ( ( ( cpsl4H, cpsl4I, cpsl4J, cpsl4K, cpsl4L, ( ( ( rj tasA genes ( y________y____y_______y_______,________________,_______________________________ _____________________________________________y________y_________f _________y ( 35 (18(16961(16170(emb(283335~SPZ8(S.pneumoniae dex8) capl(A,B,C,D,E,F,G,H,I,J,KJ( ( 792 C!~
genes) dTDP-rhamnose 86 792 ( ( ( ( ( ( biosynthesis genes and aliA gene ( ( y________~____~_______,_______y________________,_______________________________ ______________ _ _ ___ ~________y_________, _________y ( 35 (19(17620(16871(gb(U09239(_ _________________________ ( ( 750 (Streptococcus pneumoniae type 19F capsular 83 750 ( pol saccharide biosnthesis ( y }

( ( ( ( ( ( operon, (cpsl9fABCDEFGHIJKt.HNO) genes, complete cds and aliA gene , ( ( ( ( ( , ( S ( pp ( partial cds , ( ________y____,_______y_______,________________y___________..___________________ _____________________________________________,________,_________, _________, TABLE

S. pneumoniae- Coding regions containing known sequences ________,____,_______,_____.._,________________y_______________________________ _____________________________________________y________ y_________y_________y ContiORE StartSto match match ( HSP ( ORE ( ( ( ~ ( ( ~ gene name percentnt nt 9 p ( (ID ~ ( ~ acession ~ ~
~ ~ length ID (nt)(nt) identlength (________,____y______________,______..________ _____________________________________________________________________________y_ _______y_________,_________, OD

35 ~20 A (17604(emb(R85787(SPCP(S.pneumoniae dexB, cpsl4A, cpsl4B, cpsl4C, ( ~ ( 145B ( 9061 cpsl4D, cpsl4E, cpsl4F 94 1458 cpsl4G

( ~ ~ ~ ( , ~
~ ~ I ~p , ~ cpsl4H, cpsl9I, cpsl4J, cpsl4R) cpsl4L, tasA genes ________,____,_______,_______,________________,_______________________ W
__ __ _ _ y________?_________,_________) 36 ~19 A 18352~ _ 8960 b(U40786( __ ____________________________________________ S
i f g treptococcus pneumon ~ ' ~ 609 ae sur 99 609 ace antigen A variant precursor (psaA) and kpa protein genes) complete cds, and ORF1 gene) partial cds ____ ____________~_______________________y__________________________________________ __________________________________ y________y_________,_________y i i20 i19939i18966igb(U53509(iStreptococcus pneumoniae surface adhesin ( ~ 969 36 A precursor (psaA) gene, complete 99 969 ( cds ( ~ ( ________,____y_______y_______,_________________________________________________ ___________________________________________ y________y_________y_________, ( ( ( ~ (emb(267739(SPPA~S.pneumoniae parC, parE and transposase genes( ( ( 2565 37 1 2743179 _y________________and unknown orf 99 2S65 ( ________,____y_______y______ _______ ,____ _ ,_________________y_________y ( ( ( ( (emb(Z67739(SPPA_~______.._______________________________.._________________.._ ____( ( ( 162 37 2 29852824_y___________(S.pneumoniae parC) parE and transposase genes100 162 ( ,_______..,____,_______y______._ and unknewn orf y _. y________p________-________y ( ( ( ( __ ~
~ ( 19b5 50343070__________________________________________________..______________--_________ 99 196S ( (emb(Z67739(SPPA

(S.pneumaniae parC, parE

and transposase genes and unknown orf ________,____y_______,_______________________,_________________________________ __________________________________________ - " y ----'--'--------__y "----( ( ('5134( (emb(Z67739(SPPA(S.pneumoniae parC, parE and traps osase ( ( ( 657 37 4 5790 p genes and unknown orf 99 p________,____y_______,_______,________________y__ ______ __ ___ __ _ __ ,________,_________1_________, ' ( ( ( ( (emb('t67739(SPPA(S.pneumoniae parC, parE and transposase genes( ( ( 339 N
37 S 61715A33 and unknown orf 96 339 ( N
,________y____y_______~_______,_____..__________y______________________________ ____________________________ _ __ _ __ ___ _ _ _________ ( (19 (1296913268(gb(M28679 ________ __ _ ___ __ 38 ~ S. neumoniae romoter re ion DNA _ ____ ( 300 ( P P 9 l 100 , ( ( 64 __ _ _ _ _ J
( _ _ ____________________ y________,_________,_________ ______ _____ ____________________________________________________________________________ __ , ( ( ( ( (gb(u41735((Streptococcus pneumoniae peptide methlonine ( ( ( 882 p 39 2 12562137 sulfoxide reductase (msrA) and 99 8B2 ( ( ( ( ( ( ( homoserine kinase homolog (thre) genes, ~
( ~ ( complete cds y________y____y_______,_______,________________y_______________________________ _____________~_______________________________y________y_.________y_________y ( ( ~ ( (gb(U41735(Streptococcus pneumoniae peptide methionine ( ( 966 39 3 290S3370 sulfoxlde reductase (msrA) and 99 ( ( ~ ( ~ i ( ~ i i ~o homoserine kinase homolog (thr8) genes) complete cds __ y_______y_______y________________y____________________,________ _ _ _ _ ______________________________y________y_________y_________ n 40 9 s2s37zoe(gb(M29686)~S.pneumoniae mismatch repair (hexB) gene) ( ( ( 1956 ( ~ ( ( complete cds 99 y________,___________,_______y________________,________________________ _ __ _________________________________________________y__________________, 41 1 3 1037emb Z17307 S. neumoniae recA g ( ( ( 1035 ( ( SPRE ( p gene encodin BecA 99 1027 ( ( ( ~

,________+____,______________y________________y__ y________,__________________y _______ ' ( ~ ( ( (emb(239303(SPCIStreptococcus pneumoniae tin operon encoding ~ ~ ( 13A6 41 Z 13282713 the cinA, 99 recA, dinF, lytA

( ( ( ( ( ( genes, and downstream sequences ( y________,____,_______y______y_____ _ _ y________________________~___________________________________________________y_ _______y______,.__y_________y ( ( ~ ( ________ (S.pneumoniae autolysin IlytA) gene) complete~
( ~ 963 41 3 30834045__ cds 99 (gb(M13812( ________,____y_______,_______y________________y________________________________ ____________________________________________,_________________y_________, ( ( ~ ( (gb(M13812((S.pneumoniae autolysin (lytA) gene, complete( ~ ( 177 41 9 32723096 cds 100 ,________y____,_______,_______,_,-_____________y____________________________.-._______________________________________________y________y_________y_________y ( ( ( ( ( S

41 5 36033B60b(M13B12( i i g ( ( ~ ( 258 .pneumon 100 258 ( ae autolys n (lytA) gene, complete cds .._______y____y_______y_______,____________~___y_______________________________ _______ __ ,________r_________y_________y ' ( ( ( ( (gb(L36660((Streptococcus pneumoniae ORE, complete cds ~
( ( 408 408 ( y________y____,_______,_______,________________+____________________________.._ __________________________________y________a_________y______ 41 ( ( ( (gb(L36660(__ ( ( ( q47 7 S2705716 (Streptococcus pneumoniae ORE, complete cds 98 C47 ( ,________,____,_______,_______y________________y_______________________________ _______________________________________ ___________ __ y ' ' ----( ( ( ( ~gb(L36660((Streptococcus pneumoniae OBE, complete cds 98 431 ~ 807 41 B 61126918 ( ~

- ( s y y y ,_ __ _________________________________________________________________________ ____ J
_____( ( ( (gb(L36660(_ _ __ __ __________________ __ __ ( 9 69167119 Streptococcus pneumoniae ORE
_____ 91 com _ lete cds ) ( ( i 204 p 100 204 ~

,________,____y_______~_______y________________y_______________________________ ___________________________________-_________________y_________y_______ ( ~10 ( ( (gb(L36660~(Streptococcus pneumoniae ORE, complete cds ~
( ( 579 ~________,____,_______,_______,________________,___ __ __ _ _ _ ,________,_________i_________y pp ( (11 ( ( ( __ 41 ?6807979b(L36660( _ ___ ____________________________________________________________ (Str t c s OBE
i l t d g ep ~ 81 ~ 300 o 98 occu ( pneumon ae , comp e c s e ____________y_______,_______,________________y_________________________._______ _______ ,________y_________ y_________y ( ~12 ~ ~ b(277727 _________ ___ __ (em (S.pneumoniae DNA for insertion sequence IS1318~ 353 ~ 453 ~SPIS (823 bp) 97 ( ,________y____y_______,_______,________________,_______________________________ ______________________________,.______________y________y_________4_________y S. pneumoniae - Coding regions containing knorm sequences (________,____,_______,_______,________________a_______________________________ _____________________________________________,________ +_________,_________~

Concig~OHF~ ~ ~ match ~ match gene name ' f HSP ORF ' Q
StartStop percentnt nt ~

ID SID~ ~ ~ acession~ ~
~ lengthlength (nt) !nt) ident~

________~____~_______,______ _,________________~____________________________________________________________ ________________~________i_________i_________t ~0 41 Q13~ ~ ~emb~277725~SPIS~S.pneumoniae DNA for insertion sequence ~
~ 160 402 9533 9132 IS1381 (966 bp) 95 ~

________t____~_______,_______,________________~________________________________ ____________________________________________~________ ~_________,_________~

41 ~ld~ ~ ~emb~Z82001~SPZ8~S.pneumoniae pcpA gene and open reading ~
~ 189 19S
9669 947S frames 100 ~

________,____t_______,_______,________________~________________________________ ____________________________________________~________ ,_________a_________~

44 ~ ~ ~ ~emb~Z82001~SPZ8(S.pneumoniae pcpA gene and open reading ~
~ 366 366 S 7190 7555 frames 99 ~

,________,____,_______,_______,________________f_______________________________ _____________________________________________4________ ,_________~_________t 44 ~ ~ ' ~emb~Z77726~SPIS~S.pneumoniae DNA for insertion sequence ~
~ 453 453 6 8059 7607 I51318 i1372 bp) 97 ~

________,____,_______a_______,________________,________________________________ ____________________________________________,________ ,_________,_________~

44 ~ ~ ~ ~emb~277725~SPIS~S.pneumoniae DNA for insertion sequence ~
~ 160 402 7 8423 B022 I51381 (966 bp) 95 ~

________,____,_______,_______,________________~________________________________ ____________________________________________,________ v_________~_________+

4d ~ ~ ~ (emb~Z82001~SPZ8~S.pneumoniae pcpA gene and open reading ~

8 B559 8365 frames t00 ~

________,____,_______,_______,________________a________________________________ ________________________..____~________ t_________,_________~
~_____________ 48 ~ ~ ~ ~gb~L39074~Streptococcus pneumoniae pyruvate oxidase~
~ 1794 1794 9 6480 4687 (spxB) gene, complete cds 99 ~

________,____,_______~_______,________________f________________________________ ____________________________________________1________ ~_________~_________f 49 ~ ~ ~ ~gb~L20561~Streptococcus pneumoniae Exp7 gene, partial~
~ 216 2373 2 231 2603 cds 100 ~

________,____,_______,_______~________________,________________________________ ____________________________________________,________ ~_________~_______ 53 6 ~ ~ ~gb~U04047~Streptococcus pneumoniae SSZ dextran glucosidase 97 242 252 o 2407 2156 gene and insertion sequence I51202 transposase gene, completei ~ i N
cds i ,________,____i_______,_______f________________~_______________________________ _____________________________________________~________~_________,_________, N
J

53 ( ~ ~ ~emb~Z83335~SP28~S.pneumoniae dex8, capliA,B,C,D,E,F,G,H,I,J,K)~ 190 ~ 94 162 7 2566 2405 genes, dTDP-rhamnose ~

biosynthesis genes and aliA gene ,________,____,_______~_______,________________~_______________________________ ___________________________________________ __ ,_________,_______ N

__,__ ____ 53 ~ ~ ~ ~emb~Z83335~SP28~S.pneumoniae dexB;
capl[A,B,C,D,E,F,G,Ii,I,J,K)~ 99 ~ 338 357 8 2831 2475 genes. dTDP-rhamnose ~

biosynthesis genes and aliA gene 'C ~o _ ~
,_______,_______,________________,_____________________________________________ _______________________________,________ _r_ ________, _ ____ ,________ , ______ 54 (1312409l11105~emb~ZB3335~SP28IS.pneumoniae dexB, capl[A,B,C,O,E,F,G,H,I,J,K[~ ~ 59I 1305 genes, dTOP-rhamnose 67 ~

biosynthesis genes and aliA gene ( ~ ~ ' _ _ , ,_____~__________,_______,________________,____________________________________ ________________________________________,________ f_________,_______ __ 55 Q22Q2048819949'emb~Z84379'HS28~S.pneumoniae dfr gene (isolate 92) ~
' 540 540 ' 99 ~ ~

________~____i_______,_______,________________~________________________________ ____________________________________________~________ y_________i_________, N

61 Q11A1864~ ~emb~Z16082~PNALStreptococcus pneumoniae ali8 gene ~
~ 1965 1965 9900 98 ( ________,____,_______,_______,________________,________________________________ ____________________________________________,________ ~_________,_________, 63 ~ ~ ( ~gb~Mi8729[~S.pneumoniae mismatch repair protein ( , 237 237 1 3 239 (hexA) gene, complece cds 100 ~

,________~____~_______,_______,________________~_______________________________ _____________________________________________,________ ~_________~_______ 63 ~ ~ ~ ~gb~M18729~~S.pneumoniae mismatch repair protein ~
~ 2330 2379 2 233 2611 (hexA) gene. complete cds 99 ( ,________~____~_______,_______,________________~__________.____________________ _____________________________________________,________ ,_________a_________t 63 ~ ~ ~ ~gb~M18729~~S.pneumoniae mismatch repair protein ~
~ 266 267 3 2557 2823 IhexA) gene, complete cds 99 ~

________,____i_______,_______~________________~________________________________ ____________________________________________,________ ,_________~_________, 63 ~ ~ ~ ~gb~H18729~~S.pneumoniae mismatch repair protein ~
~ 69 1707 4 2958 4664 (hexA) gene, complete cds 95 ~

____ ,____~_______,_______t________________+________________________________________ ____________________________________,________ ~_________~_________, 67 ~ ~ ~ ~gb~L20670~Streptococcus pneumoniae hyaluronidase ~
( 372 372 6 3770 3399 gene. complete cds 96 ~

f________t____~_______,__-____y________________~________________________..____________________________-____-_______________ __ ____ i__ ______ __,__ _____~_ __ 67 ~ ~ ~ ~gb~420670~Streptococcus pneumoniae hyaluronidase ( ~ 2938 2991 7 7161 4171 gene, complete cds 99 ~

________,____,_______,_______,________________,________________________________ __________________________________________ ______,_________, 70 ~ ~ ~ ~gb~H14340~~S.pneumoniae DpnI gene region encoding ~
( 693 702 1 1 702 dpnC and dpnD, complete cds 100 ~

,______.__,____,_______,_______,________________,_______-____________________________________________________________________,________ ~_________f_________~

70 ~ ~ ~ ~gb~H14340~~S.pneumoniae DpnI gene region encoding ( ( 483 483 2 678 1160 dpnC and dpnD) complete cds 100 ~

________~____,_______~_______,________________,________________________________ __________________________________________ _____ ____ __ __t____ _ 70 ~ ~ ~ ~gb~M14339~~S.pneumoniae DpnII gene region encoding s ~ 462 1281 ~D
3 2490 1210 dpnH. dpM, dpnB, complete cd ~ ~
~

,________,____,_______,_______,________________,_______________________________ _________________________________ ____ ___ ~

__ __ ,___ 70 ~ ~ ~ ~gb~J04234)~S.pneumoniae exodeoxyribonuclease lexoA)_ _____~_________, 7 4230 4424 gene, complete cds ________ ~

~ ~ 195 ,________~____y_______,_______f________________~_______________________________ ____________..________________________________y________ ,-________~_________~

70 ~ ~ ~ ~gb~J04234~~S.pneumoniae exodeoxyribonuclease (exoA)~
~ 881 882 8 5197 4316 gene, complete cds 99 ~

,________,____,_______~_______,________________~______.._______________________ __________________________________..___________,________ ~_________r_________~

S. pneumoniae - Coding regions containing known sequences ________,____,_______v_______,________________,________________________________ ____________________________________________ ,________ _ _____ ___ ' ( ~ORF~ ~ ~ match ~ match gene name ~
percent(HSP ORF ( Contig StartSiop nt nt ( ~ID( ~ ( acession ID (nt) (nt) ( identlengthlength( ( ~

,________,____,_______v_______ ,________________,_____________________________________________________________ _______________,________,_________,_________~ pp 70 (13~ ~ ~gb~L20562~Streptococcus pneumoniae ExpB gene, partial ~ 93 8108 9874 cds ~ ~

________,____,_______,_______ a________________f_____________________________________________________________ _______________a________,_________i_________~

71 (222796428341~emb~X63602~SPB0~S.pneumoniae mmsA-Box ~ 93 Z33 378 ~ ~ ( ,________~____,_______a_______,________________~_______________________________ _____________________________________________ ~________,_________~_________~

72 ( ~ ( ~emb~226850~SPAT~S.pneumoniae (M222) genes for ATPase a subunit,( 97 102 10S6 ( 4607 35S2 ATPase b subunit and ATPase ~ ( ( ( ( ~ ~ ( c subunit ( ( ( ________,____~_______,_______,________________ ~____________________________________________________________________________,_ _______,_________f_________f ( ~ ~ ~ ~emb(X63602(SP80(S.pneumoniae mmsA-Box ~ 91 193 339 73 1 471 133 ~
~

,________,____,_______,_______,________________;_______________________________ _____________________________________________,________,__,~______,_________, 73 ~ ( ~ ~gb~J04479~~S.pneumoniae DNA polymerase I (polA) gene, ~ 99 3 3658 977 complete cds ( ( ________,____,_______,_______,________________,________________________________ ____________________________________________~________~_________,_________, 73 ~ ~ ~ ~gb~M36180~Streptococcus pneumoniae transposase, (comA ( 98 8 4864 5379 and coma) aqd SAICAR synthetase ( ( ( ~ ~ ~ ( ( IpurC) genes) complete cds ,________,____,_______,_r_____,________________,_______________________________ _____________________________________________r________~_________,_________t 77 ~ ( ~ ~emb~Z83335~SP28~S.pneumoniae dexB, capl(A,B,C,D,E,F,G,H,I,J.KJ~ 95 624 3 Z622 1999 genes, dTOP-rhamnose ~

( ~ ( ( ( ( biosynthesis genes and aliA gene I ( i ; y ________,____,_______,_______,________________,_______________________________.
._______________ o ________ __ ,________,_________,_________, __ ' ( ~ ~ ~ ~emb~ZB)335~SPZ8~S.pneumoniae dexB, capl(A,B,C,D,E,F,G,H,I,J,K)91 77 4 334l 252J genes, dTDP-rhamnose ( ~ biosynthesis genes and aliA gene i i i ;

________,____,_______,_______,________________,________________________________ ______________________________________________ __ __ ,__ __ __ __, _____,____ __ 78 ~ ~ ~ ~emb~X77249~SPR6~S.pneumoniae (R6) ciaR/ciaH genes ~ 99 339 _, J
1 34) 3 ~ ~

( _ ,_______~_______,________________,_____________________________________________ _______________________________~-_______,_________,_________~ N

78 ~ ( ~ (emb~X77249~SPR6(S.pneumoniae 1R6) ciaA~ciaH genes ( 99 771 771 2 1095 325 ( ( ( .._______,____,_______,_______,________________,_______________________________ _____________________________________________,________,_________,_________) 82 Q10A 10816~gb~U90721~Streptococcus pneumoniae signal peptidase ~ 97 1436 I (spi) gene, complete cds ~ ~ ~

________,____,_______,_______,________ ,____ ____ __ __ _ _ _____ ,________,_________,_________, 82 ~I1A24021143d_ ____ ~ 98 ~gb~U93576~____________________________________________________________~ ~

Streptococcus pneumoniae ribonuclease HII
(rnhB) gene) complete cds ________y____,_______,_______,________________,________________________________ ______________________________-_____________,________,_________,_________, 82 ~12A238112704~gb~U93576~Streptococcus pneumoniae ribonuclease HII ~

(rnhB) gene, complete cds ~ ~ ( ________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________~_________, to 83 ~ ~ ~ ~emb~Z77727~SPIS~S.pneumoniae DNA for insertion sequence IS1318~ 97 29p 339 8 3212 3550 (B23 bp) ~ ( ,________~____~_______,_______f________________~_______________________________ _____________________________________________,________,_________~_________, 83 (10~ ( ~gb~M36180~Streptococcus pneumoniae tcansposase, (comA ~ 99 2190 2190 ( 4662 6851 and com81 and SAICAR synthetase ( ~

( ~ ~ ~ ~ (purC) genes, complete cds ,________, ,_ ___ , _______ _ _ _ ________ ,____________________________________________________________________________,_ _______,_________,_________, 83 ~11_ , ___________Streptococcus pneumoniae transposase) (comA ~ 99 ~ ~ (gb~M36180~and coma) and SAICAR synthetase ~ ( ( ( ( ~ ( ~ IpurC) genes, complete cds ~ I
( (________,____,_______y_______,________________,_______________________________ _____________________________________________,________,_________,_________, 83 (12( ~ ~gb~H36180~Streptococcus pneumoniae transposase, (comA ~ 99 8236 9090 and comb) and SAICAR synthetase ~ ( ( ( ( ~ ~ IpurC) genes, complete cds ( I ( ( ________~____,_______,_______,________________,________________________________ ____________________________________________,________,_________,_________, 83 (13~ 13017~gb~L15190~~SCreptococcus pneumoniae SAICAR synthetase ~
100 107 3735 r 9283 IpurC) gene ~ ~

complete cds , ( b ________.____,_______,_______,________________,________________________________ ____________________________________________,________,_________._________, n i i23i22147i23313igb~L36923~iStreptococcus pneumoniae beta-N-acetylhexosaminidase( 98 218 1167 83 (strH) gene, complete ~
~

cds ~ ( ~ ( ________,__.._,_______,_______,__________-_____,_________________________________________________________________________ ___,________,_________,________ i 24 i2326823450gb(L36923~S ~ 98 83 ptococcus pneumoniae beta-N-acetylhexosaminidase( (strH) gene, complete i i i i i i Cas wr _ _______y____,_______,_______a________________,_________________________________ ___________________________________________,_________________,_________?
vp 83 25 2752723505gb~L36923(Streptococcus pneumoniae beta-N-acetylhexosaminidase~ 99 3826 4023 (scrtil gene, complete ( ( i i i i i i i cds i I
j ,________,____,_______,_______,________________,_ ________________________________________________________________________ ________ _________ _________~

TAI3I,E 1 g, pneumoniae - Coding regions containing known sequences ________,____,_______ ,_______,________________,_____________________________________________________ _______________________, ________,_________,_________ ( (ORF( ( ( match ( match gene name ( percent(HSP ORF
Contig StartStop nt nt ( ( (ID( ( ~ acession( ID (nt) (nt) ident lengthlength ( ,______________________________________ ________i__________________, __ __________________________________________________________________,_________ ( (2628472(2777I(gb(L36923((Streptococcus pneumoniae beta-N-acetylhexosaminidase99 416 702 83 ~ (strH) gene) complete ( ( ( ( ( ( ( ( cds ( ( ________,____,_______,_______,________________,________________________________ ____________________________________________, ________,_________,_________ W

( ( ( ( (emb(ZB3335(SP28S.pneumoniae dexe, capl(A,B,C,D,E,F,G,H,I,J,K)98 697 1620 ( 84 4 4S54 6173 ( genes, dTDP-rhamnose ( ( ( ( ( ~ ( ( i biosynthesis genes and aliA gene ( ( ( ,________,____,_______,_______,________________,_______________________________ _____________________________________________, _________________,_________, ( ( ( ( (emb(277725(SPIS

87 6 5951 S316 (S.pneumoniae ( ( ( DNA foz insertion sequence (966 bpl ( ,___,.____,____,_______f_______,________________,_.____________________________ ________________.____._____________________-_____, ________,_________1--____-,._, 88 ( ( ( (gb~M36180(Streptococcus pneumoniae transposase. (comA 94 2957 3511 ( and comb) and SAICAR synthetase ( ( ( ( ( ( ~ ( ( ~ (purC) genes, complete cds ( ( ( ( ,____________,_______y_______________________,_________________________________ _________-_________________________________, ________,_________,_________, ( ( ( ( (gb(M361B0(Streptococcus pneumoniae transposase, (comA 94 88 6 3466 4269 ( and comes) aid SAICAA synthetase ~
i i i ( ( ( ( ( ( (Putt) genes, complete cds ___________________v_______,________________,__________________________________ __________________________________________ ________,_________,_________, ( (13( 10093(gb~H361A0(Streptococcus pneumoniae transposase, (comA 97 211 216 ( 89 9A78 ( and comBl and SAICAR synthetase ( ( ( ( ( ~ ( ( ( IpurC) genes) complete cds ( ( ( ( ' ________,____,_______,_______,______________.._,_______________________________ _____________________________________________, ________,__________________, ( (14(10062(10412(emb(ZBJ335(SP28S.pneumoniae dexB) capl(A,B,C,D,E,F,C,H,I,J,KJ97 335 3S1 o 89 ( genes, dTDP-rhamnose ( ( ~ ( ( ( ( ( ( biosynthesis genes and aliA ( ( ( ene ( g y____________,_______,_______,________________,________________________________ ____________________________________________ ________,__________________, J

( (10( ( ~emb~X63602(SPBO

93 S303 4941 (S.pneumoniae ( ( ( mmsA-Box ( __ _ .
J
__ .

__ ___,_______,___________ _ ____________________________________________________________________________ _________________, N
__ , ( ( ____________ __, __ 140 189 o ( ( 1708 1520 Streptococcus pneumoniae peptide methionine __,_ ( ( 97 4 (gb(U41735~sulfoxide reductase (msrAl and ( 91 ( ( ( ( ( ( ( ( homosecine kinase homolog (thr8) genes) complete( ( ( cds ( w.

,________,____,______________,________________,________________________________ ____________________________________________ __,._____,_________,_________, f.r ( ( ( ( (emb(Z8J335(SPZBS.pneumoniae dexH, capl(A,B.C.D.E.F.G,H.I,J,KJ97 S92 612 ( ~O
99 1 89 700 ( genes, dTDP-rhamnose ( ( ( ( ( ( ( ( ( biosynthesis genes and aliA gene ( ( ( ( ,________,____a_______,______..,________________,______________________________ ______________________________________________, ________,_________._________, O

( ( ( ( (emb~x17337(SPAM

99 2 177J 775 (Streptococcus ( ( ( pneumoniae ami locus conferring aminopterin resistance ( ________,____,_______,_______,_________________________________________________ ___________________________________________, ________,_________y_________, ( ( ( ( (emb(X17337(SPAM

99 3 2794 1712 (Streptococcus ~
( ( pneumoniae ami locus conferring aminopterin resistance ~

________,____,_______,_______,________________,________________________________ ____________________________________________, ________,_________,_________, ( ( ( ( (emb(X17337(SPAM

99 4 3732 278B (Streptococcus ( ( ( pneumoniae ami locus conferring aminopterin resistance ( ____________,_______,_______,________________,_________________________________ ___________________________________________ ________,__________________, ( ( ( ( (emb~X17337(SPAM

99 5 5249 3714 (Streptococcus ( ( ( pneumoniae ami locus conferring aminopterin resistance ( ____..___,____,_______,_______+________________;_______________________________ _____________________________________________, _________________,_________, 99 ~ ( ( (emb(x17337~SPAM

6 7262 5277 (streptococcus ( ( ( pneumoniae ami locus conferring aminopterin resistance ( i____________,_______,_______,________________,________________________________ ____________________________________________a ________,__________________, ( ( ( ( ~emb(X59225(SPENS.pneumoniae epuA and endA genes for 7 kDa 99 146 1323 ( 101 1 Z16 1538 ( protein and membrane ( ( ( I ( I I ( / endonuclease ( ( ( ( ________,____,_______,_______,________________,________________________________ ____________________________________________, ________,_________,_________, ( ( ( ( (emb~X54225~SPENS.pneumoniae epuA a:.,i endA genes for 7 kDa I01 2 1492 1719 ( protein and membrane ( ( ( ( ( ( ( ( ( ( endonuclease ( ~
~ ( ________,___________,_______,________________,_________________________________ ___________________________________________,________,_________,_________, ( ( ( ( (emb(X54225(SPENS.pneumoniae epuA and endA genes for 7 kDa 101 3 1694 185S ~ protein and membrane ( ( ( ( ( ( ( ( ( ( endonuclease ( ( ( ( ,________,____,_______,_______,________________,_______________________________ _____________________________________________, ________,_________,_ ________, ( ( ( ( (emb~X54225~SPENS.pneumoniae epuA and endA genes for 7 kDa 101 4 1701 2582 ( protein and membrane ( ( ( ( ( ( ( ( ( ( endonuclease ( ( ( ( (____________,_______,_______,________________,________________________________ ____________________________________________, ________,_________,_________, lp ( ( ( ( ~emb~295914(SP29 103 7 S556 5041 (Streptococcus ( ( pneumoniae sodA gene ( ( ,________,____,_______,_______y________________,_______________________________ ___________________________________________1_, ________,_________,_ ________, ( ( ( ( (emb(277727(SPIS

l04 2 1347 1556 (S.pneumoniae ( ( ( DNA for insertion sequence (823 bpl ( ________,____,_______,_______________________,_________________________________ __..________________________________________,________,_________,_ ________, TABLE I
S. pneumoniae - Coding regions containing known sequences (________,____ ,_______,_______, ________________,______________________________________________________________ ______________,________,_________ ,_________t Contig~ORF~ ~ ~ ~ match gene name ~
percentHSP ~ ORF
StartStopmatch nt nt ID ~1D~ ~ ~ ~ ~
identlenyth~ length (nt) Int)acession ( ________,____,_______,_______,________________ ,____________________________________________________________________________,_ _______a_________,_________, 105 ~ ~ ~ ~emb~267739~SPPA ~S.pneumoniae parC, parE and transposase genes~ 98 353 ( 354 5381 5028 and unknown orf ~

________,____,_______,_______~________________ ~____________________________________________________________________________,_ _______,_________i_________, pp 105 6 6089 5379emb neumoniae SPPA arC

arE and trans a k d f . ~ 98 84 ~ 711 p p ~
, p pos se genes an un nown or ________,____,_______,_______,________________ _______________________________________________________________________________ _____,_ ________,_________y r 107 ~ ~ ~ ~emb~X16022~SPPE ~S.pneumoniae peM gene 4 2785 1880 98 ~ 72 ~ 906 ____________~_______,_______________________ ~____________________________________________________________________________,_ _______4_ ________,_________y 107 ~ ~ ~ ~emb~X16022~SPPE ~S.pneumoniae peM gene 5 2913 4988 99 ~
1692~ 2076 ________,____,_______,_______,________________ , ___________________________________________________________________________,___ _____i__________________ 107 ~ ~ ~ ~emb~X13136~SPPE Streptococcus pneumoniae peM gene for penicillin~ 91 107 ~ 615 6 4981 S595 binding protein 2B ~

lacking N-term. (penicillin resistant strain) ________,____,______________________________ t____________________________________________________________________________,_ _______,_________,_________ 108 ~ ( ~ ~emb~Z67739~SPPA ~S.pneumoniae parC, parE and transposase genes~ 95 342 ~ 351 9 9068 8718 and unknoen orf ~

,________,____, ,_______,________________ ,____________________________________________________________________________,_ _______y_________,_______ _____ 108 Q12A 10922(emb~Z67739~SPPA (S.pneumoniae parC, parE and transposase genes~ 99 199 ( 387 1308 and unknown orf ~

,____________,______________________________ ,____________________________________________________________________________;_ _______,_________,_________, 109 ~ ( ~ ~emb~277725~SPIS ~S.pneumoniae DNA for insertion sequence IS1381~ 96 61 ( 528 ________3 2768 2291,________________ (966 bp) ~
___________,_______ ,____________________ _ _ _ ,________,_________,_________, 109 ~ ~ ~ ~emb~277726~SPIS ___ ~ 96 148 ~ 168 ~________4 2688 2B55,________________ __________________________________________________~
____,_______,_______ ~S.pneumoniae DNA for insertion sequence IS1318 (1372 bp) _______________ ____ _ ~________4_________,_________, 109 ~ ~ ~ ~emb~Z77727~SPIS
________________________________________________________~ 97 353 ~ 408 ________5 2862 3269,________________ ~S.pneumoniae DNA for insertion sequence IS1318~
,____,______________ (823 bpl ,__________________________ __________________________________________________,________,_________,_________ , 109 ~ ~ ~ ~gb~M18729~ ~S.pneumoniae mismatch repair protein (hexAl ~ 100 371 ~ 1737 6 5320 3584 gene, complete cds ~

________,____,_______,_______________________ ,____________________________________________________________________________,_ _______,_________,_________, 11J ~ ( ~ ~gb~M36180~ Streptococcus pneumoniae transposase, (comA 95 ~ 429 ~ 429 ~ G1 1 931 3 and come) and SAICAR synthetase IpurC) genes, complete cds ~ I ~ N
~

____________,_______,_______,________________ ,____________________________________________________________________________,_ _______,_________,_________ 113 ~10~ ~ ~emb~X99400~SPDA ~S.pneumoniae dacA gene and ORF
~ 99 1257~ 1257 9788 8532 ~

________,____,_______,_______,________________ ,____________________________________________________________________________,_ _______,_________,_________, 113 ~11~ A ~emb~X99400~SPDA ~S.pneumoniae dacA gene and ORF
~ 99 1116~ 1116 9870 0985 ~

,________,____,_______,_______,________________ ____________________________________________________________________________,__ ______,_________,_________ 114 ~ ~ ~ ~gb~M36180~ Streptococcus pneumoniae transposase, (comA ~ 95 4B1 ~ 501 3 2530 2030 and coma) and SAICAR synthetase ~

(purC) genes, complete cds ___ y___________,_______,________________ ____________________________________________________________________________,__ ______,__________________y 115 ~11A 10932~gb~U04047~ ~Stzeptococcus pneumoniae SSZ dextran glucosidase~ 97 372 ~ J72 1303 gene and insertion ~

sequence IS1202 transposase gene, complete ,________,___________,_______,________________ cds ,__ ___ _ _ _ __ _________ 117 ~ ~ ~ ~emb~X72967~SPNA ____ ____ _________________________________________________________________99 ~ ~

~S.pneumoniae nanA gene 2402 ________,____,______________,________________ ,____________________________________________________________________________f_ _______,__________________, 117 ~ ~ ~ ~emb~X72967~SPNA ~S.pneumoniae nanA gene 2 3277 3831 99 ~ 237 ~ 555 ~________,____y______________,________________ ,____________________________________________________________________________,_ _______~__________________, 117 ~ ~ ~ ~gb~M36180~ Streptococcus pneumoniae transposase, (comA ( 98 429 429 3 4327 3899 and coma) and SAICAR synthetase ~

IpurC) genes, complete cds ________,____,_______,_______________________ ,___________________________________________________,________f___ _______________________ __ _ ,_________ 121 2 1369 1941gb~U72720~ Streptococcus pn eumoniae heat shock protein ( 99 ___ S73 70 (dnaX) gene, complete cds 202 i i i i i i ________,___________,_______;________________ and DnaJ (dnaJl gene, partial cds ,________ _ _ +_________________i_________~

__________________________________________________________________ 121 ~ ~ ~ ~gb~U72720~ Streptococcus pneumoniae heat shock protein ~ 99 1B42~ 1842 3 2412 4253 70 (dnaK) gene ~

complete cds , and DnaJ (dnaJ) gene, partial cds ,____________,_______,_______________________ ,______________________________________________________________________________ ______~_________,_________ 122 ~ ~ ~ ~gb~U04047~ Streptococcus pneumoniae SSZ dextran glucosidase~ 64 451 ~ 522 fJ
8 5066 5587 gene and insertion ~
~

sequence IS1202 transposase gene, complete cds _____,____f___ , __ ___ _ _______~________________ ,____________________________________________________________________________,_ ________________ f_________~
_ S. pneumoniae - Coding regions containing known sequences y________y____ y_______y______ _~________________a____________-_______________________________________________________________y________y _________y_________y Contig~ORF~ ~ ~ , match ~ match gene name ~
HSP ORF
StartStop percent nt nt ~

ID ~ID~ ~ ~ acession~ ~
lengthlength (nt) (nt) ident ~

~

y________y____y_______y_______y________________y_______________________________ _____________________________________________y________y _________y_________y 125 ~ ~ ~ ~gb~H36180~Streptococcus pneumoniae transposase, (comAtase92 1 1811 189 and comb) and SAICAR synthe i ~ i (putt) genes, complete cds i ~________y____y_______y_______y________________~_______________________________ _____________________________________________y________y_________f-________+
W

128 ~15A249611204~emb~Z83335~SPZ8y S.pneumonlae dexB, capl(A,B,C,D,E,F,G,H,I,J,K1 ~ 91 705 ~ 1293 genes, dTDP-rhamnose ~

biosynthesis genes and aliA gene ~________,____y_______y_______,________________y_______________________________ _____________________________________________y________,_________,_________~

134 ~ ~ ~ ~emb~Y1081B~SPYi~S.pneumoniae spsA gene ( 1 1 492 99 ~

( 492 ________,____y_______y_______y________________~________________________________ __________._______________.___________________,________y _________y_________y 134 ~ ~ ~ ~gb~AF019904~Streptococcus pneumoniae choline binding cds 2 556 2652 protein A (cbpA) gene, partial ~ ~

~

________,____y_______y_______y________________y________________________________ ____________________________________________y________y _________y_________y 134 ~ ~ ( (emb~YlOBIB~SPY1(S.pneumoniae spsA gene ~ ~

y________y____y___-___y_______y________________y__~~______________________________________________ ____-____y________y _________;_________;
'_______________ l34 ~ ~ ~ ~gb~AF019904~Streptococcus pneumoniae choline binding cds 4 3952 2882 protein A (cbpA) gene, partial ~ 1071 ( y________y____y_______~_______~________________y_______________________________ _____________________________________________y________y _________y_________y 134 ~ ~ ~ ~gb~U12567~Streptococcus pneumoniae P13 glycerol-3-phosphate 8 7992 9848 dehydrogenase (9lpD) ~ i i i gene, partial cds, and glycerol uptake facilitatores, (glpF) and ORF3 gen complete cds ____ ,____,_______,_______,________________,________________________________________ __________________________________--y________y_________y_________y C

134 ~ ~ 10622~gb~U12567~Streptococcus pneumoniae P13 glycerol-3-phosphate~ 99 570 777 9 9846 dehydrogenase (glpD) ~ ~
' gene) partial cds, and glycerol uptake facilitatores, (glpF1 and ORF3 gen complete cds ____ y____y_______y_______y________________~________________________________________ ____________________________________y________y _________~_________y N

!34 Q10A 11122(gb~U12567~Streptococcus pneumoniae P13 glycerol-3-phosphate~ 100 318 l18 o 0805 dehydrogenase (glpD1 ~

gene) partial cds, and glycerol uptake facilitatores,~ i ~
(glpF1 and ORF3 gen I

( ~ ( ~ complete cds ~ ~ ( W

y________y____y_______y_______~________________y_______________________________ _____________________________________________y________y _________y_________~

137 Q13~ ~ ~gb~U09239~(Streptococcus pneumoniae type 19F capsular~ 90 7970 8443 polysaccharide biosynthesis ~ ~

operon) (cpsl9fABCDEFGHIJKLhWO) genes, complete cds, and aliA gene, partial cds y________~____y_______y_______~________________y_______________________________ __________________ ____ __ _ _ __y-_______y _________y_________y 137 ~14~ ~ ~emb~Z83335~SPZ8~S.pneumoniae dexB, capl(A,B,C,D,E,F,G,H,I,J,K] 94 174 186 wo 8590 877S genes, dTDP-rhamnose biosynthesis genes and aliA gene ~ ~ ~ ~

________y____y_______y_______y________________y________________________________ ____________________________________________y________y_________y_________y 4 '15~ ~ ~emb~283335,SP28(S.pneumoniae dexB, capi(A
B,C,D,,F,G,H,I,3.K]~ 98 19S
137 8773 8967 genes, dTDP-rhamnose ~
~

biosynthesis genes and aliA gene ________y____y_______y_______y________________y__________..____________________ _____________________________________________y________y_________,_________ y 137 Q16~ ~ ~emb~277726~SPI5~S.pneumoniae DNA for insertion sequence ~

9223 9687 IS1318 (1372 bp) 96 ~

( 465 ________,____,_______~_______y________________y________________________________ ____________________________________-_______y________, _________y_________y I37 Q17~ 10051~emb~Z77727~SPIS~S.pneumoniae DNA for insertion sequence ~

9641 IS1318 (823 bp) 96 ~

~ 411 (________,____,_______,_______,________________~_______________________________ _____________________________________________,________y _________y_________y 139 Q10A299812702~emb~X63602~SPB0~S.pneumoniae mmsA-Box ( 234 90 ~
~ 297 y________y____y_______y______ _y________________y____________________________________________________________ ___________ _____y________ _ _____ _____ 141 ~ ~ ~ ~emb~249988~SPMMStreptococcus pneumoniae mmsA gene ~

8 780S 8938 99 ~

( 1134 ________y____y_______y_______y________________~___________-_____________________________________________________________ __y________y _________y_________y 141 ~ ~ 10972~emb~Z49988~SPMMStreptococcus pneumoniae mmsA gene ~

9 8936 99 ~

~ 2037 y________y____y_______y_______~________________y_______________________________ ________________________________________ _____y________y_________y_________y ( Q10A1472A2467~emb~249988~SPHM,Streptococcus pneumoniae mmsA gene ' 14l 100 f ~ 996 ____ y y_ _ v _ _ _______ y_______,________________y_____________________________________________________ __________________ _____y________y_________a_________y 142 ~ ~ ~ ~gb~M80215~Streptococcus pneumoniae uvs402 protein ~
174 y0 2 257 814 gene, complete cds 98 ( ~ SSB

~________y____,_______y_______y________________y_______________________________ _____________________________________________~________y_________y_________y ~

142 ~ ~ ~ ~gb~H80215~Streptococcus pneumoniae uvs402 protein ~

3 7B7 9S7 gene, complete cds 100 ~

~ 171 ( ____ y____~_______y-______y_______________ _y_____________________________________________________________________-______y________y_________y_________y 142 ~ ~ ~ ~gb~M80215~Streptococcus pneumoniae uvs402 protein ~

4 980 3022 gene, complete cds 95 ~

( 1043 ________,____y_______y_______y________________ y_____________________________________________________:-_____________________y___-____y_________,_________y S. pneumoniae - Coding regions containing known sequences _____y____4_______y_______y________________y___________________________________ ____-~-__________________________________ y________y_________;_.________;

( (0RF( ( ( match ( match gene name ( HSP ~ ORF ~ 0 Contig StartStop percentnt nt ID ~ID( ( ~ acession~ ~
~ length( length.
Int) (nt) ident ( y________y____y_______y_______y________________y____________________________:__ ______..______________________________________ y________y_________;_________;

( ( ( ( ( (Stre ( 142 5 3020 3595 b(M80215~ tococcus I00 neumoniae uvs402 ( ene 153 rotein ( com 576 lete cds ( g p p p g , p ___ _y____y_______y_______y________________ y____________________________~_______________________________________________y_ _______;_________;_________y 145 1 1 219 emb 235135neumoniae aliA

( ( ~ ( SPAL ( ~

( ( ( ene for amiA-like 97 ene A ( ( ( .
P g g ________,____,_______,_______y________________ ,___________________________________________________________~________________,_ _______y_________,_________, ( ~ ( ( (gb(L20556((Streptococcus pneumoniae plpA gene, partial ( 145 2 171 1994 cds 99 ( ~
i824 ___ _;____;_______ y_______;__________-_____~_________________________________________________________________________ ___ f________y_________;_________;

( ~ ( ( (emb(Z47210(SPDE(S.pneumonfae dex8) cap3A, cap3H and cap3C
( 145 3 22B7 7599 genes and orfs 99 ( ( ( ________f____,_______y_______,________________ ,____________________________________________________________________________,_ _______y________..;_________, i i i i igb(H90527(Streptococcus pneumoniae penicillin-binding 99 145 4 9934 7766 protein IpOnA) gene, complete i i i cds i ________,____~_______y_______;________________;________________________________ ____________________________________________;________y_________;_________;

14S i ;10488' ;gb~M90527(iStreptococcus pneumoniae penicillin-binding ( 9922 protein (pdM) gene, complete 99 cds ( ___ _y____y_______y_______y________________y________________________-.___________________________________________________y________y_________;_______ __;

( ( ('159( (emb~282002(SPZB(S.pneumoniae pcpB and pcpC genes ( y ( ( ( ___ _y____~_______y_______y________________y_____________-________________________________.._____________________________y________y______ ___f_________;

( ( ( ( (emh(Z82002~SPZ8(S.pneumoniae pcpB end pcpC genes ( ( ( ( ___.._________y________________________________________________________________ ____________y________~_________~_________y N

( (16(11795(10794(emb(282002(SPZB(S.pneumoniae pcpB and pcpC genes ( w.
( ( ( ~________;____y_______y_______y________________y_______________________________ _____________________________________________4________f_________y_________y J
( (11(1067A(10202(emb(Z21702(SPUN(S.pneumoniae ung gene and mutX genes encoding( N
I47 uracil-DNA glycosylase and 8- 98 ( ( ( ( ( ( ( ( ( oxodGTP nucleoside triphosphatase ( ( ( O
( ________~____y_______,_______,________________~________________________________ ____________________________________________y________y_________y______ i i121133810676(emb~221702(SPUN(S.pneumoniae ung gene and mutX genes encoding~
147 uracil-DNA glycosylase and 8- 99 ~

~ i ( ~ ~ oxodGTP nucleoside triphosphatase ( ( ________,____y_______,_______,________________;________________________________ ____________________________________________;________;_________y_________;

148 (12( ( (gb(U41735~(Streptococcus pneumoniae peptide methionine 90 9009 8815 sulfoxide reductase (msrAl and ( ( ( ( ( ( homoserine kinase homolog (thrB1 genes, i i i complete cds i ~________?____y_______~_______;________________y_______________________________ _____________________________________________y________y_________f_________y N

( ( ( ( ~emb(X63602(SPBO(S.pneumoniae mmsA-Box 156 9 1154 1402 ( ( ( ( ________~____,_______,_______,________________y________________________________ _____________________._______________________~________,_________y_________y ( ~13( ( ~gb~M36180(Streptococcus pneumoniae transposase) (comA ( 526 ( S28 159 9048 8521 and coma) and SAICAR synthetase 98 ( ( ( ( ( ~ ( ( (purCl genes, complete cds ( ( ( ~

y________;____y_______~_______ ~________________,_____________________________________________________________ _______________y________~_________;_________;

160 ( ~ ( (emb~Z26851~SPAT(S.pneumoniae IR61 genes for ATPase a aubunit,( 142 147 1 1 147 ATPase b subunit and ATPase c 100 ( ( ( ( ( ( ( ( subunit ' ( ( ( i ~________y____y_______,_______~________________a_______________________________ _____________________________________________f________y_________~_________+

( ( ( ( (emb(Z26851(SPAT(S.pneumoniae (R6) genes for ATPase a subunit)( 720 720 160 2 179 B98 ATPase b subunit and ATPase c 99 ~
( ( ( ~ ( ~ ~ subunit ( y________i____y_______y_______y________________y_______________________________ _______.._____________________________________i________~_________;_________;

( ~ ( ( ~emb(Z26B50(SPATS.pneumonfae 4M2221 genes for ATPase a subunit,( 501 501 160 3 906 1406 ATPase b subunit and ATPase 95 ( ( ( ( ( ( ( ( i ( c subunit y________y____y_______y_______,________________f_______________________________ _____________________________________________________+_________y_________f i i i 1992 iemb(226850~SPATS.pneumaniae (M222) genes for ATPase a subunit,( 306 570 l60 4 1373 ATPase b subunit and ATPase 87 ( ( ( i i ( ( c subunit ( ________,____,_______,_______y________________,________________________________ ____________________________________________,________,_________,_________a ( ( ( ( ~emb(X77249~SPR6(S
( J
161 1 1 984 pneumoniae (R61 ciaR/ciaH genes 99 . ( ( ( ~________y____y_______y_______!___________-____;__________________________________..______________________________________ ___y________+_________y_________y ( ( ( ( (emb(X83917(SPGY(S.pneumoniae orflgyrB and gyrB gene encoding( 161 7 6910 7497 DNA 9yrase B subunit 99 ( ( ( ________,____ ,_______,_______,________________y_____________________________________________ ___________________________,________y_________y_________y __ ( ( ( ( (emb~X83917(SPGY(S.pneumoniae orflgyrB and gyrB gene encoding( 161 8 7443 9386 DNA gyrase B subunit 98 ( 19l2 ( ( y________y____ ;_______~__..____y________________4____________________________________________ ________________________________a________y_________;_________;

l63 ( ( ( (gb(L20559((Streptococcus pneumoniae ExpS gene, partial 98 1 2 21S5 cds ( ( ( ( y________y____ ;_______t_______y________________y____________.._______________________________ ________________________________;________;_________;_________;

S. pneumoniae - Coding regions containing known sequences ________,____y_______,_______,________________+________________________________ ____________________________________________,________y_________y_________y Contig~ORF ~ ~
HSP ~ ORF
ID ~ Stop ________percentnt nt y pp y________Start ~
( ~ lengthlength ~ID match ident__________________ ~ ~ y________ (nt) match y____y_______ gene name ~

(nt) ~

acession ~

y_______,________________,_____________________________________________________ _______________ 165 1 ~ ~gb~J01796~ ~
~ 1587~ 1S87j 32 (S.pneumoniae ________99 W
y________4____1618 malX and y_________y_________4 malts y________ y_______y_______ genes encoding membrane protein and amylomaltase, complete cds, and male gene encoding phosphorylase ,________________y_____________________________________________________________ _______ 165 2 16083902 b J01796 S.pneumoniae malX and malts g protein ~ ~
280 ~ 2295 ~9 ~ ~ and 100 ~ genes encodin membrane amylomaltase. complete cds, and male gene encoding phasphorylase y________y____y_______ y_______y________________y_____________________________________________________ _______________ ________y________ y_________y_________y 166 ~ ~ ~ ~emb~Y11463~SPDNStreptococcus pneumoniae dnaG) rpoD, ( 1 378 4 cpoA genes and ORF3 and ORFS 100 ~

~

y________y____,_______i_______y________________y_______________________________ _____________________________________ ________,________;_________,_________y 166 ~ ~ ~ ~emb~Y11463~SPDNStreptococcus pneumoniae dnaG, rpoD, ~

2 1507320 cpoA genes and DRF3 and ORES 99 ~

~

________y____y_______ f_______y________________y_____________________________________________________ _______________________y________y_________y_________y 166 ~ ~ ~ ~emb~Y1I463~SPDNStreptococcus pneumonfae dnaG, rpoD, ~

3 3240I432 cpoA genes and ORF3 and ORFS 99 ~

~

y________y____a_______,_______y________________,_______________________________ _______________________~____________________a________y_________y_________y 167 ~ ~ ~ ~emb~271552~SPADStreptococcus pneumoniae adcCBA operon ~

~

~

________y____y_______,_______,________________ y____________________________________________________________________________y_ _______y_________y_________y I67 ~ ~ ~ ~emb~Z71552~SPADStreptococcus pneumoniae adcCBA operon ~

y____2 1844999 98 __ _ yi _ ~

~

_ _________y__ y_____~.__________y____________________________________________________________ _____.___________y______..._y_________y_____~~_-y _ y ~ ____ ~emb~Z71552~SPAD~SCreptococcus pneumoniae adcCBA operon ( ' ~ 27l4~ g7 167 3 1B42 ~

~

________,___________,_______y________________,_________________________________ ___________________________________________,______.._y_________s_________y ( ~ S ~ ~emb~Z?i552~SPADStreptococcus pneumoniae adcCBA operon ~
to ~

~

~

________,____y_______,_______,________________,________________________________ ____________________________________ .._______,________,_________a_________y ~1 168 ~ ~ ~ ~gb~L20558~ ~
J
1 1 2259 Streptococcus 99 _ pneumoniae ~

Exp4 gene, 282 partial ~
cds 2259 ~

________y____y________ y________________y_____________________________________________________________ _______ ________y________y_________y_________y N
( ~10~ y_____~emb~277726~SPIS ( o 170 733A~ ~S.pneumoniae 95 7685 DNA foc ~

insertion 315 sequence ~

(1372 ~
bp/

________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________,_________, 172 ~ ~ ~ ~gb~U47625)Streptococcus pneumoniae formate acetyltransferasetial ~ Wp 6 246249B1 (exp72) gene) par 97 ( ~

~

cds ________,____y_______y_______________________y_________________________________ ___________________________________________s________y_________ f_________4 175 ~ ~ ~ ~gb~M36180jStreptococcus pneumoniae transposase, chetase~
~ 354 ~ o 1 373 20 (comA and come) and SAICAR syn 89 ~

~ ~ ~ ~ ~ IpurC1 genes) complete cds ________,____y_______y_______y________________y________________________________ ____________________________________________y________y_________4_________y N
( ~ ~ ~ ~emb~247210~SPDE(S.pneumoniae dexB, cap3A, cap3B and ~

175 4 18433621 cap3C genes and orfs 95 ~

~

~___________________y_______~________________y_________________________________ ___________________________________________y________y_________~_________y 176 ~ ~ ~ ~emb~Z67739~SPPA~S.pneumoniae parC, pare and transposase~

J9842980 genes and unknown orf 100 ~

~

____________,_______,_______,________________,_________________________________ ___________________________________________y________y_________,_________y 178 ~ ~ ( ~emb~Z67739~SPPA~S.pneumoniae parC, pare and transposase~

1 3 425 genes and unknown orf 95 ~

~

_________y_______y_______y_____________-__y__________~_________________________________________________________________ y________y_________y_________y I79 ~ ~ ~ ~emb~283335~5PZ8(S.pneumoniae dex8, capllA,B.C,D,E,F,G,H,I,J,K/ 99 338 357 1 426 70 genes, dTDP-rhamnose i i j i biosynthesis genes and aliA gene ________y____4_______y_______y________________y________________________________ _____________________________..______________y________y_________y_________y 180 ( ( ~ ~emb~x95718~SPGY~S.pneumoniae gyrA gene ~

~

~

y________y____y_______y_______y________________y_______________________________ ___________~_________________________________y________y_________y______ 186 ~ ~ ~ ~emb~Z79691~SOOR~S.pneumoniae yorf(A,B,C,D,E/, ttsL) ~

________1 714 4 _ pbpX and regR genes 98 .____._ ._ , ~

_ _ 59 ~

___ ___ ___ ,____________________________________________________________________________,_ _______._________,_________, n 186 ~ __ __ ____________~S.pneumoniae yorf(A,B.C,D,EI) ftsL, ~

2 ~ ~ ~emb~Z79691~SOORpbpX and regR genes 98 2254608 ~

~

y________y____y_______i_______________________a.______________.________________ _____________________________________________y_______y_________y_________y 186 ~ ~ ~ ~emh~279691~SOOR~S.pneumoniae yorf[A,H,C,D,E/) ftsL, ~

3 707 880 pbpX and regR genes 98 ~

~

________y____,______________,________________y_________________________________ _ _________________________________________,________i_________y_________y 189 ~ ~ ~ ~gb~U72720~Streptococcus pneumoniae heat shock proteincds I 2 259 70 (dnaK) gene, complete ~

~

~

and DnaJ (dnaJ) gene, partial cds ________y____y_______y_______y________________y_________________-________--,________________________________________________y________y_________y_________y 1B9 ~ ~ ~ ~gb~U72720~Streptococcus pneumonlae heat shock proteincds 2 600 385 70 (dnaK) gene, complete ~

y and DnaJ /dnaJ) gene, partial cds ________y____y_______y_______y________________y________________________________ ____________________________________________y________,_________y_________y 5. pneumoniae - Coding regions containing known sequences ________,____y_______y_______,________________y________________________________ ____________________________________________+________,_________,_________, j ~ORF~ j j match j match gene name ~
percentHSP ORF
Contig StartStop nt nt ~

j SID~ j ~ acession' , identlength length ID (nt) (nt) ( ~

________,____a_______,_______y________________,________________________________ _____________________ _______________________a________,_________,_ ________~ w 189 ~ ~ ~ ~gbjU72720jStreptococcus ene 99 3 I018 851 neumoniae heat shock com ~ ~

rotein 70 (dnaK) lete ds ~

p g p ) p c j ~ ~ ' and DnaJ IdnaJ) gene, partial ~ ~ j j ~D
cds ,________,_-__,_______y_______,________________y___________________________________________ __________ _______________________,________,_________,_ ________, Yr j ~ ~ ( ,gbjU72720jStreptococcus pneumoniae heat gene, complete99 189 4 I012 21S4 shock protein 70 (dnaK) cds j j j ( ~ ~ j ~ ~ and DnaJ (dnaJ) gene, partial cds ,________,____,_______,_______y________________,_______________________________ _____________________________________________,________,_________,_ ________y j j j j jembjX63602jSPB0jS.pneumoniae mmsA-Box 191 9 7829 7524 j 95 j j ,________y____,_______,_______f________________+_______________________________ ______________________ _______________________,________y_________y_ ________, j j j j jgbjM3618Dj~StreptococcuS pneumoniae transposase,and SAICAR

194 1 1 729 (comA and comes) synthetase ' ~
~

j j ~ j j ~ (purCl genes, complete cds j j ,________t____y_______,_______,________________y_______________________________ _____________________________________________y________y_________y_ ________, j ( j j jembjZ83335jSPZ8jS.pneumoniae dexB, capl[A,B,C,D,E,F,G,H,I,J,K)21TDP-rhamnose96 211 237 l99 2 1117 B81 genes, j j ~ i j j j j j j biosynthesis genes and allA j j gene ________,____,_______,_______,________________y________________________________ ____________________________________________y________,_________y_ ________, j j 1499 1762~embjZ83335jSP28~S.pneumoniae dexB, capl(A,B,C,D,E,F,C,H,I,J,K]dTDP-rhamnose B9 24A 264 199 4 j genes, ~ ~
~

j j , ~ j j biosynthesis genes and aliA ~ ~
j j ~ gene ________,____,_______,_______,________________a________________________________ ____________________________________________y________y_________,_ ________, C

j ~ j j jrmbj283335jSP28jS.pneumoniae dex8, capl[A,B,C.D,E,F,G,Ii,I,J,K/dTDP-rhamnose 98 504 504 199 5 1'!812284 genes, j ~

j j j j j j biosynthesis genes and aliA ~ j j J
,________,____,_______,_______,________________gene ,____ ___ ___ __ _ _ _ _______________________,________,_________,_ ________, w..
j j j j jgbjL20567j_ j 99 203 1 1977 337 __________ j j j J
________ __________________ _ jStreptococcus pneumoniae Exp9 gene, partial cds ________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________,_ ________, o 20A ( j j jgbjL36131j(Streptococcus pneumoniae expl0 99 1 114S 3 gene, complete cds. recA gene, j j j 5' end j ,________y____,_______,_______,________________,___________________ Cn __________ _ __ __ _______________________y________4_________,_ ________, ___ __ ______________ j j j j jgbjU89711jjStreptococcus pneumoniae pneumococcalA PspA
(pspA) 90 471 2238 208 1 59 2296 surface protein gene, j j j j j ~ j j ~ complete cds ~ ~
~ ( ~O

,________,____,_______,_______,________________y_______________________________ _____________________________________________,________,_________y_ ________, O

213 j ~ ' ~embjZ83335jSPZ8~S.pneumoniae dexB, capllA,B,C,D,E,F,G,H,I,J,K)dTDP-rhamnose 96 3J2 333 3 2455 2I23 genes, ( ~ j j j j ~ ~ ~ biosynthesis genes and aliA ~ ~
j gene y________,____y_______,_______y________________y_______________________________ _____________ __ _ __ _______________________y________y_________y_ ________y __ __ 216 ~ ~ ~ ~embjz83335~SPZ8S.pneumoniae dexB) capl[A,B,C,D,E,F,C,H,I,J,x]dTDP-rhamnose 99 338 357 1 368 12 genes, ~

i biosynthesis genes and aliA gene y________y____y_______,_______,________________y_______________________________ _____________________________________________,________,_________,_ ________, j j j j jgbjM28678jjS.pneumoniae promoter sequence j 98 216 3 2650 2327 DNA ~
j j ________,_.___,_______,_______,________________ ,____________________________________________________________________________,_ _______,_________,_ ________, j ~ j j jembjZ83335jSP28jS.pneumoniae dexB) capl[A,B,C,D,E,F,G,H,I,J,K/dTDP-rhamnose 94 41d 414 222 1 417 4 genes) ~
~

j , j ~ biosynthesis genes and aliA

gene ,________,____,_-_____,_______,________________,________________________________________________ ____________________________,________y_________,_ ________, j j j j jembjAJ000336jSPjStreptococcus pneumoniae ldh ~

227 3 5266 423B gene ( j , ,________a,___,_______,_______,________________ y_________________________..___..______________________________________________ f________,_________y_ ________y j j ~ j jgbjM31296j~S.pneumoniae recP gene, com 239 1 1 80d fete cds j 95 P ~ j ,________r____-- _ _ _ ____________________________________________________________________________,__ ______~_________y_ ________y _ ____________________ , ____ ( j j ( (gb~M36180jStreptococcus pneumoniae transposase,and SAICAR
94 178 183 n 247 3 1625 1807 (comA and coma) synthetase j ~

j j j j j j (purCl genes, complete cds j ~
( ,________,____,_______,_______,________________ ,____________________________________________________________________________,_ _______i_________,_ ________y 249 j j ~ jembjZ83335jSP28jS.pneumoniae dexB, capl[A,B,C,D,E,F,C,H,I,J,K]dTDP-rhamnose 94 443 444 3 921 1364 genes, j ' j j j j ~ biosynthesis genes and aliA

gene ________,____,_______,_______,________________ ,____________________________________________________________________________,_ _______,_________,_ ________, j j j j 'gbjM36180)jStreptococcus pneumoniae transposase)and SAICAR

253 1 362 3 IcomA and comes) synthetase j j j j j ~ ( j (purC) genes, complete cds j y________y____y_______y_______,________________y_______________________________ _____________________________________________+________y_________t_________y OD

j ~ ~ j ~emb~283335~SPZ8~S.pneumoniae dexB, capl(A,B,C,D,E,F,G,H,I,J,K)dTDP-rhamnose 95 420 813 253 5 1238 2050 genes, i i ~ i j ~ j j biosynthesis genes and aliA

gene ,________s____i_______,_______,________________y_______________________________ _____________________________________________,________,_________y_________, TABLE 1 S. pneumoniae - Coding regions containing known sequences ,________y____y_______,_______y________________+____________________________.._ ______________________________________________y__.._____;_________y_________y ( Contig(ORF( ( ( match ( match gene name ( percent ( HSP ( ORF
StartStop nt nt ( ( ID (ID( ( ( acession ( ( ident ( length( length( (nt) (nt) ________,____,_______,_______,________________y________________________________ ____________________________________________,________,_________y_________y ( 253 ( ( ( (emb(Z83335(SP28(S.pneumoniae dexH, capl(A.H.C,D,E,F.G.H.I,J,K] 97 504 504 6 2069 2572 genes) dTDP-rhamnose i i ~ i ( ( ( ( ( ( biosyntheses genes and aliA gene ,________y-___y_______,_______;___,.____________y___..____________________________________ ____________________________________,________y_________y_________y ( 255 ( ( ( (emb(282002(SPZB ( 97 ( 531 1 3 B00 (S.pneumoniae ( 798 ( pcpB and pcpC genes ________y____,_______,_______y________________,________________________________ ____________________________________________,________y_________,_________, ( 25S ( ( ( (emb(282002(SP28 ( 97 ) 672 2 i98 1841(S.pneumoniae ( 104d pcpB and ( pcpC genes ,________a____y_______,_______y________________,_______________________________ ___________________________ __________________,________y_________y_________, ( Z55 ( ( ( (emb(267739(SPPA orf ( 92 3 2493 1969(S.pneumoniae ( 4)5 ( part, parE 52S ( and transposase genes and unknown ________,____y_______,_______,________________y________________________________ __________________________ __________________i________,_________,_________, ( 257 ( ( ( (emb(X17337(SPAH resistance 2 98S 770 (Streptococcus ( 96 ( pneumoniae 117 ( 216 ami locus ( conferring aminopterin ________,____,_______,_______y________________,________________________________ ____________________________________________,________,_________,_________, ( 257 ( ( ( (gb(H36180((Streptococcus pneumoniae transposase)SAICAR
synthetase( ( 339 ( 339 ( 3 1245 907 (comA and coma) and 97 ( ( ( ( ( ( (purC1 genes, complete cds ( ( ( ( y________y____y_____~._p______y________________,_______________________________ _____________________________________________,________y_________,_________y ( 267 ( ( ( (gb(U16156(Streptococcus pneumoniae dihydropteroatedihydrofolate( ( 714 ( 2 495 120B( synthase (sulAl 95 ( ( ( (, ( ( ( , (sulC), synthetase (sulB), guanosine triphosphatealdolase-cyclohydrolase i ( ( ( ( ( ( ( ( pyrophosphokinase IsulD) genes, ( complete cds ________,____,_______,_______,________________,________________________________ ____________________________________________,________y_________ ,_________, N

( 267 ( ( ( (gb(U16156(Streptococcus pneumoniae dihydropteroatedihydrofolate( ( 755 ( 987 N
3 1291 2277( synthase (sulA), 97 ( ( ( ( ~ ~ synthetase (sul8), guanosine triphosphate(suiCl.
( ( ~ "I
cyclohydrolase aldolase-( ( ( ( pyrophosphokinase IsulD) genes, ( ~ ( ( J
complete cds ~________,____y_______y_______,________________,_______________________________ ___________________________ __________________4________y_________y_________f N

( 267 ( ( ( (gb(U16156(Streptococcus pneumoniae dihydropteroatedihydrofolate ( 1341( 1341( 4 2261 3601( synthase (sulA), ( 98 O

( ( ( ( ( ( synthetase (sulB), guanosine triphosphate(sulC), ( ( p1 ,r cyclohydrolase aldolase-( ( ( ( ( ( ( pyrophosphokinase (sulD1 genes, ( ( ( J ~o complete cds y________y____y_______s_______,________________,__~____________________________ _____________________________________________y________ y_________,_________, ( 267 ( ( ( ~gb~U16156( dihydrofolate( ( 576 ( S76 ( 3561 d136(Streptococcus 99 pneumoniae dlhydropteroate synthase (sulA), ( ( ( ( ( 5ynthetase (sulC), o IsulB)) aldolase-guanosine triphosphate cyclohydrolase ( ( ( ( ( ( pyrophosphokinase ; i i i (sulD) .
genes, complete cds ,________,____,_______,_______a________________,_______________________________ _____________________________________________,________ ,_________+_________y N

( 267 ( ( ( (gb(U16156(Streptococcus pneumoniae dihydropteroatedihydrofolate( ( 748 ( 786 ( 6 4164 4949( synthase lsulA). 99 ( ( ( ( ( ~ synthetase (sulH). guanosine triphosphate(sulC)) cyclohydrolase aldolase ( ( ( ( ( pyrophosphokinase IsulD) genes, complete cds ________,____y_______,_______,________________+________________________________ ____________________________________________ y________4_________,_________, ( 267 ( ( 5140(gb(U16156( dihydrofolate( 186 405 ( 7 5594 ~ (Streptococcus 100 pneumoniae dihydropteroate synthase (sulA).

( ( ( ( synthetase (suit), ~ ~
(sulB)) aldolase guanosine triphosphate cyclohydrolase ~

( ( ( ( ( PYroPhosphoklnase ~
( ( (sulD) genes, complete cds ________y____,_______;______ _y________________y____________________________________________________________ ________________y ________y_________,_________, ( 268 ( ( ( (emb~X63602~SPB0 ( 89 ( 194 4 179) 1990(S.pneumoniae ( 198 ( mmsA-Hox y________,____,_______,_______y________________y_______________________________ _____________________,-_____ __________________y________~--__---__y_________y ( 271 ( ( ( (gb(M29686( ( 93 ( 160 1 562 104 (S.pneumoniae ( 9S9 ( mismatch repair (hexB) gene) complete cds (________,____,_______,_______,________________y_______________________________ _____________________________________________,________,_________,_________, ( 291 ( ( ( (gb(U04047(Streptococcus pneumoniae SSZ dextraninsertion 96 1 75 524 glucosidase gene and ( ( ( ( ( ( ( ( ( sequence IS1202 transposase gene, ( ( ( ( complete cds ________y____y_______4_______,________________y________________________________ _______ __, ________i_________i_________, __ __ ( 29l ( ( ( (emb(Z83335(SPZBS.pneumoniae dexB, capl[A,B,C,D,E.F,G,H,I,J,K) 87 205 477 2 1001 525 ( genes. dTDP-rhamnose ( ~ i i ( ( ( ( ( ( biosynthesis genes and aliA gene ( ,________y____y_______y_______,________________,_______________________________ ___________--______________ __________________,_______..,_________+_________f ~.

( 291 ( ( ( (emb(Z83335(SPZBS.pneumoniae dexH, capl(A,B,C,D.E,F,G.H,I,J,K] 90 249 3 807 559 ( genes, dTDP-rhanu~ose ( ( ( ( ( ( ( ( ( ( biosynthesis genes and aliA gene ( ( ( ( ________,____,_______,_______,________________,________________________________ __________________________ __________________,________,_________,_________, pp ( 29I ( ( ( (gb~M36180(Streptococcus pneumoniae transposase)SAICAR
synthetase85 264 276 4 1J74 1099( (comA and coma) and ( ( ( ( ( ( ( ( ( (purC) genes, complete cds ( ( ( y________r____y_______y_______y________________y______________________.._______ ____________________________ __________________,________v_________y_________, TABLC 1 S~ Pneumoniae - Coding regions containing knoum sequences ________,____ ~_______,_______ ,________________,_____________________________________________________________ _______________ ,________,_________,______ Contig~ORF~ ~ ~ match ~ match gene name StartStop percentHSP ( ORF
nt nt ID ~ID~ ~ ~ acession (nt)(ntl ' ~ le l ~dent th th I

ng eng ,________,____, _______________________ , _ ________' pp ______ ____________________________________________________________________________ ,__ _________, 293 ~ ~ ~ ~emb~z67740~SPGY~S.pneumoniae gyrB gene and unknown orf ~ 98 553 ~ 1671 1 3 i673 ~

________,____,_______,_______,________________,________________________________ ____________________________________________~________,_________,_________, 296 ~ ~ ~ ~emb~Z47210~SPDE~S.pneumoniae dexH, cap3A) cap3B and cap3C
~ 99 430 ~ 12B4 W
1 1434151 genes and orfs ~
~

________~____,_______,_______,________________a________________________________ ____________________________________________ ,________,_________,_________, 317 ~ ~ ~ ~emb~267739~SPPA~S.pneumoniae parC, parE and transposase genes~ 89 353 ( 354 1 157 510 and unknown orf ~

,________,____,_______a_______,________________,_______________________________ _____________________________________________,________~_________,_________~

i 325 i 1237~ ~emL~283335~SPZ8~S.pneumoniae dexB) capl(A,B,C,D,E,F,G,H,I,J,K)91 299 753 2 485 genes, dTDP-rhamnose i i i biosynthesis genes and aliA gene ~

________,_..__,_______~_______,_..______________,______________________________ ______________________________________________,________i_________t_________, 326 ~ ~ ~ ~emb~Z82001~SPZ8~S.pneumoniae pcpA gene and open reading frames~ 100 233 ~ 462 1 1 962 ~

________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________,_________, i 327 i ~ i iemb~Z83335~SPZ8IS.pneumoniae dexB, capi(A,B,C,D,E,F,G,N,I,J,K]~ 94 89 ( 540 1 603 69 genes) ~TDP-rhamnose ~

biosynthesis genes and aliA gene ,________,____,_______,_______,________________,_______________________________ _____________________________________________,________~_________~_________, 334 ~ ~ ~ ~gb~U41735~Streptococcus pneumoniae peptide methionine ~ 87 91 ( 393 1 153 545 sulfoxide reductase (msrA) and ~

( ~ ~, ~ ~ ~ homoserine kinase homolog (thrB) genes, ~ ~
I ~ y complete cds ________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________,_________, 336 ~ ( ( ~emb~Z26R50~SPAT~S.pneumoniae (M2221 genes for ATPase a subunit,~ 97 102 ~ 216 1 30A 93 ATPase b subunit and ATPase ~

c subunit ________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________,_________, 360 ~ ~ ~ ~emb~Z67739~SPPA~S.pneumoniae parC, parE and transposase genes~ 95 435 ~ 519 1 1 519 and unknown orf ~

,________,____,_______,_______,________________,_______________________________ _____________________________________________,________,_________,_________, N

360 ~ ~ ~ ~emb~Z83335~SP28~S.pneumoniae dexB, capl[A,B,C,D,E,F,C,H,I,J,K)~ 94 353 363 o 4 159B1960 genes, dTDP-rhamnose ~
~

biosynthesis genes and aliA gene ,_ _______________________________________________________________________________ _______________________________,________t_________,_________, ~ ~O
______~ a r -362 ~ ~ ~ ~emb~Z83335~SPZ8~S.pneumoniae dexB, capl(A,B,C,D,E,F,G,H,I,J,K)~ 95 63 ~ 672 i 673 2 genes, dTDP-rhamnose ~

biosynthesis genes and aliA gene , ________,____,_______,_______,________________,________________________________ ___________________________________..________i________i_________,_________, 362 ~ ~ ( ~gb~U04047~Streptococcus pneumoniae SS2 dexiran gpucosidase96 441 4d1 2 1168728 gene and insertion i i N
sequence IS1202 transposase gene, com lete i i cds ,________,____,_______,_______,________________~_______________________________ ___________..______________________________,________,______..__,_________, ~O
__ i 3B4 i i ; emb~X85787~SPCP~S.pneumoniae dexB, cpsl4A, cpsl4H) cpsl4C, ~ 94 54 ~ 237 1 J47 111 cpsl4D, cpsl4E, cpsl4F, cpsl4G, ~

cpsl4li, cpsl4I, cpsl4J, cpsl4K) cpsl4C" tasA
genes ________,____,_______y_______,________________,________________________________ ____________________________________________,________,_________,_________, ro n H
~o S. pneumoniae - Putative coding regions of novel proteins similar GO known proteins ,________,____,_______ ,_______,_______________ _,__________________________-___________-____________________________-________,________,_________,_________, J JORF( J J match ( match gene name ~ Z
1 J length Contig StartStop sim ident ~

J JID( J ( acession ID fntlSnt1 J ~
J (nt1~ ~O

________1____,_______1_______,_______________ _,_____________________________ ___________________________________..__________,________,_________,_________t Op ( ( ( ( (pir(F60663(F606(translation elongation factor Tu -StreptococcusJ 100 228 2 17601942 oralis J

( J

,___-.____1____,_______,_______1_______________ _,_________________-_____________________________--__________________-_______-,________,_____-___,_________;

J ( J ( Jgi~984927 Jneomycin phosphotransferase [Cloning vector ( 100 319 1 2 205 pBSL991 ( ( J

1________,____1_______,_______1_______________ _,____________________________________________________________________________, ________,_________1_________, ( ( ( J (pir(F60663(F606(translation elongation factor Tu -Streptococcus( 99 260 1 2 1138 oralis ( J

J

1________,____,_______,_______,_______~-______ _,____________________________________________________-______________________..,________,_________,____--___y ( ( ( ~ Jgi(1574495(hypothetical (Haemophilus influenzae] J 98 25 2 486 1394 ( ( J

,____..-__1____,_______,_______,-_______________~_._____-_______-_________________________________________________--_______--_,________,_________,_________, J J J J (giJ310627 phosphoenolpyruvate:sugar phosphotransferase ( 98 94 2 685 1002 system HPr (Streptococcus ( ( =

J J ( J J mutansl J ~
J

________,____,_______,_______,________________,________________________________ ____________________________________________,________~_________,_________, ( ( J ( (gi(347999 (ATP-dependent protease proteolytic subunit ( 98 312 1 190 2 [Streptocochus salivariusl ( J

( ________,____,_______,_______,________________+________________________________ ____________________________________________,________,_________,_________1 ( ( ( ( (9i(924848 (inosine monophosphate dehydrogenase [Streptococcus( 98 329 1 1 807 pyogenes] ( J

J

,________1____,_______,_______,________________,_______________________________ _____________________________________________,________,_________,_________1 ( ( J ( i 987050 lac2 ( 336 2 290 589 (g ( J gene product (unidentified cloning vectorl ~ 98 ( J

1________1____,_______,_______,________________,_______________________________ ______________..___-_________;_~______________f________1_________,_________y .

( J ( ( (gi~153755 (phospho-beta-D-galactosidase (EC 3 J 97 85) (Lactococcus lactis cremoris) J

. q19 . J
.

________1____,_______,_______,________________,________________________________ ____________________________________________ i________i_________,_________, ( ( ( ( (9i(347998 (uracil phosphoribosyltransferase (StreptococcusJ

312 2 1044361 salivarius] ( ( ( ,________,____,_______1_______,________________,_______________________________ _____________________________________________v________~_________a_________, J ( ( ( (sp(P37214(ERA_S(GTP-BINDING PROTEIN ERA HOMOLOG. J

( ( 1________1____1_______,_______,________________,______________________-_____________________________________________________ f________4_________,_______-_, J ( J J Jgi(153615 (phosphoenolpyruvate:sugar phosphotransferase( 96 92 1791 O1 ,r 94 3 951 2741 system enzyme I [Streptococcus J
( ( J J ( ( ( J salivarius) ( ( ( ( ~O ~o ,________1____,_______,_______,________________,__ ______________________________________________________ _____ _____ __,__ __,__ __, ( ( J ( (9i(581299 Jinitiation factor IF-1 (Lactococcus laccis) ~ 96 127 1 1 168 ( ( ( ,_____-__1____1_______,_______,________________,______________________________________ ______________________________________ y________,_________,_________, O

J J19(10438J11154(9i(1276873JDeoD [Streptococcus thermophilus) ( J

( ________,____,_______,_______1_______________ _,____________________________________________________________________________1 ________,_________,_________, J ( J J Jgi(46606 JlacD polypeptlde (AA 1-326) (Staphylococcus J 96 181 4 13621598 aureusl ( J

J

,________,____,_______,_______,_______________ _,____________________________________________________________________________, ________,_________,_________, J ~ ( ~ JgiJ1743856~intrageneric coaggregation-relevant adhesin J 96 218 1 1 834 (Streptococcus gordonii] ~

~
g34 J

1________,____,_______,_______1________________t_______________________________ _____________________________________________ ,___-____1_________4_____-___;

( ( ( ( Jgi(208225 Jheat-shock protein 82/neomcyn phosphotransferaseJ

319 2 115 441 fusion protein (hsp82-neo) J
~ J

( ~ ( ( ( ( [unidentified cloning vector] ( J
( ( (________,____1_______,_______,________________,_______________________________ _____________________________________________ ,________4______-__~_________, ( J12~ J10967(gnl(PID(d100972JPyruvate formate-lyase [Streptococcus mutans]( 95 54 B622 ( ( ( 1________1____1_______1_______1________________,_______________________________ _____________________________________________ 1________,_________1______.___, ( ( J ( (9i(149396 JlacD (Lactococcus lactisl 181 2 606 1289 ( 95 ( ( ( ,________,____,_______,_______,________________,____-_______________________________________________________________________,_______ _,_________,_________, ( ( ( J Jgi(1850606(YlxH [Streptococcus mutans] J 94 ( J

,________,____,_______,_______,_____________...__,___--_________________________________________________________________________ "d __.-_ n _-___ _ ( J10( ( Jgi~703442 Jthymidine kinase [Streptococcus gordonii]

89 79727337 ( 94 r J j J

( ,________,____a_______,_______,________________1-_____________.._.______________________________________________________________ _ _ __,__ ___ __,_________y J ( ( ( (9i(995767 ~UDP-glucose pyrophosphorylase [Streptococcus( 94 148 9 64317354 pyogenes] J

( ( ,________,____t_______1_______1________________,_______________________________ _____________________________________________,________,_________,_________, ~p J ( ( ( JgiJ153573 (H~ ATPase [Enterocoecus faecalis] J 94 s 160 7 44305B48 ~

J

( 1________~____1_______1_______,________________1_______-__________________________________________________________________ _____ __ rr _ _____4_________, J ( ( ( ~giJ153763 Jplasmin receptor (Streptococcus pyogenesl ( 93 2 3 45983513 ( J

J

,________,____,_______1_______1________________,____________~__________________ _______________-__________________ _ ,________+_________,_________, __ ' ( J J ' (giJ1103865~formyl-tetrahydrofolate synthetase [StreptococcusJ

12 8 78776204 mutans] J

J

J

________,____,_______,_______.________________,________________________________ ___________________________________-________,________,_________,_________, S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ,________+____,_______+_______ ,________________,_____________________________________________________________ _______ ________+________+_________+_________+

( IORFI ( I match I match gene name I E sim $ identlength Contig StartStop I
I I

( 11DI ( ( acessionI

ID (ntl tnt) 1 I
( (ntl to ________+____+_______,_______+________________+________________________________ ____________________________________ ________+________,___ ______+_________+ pp ( I11( ( (9i140150 (L14 protein (AA 1-122) (Bacillus subtilisl ( I

+________~____+_______+_______+________________+_______________________________ _____________________________________ ________+________+_________,_ ________+

( ( ( I (g1(47341 lantitumor protein (Streptococcus pyogenes]( 93 I

I I

r.
~________,____+_______+_______+________________+_______________________________ ____-________________________________ ________,________+_________+_ ________+

I ( I 1 IgnIIPIDId101166(ribosomal protein S7 (Bacillus subtilis]I 93 ( ( +________+____,_______+_______+________________+_______________________________ _____________________________________ ________,________+_________+_ ________+

( I I 1 19i1142462(ribosomal protein S11 [Bacillus subtilis)( 93 ( ( ( ,___-____+____,_______,_______+________________+____________-___________________________________________________--__ ________+_-______+_________+_ ________, ( I ( I (9i11773264IATPase, alpha subunit [Streptococcus 1 93 I

160 5 1924 3962 mutans]
( I

,________,____,_______+_______+________________+_______________________________ ____________--__--___________________________+________+_________+_ ________+

( 1 1 1 (9i1535273(aminopeptidase C [Streptococcus thermophilus)1 93 I I

,________,____+_______+______-,________________,______________________________________________________~______ _______ ________+________,_________+_ ________, I I I 1 19i1149394IlacB [Lactococcus lactis] 1 93 1 ( 1 +________,____+_______+_______+________________+_______________________________ ______________________-______________________+________+___ ______+_________+

( I I I (9i1295259Itryptophan synthase beta subunit [Synechocystis( 366 1 197 3 sp.l I ( _ ' (___ ,____+
,_______+________________,_____________________________________________________ _______________ ______ ____ ___,_ ____ _______ ____ _ __,_ _ __+__ __,__ __+

I I ( I (9i11574496(hypothetical [Haemophilus influenzae] I 92 ( I I

________,____,_______,_______+________________f________________________________ ____________________________________________,________+_________,_ ________+

I 121I20781119927( (h 36 i1310632 d o hobic emb [St t i t d ii 9 y I 92 I 86 8S5 p I ( r m rane pro n ococcus gor e rep on ]

,________,____,_______+___..___,________________,_____________________________.
.________________.._____________________________+________,_________+_ _.,.______+ J

I ( I I (9i1149396IlacD [Lactococcus lactis]

181 3 1265 1539 ( 92 I

I I

________,____,_______,_______,________________+________________________________ ____________________________________________,________+___ ______,_________+

N

I I I I 19i(149410lenzyme III [Lactococcus lacy isl I 92 I
B3 399 o I I

,________,____,_______,_______,________________,_______________________________ _____________________________________ ______ _____ ____ _____ _ J
__+__ __+__ __+
__+_ ( ( ( ( IgnIIPiD1e294090Ifibronectin-binding protein-like protein( 91 32 4 5631 3937 A [Streptococcus gordoniil I ( ________,____+_______,_______,________________+________________________________ ___________________________________-________,________+___ ______,_________+
~o 1 I I 1 19i11850607(signal recognition particle Ffh [StreptococcusI

46 2 3054 1462 mutans]
I I

~________,____+_______+_______+________________+_______________________________ _____________________________________ ________+________~___ ______+_________+ p I I10I I IpirIS178651S17B(ribosomal protein S17 - Bacillus stearothermophilus( 91 I 80 2B5 ( ( +________,____,_______,_______+________________+_______________________________ _____________________________________________+________~_________,_ ________+ N

1 I I I (9i1287871IgroEL gene product [Lactococcus lactis) ( 91 ( 1 ( ,________,____,_______+_______,________________,_______________________________ _____________________________________ ________+________,___ ______+_________+

I I I I (9i1871784IClp-like ATP-dependent protease binding I 91 I

84 1 2 20S6 subunit [Bos taurus]
I ( ,________+____y_______+_______,________-_______,_______________________________________________________________________ _____+________+_____..__+_ ________+

( I 110750I 19i1153740(sucrose phosphorylase [Streptococcus 1 91 ( 99 8 9272 mutansl ( 1 ________+____~_______+_______,________________+__________:_____________________ ____________________________________ ________+________+___ ______,_________, I I 11194711107219i1153739(membrane protein [Streptococcus mutans] I 91 I

( 1 (________,____+_______,_______+________________,_______________________________ _____________________________________________+________+___ ______+_________+

I I I I IpirIS072231R5BS(ribosomal protein L17 - Bacillus stearothermophilusI 91 I 78 405 ( I

,________,____+_______,_______+________________+_______________________________ _____________________________________________+________+__- ______+_________, I I 1 1 (9i1143065Ihubst [Bacillus stearothermophilus] I 91 1 I I

+________+____+_______+_______,________________,_______________________________ _____________________________________________,_._______+___ ______,_________+

I I ( 1 IgnlIPIDId100347INaa -ATPase beta subunit [Enterococcus 1 91 I 79 13g9 137 8 4765 6153 hirae]
( ( ,________+____,_______,_______+________________+_______________________________ ___________________________________________ ____ _____ I ( I111191 (9i11815634Iglutamine synthetase type 1 [Streptococcus____ _ 151 7 9734 agalactiae[ + 91 '+
+ +
I I I I

________,____,_______,_______+________________+________________________________ __________________________________________ ___ ______+_______-_+
__+__ ___+___ I 1 I I 1g112208998Idextran glucosidase DexS [Streptococcus I 91 ( 201 2 1798 278 suis]
I I

,________,_-__,_______,_______+________________+___________________________________________ _________________________________,________+___ ______+_________+

1 1 ( ( (9i1153741(ATP-binding protein (Streptococcus mutans]I 91 ( I l167 I

+________,--__,_______+_______,________________+__________________.______._________________ _______ _ _ __,________,___ .._____+_________+

( I I ( (9i11196921(unknown protein [Insertion sequence IS861]( 91 ( 71 288 pp I ( +________+____,_______+_______+________________,_______________________________ _____________________________________________+________+___ ______+_________+

I I 1 I Ipir1A369331A369Idiacylglycerol kinase homolog -Streptococcus( 90 ( 77 405 32 7 6166 6570 mutans ( I

,________,____,_______,_______+________________+_______________________________ _____________________________________________,________+___ ______+_________+

TABLE 2 S. pneumoniae - Putative coding regions of novel proteins Similar to known proteins ________,____y_______, _______ ________________ _ _______________________________________________________________ ___ ___ y _ __ _ __________y_________y______ Contig~ORF ~ ~ ~ ~
~ $ ~ $ ~ length Start Stop match match sim ident gene name ID SID ~ ~ ~

(nt) (nt) acession /ntl (________y___________y_______y________________ ____________________________________________________________ ________________i________i____________ !
i__ 0~0 33 ~ ~ 4 (gi~1196921 unknown ~ 90 ~ rte..
2 841 S27 protein 70 ~

(Insertion 315 ~
sequence IS861) ________,____y_______ ,_______,______________________________________________________________________ ______ ________________,________,_________,_________y 00 48 Q27 20908 A ~gnl~PID~e274705 lactate ( 90 ~ W
9757 oxidase 80 ~

[Streptococcus 1152 iniae) ~

,________,____ y_______,______ _,___--_______________________________________________________________________ _____________---y----___..__________________ r~

55 Q21 A 18515 ~gnI~PTD~e221213 ~CIpX
~ 90 ~
9777 protein 75 ~

[Bacillus 1263 subtilis) ________,____ ,_______ ,______ _+________________y____________________________________________________________ ________________;__________________________, 56 ~ ~ ~ ~gi~1710133 ~flagellar ~ 90 ~
2 717 977 filament 50 ~

cap 261 [Harrelia burgdorferi) ,____________ ,_____________ _-____-__________y____-______________________________-_____-___,.______________ ________________________y____._____y_________, 65 ~ ~ ' ~gi'1165303 ~L3 1 1 606 [Bacillus 90 ~ 75 subtilis) ~ 606 y________,____ ,_______ y______ _y________________y________-_.._________________________________________________ ________________y________y__________________, ( ~ ~ ~ ~gi~153562 ~aspartate eptococcus90 80 987 114 1 2 988 beta-semialdehyde ~
dehydrogenase (EC
1.2.1.11) (Str ( ~ ~
routans) ________,____ _______ ______ _,________________ ,____________________________________________________________ _________-______y________y_________y_____--__y 120 ( ~ ~ (gi'407880 'ORF1 ~ 90 ~
1 1345 B27 [Streptococcus equisimilis) 519 ' ,____________ _____________ _________________________________________________..___________________________ ________________f________a__________________y 159 Q12 F ~ ~gi~143012 ~GMP
~ 90 ~ y 7690 8298 synthetase 84 ~

(Bacillus 609 ( subtilisl (________,____ ,_______ ,______ _,________________ ,____________________________________________________________ ________________y________,_________,_________, I66 4 4076 3282 i high o 1661179 affinit treptococcus90 78 g branched ~ ~

~ chain amino acid traps ort y p protein (S

' mutarts) ________,____ ,_______ ,______ _,________________ ,____________________________________________________________ ________________,________~_________,_________ ~1 183 ~ ( ~ ~gi~308858 ~ATP:pyruvate ~ 90 ( 1 28 1395 2-O-phosphotransferase 76 ~
(Lactococcus 1368 lactis) ~

________,____ ,_______ ,______ _,________________,____________________________________________________________ ________________,________r_________,_________, I91 ~ ~ ~ ~gi~149521 ~tryptophan ~ 90 ~ o 3 2891 1662 synthase 78 ~

beta 12J0 subunit ~
[Lactococcus lactis) ________,____ ,_______ ,______ _y________________ ,____________________________________________________________ ________________,____.___ _ , _________________y 198 ~ ~ ~ ~gt~2323342 ~(AF014460) 2 15S1 436 CcpA 90 ~ 76 yo (Streptococcus ~ 1116 mutans) ,________y____ _______ ,______ _y________________ ,____________________________________________________________ ________________,________,_________y_________ 305 ~ y ~ ~gi~1573551 ~asparagine ~ 90 ~
1 37 783 synthetase HO ~

(asM) lHaemophilus influenzae) ________,____ ,_______ ,______ _,________________ ,____________________________________________________________ ________________,________,_________,_________, o 8 ~ ~ ~ ~gi~149434 putative ' 89 ~
3 2285 3343 [Lactococcus 78 ~
lactis) 1059 ________,____ ,____~.__ ,______ _________________ y________________________________________..,___________________ ________________,________,__________________y 46 ~ ~ ~ ~pir~A45434~A454 ribosomal ~ 89 ~
8 7577 7362 protein 76 ~

-Bacillus stearothermophilus ________,____ ,_______ ,______ _f________________ ,____________________________________________________________ ________________y________,_________;_________y 49 ~ ~ A ~gi~153792 (recP

9 8363 0392 peptide 89 ~

[Streptococcus ~ 1980 pneumoniae) ________,____ _______ ______ _,________________ y_____________________________________..______________________ ________________________,__________________y S1 ~14 A A ~gi~308857 ~ATP:D-fructose lactis) 8410 9447 6-phosphate ~ 89 1-phosphotransferase ~ 81 (Lactoccccus ~ 103B

(____________ _______ ,______ _,________________ y__________~___________________________________.._____________ ________________,________y_________,_________y 57 ~L1 ~ (10669 ,gnl(PID~d100932 (ti20-forming ~ 89 ~
9686 NADN 77 ~

Oxidase 984 (Streptococcus mutans) ____________ _______ ,______ _________________ ____________________________________________________________ ________________,________y_________,_________, 65 ~ ~ ~ ~gi~1165307 (S19 241B 2786 /Bacillus 89 ~ 81 subttlis) ~ 369 ________,____ ,_______ ,______ _,________________ y____________________________________________________________ ________________________y_________y_________, 65 ( ~ ~ ~sp~P14577~RL16_ 50S

8 3A06 4225 RIBOSOMAL 89 ~

PROTEIN ~ 420 L16. ' ,________,____ ,_______ ,______ +________________ y ,____________________________________________________________________________ _ __,_________,_________ 65 Q18 ~ ~ ~gi~143417 ribosomal ( 89 ~
B219 8719 protein 76 ~

SS 50l (eacilius stearothermophilus) ________,____ ,_______ ______ _,________________ _________________________________________ ___ ___ _ __ 73 ~ ~ ' ~gi~532204 _ ,________9 633i 5315 ..,________________ _ ,____ ,..______ ~______ ________________________,________,_________,_________y ~prs (Listeria monocytogenes) ~

~

~

___________..___________________________________________-____________ __ _ ( ~ ~ ~ ~gnl~PID~e200671 ,lepA
_______________________________y 76 3 3360 146S gene ~ 89 ~
product 76 ( [Bacillus 1896 subtilisl y____________ ,_______ ,______ _________________ ,____________________________________________________________________________,_ ________________4_________ 99 ~10 (12818 11919 ~gi~153738 membrane ( 89 ~
protein 73 ~
[Streptococcus 900 mutansl ________,____ ,_______ ______ _,________________ ___________________________________________________ _ ~
_________________________________,_________ I20 ~ ~ ~ ~gi~407881 _______ ~ 89 ~
2 3552 1300 stringent 79 ~

response-like 2253 protein (Streptococcus equisimilis) ~____________ _______ ______ _~________________ ,_____________________________________________________________ _______________,_________________y_______-_y 122 ~ ~ ~ ~gnl~PI0~e280490 unknown ~ 89 ~
5 4512 2791 [Streptococcus 81 ( pneumoniae) 1722 ,________,____ _______ ,______ _________________ _____________________________________________________________ _______________________y________-y_________y S. pneumoniae - Putative coding regions of novel proteins simLlar Co known proteins ,________,____ ,_______+_______, ________________y____________ Contig ~ ________________________________________________ ~ORF Start __ ________ ID ~ match ~

SID Stop match _____,____ ~ gene name ~ ~ 4'--------~---------r (nt) acession ~ ~ ~ B

(nt) sim ~

~ t ident ,_______,_______, ~ length I ( ~ (nt) _____ ___________,________ 176 ! ~
______________________________________________________________ ________1 669 _______ 177 ,____~ ~gi~47394 ~ 4 ~5-oxoprolyl-peptidase 6 ,_______,______ (Streptococcus ~ pyogenes]

30S0 , ~ 89 3934 ~

~

_,________________,____________________________________________________________ ________________~________,_________,_________, ~gi~912423 putative (Lactococcus lactis]

( ~

~

______________~_________________________._____________________ 1A1 ~ ~ __4________~_________+_________E

~ 40335751 ~gi~149411 8 enzyme III

(Lactococcus lactis]

~

~

~

_____y____ ~_______,_______~________________E______________________ _______~________~-________~_________t 211 ~ ~ ______________________________ ~ 89 ~ 83 ~ 357 ~ 31492793 ~gi~535273 4 ~aminopeptidase C

(Streptococcus thermophilus]

~________~____~_______,_______~________________~_______________________________ ______________________________________ _______~________~_________~_________s 361 ~ ~ ~ ~gi~1196922 ~ 89 ~ 70 ~ 408 1 431 838 unknown protein (Insertion sequence IS861]

________i____v_______,_______~________________~___________________________-_________________________________________ _______~________~_________~_________t 34 Q17A1839~10535~sp~P30053~SYH_S
S). ~ 88 ~ 78 ~ 1305 ~HISTIDYL-TRNA

SYNTHETASE

(EC

6.1.1.21) (HISTIDINE--TRNA

LIGASE) (HISR

________,____ ,_______,_______,_______________ _i_________________________________________________ _____ ~_____________ _______~________~_________ 38 ~ ~ putative ABC transporter subunit ComYA ~ 88 ~ 78 ~ g78 ~ 16462623 [Streptococcus gordonii) 3 ~gi~2058544 ~________,____~_______,_______f________________a_______________________________ ______________________________________ _______i________y_________~_________, 54 ~ ~' ~ ~ 88 ~ 66 ~ 225 _____ 1 3 227 _______,________~_________ 57 ,____,_______~gnI~PID~d101320 ~ 88 ~ 75 ~ 858 ~ ~ ~YqgU

2 611 (Bacillus subtilis) ,_______,________________,_____________________________________________________ ________________ ~

~gnl~PID~e134943 putative reductase (Saccharomyces cerevisiae]

________,____ ,_______,_______,________________,_____________________________________________ ________________________ _______,________,_________,_________, 65 ~ ~ ~ 88 ~ 75 ~ 573 ~13 54976069 __ ,_______~pir~A29102~R5BS
_______~________~_________t_________;
ribosomal ~ 88 ~ 83 ~ 471 65 ( protein ________,_ -bacillus steerothermophilus ,_______,________________,_____________________________________________________ ________ ________ ~

~gi~2078381 ribosomal protein (Staphylococcus aureusl ___,_______,_______,________________,__________________________________________ _____________-_____________ _______,________,_________,_________, 78 ~ ( ~ ~ B8 ~ 80 ~ 2529 ________3 36J61108 N
,____,_______~gnl~PID~d100781 _______,________,_________,_________, 106 ~lysyl-aminopeptidase ~ 88 ~ 72 ( 912 Q12A (Lactococcus 2965lactis]

,_______,________________,_____________________________________________________ _______ ___ _ _____ ~gi~2407215 ~(AF017421) putative heat shock protein HtpX

[Streptococcus gordonii]

__ ,_______,_______,________________ ,____________________________________________________________________________~_ _______E_________ 107 ~ ~ ~ putative acylneuraminate lyase (Clostridium~ 88 ~
75 ~ 744 2 2i9 962 tertium) (gnl~PID~e339862 ________,____,_______,_______,________________,________________________________ _____________________________________ _______,________,_________,_________, 111 ~ A 10420~gi~402363 ~ 88 ~ 74 ~ 3654 polymerase beta-subunit (Bacillus subtilis]

,________~____,_______,_______~________________~_______________________ _______~________~_________y______ 126 ~ 13096A2062__-_-_____________________________________ ~ 88 ~ 74 ~ 1035 9 ~gnl~PID~e311468 unknown (Bacillus subtilis) ________,____,_______,_______,________________ ,____________________________________________________________________________,_ _______~_________~_________, 140 Q17A 18B74~gi~1573659 ~N. influenzae predicted coding region ~ 88 ~ 61 ~ 270 9143 W 0659 (Haemophilus influenzae) ,________,____,_______,_______,________________ ,______________________________________________________,_______y________~______ ___ 144 ~ ~ ~ ~gnl~PID~e274705 ______________ ~ 88 ~ 75 ~ 162 1 394 555 lactate oxidase (Streptococcus iniae) _____ ,____~_______,_______~________________,__________ 148 ~ ~ __ 4 2723______________ 160 __ ,______________________y________+_________t______ ~ ~ ~

_ ~gi~1591672 phosphate transport system ATP-binding protein lMethanococcus jannaschii) ~

~

~

,_______~________________~___ __i________~_________~_________y ~

~gi~1773267 ~ATPase, epsilon subunit (Streptococcus mutans]

~

~

~

_ p______,_______,________________ ~__..____________________________________________________ 177 ~ ~ ~ ______________ _______,________,_________ 4 17702885 putative (Lactococcus lactis] ~ 88 ~ 72 ~

________ ~gi ,____,______199926 _ ,_______,________________ ~_______________________ _____________________________________~________~_________ 211 ~ ~ ~ ~aminopeptidase C (Streptococcus thermophilus]~ 88 ~ 74 ~ 528 ________6 41403613 ,__________________ ,____,_______~gi~535273 _ ,_______,________________ __ _______a________,_________,_________, 231 ~ ~ ~ ~gi~40186 ________________________________________________~ 88 ~ 7g ~ 37g 4 580 957 homologous to E.coli ribosomal protein L27 [bacillus subtilis]

,________~____y_______,_______ ,________________ ,___________________________________________________________ 260 ~ ~ ~gi~1196922 _ __~________y_________ ,________5 23B7 ,________________ unknown protein (Insertion sequence IS861]
,____~ ~ 88 ~ 69 ~ 612 291 2998 ~gnl~PID~d100571 ~___ _ _ _ ___ _ _ _ _______________________ pp ~ ,_______,_______ ____f_________~_________y ________6 ,________________ ~adenylosuccinate synthetase (Bacillus I 319 ( ~gi~603578 subtilis) ,____2017 88 ~ 75 ~ 1359 i ~ ~ ~____________ _ __,________~_________~_________, 4 3375 ~serine/threonine kinase (Phytophthora capsici] ~ 88 ~ 88 ~ 342 ~
,_______,_______ ~

~

________,____,_______,_______,________________ ~____________________________________________________________________________t_ _______~_________~_________~
40 ~ ~ ~ ~gi~153672 lactose repressor (Streptococcus mutans]

________5 93S34514 ~ 87 ~ 56 ~ 162 ,__ _ _ ,_______,_______4________________ ,____________________________________________________________________________~_ _______~_________~_________f S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________,_________, 1 ~ORF( 1 1 match 1 match gene name 1 ~ sim 1 i 1 ( Contig StartStop ident length 1 IID1 1 1 acession 1 1 ID (nt)(nt/
( t) n I

,________,____,_______ ,_______,________________,_____________________________________________________ _______________________,________1_________,_________, pp 1 I10I1066011092919i11196921lunknown protein [Insertion sequence 1 I

________,____,_______,_______,________________f________________________________ ____________________________________ ________1________,_________a_________, 1 1 I 1 19i111653091S3 (Bacillus subtilis/ I

I

,________,____,_______f______ _,________________,_________________________________________.__________________ _________ ________1________,_________,_________, I 115( 1 19i11044978(ribosomal protein SB [Bacillus subtilis)I

4i7 I

,________;____,_______,_______,________________ ,_____________________________.,______________________________________________+
________,_________,_________, I 1 1 1 19i11877422Igalactoklnase (Streptococcus mutans/ 1 75_ 8 54116625 87 ,________i____,_______,______ _,________________,____________________________________________________________ ________________a________1_________,_________, 1 1 1 I IgnlIPIDId101166(elongation factor G [Bacillus subtiiis)( I

I

________,____,_______,______ _,________________,____________________________________________________________ ________________1________1_________,_________, 1 I I ( 19i11196921lunknown protein [Insertion sequence 82 1 541 248 IS861) ( I

________,____,_______,_______,________________,________________________________ ______________________~_____________________a____..___1_________,_________, 1 123125O33123897IgnlIPIDIe254999Iphenylalany-tRNA synthetase beta aubunit1 140 [Bacillus subtilis) 87 i137 ,________,___-,______-,_______,________________,_________________________________-,_-_____________________________________---,________,-,________,_________, 1 I1410441I 19i12281305(glucose inhibited division protein homolog1 87 214 8516 GidA ILactococcus lactis ( I I

I I ~' I ! [ cremorts] I I

________,____,_______._______,________________,________________________________ ___________________________________________,________,_________,_________, I I 1 I IgnllPiD1e324358(product highly similar to elongation 1 220 2 2742874 factor EF-G (Bacillus subtilis) 87 I

,________,____,_______,_______,________________,_______..______________________ ______________________________________________,________1_________1_________1 N

1 I 1 I 1g111196921lunknown protein [Insertion sequence 1 260 4 20962389 IS861] 87 I

I

w.
,________,____,_______ ,_______,____________-___,___________________________________________________________________________ _F________1_________,_________, I I I 1 1g11897795 1305 ribosomal protein (Pediococcus acidilactici/( N

I

________,____,_______,_______,________________,______..________________________ _____________________________________________,________1_________,_________, I 1 ( 1 19i11044978(ribosomal protein 58 [Bacillus subtilis) w] w..

( I

,________,____,_______,_______,________________,_______________________________ _____________________________________________,________,_________,_________1 W ~O

1 11111092711194519i11196922lunknown protein [Insertion sequence I

I

________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________,_________v 1 112I 1 19i1951051 Irelaxase [Streptococcus pneumoniae/ 1 ________,____,_______f_______a________________?________________________________ ____________________________________________,________,_________,_________, I I I 1 Ipir1A027591R5BS(ribosomal protein L2 - Bacillus stearothermophilus1 la I

,___.____,____,_______a_______,________________,_______________________________ _____________________________________________,________1_________,_________, 1 1231i095711l6101g1144074 ladenylate kinase /Lactococcus lactis] I

I

,________,____,_______,_______,_______~.________+________________________-________-__-__________-____________________________,________1_________,_________, 1 I I 1 19i1153745 Imannitol-specific enzyme III [StreptococcusI

82 4 43744856 mutans/ g6 I

________,..___,______._______,________________,________________________________ ____________________________________________,________,_________y_________, 1 1 1 1 IgnIIPID1e264705IOMP decarboxylase [Lactococcus lactisl 1 ,________,____,_______,_______,________________,_______________________________ _____________________________________________4________,_________,_________, I ( 1 1 IgnIIPIDle137598laspartate transcarbamylase [Lactobacillus1 106 6 782468B0 leichmannii/ 86 I

I

,________,____,_______,_______a________________,__,.___________________________ ______________________________________________,________1_________f_________, ( 1 1 1 IgnIIPIDle339862(putative acylneuraminate lyase [Clostridium( 10? 1 1 273 tertium/ 86 ________,____,_______,_______,________________,________________________________ ____________________________________________r________1_________,_________1 1 I 110432I IgnIIPIDle228283(DNA-dependent RNA polymerase [Streptococcus1 ,b I11 7 6710 pyogenes) 86 I

________,____,_______,_______,________________,________________________________ ___________________________..________________~________,_________,_________, I 1 1 ( 19i11661193Ipolipoprotein diacylglycerol transferase1 I31 9 57044892 (Streptococcus mutans) 86 ~ 1 8i3 I

,________, ,_______a_______,________________,_____________________________________________ _______________________ ____ ___ __,________,_________1_________, 1 1 1 1 1 ( 134 7 643079B0i12388637 l cerol kinase (Enteroc f li i 9 g I

y 86 occus I
aeca 73 s 1 .________,____,_______,_______1________________1____________ ___ ____ ____________________________________,________,_________,_______--, I 111I 1 19i11591731Imelvalonate kinase [Methanococcus jannaschii/1 (________,____,_______,_______,________________,_______________________________ _____________________________________________,________,_________,_________, 1 1 I 1 19i12160707Idipeptidase [Lactococcus lactisl I

________,____f_______,_______,________________ ,____________________________________________________________________________,_ _______,_________,_________~

I I 1 1 (9i11857246(6-phosphogluconate dehydrogenase [Lactococcus1 l54 I 2 I435 lactis/ 86 ( I

,________,____ ,_______,_______,________________ ,____________________________________________________________________________,_ _______,_________a_________a S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ~________~____4______-4_______4________________4_____________________________________________________ _---___________________4________ 4_________4_________4 Contig~ORF~ , ~ match ~ match gene name ~
( 8 , length StartStop 9< Ldent sim ID SID~ ~ ~ acession Int)(nt) (nt) 4________,____4_______ 4_______4_______________ _4____________________________________________________________________________I
________ I_________!__ -__; 00 161 ~ ~ ~ ,gi~47529Unknown (Streptococcus salivatiusj , ( 66 ~ 126D

4________4____4_______4_______4_______________ _4____________________________________________________________________________;
________ ,_________;_________, 00 184 ~ ~ ~ ~gi~642667~NADP-dependent glyceraldehyde-3-phosphate ~
~ 73 ~ 1982 W
1 2 1483 dehydrogenase (Streptococcus 86 ~

( ~ , ~ ~ ~ mutansl ________4____4_______4_______4_______________ _4____________________________________________________________________________, ________ 4_________4-________;

210 ~ ( ~ (gi~153661~translational initiation factor IF2 (Enterococcus~
~ 76 ~ 2913 8 36S96571 faeciuml 86 ________4____4_-_____4_______4_______________ _4_______________..____________________________________________________________ 4________ 4_________4_________4 250 ~ ~ ~ ~gi~1573551~asparagine synthetase A fasnA) (Haemophilus ~
~ 68 ~ 186 1 2 187 influenzaej 86 ________4____4_______4_______ 4________________4_____________________________________________________________ _______________4________ 4_________4_________4 36 ~ ~ ~ ~gi~2149909~cel1 division protein (Enterococcus faecalisl~
~ 73 ~ 1266 4________4____~_______4_______4_______________ _4____________________________________________________________________________4 ________ 4_________f_________4 38 ~ ~ ( ~9i~2058545putative ABC transporter subunit ComYB
[Streptococcus~ ~ 72 ~ 111J
4 2475J587 gordoniil 85 4________4____4_______4_______4________-_______4_______________________________________________________________________ _____4________ 4_________4_______.._4 38 ( ~ ~ ~gi~2058546~ComYC [Streptococcus gordonii) 5 35773915 85 ~ 80 ~ 339 4________4____4_______4_______4________________4_______________________________ __________.._________________________-________4________ 4_________;_________;

57 ~ y ~ ~gnl~PID~d101316~YqfJ (Bacillus subtilisj ~ ~ 72 ~ 993 ________,____4_______4_______4________________4________________________________ ____________________________________________4________ 4_________4_________, 82 ~ ~ ~ ~gi~153746~mannitol-phosphate dehydrogenase (Streptococcus~
~ 68 ~ 1140 o S 49156054 mutansj 85 ~

4________4____4-_-____4_______4________________4______________________________________________-____-______________________4 N
_ __ _ _ _ 4_________4_________4 87 ~15'1469015793~gi~143371~phosphoribosyl aminoimidazole synthetase ___ ~ 69 ~ 1104 (PUR-M) IHacillus subtilisj _ ~
~

4________4____4_______4_______4________________4_______________________________ _____________________________________________4________ 4_________4_________4 H"
87 ~ ~ ~ ~gi~1184967~ScrR [Streptococcus mutansl 2 1417238A BS ~ 69 ~ 972 ~

N

4________4____4_______4_______ 4________________4_____________________________________________________________ _______________4________ 4_________4_________4 0 10B ~ ~ ~ ~gi~153566~ORF (19K protein) (Enterococcus faecalisj ~
~ 67 ~ 489 J

~

4________4____4_______4_______4________________4_______________________________ _____________________________________________4________ 4__..______4_________4 127 ~ ~ ~ ~gi~10449B9ribosomal protein S13 iBacillus subtilis) ~
~ 72 ~ 3B1 ~o ~

4________4____4_______,_______4_______________ _4___________________________________________________________________-________4________ 4_________4_________4 1Z8 ~ ~ ~ ~ (tetrah 85 ~ 7 3 15342409 i~1685110drofolate deh dro enase/c cloh drolase (Str to u hil th ) g y ~ 1 ~ 876 o g ~
y y y ep cocc s ermop us 4________4____4_______4_______4________________4_______________________________ _____________________________________________i________ 4_________4_________4 1I7 ~ ~ ~ ~gnI~PID~d100347~Na4 -ATPase alpha subunit (Enterococcus hirae)~ ~ 74 ~ 1806 4________4____4_______4_______4________________4____-_______________________________________________________________________4_______ _ 4_________4_________4 170 ~ 4 ~ ~gnl~FID~d102006~fA80014881 FUNCTION UNXNOWN. SIMILAR PRODUCT

2 2622709 IN E.COLI. H. INFLUEN2AE AND

i ~ i NEISSERIA MENINGITIDIS. [Bacillus subtilisj i 4________4__.._4_______4_______4________________4._____________________________ _______________________________________________4________4_________4_________;

1B7 ~ ~ ~ ~gi~727436putative 20-kDa protein iLactococcus lactis) ~
~ 65 ' 627 ' 4________4____4_______4_______4________________4.._________r_______________..__ _______________________________________________4________ 4_________4_________;

233 ~ ~ ~ ~gi~1163116~ORF-5 [Streptococcus pneumoniael ~
~ 67 ~ 1146 4________4____4_______4_______4_______________ _4____________________________________________________________________________4 _______._ 4_________4_________4 ( ~ ~ ~ ~9i~2293155~(AF008220) YtiA (Bacillus subtilisj ~ 61 ~ 294 4________4____4_______4_______4________________4_______________________________ _____________________________________________4________ 4_________4_________4 240 , ~ ( ~gi~143597~CTP synthetase (Bacillus subtilisj 1 3Q9 1931 BS ~ 70 ~ 1623 4______,._4____4_______4_______4________________4______________________________ ___-__________________________________________4________ 4_________4_________4 6 ~ ~ ~ lgi~508979~GTP-binding protein [Bacillus subtilisj ~
~ 72 ( 1323 b ~

4________4____4_______4_______4________________4__________-_________________________________________________________________4________ 4_________4_________4 ~ ( ~ ~gnI~PID~e339862putative acylneuraminate lyase [Clostridium ~
~ 70 ~ 933 4________4 43753443 4_______________tertium) 4____4_______4_______ _4____________________________________________________ ___ ___ _ __4________ 4_________4_________4 __ _____________ 14 ~ ~ ~ (gi~520753DNA topoisomerase I [Bacillus subtilisj ~
( 69 4 2031 , ________4____4_______4_______4________________4________________._______________ _____________________________________________4________ 4_________4_________+

19 ( ~ ~ ~gi~2352484~(AF005098) RNASeH II (Lactococcus lactisj ( ~ 68 ~ 801 ________4____4_______4_______4________________4________________________________ ____________________________________________4________ 4_________4_________4 ( Q17A 19687~gnl~PID~d100584cell division protein (Bacillus subtilisj ~ ~ 71 ~ 1968 4________4____4_______4_______4________________4_______________________________ _____________________________________________4________4___.._____ 4_________4 22 Q282172320884~gi~299163~alanine dehydrogenase (Bacillus subtilisj ~
~ 68 ~ 840 4________4____4_______4_______4________________4_______________________________ _____________________________________________4________4_________ 4_________4 S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ,________~____,_______ ,_______,________________,_____________________________________________________ _______________________,________,_________y_________y Contig~ORF~ ~ ~ match ~ match gene name ~
~ t StartStop 4 ident sim ~
length ID SID~ ~ ~ acession (nt) (nt) ~ (nt) ~
~

__ : ~O
_______ _____ _______________ _ __ , pp a _____ ___________________________________________________________________ a ___ __ v v , ' 30 ~10~ 6792~gnl~PID~d100296~ftuctokinase [Streptococcus mutans) ~

~

~

,________y____i_______y.______y________________,_________________________ _______,________y_________,_________, ___________________________________________ ~O
~

33 ~ ~ ~ ~gi~147194~phM protein [Escherichia colt[ ~
W

~

~

y________,____,_______,_______y________________y_______________________________ _____________________________________________,________;_________,_________, 36 Q22Q21551(20772~gi~310631ATP binding protein [Streptococcus gordonii]~

~

( (________y____,_______,_______,________________y_______________________________ _____________________________________________,________y_________,_________, ( ~ ~ ~ ~gi~882609~6-phospho-beta-glucosidase [Escherichia ~

48 4 2837 2505 colt) 84 ( ~

______,____,_______,_______,________________,__________________________________ __________________________________________,________,______--_,_________, 58 ~ ~ ~ ~gi~450849amylase [Streptococcus bovisl ~

~

~

________,____y_______,_______,________________,________________________________ ____________________________________________,________,_________,_________, 59 Q10~ ( ~gi~951053~ORF10, putative [Streptococcus pneumoniael~

~

~
d02 ________~____~_______,_______,________________y________________________________ _____________________v_______________ ______,________,_________,_________, 62 ~ ~ ~ ~gi~806487~ORF211; putative [Lactococcus lactic) ~

~

~

,________,____,_______i_______,________________ ,____________________________________________________________________________,_ _______,_________y_________, 65 Q17! ( ~g1~1044980ribosomal protein L18 [Bacillus subtilis) ~
C'1 ~

~

~

~________,____y_______,_______,________________,_______________________________ ___-_________________________________________,____-___y_________,_________y 65 Q21( 10397~ ~SecY

9S07 i~44073 rotein [Lactoc cc l ti ) g p ~

us 84 0 ac 68 o ~
c 891 ,________,____,_______,_______,________________~_______________________________ ____W._________________________________ ______,________,_________,_________, N

106 ~ ~ ~ ~gnl~PID~e199387~carbamoyl-phosphate synthase [Lactobacillus~

4 5474 2262 plantarum) 84 ~

~

, ________,____,_______,_______,_________-______,___ __,________,_________,_________, .__ ___.___ __ ' ' 159 ~ ' ~ jgi~806487~ORF211: putative [Lactococcus lactic) ~
J

~

~

~

(________~____,_______,_______,________________y_______________________________ _______________________________________ ______y________ _ N
_____ ___ _ _ __y__ _ __, o 163 ~ ~ ( (gi~2293164~(AF008220) SAH synthase [Bacillus subtilisl~

~

( ________,____,_______,_______,________________,____________________ _ __ __ _ __ ______,________,_________,_________) _ Ch ___ __ ____________________________________ 192 ~ ~ ~ ~gi~4950d6~tripeptidase (Lactococcus lactic] ~

~

~

,________,____,_______,_______,________________,_______________________________ _____________________________________________y________,_________y_________, 348 i i ; gi~1787753(AE000245) f346: 79 pct identical to 336 ~ 84 1 671 6 amino acids of ADH1_ZYMMO SW:

i P20368 but has 10 additional H-ter residues~ ~ ~
o [Escherichia cold I

________,____,_______,_______,________________,________________________________ _,.__________________________________________,________~_________,_________, 3 ~ ~ ~ ~gi~113766~IthrSvl (EC 6.1.I.3) [Bacillus subtilis] ~
N

~

( ~

________,____,_______,_______,________________,________________________________ ____________________________________________,________f_________,_________, 9 ~ I ~ ~9nl~PID~d100576single strand DNA binding protein [Bacillus( 6 3893 34l7 subtilis( 83 ~

~

,________,____y_______i_______,________________,_______________________________ _____________________________________________i________,_________,_________, 17 Q15~ ~ ~gi~520738~comA protein [Streptococcus pneumoniael ~

~

~

t________,____,_______y_______f________________y_______________________________ _______________________________________ ______~________,_________,_________y 20 Q12A 14144~gnl~PID~d100583unknown (Bacillus subtilis] ~

~

~

____ _y____,_______,_______,________________,_______________________________________ _______________________________ ______y________,_________,_____ 23 i i i igi~1788294~[AE000290) o238; This 238 as orf is 40 ~ 83 4 3358 2606 pct identical (5 gaps) to 231 ~

i residues of an approx. 248 as protein hia ~
YEBC_ECOLI SW: P24237 [Escheric colt) ________,____,_______+_______,________________,________________________________ ____________________________________________,________,_________,__-______, 28 ~ ~ ~ ,gi~1573659~H. influenzae predicted coding region ( 6 3304 3005 W0659 [Haemophilus influenzael 83 ' ' j ____ _,____,____, _, _ _ _ __ __ __ y__ _ __ ___ ___ ___ _____________ __,________4_________,_________, ' ' 35 ~ ~ ~ ~gi~311707hypothetical nucleotide binding protein ~

y_______7 5108 3B67_,________________[Acholeplasma laidlawii]

_y____,_______,______ ,______________________________________________________________________~

( ______4________~_________,_________, 55 (191793217528~gi~537085~ORF_f141 [Escherichia cold ~

~

~

________ , y y ' .

t _ ______________ _______________________________________________________________________________ _____________ v ________ ___ _________,_________, 55 Q20A A ~gi~496558~orfx [Bacillus subtilis) ~
v ~

~

( ___ ,_______,________________,_____________________________________________________ _______________________y________y_________,_________y 65 ~ ~ ~ ~gi~1165308L22 [Bacillus subtilis) ~
U

________,____,_______,_______,________________,________________________________ ____________________________________________,________I_________ 68 ~ ~ ~ ~gi~1213d94~immunoglobulin A1 protease [Streptococcus~

6 6877 6683 pneumoniae) 83 ~
5d ( ________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________,_________y_________, TABLE ~ S, pneumoniae - Putative coding regions of novel proteins similar to known groteins ~________a____a_______~_______a________________a_______________________________ ____________________________________________~______ __~_________~_________4 Contig~ORF~ ~ ~match ~ match gene name ~ a ~8 identlength StartStop sim ~

ID (ID~ ~ ~acession (nt) (ntl ~ ~ :~
(ntl ~O
~

___ ~____,_______ ,_______,_______________ _i____________________________________________________________________________, _________________,_________, 00 _____ 87 ~15A511214771~gnl~PID~e323522 putative rpo2 protein (Bacillus subtilis]~ ~

( a________~____a_______a_____ __a_______________ _a___________________________________________________--________________-___-_~______ __~______-__~--___-,__~

96 ~12~ ~ ~gi~47394 ~5-oxoprolyl-peptidase [Streptococcus ( ~
8963 9631 pyogenes] B3 73 ~

________~____a_______a_______a_______________ _~____________________________________________________________________________~
______ __~_________i_________~

98 ~ ~ ~ ~g1~1183885 ~glutamine-binding subunit [Bacillus subtilis]~ ~

~

~________~____a_______~_____ __~_______________ _~____________________________________________________________________________t ______ __~_________a_________y ' ~ ~ ~ fgi~310630 ~zlnc metalloprotease [Streptococcus gordonii]~ ~

~

________,____a_______,_______,_______________ _,____________________________________________________________________________, ______ __~_________,_________a 127 ~ ~ ~ ~gi~1500567 ~N, jannaschii predicted coding region ii] ( 7 2998 4347 14J1665 (Hethanococcus jannasch ~ 72 83 ( a________~____,_______~_______,_______________ _,____________________________________________________________________________~
___-__ __,_________t_________, 137 ~ ~ ~ ~gi~472918 w-type Na-ATPase (Enterococcus hirae] ~
~

~

________~____~_______,_______a_______________ _~____________________________________________________________________________~
________~_________ 160 ~ ~ ~ ~gi~1773265 ~ATPase) gamma subunit [Streptococcus ~ ~
6 3466 4356 mutans) 83 67 ~

a________a____~_______r_______~_______________ _~____________________________________________________________________________f ________t_________i_________~

214 ~ ~ ~ ~gi~663279 ~transposase (Streptococcus pneumoniae] ~
~

~

~________a____a-______a_______a_______________ _a____________________________________________________________________________a ________t_________a_________~ iy ( ~ ~ ~ ~gi~142154 ~thioredoxin [Synechococcus PCC6301] ~
~

I
~

__ , o ____ __________________ _ __________________________..___________________._____________________~
N
a ~ v __.____________ a ______,_ a ____.___ _________v_________~

303 ~ ~ ~ ~gi~40046 ~phosphoglucose isomerase A [AA 1-449) ~
~ N
1 3 1049 (Bacillus steerothermophilus) 83 b7 ~

~

~________,____a_______a_______a_______________ _a____________________________________________________________________________, ______ __t_________,_________, 303 2 1155 1931i lutam 1-tRNA s 289282 ~9 Y ynthetase (Bacillus subtilisl ~ y J
~9 83 67 ( ~

~

____________ _~____________________________________________________________________________~
________~_________~_________~ N

6 ~17A537014318(gi~633147 ribose-phosphate pyrophosphokinase (Bacillus( ~ o caldolyticus] 82 64 ~

~

________~____,_______,_______a_______________ _a____________________________________________________________________________, ________a_________,_________, ~ ,.., 7 ~ ~ ~ ~gi~143648 ribosomal protein L28 (Bacillus subtilis]~
~

~

________~____~_______r_______;_______________ _i____________________________________________________________________________~
________E_________a_________t 9 ~ ~ ~ ~gi~385178 unknown (Bacillus subtilis] ~
~

~

________,____a_______a_______a_______________ _a____________________________________________________________________________, _______ _a_________t_________~

9 ~ ~ ~ ~gnl~PID~d100576 (ribosomal protein S6 [Bacillus subtilis]~ ~

~

________,____,_______v_______,_______________ _ ____________________________________________________________________________a__ ______t_________,_________, to 12 ~ ~ ~ ~gnl~PID~d100571 unknown (Bacillus subtilis]
~ ~
6 d688 3942 82 68 ~

a________a____r_______~_______a_______________ _a____________________________________________________________________________, ______ __,_________~_________, f f17'13422A4837~gi~520754 putative [Bacillus subtilis]
~ i ~

________t____a_______a_______a_______________ _~____________________________________________________________________________a ________,_________a_________~

22 ~18,148971S658,gnl~PI0~d101929 (uridine monophosphate kinase (Synechocystis~ ~
sp,] 82 62 ~
i62 __ ,_______a_______a_______________ _y____________________________________________________________________________~
________a_________,_______ 33 Q16A 10641~gnl~PID~d101190 ~OAF4 [Streptococcus mutans]
~ ( ~

________a____a_-_____~_______~_______________ _~__.._________________________________________________________________________ ~________~_________a_________~

35 ~ ~ ~ ~gi~1881543 ~UDP-N-acetylglucosamine-2-epimerase (Streptococcus~ ~
9 7400 6255 pneumoniae) 82 68 ~

________,____,_______.-______a_______________ _a____________________________________________________________________________a _______ _,_________,_________, 40 ~10~ ~ ~gi~1173519 riboflavin synthase beta subunit [ACtinobacillus~ ~
8003 7S33 pleuropneumoniae] 82 68 ~

a________~____a_______a_______a_______________ _a_________________-__________________________________________________________~________~_________i_ ________a 48 Q32Q23159Q23437~gi~1930092 outer membrane protein (Campylobacter ~ ~
jejuni] 82 61 ~

~________a____a_______a_______a_______________ _~_______________________________________________________..____________________ t_______ _~-_______~_________;

52 (14A3833A4765~gi~192521 ~deoxyribodipyrimidine photolyase [Bacillus~ ~
subtilis] 82 61 ( ________a____a_______,_______~_______________ _~____________________________________________________________________________, _______ _a_________t_________a 60 ~ ~ ~ ~gnl~PID'd102221 1A8001610) uvrA [Deinococcus radiadurans]~ ~
4 4737 l849 82 66 ~

J

________~____a_______a_______~________________~________________________________ ____________________________________________~_______ ..+_________a_________a 62 ( ~ ~ ~gi~2246749 ~IAF009622) thioredoxin reductase (Listeria~ ~
4 2131 1457 monocytogeries] 82 63 ~

________,____,_______a_______~_______________ _,____________________________________________________________________________, _______ _,_________y_______ 71 ~11A 17518~gnl~PID~e322063 ~ss-1,4-galactosyltransferase (Streptococcus~ ~
6586 pneumoniae) 82 60 ~

~________~____~_______,_______~_______________ _,____________________________________________________________________________~
_______ _a_________a_________f 73 Q13~ ~ ~gnl~PiD~d100586 unknown (Bacillus subtilis]

9222 7837 82 ~

~

________a____a_______a_______~_______________ _a____________________________________________________________________________a _______ _f_________i_________~

S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ________a____a_______a_______,________________a_______ ___________________________________________________ _ ______ _________ _________ Contig~ORF~ ~ ~ match ~ match gene name StartStop ~ ~ i length t (dent sim ~

( ~ID~ ~ ~ acession~
Intl ~ ~p ID (nt) (nt) a________a____ a_______a_______a_______________ _a_____________________________________-_-____________________________________a________ a_________a_________a ( ~ ~ ~ ~gnl~PID~d101199 alkaline amylopullulanase (Bacillus sp.j~ 82 3771 74 1 1 3771 ~

~

d0 ,________a____ a_______a_______a________________a_____________________________________________ _______________________________a________a_________a_ ________a 83 ~ ~ ~ ~gnI~PID~e30S362 (unnamed protein product (Streptococcus ~ 82 2B8 9 3696 3983 thermophilusj ~

~

a________a____ a_______a_____ __a________________a___________________________________________________________ ________ _______ ____ ________a __a__ __a_________f_ ( Q11A ~ ~gi~683583 ~S-enolpyruvylshikimate-3-phosphate synthase~ 82 1383 86 0776 9394 (Lactococcus lactis) i ~

a________a____ a_______a_______a________________a__________________-________________________________________________ _________a___-____a_________a_ ________, 89 ~12~ ~ ~gi~40025 homologous to E.coli SOK (Bacillus subtilisj~ 82 1458 829S 9752 ~

~

a________a____a_______a_______a________________a_______________________________ __.__________________________________ _________,________a_________a_ ________, 11S ~ (10347~ ~gnI~PID~d102090 ! 82 1536 9 8812~IAB003927) ~

phospho-beta-galactosidase 7 1 ~

[Lactobacillus gasser(]

a________a____a_______a_______a________________,_______-_________ _________a________a_________ ________a _ _ ____________________________________________ a 11B ~ ~ ~ ~gnl~PID~d100579 ~seryl-tRNA synthetase [Bacillus subtilisj~ 82 1332 1 1 1332 ~

~

a________a____a_______a_______a________________a_______________________________ ________________________a___________ _________a________a_________a_ ________, i j i i ipir~S06097~5060(type I site-specific deoxyribonuclease ~
82 66 ~ 1590 151 3 46S7 6246 (EC 3.1.21.3) CfrA chain S - ~

Citrobacter freundii ________a____a_______a_______a________________ a____________________________________________________________________________a_ _______a-________a_________a i73 ( ~ ~ ~gi~2313836 ~(AE000584) conserved h 6 41B3 3S03 ypothetical protein (Helicobacter pylori]~ 82 ~

~

--_--__- _ __a__ _________,________a__-______f_ ________a ( Q12~ ~
___a___________________________________________________________________ 177 S481 7442~

nl~PID~d101999 g (( ~ 82 1962 1 NcrB (ESCherichia colij ~

~

________,____ ,_______,_______a________________ a____________________________________________________________________________,_ _______a_________,_ ________a 193 ~ ~ ~ ~pir~S08564~R3BS ribosomal protein S9 - Bacillus stearothermophilus~ 82 399 ,___ 2 178 576 ~

~

_____a____a_______a_______a________________ a____________________________________________________________________________a_ _______a_________a_ ________t 2dS ~ ~ ~ ~EcoA type I restriction-modification coli) 2 2S8 B45 enzyme S subunit (ESCherichia ~

~gi~146402 B2 ~

~

________,____a_______a_______a________________a________________________________ ___________________________________ _______ __ _ ________,_________a_ _ a __a ' 9 ~ ~ ~ ~gnl~PID~d100S76 ribosomal protein S18 (Bacillus subtilisj~ 81 255 S 3400 3146 ~

~

a________a____a_______a_______a________________ a_____________ __a________a_________a_ ________~ ~1 16 ~ ~ ~ ~tr 7 7484 8413 to ~gi~1100074 han l-tttNA s th t [Cl idi t l i yp ( 81 930 p ~
y 70 yn ~
e ase os r um ong sporumj a________a____a_______a_______a________________a_____________________-_____________________________________________ _________a________,____-____a_ ________a 0308 ~

nl~PID~d100583 t i ti -i li f g ranscr ~ 81 3513 p ~
on 63 repa ~
r coup ng actor [Bacillus subtilis]

a________a-___a_______a_______a________________a__________________________________________ _______________________.._ _________a________,_________a- ________a 38 ~ ~ ~ ~gi~20SB543 putative DNA binding protein [Streptococcus( 81 375 2 I232 1606 gordoniij ~

~

_ ________a____a_______a______________________ __,________a_________a_ ________a __ y __a________________a________________ 4S ~ ~ ~ ~gi~460259 ~enolase [Bacillus 2 3061 17S1 btili j su ~ 81 1311 s ~

~

,________,____,_______,_______a________________ a____________________________________________________________________________,_ _______a_________,_ ________a 46 ~ ~ ~ ~gi~431231 ~uracil permease (Bacillus caldolyticusj( 1 2 1267 ~

~

________,____,_______,_______a________________a___________________ __,________a_________a_ ________a __ v _ __ 48 ~ ~ ~ ~gnl~PID~d100453 ~Hannosephosphate Isomerase (Streptococcus~ 81 1014 3 2453 1440 mutansj ~

~

________,____a_______,_______a________________a________________________________ ______________________ __ _________a________a_________a_ ________a _ __________ S9 ~ ~ ~ ~gi~1S47S2 transport protein [Agrobacterium tumefaciens)~ 81 771 2 1106 336 ~

~

a________a____a______-a_______a________________a_____________________________________________________ ______________ _________a________a_________a_ ________a 65 Q22A A ~gi~44073 ~SecY protein [Lactococcus lactisj ~

0306 0821 ~

~

,________,____a_______a_______a________________a_______________________________ ____________________________________ _________a________a_________a_ ________i 89 ~ ~ ~ ~gi~SS6886 ~serine hydroxymethyltransferase [Bacillus~ 81 1272 4 3874 2603 subtilis) ~
~

~

b ~________,____,_______a_______a________________ a____________________________________________________________________________;_ _______a_________a_ ________a 99 Q16A A ~gi~2313526 ~(AE000557) H, tylori predicted coding 9126 8929 region HP0411 [Helicobacter pylori] ~

~ 81 ~ 7S ~

________a____a_______a_______a________________ a____________________________________________________________________________,_ ____-__a_________a_ ________a 106 ~ ~ ~ ~gnl~PID~e199384 ~pyrR (Lactobacillus plantarumJ
~ 81 552 CJ~
7 8373 7822 ~
~

~

________,____,_______a_______,________________ ,____________________________________________________________._______________ ~O
_ _______a_________,_________a 108 ~ ~ ~ ~gi~1469939 group B oligopeptidase PepH
(Streptococcus~ 81 1824 6 S054 6877 agalactiaej ( ~

a________a____a_______a_______a________________ a____________________________________________________________________________a_ _______,_________a_________, 113 Q15A 18283~pir~S09411~5094 ~spoIIIE protein - Bacillus subtilis ~ 81 238S
5899 ~

~

a________a____a___ a ___ _ _______a________________ a____________________________________________________________________________a_ _______,_________,__ _______a pp l28 ~ ~ ~ ~gi~1685111 (orf1091 [Stre S 33S9 3639 tococcus therm hil ) p ~ 81 276 a_ op ~

us 69 ~

_______a____a_______a_______a________________ a____________________________________________________________________________,_ _______a_-_______a_________a S. pneumoniae - Putative coding regions of novel proteins similar to known proteins y________y____y_______y_______y_______________ _y____________________________________________________________________________y ________y_________y_________, Contig~ORF( ( ( match ( match gene name ( 8 sim 8identlength 0 StartStop ( ( ( TD SID~ ~ ( acession( ~ ~
~ (nt) (nt) (nt) ( ________,____,_______,______ _,________________y____________________________________________________________ ________________y________y_________y_________y (gi~304896~EcoE type I restriction-modification enzymei) ~ B1 59 2382 R subunit [ESCherichia col ( ~

________,____,_______,_______y________________y____________________________L___ ____________________________________________y________y___ ______y_________, 159 (11~ ~ (9i(2239288(GMP synthetase [Bacillus subtilis) ~ 81 ~

6722 7837 ( ~_._______,____,_______,_______y________________y______________________________ ______________________________________________y________y___ ______y_________y 170 ~ ~ ~ ~gnI~PID~d102006(A8001488) FUNCTION UNKNOWN. (Bacillus ~
81 ~ 55 282 1 739 458 subtilis) ~

y________y____y_______y_______y________________y_______________________________ _____________________________________________y________y___ ______y_________y 191 ( ~ ( ~gi(149522(tryptophan synthase alpha subunit [Lactococcus~ 81 ( 65 867 2 1759 B93 lactis) ( y________~____y_______y_______y________________y_______________________________ _____________________________________________y________y___ ______y_________y 214 ( ( ~ ~gi~157587reverse transcriptase endonuclease [Drosophila( 81 ~ 93 297 3 2290 1994 virilis) ( ~________~____y_______y_______y________________y______-_____________________________________________________________________y________y ___ ______y_________y 217 ~ ~ ~ ~gi~466473~cellobiose phosphotransferase enzyme IW ( 81 ( 4 4415 4008 (Bacillus stearothermophilus) ~
( p________y____y_______y_______y________________y_______________________________ _______________________~_____________________+____---_y--- ------y_________y 262 ~ ~ ~ ~gi~153675(tagatose 6-P kinase [Streptococcus mutans)~ 81 ~

2 569 868 ( ________y____y_______,_______y________________y________________________________ ____________________________________________,________y___ ______y_________, 299 ~ a ~ ~gnl~PID~e301154(StySKI methylase [Salmonella enterica) ~
B1 ~ 60 660 1 663 4 ( (________,____,_______y______ _y________________y____________________________________________________________ ________________y________,___ ______,_________, ( 366 ( ~ ~ ~gi~149521~tryptophan synthase beta subunit [Lactococcus~ B1 ~ 65 294 o 2 376 83 lactis) ( ( y________~____y_______y_______~________________y_______________________________ _____________________________________________y________y___ ______y_________y N

12 Q10~ ( ~gi~1216490~DNA/pantothenate metabolism flavoprotein ~ 80 ~

8766 9242 (Stre tococcus mutans[

p ~ ~

________,____,_______,_______y________________y________________________________ ____________________________________________,________v___ ______y_________y ,.., ( 17 (11~ ~ ~gnI~PID(e305362(unnamed protein product (Streptococcus ~
80 ( 67 J03 6050 5748 thermophilusl ~

N
___ _y________________~____________________________________________________________ ________________y________y___ ______y________ _~ O
17 Q16~ ( ~gi~703126(leucocin A translocator [Leuconostoc gelidum)~ 80 ( 59 612 B455 9066 ( J
y__________y__ ___ _y___ __ y____ y_ _ __ __ _____________y ______y________y___ ______y_________y ____________________________________________________________________ 18 ~ ~ ~ ~gi~1591672phosphate transport system ATP-binding ii) ~ 80 J 2440 1613 protein [Methanococcus jannasch ~ ~

y________,____,_______y_______y________________y_______________________________ ______________________________________________y________y___ ______y_________ yp 27 ~ ~ ~ ~gi(452309(valyl-tRNA synthetase (Bacillus subtilis)( 80 ~
69 2670 ' 3 4248 1579 ( ( y________y____y_______y_______y________________y_______________________________ ___________________________________________ ____ ____ ,____ _ 0 __y_ _ __y__ __y__ __y ( 28 ~ ( ( (9i(1573660~H. influenzae predicted coding region ( 80 ( 63 389 ' 7 3671 3288 HI0660 [Haemophilus influenzaey ( ( y________,____y_______y_______y________________y_______________________________ _____________________________________________y________y___ ______,_________y N

( 32 ~ ( ~ ~gnI~PID~e264499~dihydrooratate dehydrogenase B [Lactococcus( 80 ~ 66 1032 2 902 1933 lactis) ( y________,____y_______y_______y________________y_______________________________ _____________________________________________y________y___ ______y_________y ( 39 ~ ~ ~ (gnl(PID(e234078Whom (Lactococcus lactis) ( 80 ( 63 1266 1 1 1266 ( ________,____,_______y_______y________________y________________________________ ____________________________________________y________y___ ______y_________y 52 ~ ~ ~ ~gi~1183884ATP-binding subunit [Bacillus subtilis) ~ 80 ( 4363 3593 ~

y________y____y_______~_______y________________y__________u____________________ _____________________________________________y________y___ ______y_________y 54 ~ ~ ~ ~gI~2198820~(AF004225) Cux/CDP(LB11; Cux/CDP homeoprotein( BO
~ 60 195 5 4550 4744 [Mus musculus) ( y________y____y_______y_______y________________y_______________________________ _____________________________________________f________~___ ______,_________, ( 59 Q11~ ~ ~gi~951052~ORF9, putative (Streptococcus pneumoniae)~ 80 ( 7109 7486 ~

________y____y_______y_______y________________y________________________________ ____________________________________________f________y___ ______y_________y 65 ~ ~ ~ ~pir~A02815~R58Sribosomal protein L23 - Bacillus stearothermophilus~ 80 ( 69 321 3 1230 1550 ~

y________y____~_______y_______y________________y_______________________________ ___-_________________________________________+________ ___. _.__ _ _ y _ y _ ----y 65 125174 5503~ ~ ~ p p ~ 80 ~

( ~ ~ ~ ~pir A02819ribosomal rotein L24 - Bacillus stearothermo ( RSBS hilus ________y____y_______y_______y________________y________________________________ __________________________________________ ____ __ __ 66 ~ ~ (10687~gi~23138J6~(AE000584) conserved h 9 9884 ypothetical protein [Helicobacter pylori) ~ 80 ~ 66 ~

________,____,_______y_______y________________y________________________________ ______________________________________ ______,________y_________y_ ________, VJ

82 ( ( ~ ~gi~622991~mannicol transport procein (Bacillus stearothezmophilus)~ 80 ~ 65 1791 J

~ ~

~________y____y_______y_______y________________y_______________________________ _______________________________________ ______y________y___ ______y_________y 85 ~ ~ ~ ~gi~528995~polyketide synthase (Bacillus subcilis/ ~ 80 ~

1 9S0 630 ~
i N
y________y____y_______y_______y________________y_______________________________ _____________________________________________y________y_________y_ ________y ( 89 ( ~ ~ ~gi(857776'peptide chain release factor 1 [Bacillus ~ 80 ( 8 6870 5779 subtilis) ( ( ________,____i_______y_______.________________y________________________________ ____________________________________________y________y_________,_ ________, 93 (12~ ~ ~gnl~PID~d101959hypothetical protein [Synechocystis sp.) ~
80 ~ 60 1281 871B 7438 ~

y________y____y_______y_______y________________y_______________________________ _____________________________________________y________y_________y_________4 S. pneumoniae - Putative coding regions of novel proteins similar to known proteins a________,____a_______a_______a________________a_______________________________ _____________________________________________a________a_________a_________a ( (ORF( ( ( match ( match gene name ( t ! ident ( Contig StartStop aim length ( ( ( (ID( ( ( acession ( ( ( ( (nt) ( ~O
ID (nil (nt) a________a____a_______ a_______a________________a_______________________________________--___________________________________a________a_________a_________a 00 ( '( ( ( (gnl(PID(e199386 (glutaminase of carbamoyl-phosphateplantarum) 106 5 6854 5751 synthase (Lactobacillus ( 80 ( 65 ( 1104 ( a________a____ a_______a_____ __a________________~___________________________________________________________ _________________a________a_________a_________! (p ( ( ( ( (9i(40056 (phoP gene product [Bacillus subtilis)( 80 ( 109 2 2160 1450 ( 711 ( ', ,, ~________a____ a_______a_______a________________a_____________________________________________ ______________ _________________a________a_________a_________a ( ( ( ( (gnl(PID(d102254 (30S ribosomal protein S16 [Bacillus( 80 ( 65 124 9 4246 3953 subtilis) ( 294 ( a__~_____a____ a_______a_______a________________a_____________________________________________ ______________ _________________a________!_________!_________a ( ( ( ( (9i(2281308 (phosphopentomutase [Lactococcus ( 80 ( 128 8 5148 6428 lactic cremoris) ( 1281 ( a________a____ a_______a_______a_______________ _a____________________________________________________________________________!
________a_________~_________a ( (19(12665(11376(9i(359109 (NADP-dependent glutamate dehydrogenases) ( 80 137 (Giardia intestinali ( 68 ( 1290 ( a________a____a_______a_______a________________!_______________________________ ____________________________ _________________a________a_________a_________a ( (19(19699(19457(9i(517210 (putative transposase [Streptococcus( 80 ( 70 140 pyogenes) ( 243 ( a________a____a_______a_______a_______________ _a______________________________________________________~_____________...___y__ ______a_________p________a ___ ( ( ( ( (9i(1877423 (galactose-1-P-uridyl transferase ( 80 ( 15B 2 2474 98d (Streptococcus mutansl ( 1491 ( (________a____a_______a_______a________________!_______________________________ ____________________________ _________________a________a_________!_________+

( (10( ( (9i(397800 (cyclophilin C-associated protein ( 80 ( 60 C'1 171 7474 7728 [Bus musculus) ( 255 ( a________a____a_______a_______a________________a_______________________________ ___-________________________ _________________a________a_________a_________a ,~

( ( ( ( (9i(149395 (lacC [Lactococcus lactis) 80 ( o ( 6 ( 618 ( a________a____a_______a_______a________________a_______________________________ ____________________________ _________________a________!_________a_________a N

( ( ( ( (9i(143467 (ribosomal protein S4 [Bacillus ( 80 ( 313 1 27 539 subtilis) ( 513 ( a________!____!_______~_______a________________!_______________________________ ____________________________ _________________a________a_________a_________, ( ( ( ( (9i(533080 (RecF protein [Streptococcus pyogenes)( 80 ( 63 'J
329 2 1652 B58 ( 795 ( N
a________a____a_______a_______a________________a.______________________________ ____________________________ _________________a________a_________a________-a O

( ( ( ( (9i(442360 (ClpC adenosine triphosphatase (Bacillus( 80 ( 58 371 1 2 958 subtilisl ( 957 a________a____a_______a_______a________________a_______________________________ ____________________________ _______________ ____ _____ ___ __a__ __,__ __a__ 8 ( ( ( i (putative (Lactococcus lactis) ( 79 ( 64 7 4312 5580149435 ( 1269 (g ( ( a________a____a_______a_______a________________a_______________________________ ____________________________ _________________a________a_________a_________a ( ( ( ( (gi~1542975 (AbcB (Thermoanaerobacterium thermosulfurigenes)( 79 ( 61 ' 23 1 1175 135 ( 1041 ( a________a____a_______a_______a________________a_______________________________ ____________________________ _________________a________ _ 0 a ________a_________a ( (14( ( (gnl(PID~e253891 (UDP-glucose 4-epimerase (Bacillus ( 79 ( 62 33 9244 8201 subtilis) ( 1044 ( a________a____a_______a_______a________________a_______________________________ ____________________________ _________________a________a_________a_________! N

( ( ( ( (gnl(PID(e324218 (ftsA IEnterococcus hirae) ( 79 ( 58 36 3 1Z42 2633 ( 1392 ( a________a____a_______!_______a________________ a____________________________________________________________________________a_ _______a_________a_________!

( (13( ( (9i(405134 (acetate kinase (Bacillus subtilis)( 79 ( 38 7155 8378 ( 1224 ( a________a____a_______a_______a________________a_______________________________ ____________________________ _________________a________!_________a_________!

( ( ( ( (9i(1146234 (dihydrodipicolinate reductase (Bacillus( 79 ( 56 55 7 9011 8229 subtilis) ( Z83 ( a________a____a_______a_______a________________a_______________________________ ____________________________ _________________a__.._____!_________a_________!

65 (19( ( (9i(2078380 (ribosomal protein L30 (Staphylococcus( 79 ( 68 866l 8915 aureusi ( 255 ( a________,____a_______a_______,________________a_______________________________ ____________________________ _________________a________a_________a_________f ( ( ( ( (gnl(PID(e311452 (unknown (Bacillus subtilis) ( 79 ( 64 69 4 3678 212B ( 1551 ( a________a____a_______a_______a________________a_______________________________ ____________________________ _________________~________~_____-___!_________~

( ( ( ( (9i(677850 (hypothetical protein [Staphylococcus( 79 ( 69 9 7881 7279 aureus) ( 603 ( a________a____a_______a_______a________________a_______________________________ ____________________________ _______________ ____ _____ __ ( (10( ~ (gnl(PID(d101091 (hypothetical protein [Synechocystis( 79 ( 62 72 8491 978l sp.) ( 1293 ( ,________a____a_______a_______a________________ ~__________________________________________________________________________ ( ( ( ( ~gi(143342 (polymerase III [Bacillus subtilis)____ _____ 80 3 2906 7300 ( 79 ( 65 ( 4395 ( a________a____a_______a_______a________________ a____________________________________________________________________________a_ _______a_________a_________!

( (14j13326(15689(gnl)P1D(e255093 )hypothetical protein [Bacillus ( 79 ( 65 J
82 subtilis) ( 2364 ( a________a____a_______a_______a________________ a____________________________________________________________________________a_ _______,_________a_,_______a ( (13(12237(11118(gi 86 683582 (prephenate dehydrogenase [Lactococcus( 79 ( 58 ~O
( lactis) ( 1116 ( a________!____a_______ a_______a________________ a________________________________________________________________________ VI
_ _ _ __a___ ____a_________a_________a ( ( ( ( (9i(537286 (triosephosphate isomerase (Lactococcus( 79 ( 65 92 3 910 1734 lactis) ( 795 ( a________a____a_______a_______a________________ !---__________________________-______________________________________________a________!_________a_________a ( ( ( ~ (gnl(PID(d100262 (Live protein [Salmonella typhimurium)( 79 ( 63 98 6 4023 4742 ( 720 ( a________,____a_______,_______a________________ a____________________________________________________________________________a_ _______!_________a_________a S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ,________,____,_______,_______ ,________________y_____________________________________________________________ _______________y________y_________y_________y Contig~ORF~ ~ ~ match ~ match gene name - ~ $
sim $identlength StartStop ( ~

ID ~ID~ ~ ~ acession~ ~ ~
~ (ntl (ntl (nt) ~D

________,____,_______,_______ ,________________,_____________________________________________________________ _______________~________t___ ______y_________y p O

99 Q12A 14150~gi~153736~a-galactosidase iStreptococcus mutansj ' 64 _ 6315 79 ~
r ~ 2166 ~

y________,____,_______y_______ ,________________~_____________________________________________________________ _________ ______,________y___ ______y_________y 107 ~ ~ ~ ~gi~460080~D-alanine:D-alanine ligase-related protein~

7 5684 6406 IEnterococcus faecalis) 79 ~

~ 723 y___-____y____y_______ ,_______y________________y________________________________--_-________________________________________y________y_________~_________~

113 ~ ~ ~ ~gi~466882~ppsl; B1496 C2_189 [Mycobacterium leprae) ~ ~

,________y____y_______,_______,________________,_______________________________ _____________________________________________,________y___ ______y_________y 151 Q10A 12213~gi~450686~3-phosphoglycerate kinase (Thermotoga ~

3424 maritimaj 79 ~

~ 1212 ________,____y_______ y_______,________________y_____________________________________________________ _______________________f________y___ __-___y_________y 162 ( ~ ~ ~gi~506700~CapD (Staphylococcus aureus) ( 2 1158 3017 79 ~

~ 1860 ,________y____y_______,_______,________________y_______________________________ _____________________________________________,________i___ ______y_________y 177 ~ ~ ~ ~gi~912423putative [Lactococcus lactis) ~

2876 3052 79 ~

( 177 ,________,____,_______,_______,________________y_______________________________ _______________________~_____________________,________y___ ______,_________t 177 ~ ( ~ ~gi~149429putative (Lactococcus lactis) ~

~

________y____y_______,_______y________________y_____-______________________________________________________________________y________ y___ ______y_________y 187 ~ ,2728~ ~gnl~PID~d102002~(AB001488) FUNCTION UNIWOwN. (Bacillus ~

3 2907 subtilis) 79 ~

~ 180 ,________,____,_______,_______,________________y________________.._____________ _______________--_____________________________,________~___ ______y_________y 189 ~ ~ ~ 'gnI~PID~e183449putative ATP-binding protein of ABC-type ~

7 35A9 4350 [Bacillus subtilis) 79 ~

~ 762 ________,____,_______,_______,________________~________________________________ ____________________________________________,________,___ ______y_________, N

191 ( ~ ~ ~gi~149519~indoleglycerol phosphate synthase [Laciococcus~

5 4249 3449 lactisj 79 ~

~ 801 ~

,_______,____,_______f_______,________________,________________________________ _____________________--_____________________,________i___ ____-_,_______--~ 'J

211 ( ~ ~ ~gi~147404~mannose permease subunit II-M-Man (Escherichia~
57 .~.1 3 1B05 2737 coli) 79 ~

~ 933 ~

(________,____,_______y_______y________________,_______________________________ _____________________________________________,________y___ ______,_________y N

212 ~ ~ ~ ~gnI~PID~e209004~glutaredoxin-like protein (Lactococcus ~
58 o 3 3863 3621 lactis) 79 ~

~ 243 ~

________,____y_______,_______ ,________________y_____________________________________________________________ _______________,_______,___ ______,_________, Qp 215 ~ ~ ~ ~gi~2293242J(AF0082201 arginine succinate synthase ~

1 987 715 (Bacillus subtilis) 79 ~

~ 273 ,________,____y_______,_______y________________y_______________________________ _____________________________________________,________,___ ______,_________y 323 ~ ~ ~ ~gi 89779530S ribosomal 2 530 781 protein [Pediococcus acidilacticij ~ 67 79 ~
~ 252 ,________,____y_______,_______,________________,_______________________________ _____________________________________________,________y___ ______,_________, O

380 ~ ~ ~ ~gi~11B4680~polynucleotide phosphorylase [Bacillus ~

1 694 2 subtilis) 79 ~

~ 693 ~

,________y____y_______,_______,________________,__________..___________________ ____________.._________________________________y________,___ ______t_________, N

384 ~ ~ ~ ~gi~143328~phoP protein (put.); putative [Bacillus ~

2 655 239 subtilis] 79 ( ~ 417 ,________,____y_______,_______y________________~__________..___________________ ______________________________________________,________,___ ______,_________, 6 ~ ~ ~ ~gi~853767~UDP-N-acetylglucosamine 1-carboxyvinyltransferase~ 62 3 2820 4091 [Bacillus subtilisj 78 ~

~ 1272 ,________,____y_______,_______y________________y_______________________________ _____________________________________________~________,___ ______,_________y 8 ~ ~ ~ ~gi~149432(putative (Lactococcus lactis) ( 1 50 1786 78 ~

~ 1737 __~_____,____,_______,_______,________________~__________~_____________________ ____________________________________________y________y___ ______,_________, 9 ~ ~ ~ ~gi~897793y98 gene product [Pediococcus acidilactici)~

1 351 124 78 ~

~ 228 ,________,____,_______,____-__y________________,___________________________________________________________ _________________y________y___ ______,_________y ~ ~ ~ 'gnl(PID~d100585~cysteine synthetase A (Bacillus subtilisj~

8 7364 8J14 78 ~

~ 9S1 ________y____y_______,_______y________________,________________________________ ____________________________________________t________,___ ______y_________y ~10~ A ~gnl~PID~d100583stage V sporulation (Bacillus subtilis) ~

9738 0310 78 ~

~ 573 y________,____y_______,_______,________________y_______________________________ _____________________________________________y___-____y___ ______ 20 (16,17165(177I3~gi~49105(hypoxanthine phosphoribosyltransferase f (Lactococcus lactis) 78 ~
~ 549 ________,____,_______,_______y________________,________________________________ ____________________________________________y________,___ ______,_________y p"3 22 ~22A738818416(gnl~PID~d101315~YqfE [Bacillus subtilisj ~

78 ~
~ 1029 ________y____ y_______,_______,________________,_____________________________________________ _____________________________.._,________,___ ______~_________i 22 Q27Q20971Q20612~gi~299163~alanine dehydrogenase (Bacillus subtilisl~

78 ~
~ 360 ________,____ t_______,_______,________________y_____________________________________________ __________..____________________,________,_________,_________y w 34 ~ ~ ~ i~41015 ~as 8 7407 7105 ~ actate-tRNA li ase [Esche ichi li) g p ~ 55 g 78 ~
r ~ 303 a co ________,____ ,_______y_______,________________,_____________________________________________ _______________________________y________,_________,_________y f11 35 ~ ~ ~ 'gi~1657644~CapBE (Staphylococcus aureusj ~

8 6257 5196 78 ~

~ 1062 ,________y____y_______,_______,________________y_______________________________ _______-_____________________________________~________~_________y_________y TABLE 2 S, pneumoniae - Putative coding regions of novel proteins 3lmilar to known proteins ________,____,______ _,_______,________________,____________________________________________________ ________________________~________t_________,_________r ( ~ORF~ ~ ~ match ~ match gene name ~ E
1 identlength Contig StartStop sim ~
~

ID SID! ~ ~ acession~ ~ ~
~ (nt) (nt)(nt) _ ________~________________~_____________________________________________________ __________-_-__________~________~_________y___--__ 40 Q11~ ~ ~g1~1173518~GTP cyclohydrase IIl 3.4-dihydroxy-2-butanone-4-phosphate 78 58 1287 9287800I synthase ~

(ACtinobacillus pleuropneumoniae) ________,____~_______~_______~________________,________________________________ ____________________________________________~________~_________,_________, 48 ~3122422231B3(gi~2314330~(AE000623) glutamine ABC transporter. ATP-binding 78 58 762 protein (glnQ) i i ~ i (Helicobacter pyloril ________t____,_______t_______~________________~________________________________ ____________________________________________t________~_________~_________~

52 ~ ~ ~ ~gi~1183887integral membrane protein (Bacillus subtilisl~

( ~

~________~____~_______4---____~______-_.~_______f____________________________________________________________________ ________~________~_________y_________y SS Q14A A2712~gnl~PID~d102026(A8002150) YbDP [Bacillus subtilis) ~

~

~
89d ________~____i_______r_______f________________~________________________________ ________________________________________ ____~________~_________~_________t 55 Q17A 15612, ~gnl~PID~e313027hypothetical protein [Bacillus subtilisl ~

~

~

~________~____~_______~_______~________________~_______________________________ _______________________.~_____________________~________t_________f_________~

71 ~14(19756(19598(gi~179764(calcium channel alpha-1D subunit [Homo ~

_ sapiens) 7$

~

~

~_________ ~_______~_______~________________t_____________________________________________ ___________________________ ____~________~_________~_________i 74 _t_A 14018~gi~1573279 ~

Q115031 ~Holliday 78 junction ( DNA helicase 57 (ruvB) ~

[Haemophilus 1014 influenzael _____________f_________________________________________________________________ _______ ____~________~_________4_______ 75 ( ~ ~ ~gi~1877423~galactose-1-P-uridyl transferase (Streptococcus~

9 66237972 mutans/ 7B

~

~

~________~____,_______~_______~________________~_______________________________ _____________________________________________~________~_________t_________~
O

81 Q12A 13906~gi~1573607~L-fucose isomerase (furl) (Haemophilus ~

2125 influenzae) 78 ~

~

~ N

(________~____,_______,_______~________________,_______________________________ _________________________________________ ____i________,_________~_________~ J

82 ( ~ ~ (gi~153744~ORF X; putative [Streptococcus mutans) ~

~

~

_____________~_________________________________________________________________ _______ ____~________~_________y_________~ N

87 ~18A A ~gi~143373~phosphorlbosyl aminoimidazole carboxy formyl( 69268500 formyltransferase/inosine 78 ~

~

monophosphate cyclohydrolase (PUR-H(J1) (Bacillus subtilisl ~________,____~_______~_______t________________t_______________________________ _________________________________________ ____~________~_________i_________~ r...

83 Q20Q20212Q20775~gi~143364~phosphoribosyl aminoimidazole carboxylase ( I (PUR-E) [Bacillus subtilisl 78 ~

~

~________~____~_______~_______;________________t_______________________________ _________________________________________ ____~________f_________~_________~

92 ~ ~ ~ ~gnI~PID~d101190~ORF2 (Streptococcus mutansl ~
o ~

~

( ~________~____~_______~_______f________________~_______________________________ _____________________________________________~________f_________~_________~

98 ~ ~ ~ ~gi~2331287~1AF0131881 release factor 2 (eacillus subcilisl, ~

( ________,____,_______,_______,________________~________________________________ ____________________________________________,________~_________,_________, 113 ~ ~ ~ ~gi~580914~dnaZX (Bacillus subtilisl ~

~

~

________,____,_______f______ _,________________,____________________________________________________________ ____________ ____,___-____~_________,_________, 127 ~ ~ ~ ~gi~142463RNA polymerase alpha-core-subunit (Bacillus~

4 11332071 subtilisl 78 ~

~

~________~____~_______f______ _~________________~____________________________________________________________ ___________-____~________t_________~_________i 132 ~ ~ ( ~gi~1561763~pullulanase (8acteroides thetaiotaomicronl~

1 2782d97 78 ~
5$
~

________~____~_______;_______f________________~________________________________ ____________________________________________i________~_________~_________, 135 ~ ~ ~ ~gi~l7$8036~(AE000269) NH3-dependent NAD synthetase ~

4 26983537 (ESCherichia coli[ 78 ~

~

________,____~_______,_______,___-____________a__________________________________________________________________ __________t________~_________,_________t 140 Q24Q26853Q25423~gi~1100077~phospho-beta-glucosidase [Clostridium longisporuml( ~

~

________f____,_______,_______~________________r________________________________ ____________________________________________,________~_________t_________?

150 ~ ~ ~ ~gi~149964amino peptidase (Lactococcus lactis) ~
' . 5 46904514 78 ~

~
l77 ~

~________y____~_______,______ _,________________,____________________________________________________________ ________________t________,_________,_________, I52 ~ ~ ~ ~gi~639915~NADH dehydrogenase subunit (Thunbergia J

1 1 795 alatal 78 ~

~

________,____~_______,______ _,________________,____________________________________________________________ ________________,________+_________f_________~

I62 ( ( ~ ~gnl~PID~e323528putative YhaP protein (Bacillus subtilis) ~

~

~

~________~____t_______,_______t________________t_______________________________ _____________________________________________y________~___-_____~_________i 181 Q10~ ~ ~gi~149402lactose repressor (lacR; alt.) (Lactococcus~

86517947 lactisl 78 ~

~

_4_ hr a________~____ f_______~_______~________________~_____________________________________________ _______________________________t______..
________~_______ ~

200 ~ ~ ~ ~gnl~PID~d100172~invertase (Zymomonas mobilisl ~
f A
( ~

~

________~____,_______,_______,________________,________________________________ ____________________________________________,________,_________~_________t 203 ~ ( ~gi~1174237~CycK (Pseudomonas fluorescens) ~

~_______3 3230 _~________________~____________________________________________________________ ____________78 _~_-__~ ~

~_______~______ ( ____~________~___-_____4_________' S. pneumoniae - Putative coding regions of novel proteins 5lmilar to known proteins ,________~____,_______s_______~________________~_______________________________ _____________________________________________~________~_________t___.._____t Contig ~ORF ~ Start ~ Stop ~ match ~ match gene name ~ ! sim ~ 8 ident ~
length ID ~ID ~ (nt) ~ (nt) ~ acession ~ ~ ~ ~ (nt) ~ 0 __,_______~________________~___________________________________________________ _________________________~________+_________~_________~ ~p 210 ~ 9 y 6789 ~ 7172 ~gi~580902 ~ORF6 gene product (Bacillus subtilis] ~ 78 ' 42 ~ 384 ~________,:___,_______,_______~________________y_______________________________ _____________________________________________,________f_________,_________, i 214 i 6 i 3810 i 2797 ignl~PID~d102049 iP. haemolytlca o-sialoglycoprotein endopeptidase: P36175 t660) ~ 78 ~ 60 ~ 1019 transmembrane [Bacillus subtilis] ~ ~ ~ ~ ,_, ,________~____~_______~_______~________________i_______________________________ _____________________________________________y________~_________,_________, 214 Q13 ~ 6322 ~ 8163 ~gi~1377831 unknown (Bacillus subtilis] ~ 78 ~ 62 ~ 1B42 ________,____,_______,_______,________________~________________________________ ____________________________________________,________,_________~_________, 217 ~ 1 ( 9 ~ 2717 ~gi~488430 alcohol dehydrogenase 2 [Entamoeba histolytica]
~ 78 ~ 64 ~ 2709 y________~____~_______y_______t________________~_______________________________ _____________________________________________~________t_________f_________f 222 ~ 3 ~ 2316 ~ 3098 ~gi~1573047 spore germination and vegetative growth protein (gerC21 [Haemophilus ~ 78 ~ 65 , 7B3 influenzae]
_____________~______________________________________________________~__________ ___________~________~_________~_______ 26B ~ 1 ~ 742 ~ 8 ~gi~517210 putative transposase (Streptococcus pyogenes] ~
78 ~ 65 ( 73S
________,____~_______,_______,________________~________________________________ ____________________________________________~________~_________t_________, 276 ~ 1 ~ 223 ~ 753 ~gnl~PID~d100306 ribosomal protein L1 [Bacillus subtilis]
~ 78 ~ 65 ~ 531 ~________;____,_______,_______~________________~_______________________________ _____________________________________________~________~_________t_________t 312 ~ 3 ~ 1567 ~ 1079 ~gi~289261 ~comE ORFZ (Bacillus subtilis) ~ 78 ( 54 ~

,________,____,_______,_______~________________~_______________________________ _____________________________________________~________~_________~_______ 339 ~ 1 ~ 117 ~ 794 ~gi~1916729 ~CadD (Staphylococcus aureus]
78 ~ 53 ~ 67g ________,____,_______,_______,________________,________________________________ ____________________________________________i________,_________,_______ 342 ~ 2 ~ 762 ~ 265 ~gi~1842439 ~phosphatidylglycerophosphate synthase (Bacillus subtilis] ~ 78 ~ 59 ( 498 ________,____,_______,_______,________________,________________________________ ____________________________________________,________~_________,_________, 383 ~ 1 ~ 737 ~ 3 ~gi~11846B0 ~polynucleotide phosphorylase [Bacillus subtilis] ~ 78 ~ 64 ~ 73S
~________~____~_______!_______~________________~_______________________________ ___________________________________________ __,________~_________~_________f 7 ~15 A 1923 1101B ~gi~1399855 ~carboxyltransferase beta subunit [Synechococcus PCC7942] ~ 77 ~ 63 ~ 906 ~ N
________~____~_______,_______,________________~________________________________ _________________________..__________________~________~_________~_______ B ~ 2 ~ 1698 ~ 2255 ~gi~149433 putative (Lactococcus lactic] ~ 77 ~ 59 ~ 558 ________~____,_______~_______,________________~________________________________ ____________________________________________~________~_________~_______ ( 17 ~14 ( 6948 ~ 7550 ~gi~520738 ~comA protein [Streptococcus pneumoniae) ~
77 ~ 60 ~ 60I
________,____i_______~_______,________________~________________________________ ____________________________________________,________,_________,_________, 30 ~12 ~ 9761 ~ 8967 ~gi~1000451 ~TreP (Bacillus subtilis) ~ 77 ~ 43 ~ 795 ~________~____~_______y_______t________________~_______________________________ _____________________________________________,________~_________i_________~
36 Q14 A1421 12131 ~gi~1573766 ~phosphoglyceromutase (gpmA) [Haemophilus influenzae] ~ 77 ~ 64 ~ 711 ________~____,_______a_______,________________i________________________________ ____________________________________________~________~_________~_________~
55 ~ 3 ~ 3836 ~ 4096 ~gi~l?08640 ~YeaB (Bacillus subtilis] ~ 77 ~ 55 ~ 261 ________,____,_______,_______,________________s________________________________ ____________________________________________,________f_________,_________, 61 ~ 8 ~ 8377 ~ 8054 ~gi~1890649 ~multidrug resistance protein LmrA
(Lactococcus lactic] ~ 77 ~ 51 ( 324 ________,____f_______,_______,________________,________________________________ ____________________________________________,________,_________,_________~

65 ~ 2 ( 607 ' 1254 (gi~40103 ribosomal protein L4 (Bacillus stearothermophilus] 4 77 , 63 ~ 648 ' t________~____~_______r_______~________________~_______________________________ _____________________________________________f________~_________~_________~
68 ~ 8 ~ 7509 ~ 7240 ~gi~47551 ~MRP [Streptococcus suis] ~ 77 ~ 68 ~ 270 ~________~____~_______~_______a________________~_______________________________ _____________________________________________i________t_________~_________~
69 ~ 1 ~ 1083 ~ 118 ~gnl~PID~e311493 unknown (Bacillus subtilis) ( 77 ~ 57 ~

________~____t_______f_______~______________.._i_______________________________ _____________________________________________~________i_________i_________~
7 ~ 5 ~ 4583 ~ 4026 ~gnI~PID~e281578 hypothetical 12.2 kd protein (Bacillus subtilis) ~ 77 ~ 60 ~ S58 t________~____~_______~_______~________________t_______________________________ _____________________________________________~________~_________~_________~
83 ~14 A 3104 '14552 ~gi~1590947 ~amidophosphoribosyltransferase (Methanococcus jannaschii] ~ 77 ~ 56 ~ 1449 ~________4____~_______~_______y________________a_______________________________ _____________________________________________~________~_________y_________i J
94 ~ 4 ( 3006 ~ 5444 ~gnl~PID~e329895 ~(AJ0004961 cyclic nucleotide-gated channel beta subunit (Rattus norvegicus] ~ 77 ~ 66 ~ 2439 ~ h.~.
________,____,_______,_______,________________~________________________________ ____________________________________________~________t_________~_________i 96 Q11 ~ 8S18 ~ 8880 'gi~551879 ~ORF 1 [Lactococcus lactic) ~ 77 ~ 62 ~ 363 ~

________y____i_______~_______f________________,________________________________ ____________________________________________~________s_________~_________, 99 Q11 A4082 12799 ~gi~153737 sugar-binding protein (Streptococcus mutans] ~
77 ~ 61 ~ 1284 ~________,____,_______a_______,________________~_______________________________ _____________________________________________~________t_________f_________t TABLE 2 S, neumoniae - Putative codin re ions of novel p g g proteins similar to known proteins y________y____-----__y _______y________________y______________________________________________________ ______________________y________y_________y_________y ( JORFJ J J J match gene name J E
~ J lengthJ
Contig Start Stop match sim t ident ID JIDJ J J ( J
J (nt)J
(nt) (nt) acession y________y____y_______ y_______y_..______________ __ _ ___ y_________y__._______y _______________________________________________________________ __ _ ___ ___y____ __ 106 J J J Jg3J148921 J

2 361 1176 JLicD J

protein 51 [Haemophilus J

influenzae] 816 J

y________y____y_______y_______ y________________y____________________________________-______________________-_____ ___________y________y_________y______ J J ( J JgiJ1574730 108 4 3152 4030 Jtellurite J
resistance 58 protein J

(tehB) 879 [Haemophilus J

influenzae]

_____y____,_______y_______ y________________y_____________________________________________________________ ____ ___________y________y-________~_________y J ( J J JgiJ1573900 118 4 3520 3131 JD-alanine J
permease 57 IdagA) J

[Haemophilus 390 influenzae) J

_____y____y_______y_______y__.._____________y__________________________________ _______________________________ ___________y________y_________y_________y ( J J J JgiJ1573162 e]
l24 4 1796 1071 JtRNA
J
(guanine-N11-methyltransferase (trmDl J

[Haemophilus 58 influenza J

J

y________y____y_______ y_______y________________y_____________________________________________________ ____________ ___________y________~_________y___.___ J ~ J J JgnIJPIDJd101163 126 4 59d9 4614 JSrb (Bacillus J J

subtilis] 1296 J

_____y____y_______ y_______y________________y_____________________________________________________ ____________ ___________y________y_________y_________y ( J J J JgnIJPIDJd101328 128 2 630 137J JYqiZ J

(Bacillus J

subtilis) 744 J

y________y____y_______ y_______ y________________y______________________________________________________~______ ____ ___________y________y_________y______ J ( J J JgnIJPIDJe325013 130 1 1 1287 Jhypothetical J
protein 61 (Bacillus J

subtilis] 1287 J

y________y____y_______ y_______y________________y_____________________________________________________ ____________ ___________y________y_________y_________y I39 J J,4388 J JgiJ2293302 3639 J(AF008220) J

YtqA 59 [Bacillus ( subtilis] 7S0 J

____________y_______ ~_______ ~________________y_____________________________________________________________ ____ ___________y__, ____y_________f_________y J J11J10931 J JgiJ289284 110 9S82 Jcysteinyl-tRNA
J
synthetase 64 (Bacillus J

subtilis] 1350 J

~____________y_______ y_______y________________y_____________________________________________________ ____________ ___________y________~_________y_________y O

J J18J19451 J19263 JgiJ517210 140 Jputative J

transposase 66 (Streptococcus J

pyogenes] 189 J

y________y____~_______ y_______ y________________y_____________________________________________________________ ____ ___________~________y_________y_________y ,J

J J J J JgnIJPIDJe157887 (aa 50 1-573) J

IDrosophila 70B

yakuba] ( y________y____y_______ y_______~________________y____________________________,._______________________ _____________ ___________y________ _ N
_____ ____ y _ __~___ __y J J J J JgiJ556258 J

1d1 4 2775 5293 JsecA
J
[Listeria 59 monocytogenesl J

J

y________y____y_______ y_______y________________y________________ _____.._____y________y_________y_________y _______________________ J J J J JgnIJPIDJd100585 144 2 671 2173 Jlysyl-tRNA
J
thynthetase 61 [Bacillus J

subtilis) 1503 J

y________y____y_______ y_______ y________________y_____________________________________________________________ ___..
_______..___y________y__..______y_________y J J J J (giJ511015 J

163 S 6d12 7398 Jdihydroorotate J
dehydrogenase 62 A J

[Lactacoccus 987 Iactls) J

________y____y_______ y_______y ________________y______________________________________________________________ ___ ___________y________~_________y_________y J J10J J JgnIJPIDJd1D0964 homologue of iron dicitrate transport E. 77 52 768 164 7841 7074 J ATP-binding protein FecE of cola J ( J
J

J J J J J (Bacillus subtilis) J J
J J
J

N
________y____y_______ y_______ y________________y_____________________________________________________________ ____ ___________y________y_________y _________y J ( J J JgiJ149516 J

191 8 7257 5791 Janthranilate J
synthase 57 alpha J

subunit I467 [Lactococcus ( lactis) ________~____y_______ y_______ y________________y_____________________________________________________________ ____ ___________y________y_________y_________y J J J J JgiJ1573856 198 8 5377 5177 Jhypothetical J
IHaemophilus 66 influenzee] J

J

y________y____y_______ y_______ ~_______________~______________________________________________________________ ___ ___________~________~_________~_________y J J J J JgiJ1743860 213 1 202 462 JBrca2 J

[Hus'musculus) SO

J

J

y________y____y_______ y_______ y________________y_____________________________________________________________ ____ ___________y________y_________y_________y J J J J JgnIJPIDJe334776 250 2 231 509 JYlbH J

protein 60 (Bacillus J

subtilis] 279 ( y________y____y_______ y_______ y________________y_____________________________________________________________ ____ ___________y________y_________y_________~

J J J J JgnIJPIDJd100947 2B9 3 1737 1276 JRibosomal J
Protein 62 (Bacillus q62 subtilis) J

y________y____y_______ ,_______ y________________y_____________________________________________________________ ____ ___________y________~_________y_________y ( J J J JgiJ143004 J

292 2 1399 668 Jtransfer J
RNA-Gln 58 synthetase J

(Bacillus 732 stearothermophilus) J

________y____y_______ y_______ y________________y_____________________________________________________________ ____ ___________y________+_________y_________y J J ( J JgnIJPIDJd101824 ( 76 7 3 2734 I166 Jpeptide-chain-release J
factor 53 (Synechocystis I569 sp.) J

y_ y y_ _ ____ _ __________ y_______ y________________y_____________.._________________..___________________________ ______ ___________y________y_________y_________y _ ( J23J18474 J18235 JgiJ455157 7 Jacyl J

carrier 57 protein J

(Cryptomonas 2d0 phi) J

________y____y_______ ~_______ ,________________y_____________________________________________________________ ____ ___________y________y_________~_________y J J J J JgiJ1146247 9 8 5706 4342 Jasparaginyl-tRNA
J
synthetase 61 (Bacillus J

subtilis) I365 J

_____y____~_______ y_______ y________________y_____________________________..______________________________ _____ ___________y________y_________y_________y J J J J JgnIJPIDJe314495 5 4531 4385 Jhypothetical J
protein 53 (Clostridium ( perfringens) 147 J

y________y____y_______ y_______ y________________y,____________________________________________________________ ____ ___________y________y__ ____y_________y 00 J J J J JgiJ1591672 18 2 1615 842 Jphosphate transport system ATP-binding protein [Hethanococcus jannaschii) J

J

( J

y________y____y_______ y_______ y________________y_____________________________________________________________ _______________,________~_________y_________y TABLC 2 S. pneumoniae - Putative coding regions of novel protelne- siimilar to known proteins y________y____y_______y_______y________________y_______________________________ _____________________________________________y___ _____y_________y_________, 1 IORF1 j I match I match gene name 1 !
sim 1 t I length Contig StartStop ident I IID1 I 1 aceasion1 I
I 1 (nt) 3D (nt)(nt) ,_______..,____y_______,______ _y________________y____________________________________________________________ ________________y___ _____,_________ y_________y 1 j37I27796I28173IgnlIPIDje133H9jtranslation initiation factor IF3 (AA 1-171) 761 r 22 (Bacillus stearothermoph)lus) 1 64 ( ________,____,_______~_______,_______________ _,____________________________________________________________________________y ___ _____y_________,_________y 00 1 I j I Igi11773346ICapSG (Staphylococcus aureus) I

j ,_______..,____y_______ ,_______,________________,_____________________________________________________ _______________________,___ _____~_________~_________, 1 I28121113j21787Igi12314328j(AE000623) glutamine ABC transporter, permease 76 52 ( 675 1 48 protein (glnP) [Helicobacter 1 I ( I I I ( PYloril I
I I

________y____y_______,_______y_______________ _y____________________________________________________________________________, ___ _____,_________ y_________y 1 (t2(12881113786Igi1142521Ideoxytibodipyrimidine photolyase (Bacillus 76j 52 subtilis/ ( 58 I

,________,____,_______y_______,_______________ _y____________________________________________________________________________, ___ _____,_________y_________y I I10I11521I10571IgnIIPIDIe283110IfemD (Staphylococcus aureus) 1 j ,________,____,_______ ,_______,_______________ _,____________________________________________________________________________, ___ _____y_________y_________, I I j I IgiI290561(0188 [Escherichia cola) 1 761 I

I

________y____,_______,_______,________________y________________________________ ____________________________________________y___ _____,_________f_________~

j I I j IgnIjPIDje313024(hypothetical protein [Bacillus subtilis) 76j 62 5 2d062095 j 59 I

j ________,____,_______,_______y________________,________________________________ ____________________________________________y___ _____y_________y_________, I ( 1' I jgi140148 (L29 protein (AA 1-66) (Bacillus subtilis) 761 65 9 4223444l 1 58 I

,________,____,_______,_______,_______________ _y____________________________________________________________________________y ___ _____,_________,______ ( ( I I IgnIIPiDIe2H4233(anabolic orn)thine carbamoyltransferase [Lactobacillus 761 68 2 13282371 plantarum/ 1 6I

I

,________y____,_______y_______,________________y_______________________________ _____________________________________________,___ _____y_________y_________y N

I I 1 I IgnIIPIDId101420IPyrimidine nucleoside phosphorylase (Bacillus 76I "I
69 8 72976005 stearothermophilus) I 61 I

I

,________,___.y_______,_______y________________,_______________________________ _____________________________________________,___ _____,_________,_________, J

( j12I ( IgnIIPID1e243629(unknown [Mycobacterium tuberculosis/ j j j ,________y____,_______y_______,________________,_______________________________ _____________________________________________y___ _____,_________y_________y I j I I IgnIIPIDId101048IC. thermocellum beta-glucosidase; P26208 74 5 843J7039 I9851 (bacillus subtilis) I 60 I

I

________,____y_______y_______y________________y________________________________ ____________________________________________y___ _____y_________,_________y I I I I (9i(2314030IIAE000599) conserved hypothetical protein BO 5 76437936 (Hel)cobacter pylori) I 61 I

I

,________,____y_______,_______,________________,_______________________________ _____________________________________________,___ _____,_________y_________y j (15116019j16996jgi1157390D(D-alanine pecmease (dagA) [Haemophilus influenzael 76j o 82 j 56 I

y________y____,_______,_______,______.._________,______________________________ ______________________________________________,___ _____,_________y_________y 83 19 1861619BB4gil143374 phosphoribosyl glycinamide synthetase IPUR-D;

gtg start codon) IHacillus I 60 1 i i i i i i N
subtilis/ I I I ' I

________,____,_______,_______y_______________, _____,_________ y_________, _,____________________________________________________________________________, ___ j I14(13409112231(9i(143806IAroF (Bacillus subtilisl ( 86 ,__,_______,____,____________ _____ __ ____ ______ ( __ 11Z9 __ ( _ _ y ___ __ _ __ ____ _____y_________,_________y I I I I (9i(153804__ 76I

87 1 3 1442 _ 59 ___ I
___ 1440 ____________ I
____ _________________________________,___ (sucrose-6-phosphate hydrolase (Streptococcus mutans) I

y________c____,_______y_______y________________y__________.____________________ _____________________________________________,___ _____,_________,_________, 1 I16I15754I15110IgnIIPIDIe323500(putative Gmk protein (Bacillus subtilis) ( j y________y____y_______,_______y________________,_______________________________ _____________________________________________,___ _____y_________,_________y I 1 I I Igij1574820I1,4-alpha-glucan branching enzyme (9l98) 76( 93 4 17691539 [Haemophilus inEluenzae) 1 46 ________,____y_______,_______,________________y_____________________,__________ ____________________________________________,___ _____,_________y______ 1 I I I (9i1144313I6.0 kd ORF [Plasmid ColEl/ 1 76( I

I

y________y____t_______y_______,________________,_______________________________ _____________________________________________,___ _____y_________y_________f b I j I 1 (9i(153841Ipneumococcal surface protein A [Streptococcus 76j 116 2 21511678 pneumoniae( 1 59 j I

.________y____,_______,_______y________________y__ ______________________________ __ __,___ _____y_________a_________, 1 1 1 I (9i(1314297IClpC ATPase (Listeria monocytogenes/ 1 y________4____,_______y_______,________________,_______________________________ _______________________________________ -__ ___ __y_________y_________, __y___ 1 1 j I IgnIIPIDId101328(YqiZ (Bacillus subtilisl I

j ________,____,_______,_______,________________,________________________________ ____________________________________________,_______ _y_________y_________y I I101 I 19i(944944Ipurine nucleoside phosphorylase [Bacillus 76I

128 69737797 subtilis) I 60 I

I

,________,____y_______,_______,________________,_______________________________ _____________________________________________y___ _____y_________y_________y 1 1111 I (9i(1674310IIAE000058) Mycoplasma pneumoniae, HG085 homolog, 131 61865812 from M. genitalium 1 47 ( I I I I I 1 IMycoplasma pneumoniae) I I
I
I

,________y____,_______y_______,________________y_______________________________ _____________________________________________,___ _____, _________v_________y S. pneumoniae - Putative coding regions of novel proteins similar to known proteins y________y____ ,_______,_______y ________________y ____________________________________________________________________________,__ ______y_________y_________y ti jORF St t h C j t g ar j j j j i sim on S matc match ~ B
ident op gene j length name j ~IDj j j ~ ( ~ ~
ID (nt) (nt) acession (nt) ,________,____y_______y_______J________________y_______________________________ ______________________________ _______________y___-____y_________y_________, 1J9 , , ~ ~gi~2293302 ~(AF008220) j 76 ' 4 3641 3192 YtqA 53 ~

[Bacillus 4S0 subtllis) y________,____,-______ ,______ _,________________y____________________________________________________________ ________________y________y_________y_________y "'r j j j jgi~1184680 ~polynucleotide j 76 ~
phosphorylase 62 ~

[Bacillus 2337 subtilis]

W
y________,____,_______,______ _,________________ y_____________________________________________________________ _______________y________,_________,_________y 143 j j j )91j143795 ~txansfer ~ 76 ~
2 2583 390S RNA-Tyr 61 ~

synthetase l323 l8acillus subtilis) ________,____y_______ ,______ _,________________~____________________________________________________________ _ _______________y________y_________y_________y 170 j j j jgnljPIDjd100959 ~ycgQ
~ 76 ~
6 509S 61l4 [Bacillus 44 subtilis] 1020 y________y____y_______,______ _,__-_____________y___________________________-_________________________________ _______________y________y_________y_________y 1H0 j j j jgi~40019 jORF j 76 j 2 1927 557 B21 53 j (aa 1371 1-821) [Bacillus subtilis) y________y____y_______y_______,________________y_______________________________ ______________________________ _______________y________y_________y_________y j j j j (91j551880 janthranilate j 76 ~
191 7 5815 S228 synthase 61 j beta 588 subunit fLactococcus lactic) y________y____y_______,______ _y________________y_______________________-______________________________T______ _______________y________y_-_______y_________y j j j j )9i)2149905 jD-glutamic j 76 j 195 3 3829 2444 acid 60 adding j j enzyme (Enterococcus faecalis) y________y____,_______ ,______ _,________________y____________________________________________________________ _ _______________y________y_________,_________J

j j j j )91j431272 jlysis j 76 j 200 3 1914 3629 protein 58 j [Bacillus 1716 subtilis) y________y____E_-_____ y______ _,________________ y_____________________________________________________________ _____________ _ _ __y__ _ ___y_________y_________, j j j j )9i)2208998 jdextran j 76 ~
201 1 431 207 glucosidase 57 ~
DexS 225 [Streptococcus cuts]

,________,____,_______ y______ _,________________ y_____________________________________________________________ _______________y________y_________y_________y o j j j ~ jgi~663278 jtransposase j 76 ( 214 2 1283 23A0 [Streptococcus 55 ~
pneumoniae) 109B

~

N

y________y____y_______ ,______ _y________________ y_____________________________________________________________ _______________y_~______y_________J____----_y j ~ ~ j )g1)1552775 )ATP-binding j 76 j 'r 225 3 2338 3411 protein 56 ~
[Escherichia 1074 cola) j y________,____y_______ ,______ _y________________ y_____________________________________________________________ _______________y________y_________y_____ J
N

233 j j j jgi~1163115 jneuraminidase j 76 ( o 1 2 724 B 60 ( [Streptococcus 723 ~

pneumoniae) y________y____y_______,______ _y______.._________ y_____________________________________________________________ _______________,________y_________y_________y 347 j j j ~gij537033 jORF j 76 ~
1 S23 38 f356 60 j [Eacherichia 486 j colt) ,________,____y_______ ,______ _y________________ _ _______-_-_____y________y_______-_y_________y y__________________________-__________________________________ j j j j ~gi~2149905 jD-glutamic j 76 j 356 2 B42 165 acid 61 j adding 678 j enzyme [Enterococcus faecalis]

y________y____y_______ y-_-___ _,______-_________ y___________________-______________-_________________-________ _______________y________ o _ _____ -____ y _ __y__ __y j j j j ~gij1d9520 jphosphoribosyl j 76 ~
366 3 734 348 anthranilate 69 ~
isomerase 387 [Lactococcus lactic) y________y____y_______y______ _y________________ y_____________________________________________________________ _______________y________y_________y_________y ~ 12599 j11484 jgi~1574293 ~fimbrial us influenzae]
8 transcription j 75 regulation ~ 61 repressor ~ 1116 ipilB) j [Haemophil y________,____y_______ y______ _,________________ y_____________________________________________________________ _______________y________y_________y_________, j j13(12553 (11894 jgnljPl0jd102050 jydiH
~ 75 ~
6 (Bacillus 51 j subtilis] 660 y________y____y_______ y______ _y________________ y_____________________________________________________________ _______________y__.._____y_________y_________J

j j10j j )9i)142538 ~aspartate j 75 j 9 7282 6062 aminotransferase 55 j (Bacillus 1221 sp.) j ,________y____y_______ y______ _y________________ y__________v__________________________________________________ _______________,________y_________y_________y j j12j j )91j149493 ~SCRFI
j 75 ~
B080 7940 methylase 56 j [Lactococcus 141 lactic) ,________,____,_______ y______ _J________________ y_____________________________________________________________ _______________~________y_________y_________y j j j j jgnljPIDjd101319 ~YqgH
~ 75 ( 18 5 4266 3301 (Bacillus 52 ( subtilis] 966 j y________,____,_______ ,______ _y________________ y_____________________________________________________________ _______________y________y_________y_________y j ~ j ~ ~gij1373157 orf-X;
supplied 75 62 B91 22 4 l838 2728 hypothetical by protein;
Method:
conceptual translation ~

~ ~ i i j ~ ~ j j author (Bacillus ~abtilis) y________y____y_______ ,______ _,________________ y_______.______________________________________________________ _______________y________y_________y_________y b 30 Q11j j ~gi~153H01 enzyme j 75 j n 9015 7828 scr-II 64 j (Streptococcus 1188 mutansl j y________,____y_______ ,______ _y________________ y_____________________________________________________________ _______________y________y_________y_________y H

31 ~ ( ~ ~gi~2293211 j(AF008220) .
5 2362 2030 putative ~ 75 ~
thioredoxin 53 ~

(Bacillus 333 subtilis]

y________J____y_______ ,______ _y________________ y_____________________________________________________________ _______________y________y_________y_________y VI

j ~ ~ ~ ~ ~formamid J
32 9 7484 8359 nl~PIDjd100560 ri idi -DNA
l l (St g opy ( 75 j m 61 ~
ne 876 ~
g ycosy ase reptococcus mutans) y________y____y_______ y______ _y________________ y_____________________________________________ __ ___ __ _ _ ____ ~ __ _____ __ ______ _ s ~

( j j j ~gij413976 jipa-52r j 75 j 33 4 1735 1448 gene 53 ~

product 288 [Bacillus subtilis) ,________,___y_ , __ _ _ ____ _ _,________________ y_____________________________________________________________ _______________ ____ __ _ _ j j10j j jgi~533105 )unknown ~ 75 ~
33 6470 57b9 (Bacillus 56 j subtilisl 702 ,________,____y_______ J______ _,________________ y_____________________________________________________________ _______________y________y_________y_________y S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ________v____~_______,_______,________________ t____________________________________________________________________________,_ _______i_________,_________~

Contig~ORF~ ~ ~ ~ match gene name ~
~ identlength StartStopmatch 8 8 ~

sim ID SID~ y ~ ~ ' ~ ~ (nt) (nt) (nt)acession ,________,____,_______~_______~
________________~______________________________________________________________ ______________~________,________ 33 ~12~ ~ (pir~A00205~FECL ~fetredoxin [4Fe-4S1 - Clostridium thermaceticum~ 56 306 6878 718J 75 ~

~

OD

________,____,_______~_______~________________ ,____________________________________________________________________________,_ _______,___ ______~_________t 36 ( ~ ~ ~gi~2088739 ~(AF003141) strong similarity to the FABP/P2/CRBP1CRABP~ ~ 43 180 1 181 2 family of 75 ~

( ~ transporters (Caenorhabditis elegans) ________,____,_______,_______~________________ ~____________________________________________________________________________,_ _______,___ ______,_____ 38 Q22A A ~gi~1574058 hypothetical (Haemophilus influenzae) ~ 56 870 4510 5379 75 ~

~

~________~____~_______,_______,________________ t_______________________________________________________________________-____t________~___ ______~_________~

48 Q332339824066~gi~1930092 outer membrane protein (Campylobacter ~ 56 669 jejuni] 75 ( ~

_____i____+_______y_______,________________ ~____________________________________________________________________________i_ _______i___ ______,______ 51 ~ ~ ~ ~gi~439B5 ~nifS-like gene (Lactobacillus delbrueckii]~ 55 318 1 2 319 75 ~

~

________,____,_______,_______,________________ ,____________________________________________________________________________f_ _______~___ ______~_________+

51 Q10~ 11683~gi~537192 ~CG Site No. 620; alternate gene names meshift75 SO 3366 8318 hs, hsp, hsr, rmx apparent fra ( ~

in Geneank Accession Number X06545 [ESCherichia ~
colil ________,____~_______~_______,________________ ~____________________________________________________________________________,_ _______~___ ______,_________, ( ~18e1956620759~gi~666069 ~orf2 gene product [Lactobacillus lelchmannii]~ 58 1194 ~
~

,________a____,_______t_______,________________ ~____________________________________________________________________________,_ _______,___ _____ 57 ~ I ~ ~gi~290561 ~olBB (Escherichia colil ~
50 627 o 9 8448 7822 75 ~
~
~

,________,____,_______,_______,________________ ,____________________________________________________________________________,_ _______,___ ______,_________, N

65 ~14~ ~ ~gi~606241 305 ribosomal subunit protein S14 (Escherichia( 64 285 6072 6356 coli( 75 ~

, ________~____,_______,_______,________________ ,____________________________________________________________________________,_ _______,___ ______,_________i 70 ~ ~ ~ Jgi~1256617 ~adenlne phosphoribosyltransferase (Bacillus( 57 600 4 307I 2472 subtilis] 75 ~

~

,________,____,_______,_______~________________ ,____________________________________________________________________________~_ _______,___ ______f_________i N

71 J243039929404'gi~1574390 ~C4-dicarboxylate transport protein [Naemophilus~ 57 996 influenzae) 75 ( ~

________,____,_______,___..___,________________ ,____________________________________________________________________________,_ _______,___ ______,______ _, DD

73 ~ ~ ~ ~gnl~PID~e249656 ~YneT (Bacillus subtilis]
~ 57 __ 2 910 455 75 ~

~

~________s____,_______~_______~________________ t____________________________________________________________________________,_ _______~___ ______~_________, 79 ~ ~ ~ ~gi~1146219 28.28 of identity to the Escherichia coli 1 1B10 491 GTP-binding protein Era; putative ( ~ ~

(Bacillus subtilisl ________,____,_______,_______~________________ ~____________________________________________________________________________,_ _______~___ ______~_________, 82 ~ ~ ~ ~gi~1655715 ~BztD (Rhodobacter capsulatus]
~ 55 177 N
6 6360 6S36 75 ~
~
~

____ ,_______,________________ ~_..__________________________________________________________________________~
________~___ ______~_________, 83 ~ ~ ~ ~gnl~PID~e323529 putative PlsX protein (Bacillus subtilis]~ 56 1038 6 1938 2975 75 ~

~

________,____~_______,_______~________________ i____________________________________________________________________________,_ _______+___ ______,_________, 93 ail~ ~ ~gi~39989 ~methionyl-tRNA synthetase [Bacillus stearothermophilus]~ 58 20S2 7368 5317 75 ~

~

~________,____+_______,_______t________________ ~____________________________________________________________________________~_ _______,___ ______,_________~

( ~13~ ( ~gi~1591493 (glutamine transport ATP-binding protein ~ 54 71I
93 9d09 8699 Q (Methanococcus jannaschii] 75 ~
( ~________y____t_______~_______,________________ ~____________________________________________________________________________~_ _______~___ ______~_________~

95 ~ ~ ~ ~gnl~PID~e323510 ~YIoV protein [Bacillus subtilisl ~ 57 1749 1 1795 47 75 ~

~

,________f____,_______,_______,________________ i_________________________________________________________..__________________, ________a___ ______i_________~

103 ~ ~ ~ ~gnI~PID~e266928 )unknown [Mycobacterium tuberculosis] ~ 64 825 2 362 1186 75 ~

~

~________,____~_______.;_______,________________ ~______________-_____________________________________________________________~________,___ ______~_________~

( ~ ~ ~ ~gi~460026 repressor protein [Streptococcus pneumoniae]~ 54 225 b ~ ~
~

________~____~_______,_______,________________ ,____________________________________________________________________________,_ _______~___ _____ 113 ~ ~ ~ ~gnI~PID~d101119 ABC transporter subunit (Synechocystis ~ 55 933 2951 38B3 sp.] 75 ~

~

,________,____t_______~_______,________________ t____________________________________________________________________________~_ _______~___ ______,_________~

121 ~ ~ ~ ~gi~2145131 repressor of class I heat shock gene expressionmutans] 58 107l I 320 1390 HrcA [Streptococcus ~ ~

~

________,____f_______~_______,________________ ~____________________________________________________________________________t_ _______~___ ______~_________f 127 ~ ~ ~ ~gi~1500451 ~M. jannaschii predicted coding region ii] 44 387 6 2614 3000 M,11558 [Hethanococcus jannasch ~ ~

~

,________~____~_______,_______4________________ ~____________________________________________________________________________a_ _______,___ ______t_________? ~D

137 Q18A 10687~gi~393116 ~P-glycoprotein 5 (Entamoeba histolytica]~ 52 606 0082 75 ~

~

_____,____,_______~_______,________________ ,____________________________________________________________________________f_ _______~___ ______ I49 Q11~ ~ ~gnl~PID~d100582 unknown (Bacillus subtilis]
~ 55 A40 8d99 9338 75 ~

~

,________,____,_______,_______,________________ ~_____________________________________________________..______________________~
________~_________t_________~

TAIiLC 2 S, pneumoniae - Putative coding regions of novel protein3 similar to known proteins ,________,____,_______~_______,________________________________________________ ___________________________-________________,________i_______-__________ Contig~ORF~ ~ ~ match match gene name ~ 1 t ldent~ length StartStop ~ sim ~

( SID~ ~ 4 acession 4 ~
~ (nt) ID (nt)(ntl ~

,________,____,_______,______ _,_____________________________________________________________________________ _____ ______-___,________,_________,_________, 15I ~ ~ ~ ~g1~40467 ' 75 6 91007673 (HsdS polypeptide, , part of 57 CErA family ~

iCitrobacter 1428 freundii) ________,____,_______,_______,_________________________________________________ ______________-__________________ __________a________+_________~_________ pp 158 ~ ~ ~ ~gnl~PID~e253891 ~ 75 1 986 3 ~UDP-glucose ~

4-epimerase 63 (Bacillus ( subtilis) 98d ,____________,_______,__________________-___-______________..____________________________________________________-________,__________________________, Y.r 172 ~ ~ ~ ~gi~142978 ~ 75 8 S6536774 glycerol ~

dehydrogenase 56 (Bacillus ~

stearothermophilus) 1122 ,________,___________,_______,________________,________________________________ ____________________________________________i________~_________~_________r I72 ( ~ ~ ~gnl~PID~e268456 ~ 75 9 71399730 unknown ~

(Hycobacterium 58 tuberculosis) ~

________,____,_______~_______,_-______________,___________________________________-_______________________-________________,________,______-___________, 173 ~ ~ ~ ~gnl~PID~e236469 ~ 75 1 261 79 ~CIOC5.6 ~

(Caenorhabditis 50 elegans) ~

________,____,_______,_______~________________,________________________________ ____-_______________________________________________,_________,________-, 185 ~ 3066~ ~gi~1574806spermidine/putrescine transport ATP-binding 3 20I4 ~ protein (potAl (Haemophilus ~ ~

influenzael ~ ~ ~ ~

___________________,_______,________________,____________-___________________________________________________________-___,________~_________,_________, I91 ~ ~ ~ ~gi~149518 ~ 75 6 52354213 ~phosphoribosyl ~

anthranilate 61 transferase ~

(Lactococcus 1023 lactic) ,________,____,_______,_______,________,________,______________________________ _____________-______________________ __________,________E_________,_________t 226 ~ a ~ ~gi~2314588 ~ 75 2 17741181 (IAE000642) ~

conserved 65 hypothetical ~

protein 594 (Helicobacter pylori) ,________,____,_______,____--_,________________________________________________________________________-____. _________________-,_________,_________, ____ 231 ~ ~ ~ ~gi~40173 ~ 75 o 1 1 1S3 ~homolog ~

of E.coli 57 ribosomal ~

protein 153 421 /Bacillus ~

subtilis) ,________,____~_______,-______,________________,_______________________________________________________ _____________________,________y_________i_________+

N

I ~ ~ ~ ~gi~2293259 ~ 75 'J
234 1 2 4I8 ~(AF0082201 ~

YtqI (Bacillus 59 subtilisl ~

, ,________,____,_______,_-_____,________________t________________________________________________________ _________ __________,_________________,_________, 279 ~ ~ ~ ~gi~1119198 ~ 75 N
1 552 1S1 unknown ~

protein 50 (Bacillus ~

subtilisl 402 ~

,________,-____________-____,________________y__--______________________________________________________________ __________,_________________,________-, C

29I ~ ~ ~ ~gi~40011 ~ 75 pp ,r 7 355S3B27 ~ORF17 ~

IAA 1-161) 18 (Bacillus ~

subtilis) 270 ~

,________~____,-______,_______,________________,______-________________________________________-__________________ __________,________-________,_________y 37S ~ ~ ~ ~gi~410137 ~ 75 2 137 628 ~ORFX13 ~

(Bacillus 58 subtilis) ~

,________,____,_______,_______,________________,_______________________________ ___________________________________ __________,___-___-~_________,_________, 6 Q20A A ~gi~2293323 ~ 74 67217560 ~(AF008220) ~

YtdI [Bacillus 53 subtilisl ~

________,____,_______,_______,________________,________________________________ ______________________-___________ __________,_________________,______ 7 ~ ~ ~ ~ ~ 74 6 d6826052 i~1354211 60 ~PET112-like rotein [Bacillus subtilis) g ' N
p ~ 1371 ~

,________i___________,_______,________________,________________________________ __________________________________ __________,________,_________,_________, 18 ~ ~ ~ ~gnI~PID,d101319 ~ 74 4 33412427 ~Yqgl [Bacillus ~

subtilis) 54 ~

,________,___________,_______,____-________..__,________________________________________________________-_________ __________________,___-____-__-______ 21 ~ ~ ~ (gi~10723H1 ~ 74 6 5885d800 ~glutamyl-aminopeptidase ~

[Lactococcus 59 lactic) ~

.________,____~______________,___________-____,__________-___________________________-___________________________ _-________f________,_________,_________, 24 ~ ~ ~ ~gi~2314762 pylori) 2 739 548 ~(AE0006551ABC ~

transporter, 74 permease ( protein 46 (yaeE) ~

(Helicobacter 192 (________,____,_______,_______,________-_______,_____-__________-______________________________________-__________ __________a__________________________, 25 ~ ~ ~ ~gnl~PID~d100932 ~ 74 1 2 367 H20-forming ~

NADH Oxldase 63 [Streptococcus ~

mutansl 366 ________,___________-______,________-_______,____________________-_____________________________________________ __________,______-_,_________,_________, 38 ~18A 12964~gi~537034 ~ 74 1432 ~ORF o488 ( [ESCherichia 57 coli) ~

________,____,_______,_____-_,________________,____________________________________________________________ ______ ___________________________s_________ 48 Q10~ ~ ~gi~1513069 ~ 74 89246669 ~P-type ~

adenosine 53 triphosphatase ~

(Listeria 2256 monocytogenes) ,________y____+_______r_______y________________,_______________________________ ___________________________________ __________,________4_________,_________f 55 Q11(1196411401~gnl~PID~e283110 ~ 74 ~femD [Staphylococcus ~

aureus) ' ( ,________,____,_______,_______,--______________,________________________________________________________________ __ __________,________f_________y_________, 61 ~ ~ ~ ~gi~2293216 subtilis) 2 178242? ~IAF008220) ~

putative 74 UDP-N-acetylmuramate-alanine ( ligase 55 [Bacillus ~

________,____,_______,_______________________,___________________..____________ ___________________-______________ --________,________,____-_____________, 76 (10~ ~ ~gnl~PID~d101325 ~ 74 94148065 ~YqiB (Bacillus ~

subtilis) 54 ~

________,___________,_______,________________~____________________________-_____________________________________ __________,________,_________,_________, M

83 ~ ~ ~ ~pir~C33496~C334 ~ 74 iI
2 666 926 ~hisC homolog ~

- Bacillus 55 subtilis ( ~

____-___,____,_______,_______________________,_-________________________________________________________________ __________________,_________,_________, J ~ ~ ~ ~gi~683585 ~ 74 86 9 8985B080 ~prephenate ~

dehydratase 55 [Lactococcus ~

lactic) 906 ,____________,_______,_____-_,________-___-___t_______________________________________________-__________________ __________,________i_________,_________~

TABLE 2 S. pneumoniae - Putative coding regions of novel proteins similar to known proteins /________+____+_______+_______+________________+_______________________________ _____________________________________________+________+_________+_________+

Contig~ORF~ ~ ~ match ~ match gene name ~
~E identlength' StartStop t ~

sim ID SID~ ~ ~ acession~ ~
~~ (nti ~ ep (nt) (nt) , /________y____/____-__y______ _y________________/____________________________________________________________ ______.._________+________+ _________+_________ +

102 ~ ~ ~ (gi~143394~OMP-PRpP transferase [Bacillus subtills) ~ ~

' ~

+________+____,_______+______ _,________________,____________________________________________________________ ________________y________+_________+_________+

103 ~ ~ ~ ~gnl~PID~e323524~YloN protein (Bacillus subtilis) ~
~ 1D98 ~

+________+____+_______ +_______+________________+_____________________________________________________ ___________________ ____,______ __+_________+_ ________/

108 ~ ~ ~ ~gnl~PID~e257631~methyltransferase [Lactococcus lactis] ~
~ 729 ~

________,____/_______+_______+________________/________________________________ ________________________________________ ____+________+_~_______+_ ________+

131 ~ ~ ~ ~gnl~PID~d101320~Yqg2 (Bacillus subtilisi ~
' 333 ~

/________y____/_______+_______y________________y_______________________________ _________________________________________ ____/________y_________/_ ________y 133 ~ ~ ~ ~gnl~PID~e313025hypothetical protein (bacillus subtilis) ~
~ d62 ~

/________/____+_______+_______y________________/_______________________________ _____________________________________________+______ __+_________+_ ________+

137 ~ ~ ( ~gnl~PID~d100479~Na+ -ATPase subunit D [Hnterococcus hirae)~
~ 621 ~

________y____y_______,_______y________________/________________________________ _____________________~_____________________,________+_________+_ _____.__+

149 ~ ~ ~ ~gnl~PID(d100581high level kasgamycin resistance (Bacillus ~
~ 876 4 3008 3883 subtilisl 74 55 ~

________,____,_______,_______,________________+________________________________ ________________________________________ ____y________/_________y_ ________/

157 ~ ~ ~ ~gi~157J373~methylated-DNA--protein-cysteine methyltransferase~ 74 ~ 582 2 243 824 (datl) [Haemophilus 48 ~

~ ( influenzael ( ~
y ~

________+____,_______y_______,________________/________________________________ ____________________________________________+________,_________+_ ________+

164 ~ ~ ~ ~gi~410131(ORFX7 (Bacillus subtilis) ~

~ ~

,________y____,_______,_______y________________/_______________________________ _____________________________________________/________y_________+_ ________/ N

167 ~ ~ ~ ~gi~413927~ipa-3r gene product [Bacillus subtilis) ~ ( w.
~

/________+____,_______y_______,________________y_______________________________ _____________________________________________+_______ _y_________y_ ________y 171 ~ ~ ~ ~gnI~PID~d102251beta-galactosidase [Bacillus circulans) ( ( 1818 N

( , ________,____y_______y_______y________________/________________________________ ____________________________________________,_______ _/_________,_ ________, 172 ~ ~ ~ ~gi~466474~cellobiose phosphotransferase enzyme II ( ~

4 1064 2392 " /bacillus stearothermophilus) 74 50 ~

/________,____+_______y_______,_______________y________________________________ ____________________________________________/,.______ _y_________y_ ___-____y 185 ~ ~ ~ ~gi~1573646~Mg(2) transport ATPase protein C (mgtC) 74 1 326 3 (SP:P220J7) [Haemophilus i ~ i influenzae) ~

/________+____+_______+_______________________+________________________________ ____________________________________________/________y_________y_ ________y O

188 ~ ~ ~ ~gi~1573008ATP dependent translocator homolog (msbA) ~
( 930 2 1089 2018 [Haemophilus influenzae] 74 44 ~

y________y____y_______y_______y________________y_______________________________ _____________________________________________y_______ _y_________+_ ________+ N

189 Q11~ ~ ~gi~1661199~sakacin A production response regulator ~
~ 684 6491 7174 (Streptococcus mutans) 74 60 ~

+________+____,_______/_______+________________,_______________________________ _____________________________________________+________+_________+_ ________/

210 ~ ~ ~ ~gi~2293207~IAF008220) YtmQ [Bacillus subtilis] ~
~ 768 2 520 l287 74 60 ( +________y____y_______y_______,________________y_______________________________ _____________________________________________+________+_____.___+_ ________+

261 ~ ~ ~ ~gi~666983putative ATP binding subunit [Bacillus subtilis)~
~ 6d5 ~

+________,____y_______/_______,________________y__________i____-____________________________________________________________/________+_________ +_ ________+

263 ~ ~ ' ~gi~663232Similarity with S. cerevisiae hypothetical ric74 ~

3 1619 3655 137.7 kD protein in subtelome 42 ~

Y' repeat region [Saccharomyces cerevisiae)~

y________/____y_______y_______y________________y_______________________________ _____________________________________________+________+_________+_ _..______/

265 ~ ~ ~ ~gi~49272 ~Asparaginase (Bacillus licheniformisl ~ ( Jgq 2 844 1227 74 6q ~

+________/____+_______ +_______+________________y_____________________________________________________ _______________________+________+_________+_ ________+

368 ~ ~ ~ ~gi~603998unknown [Saccharomyces cerevisiae) ~ ~
942 b ~
i ________,____+_______+_______+________________+________________________________ __________________________________________ ____ ________+
_____ __y__ __/__ __+_ 7 Q16f1335711921~gnl~PID~d101324~YqhX (Bacillus subtilis) ~ ~ 1437 ~

+____________+_______;_______/________________y________________________________ __________________________________________ .,_ ____ _ _____ _____+
__+__ __+__ __+_ 17 (10~ ~ ~gnl~PID~e305362unnamed protein product [Streptococcus thermophilus)~ ~ 258 U1 ~
~

i,___________y_______y_______+________________y________________________________ ____________________________________________+_______ _/_________+_ ________+

31 ~ ~ ~ ~gnl~PID~d100576single strand DNA binding protein [Bacillus~
~ 279 2 S22 244 subtilisi 73 55 !

________,____+_______y_______y________________y________________________________ ____________________________________________,_______ _+_________,_ ________+ (D

32 ~ ( ~ ~gnl~PID~d101315~YyfG (Bacillus subtilis] ~
~ 5 ~ 28 ________y____+_______,_______,________________+________________________________ ____________________________________________+_______ _+_________+_________+
pip 34 ~15A0281~ ~gnI~PID~d102151((AB001684) ORF42c [Chlorella wlgaris) ( ~ 492 ~

+________+____/_______/_______+________________y_______________________________ _____________________________________________+_______ _+_________+_ ________+

S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ________,____ ,_______,_______,________________,_____________________________________________ ____________________________ ___,________,_________,_________, Contig~ORF~ ~ ~ match ~ match gene name ~
~ S ~ length StartStop i ident sim ID ~ID~ ~ ~ acession (nt) (nt) ~
Int) ________,____ ,_______,_______~________________r_____________________________________________ _______________________________~________~_________i_________, 40 ~12 ' ~ ~gi~1173517~ribofiavin synthase alpha subunit [ACtinobacillus~
9876 9226 pleuropneumoniael 73 r ~

~

,________,____ a_______,_______,________________,_____________________________________________ ____________.._______..__________,________f_________,_________~
r QO

55 ~ ~ ~ ~gnl~PID~d101887~cation-transporting ATPase Pact [Synechocystis~
2 3S92 839 sp.] 73 ~

~

________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________~_________t_________+
r 55 Q18 A 16586~gnl~PID~e265580unknown [Mycobacterium tuberculosis) ~

~

~

,________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________,_________,_________, 65 Q16 ~ ~ ~gi~143419ribosomal protein L6 [Bacillus stearothermophilus)~

~

~

________,____ ,_______,_______,________________,_______________________________________..____ ________________________________,________,_________,_________, 66 ~ ~ ~ ~gnl~PID~e269883~LacF [Lactobacillus cases] ( ~

~

,________,____ ,_______,_______,________________f_____________________________________________ _______________________________~________f_________,_________, 70 ~10 ~ ~ ~gi~857631envelope protein [Human immunodeficiency ~

5557 S733 virus type 1] 73 ~

~

________,____ ,_______i_______,________________,_____________________________________________ ________~_____________________,________y_________,_________, 71 ~ ~ ~ ~gnl~PID~e322063ass-1,4-galactosyltransferase [Streptococcus, 4 6133 8262 pneumoniae] 73 ~

~

,________,____ ,_______,_______,________________~____..____________________________________.._ _________________________________,________,_________,_________y 72 ~ ~, ~ ~gi~2293177~IAF0082201 transporter [Bacillus subtilis) ~

~

( ,________,____ ,_______,_______,________________f.-_______..__~:_____________________________________.___________________________~
________y_________~_________, 76 ~ ( ~ ~gnI~PID~d101325~YqiF [Bacillus subtilis] ~

~

~

~

o ,________~____ ,_______,_______~________________,___-_-______________________________________________________________________,________ ~_________,_________f 76 Q12 A ~ ~gi~1573086~uridine kinase (uridine monophosphokinase) ~
to 0009 9533 (udkl (Haemophilus influenzae) 73 ~

~

~

________,____ i_______,______..,________________,____________________________________________ ________________________________,________,_________~_________~
~1 80 ~ ~ ~ ~gi~1377823~aminopeptidase [Bacillus subtilis] ~

~

~

,________,____ ,_______,_______,________________,_____________________________________________ ____________..__________________i________,_________,__..______~
N

97 ~ ~ ~ ~gnl~PiD~d101954~dihydroxyacid dehydratase [Synechocystis ~
o 3389 1668 sp.l 73 ~

~

~

,________,____ ~_______,_______,________________y_____________________________________________ _______________________________t________,_________,__..______, 98 ( ~ ~ ~gnl~PID~e314991~FtsE [Mycobacterium tuberculosis[ ~

~

~

~

________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________,_________,_________, ~O

108 iii A A (gi~388109regulatory protein [Enterococcus faecalis] ( ~

~

________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________t_________,_________, o 1Z8 ~ ~ ~ ~gi~1685111~orf1091 [Streptococcus thermophilus] ~

~

~

,________,____ ,_______,_______,________________,_____________________________________________ ____________________________ ___r________~_________,_________, N

l38 ~ ~ ~ ~gi~147326transport protein (Escherichia coli] ~

~

~

,________,____ ,_______f_______,________________+__.._________________________________________ ________________-_______________,________f_________,_________, ( (13 A2538A ~pir~E53902~E534~serine O-acetyltransferase (EC 2.3.1.30) ~
140 1903 - Bacillus stearothermophilus 73 ~

~

,________,____ ,_______,_______,_..______________,____________________________________________ _____________________________ ___,________f_________,_________, ( ~ ~ ~ ~gnI~PID~e323511putative YhaQ protein (bacillus subtilis] ~

~

~

,________,____ ,_______,_______,________________,__________u___________________-__-__________________________________________,________,_________,_________, 164 ~ ( ~ ~gi~1592076~hypotheticai protein (SP:P25768) (Methanococcus~

4 2323 2790 jannaschii] 73 ~

~

________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________,_________~_________, ( ~ ~ ~ ~gi~410137~ORFX13 [Bacillus subtilisl ~

~

~

________,____ ,_______,_______,________________,_____________________________________________ ____________________________ _.._~________i_________t_________, 170 ~ ~ ~ ~gnl~PID~d100959homologue of unidenrified protein of E. coli~

5 4394 5302 [Bacillus subtilis] 73 ~

( ,________,____ ,_______,_______,________________4___ _ _ ____________ _____,_________,_________, ________ l78 ~ ~ ~ ~gi~46242 modulation protein B, 5'end [Rhizobium lots]~

~

~

________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________,_________,_________f ( ~ ~ ~ ~gnl~PlD~e214719~PlcR protein i~acillus thuringiensis) ~

204 6 5096 4278,________________,______________ ________,____ ~_______,_______ ______ ~
__ 41 ~

___ ___ ___ _ ______________________________________________ 213 ~2 ~ ~ 'gi 1565296ribosomal 832 2037 protein S1 homolog; sequence specific DNA-binding~ 73 55 protein ~

( ~ ~ ~ ILeuconostoc lactisl ,________,____ ,_______~_______,________________,_____________________________________________ _______________________________,________y_________y______ 231 ~ ~ ~ ~gi~40173 ~homolog of E.coli ribosomal protein L21 ~
U
2 84 287 [Bacillus subtilisl 73 ~

~

~

,________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________t_________,_________, 4 ( ( ~ ~gi~1773151adenine phosphoribosyltransfezase [ESCherichia~

237 1 2 505 coli[ 73 ~

~

,________,____ ,_______~_______,________________+_____________________________________________ _____________________________ __f________i_________,_________, S. pneumoniae - Putative coding regions of novel proteins similar to known proteins y________,____y_______f _______,________________,______________________________________________________ ___..__________________ ,________y _________y_________y J JORF J J J match J match gene name J
J length Contig Start Stop % % J
sim ident J

J JID J ~ J acession~ ~
~ Int) J
ID (nt) (nt) ( ________,____ ,______ _,______ _,_________________ ________________________________________________________________________ y _ __ ,_________,_ ________y _ __y__ _ __ 269 J J J JgnIJPIDJd101328JYqiX (Bacillus subtilisl ~ 36 J

,________~____ ,______ _,______ _,________________,____________________________.__________________-__________..__________________,________,_________f_ ________, 289 ~ ~ ~ JpirJA02771JR7MCJribosomal protein L7/L12 - Hicrococcus ~ ~ 441 2 1272 832 luteus 73 ~

________y____ ~______ _,______ _,________________y________________________________________________________..__ _________________y________ y_________y_ ________y r 343 J ~ J JgiJ1788125~IAE000276) hypothetical 30.4 kD protein 73 1 14 4B4 in man2-cspC intergenic region J

J ~ J J J J (Escherichia coli] ~
~ J

,________,____ ,______ _,______ _,________________,____________________________________________________________ ________________,________ ,_________,_ ________y J ~ J ~ JgiJ2149905~D-glutamic acid adding enzyme (Enterococcus~

3S6 1 222 4 faecalis) 73 ~

_____,____ ,______ _,______ _a________________,____________________________________________________________ ________________,________ ,_________y_ ________, J ~ J ~ JgnIJPIDJd101833Jamidase [Synechocystis sp.) _ 52 J
_ J

,__ ~______ _,______ _,________________,_________________________________________.._____,________ y_________y_ ________y _____,____ J J .JgiJ146976____________________________ ~

J 7195 7647 JnusB (Escherichia coli) 7 _,________._____ ~
J
J ,______ _,______ _______ _____ 9 ___ ,________,____ _ __ _ ,. ____,________y_________,_ ________, J J17 J13743 (13300 IgnIJPIDJe289141_____ 7 ________ ___ ___ ____________________________________ Jsimilar to hydroxymyristoyl-(acyl carrier protein) dehydratase (Bacillus ~ i J J ~ J J subtilis) J
i ,________~____ ,'______ _,______ _,________________p____________________________________________________________ _______________,________ ,-___..____y_ ________y J J19 J15637 J16224 JgnIJPIDJd101929Jribosome releasing factor [Synechocystis ~ ~ 588 22 sp.1 72 J

________,____ ,______ _,______ _,________________,____________________________________~_______________________ ________________,________ ,_________,_ ________, o J J17 J12111 11425 JgnIJPIDJd101190JORF3 [Streptococcus mutans]

55 ~
~

N

_____,____ ,______ _,______ _,________________,____________________________________________________________ ________________y________ ,_________,_ ________y J

J J ~ J JgiJ396501Jaspartyl-tRNA synthetase ]Thermus thermophilus)~

J

J
,________,____ y______ _~______ _,________________,____________________________________________________________ ____________ ____,________y_________,_ ________y 38 J23 15372 16085 JpirJH64108JH641JL-ribulose-phosphate 4-epimerase IaraD) ~ 72 J 714 o homolog - Haemophilus influenzae 54 ~
~

J J J J J J (strain Rd KW20) J
J J
J

________,____ ,______ _,______ _,________________,____________________________________________________________ ________________,________ ,_________,_ ________, 39 ~ ~ ~ JgnIJPIDJe254877unknown (Mycobacterium tuberculosis) ~ J 1812 J

,________,____ y______ _,______ _,________________y____________________________________________________________ ________________,________ ,_________y_ ________y J J J ~ JgiJ153672Jlactose repressor (Streptococcus mutans) ~
~ 168 o ~

,________,____ ,______ _,______ _,________________y____________________________________________________________ ________________,________ ,_________,_ ________, 48 J ~ J JgiJ3103B0~inhibin beta-A-subunit (ovis cries) J

~

________,____ ,______ _,______ _,________________i____________________________________________________________ ________________,________ y_________,_ ________, J ~29 J21729 J22424 JgiJ2319329J(AE000623) glutamine ABC
transporter, permeaseter72 J 696 48 protein (glnP) [Helicobac J

J

J ~ J pylori] J

,________v____ ,______ _,______ _,________________y____________________________________________________________ ________________,________y_________,_ ________, J ~ ~ J JgiJ1750108JYnbA [Bacillus subtilis) J ~ 1242 ~

,________,____ y______ _,______ _,________________,_________________________________.._________________________ _________________y________ ,_________,_ ________, J J J J JgiJ2293230J(AF0082201 YtbJ (Bacillus subtilis) J
~ 1239 J

________,____ ,______ _,______ _,________________,____________________________________________________________ ________________,________ y_________,_ ________, J J13 J13681 J13938 JgiJ142521Jdeoxyribodipyrimidine photolyase [BacillusJ J 25B
52 subtilis) 72 ~

________y____ ,______ _,______ _,________________,____________________________________________________________ ________________,________ ,_________,_ ________y J ~ J ~ JgiJ882518JORF_o304; GTG start [Escherichia coli) ~

J

,________,____ ,______ _,______ _,________________,____________________________________________________________ ________________,________ y_________~_ ________y J J J J JgnIJPIpJe209886Jmercuric resistance operon regulatory proteinJ ~ 360 75 5 2832 3191 (Bacillus subtilis) J

________,____ ,______ _,______ _,________________,___ ______________________________________________________________________,________ ,_________,_ ________, J J ~ ~ JgiJ142450JahrC protein [Bacillus subtilis) J
~ 4S9 ~

________,____ ,______ _,______ _,________________,____________________________________________________________ ________________,________ ,___ _ ____ _,_ __ J ~ ( ~ JgiJ2293279J(AF008220) YtcG (Bacillus subtilis) 7 _ _____, ,________,____ ,______ _,______ _,________________y____________________________________________________________ ________________,________ y_________,_ ________y 87 J14 J14726 (12309 JgnIJPIDJe323502Jputative PriA protein [Bacillus subtilis) ~ ~ 2418 ~

________,____ ,______ _,______ _,________________, ____ ___ __ _____ +_________,_ ________, __ ____ ________________________________________________________+________ J J J J JgiJ500691JMY01 gene product [Saccharomyces cerevisiae)J

J

,________,____ ,______ _,______ _y________________f____________________________________________________________ ________________,________ y_________,_________;

J ~ J ~ JgiJ829615skeletal muscle sodium channel alpha-subunitJ

91 7 4516 4764 [Equus caballus) ~

________,____ ,______ _,______ _r________________,____________________________________________________________ ________________,________ y_________,_________y TABLE 2 S, pneumoniae - Putative codin re ions of novel g g proteins Similar to known proteins ______ y____y_______y_______y________________y________________________________________ ____________________________________y________y_________y_________y ( g (ORF( ( ( match ( match gene name ( ! sim Conti StartStop ( t Ldent ( length ( ( (t0~ ( ( acession( ID (ntl Intl j ( ( ( (ntl y________y____ y_______y_______y________________y_____________________________________________ __________________ _____________y________y_________y_________y ( ( ( ( ( 72 95 2 2004 1717 ( 40 (gnl(PID(e323527 ( 288 (putative ( Asp23 protein igacillus subtilisj y________y____ y_______y_______y________________y_____________________________________________ _______________________________f________y_________~_________y ( ( ( ( ( 72 109 1 1452 118 ( 52 (9i(143331 ( 1335 (alkaline ( phosphatase regulatory protein (Bacillus subtilis) ________y____ y_______y_______y________________y_____________________________________________ __________________ ______-______y________y_____--__y_________y ( ( ( ( ( 72 126 1 3 2192 ( 46 (gnl(PID(d101831 ( 2190 (glutamine-binding ( periplasm(c protein (Synechocystis sp.j y________~____ y_______y_______y________________~_____________________________________________ _______________..__ _____________~________y_________y_________y ( ( ( ( ( 72 130 3 1735 247B ( 53 (9i(2415396 ( 744 ((AF0157751 ( carboxypeptidase (Bacillus subtilis) y________y____ y_______y_______~.._______________y_____.._____________________________________ ____________________ _____________y________f_________y_________y ( y ( ( ( 72 137 6 2585 2929 ( 46 (9i(472922 ( 345 (v-type ( Na-ATPase [Enteracoccus hiraei y________y____ y_______y_______y________________y-__________________________________-___________________________ _____________y________~_________y_________y ( (10 ( ( ( 72 140 9601 9203 ( 48 (9i(49224 ( 399 (URF ( [Synechococcus sp.l y________y____ y___.___y_______y________________y_____________________________________________ ________________--_____________y________y_________y_________y ( ( ( ( ( 72 146 5 1906 1247 ( 45 (gn1(PID(e324945 ( 660 (hypothetical ( protein (Bacillus subtilisj ~________y____ y_______y___-___y________________y__________________________________________________________ _____ _____________y________y_________f_.~____.___y ( ( ( ~ ( 72 l47 2 2084 1083 ( 56 (gnl(PiD(e325016 ( 1002 (hypothetical ( protein [Bacillus subtilis]

______y____ y'_______y_______y________________y____________________________________________ ___________________ _____________r________y_________y_________y 147 ~ ( ( ~gi(472327 (TPP-dependent acetoin dehydcogenase beta-subunit (Clostridium magnuml ( ( ( ( ~________~____ y_______~_______y________________y_____________________________________________ __________________ _____________y________y_________~____~.____y [ ( ( ( s subtilisj N
1d8 8 5381 6433 ( 72 (9i(974332 ( 54 (NAD(PjH-dependent ( 1053 dihydroxyacetone-phosphate ( reductase (Bacillu ______y____ y_______y_______y________________y_____________________________________________ __________________ _____________y________y_________y_________y J

( (14 (10256( ( 72 1d8 9675 ( 50 (gnl(PID(d101319 ( 582 (YqgN ( [Bacillus subtilis]

____ y_______y________________~_____________________________________________________ __________ ___-_________y________y_________y_________y N

( ( 8 ( 4949 9i(1788770(AE0003301 o463: 24 pct identical 159 4005 ( I 144 gaps) to 338 residues from ( ( ( ( ; ' illin-binding protein d, PBPE BACSU (Escherichia( SW: P32959 (451 aa) ( ( ( ( ( ( ~ ( colij ( ( ( ( y________y____ y_______y_______y________________y_____________________________________________ __________________ _____________y________y_________y_________y ( (10 ( (10620 ( 72 172 9907 (9i(763387 ( (unknown ( 7l4 (Sacchazomyces ( cerevisiael ~

y________y____ y_______y_______y________________y_____________________________________________ ________-..________ _____________y________y_________y_________y ( ( ( ( ( 72 22D 3 2862 3602 ( 50 (9i(1574175 ( 741 (hypothetical ( (Haemophilus influenzaej .._______~____ y_______y_______y________________y_____________________________________________ __________________ _____________,________y_________~_________y ( ( ( ( 267 1 3 449 ( 72 yo (9i(290513 ( 48 (f470 ( 447 (Escherichia cola]

y________y____ y_______y_______y________-_______y_______________________________________________________________ _____________y________~_________y_________y ( ( 2 ( ( gnl(PID(d10096dhomologue of aspartokinase 2 alpha subtilis72 45 360 281 899 540 ( and beta subunits LysC of B.

( ( ( ( ( ( (Bacillus subtilis] i ~
i ( y________y____ y_______y_______y________________y_____________________________________________ __________________ _____________y________y _________y_________y ( ( 1 ( ( 9i(474195 This ORF is homologous to a 10.0 kd htr8 72 290 1018 14 ( hypothetical protein in the 3' ( ( ( ( ( ( ( ( region from E. cola. Accession Numberorganism)~
~ ( ( X61000 (Mycoplasmia-like ( y________,____ y_______y_______y________________y_____________________________________________ __________________ _____________y________y _________y_____ ( ( ( ( ( 72 300 1 63 5B7 ( 50 (9i(746399 ( 525 (transcription ( elongation factor (Escherichia cola]

y________y____ y_______~_______y________________y_________-_____________________________________________________ _____________y________~________..y_________y ( ( ( ( ( 72 316 1 1326 4 ( 40 (9i(158127 ( 1323 (protein ( kinase C

(prosophila melanogaster]

y________y____ y_______y_______y________________y_____________________________________________ __________________ _____________y________y_________y_________y ( ( ( ( ( 72 342 I 227 3 ( 54 (gnl(PID(d101164 ( 225 (unknown ( (Bacillus subtilis]

y________y____ y_______y_______y___-____________y_______________________________________________________________ _____________y________t_________y_________y ( ( ( ( isj ( 354 1 I 1005 72 ( (gnl(PID(d1D2048 52 ( (C. 1005 thermocellum ( beta-glucosidase;

(985) [Bacillus subtil ______y____ ~_______y_______y________________y____________________________.._______________ ___________________ _____________y________y____.-____y______ ( (10 ( (10467 ( ?1 6 8134 (gnl(PID(e264229 ( 57 (unknown ( 2334 (Mycobacterium ( tuberculosis) y________y____ y_______y_______y________________~_____________________________________________ __________________ _____________y________y_________y_________y J

( (20 (16231(15464 ( 71 7 (9i(18046 ( 52 (3-oxoacyl-(acyl-carrier ( protein) ( reductase (Cuphea lanceolataj y________y____ y_______y_______t________________y_____________________________________________ __________________ _________ __y________y_________y_________y U

( ( ( ( ( 71 O
1 1297 2 ( 51 O
(gnl~PIO(d100571 ( 1296 (replfcative ( DNA

helicase (Bacillus aubtilisl y________y____ y_______,_______,________________y_____________________________________________ __________________ _____________y________y_________y_________y ( ( ( ( ( 71 15 4 9435 3869 ( 47 ~gi~499384 ( 567 (orf189 ( [Bacillus subtilis) y________y____ y_______y_______y________________y_____________________________________________ __________________ _____________y________y_________y_________y TABLE 2 S, neumoniae - Putative codin re ions of novel P 9 g proteins similar to known proteins ,________f____,_______,_______,________________,__ __________________________________________________________ Contig ~ORF

~

Start ~

Stop ~

match ~

match gene name '-r----"

"

i-----w' ~

!

id ~

$

si t l h ID ~ID ~ ~ ~ acession~ m en engt _ (nt) (nt) ~ lnt) _______,____ ,_______,_______,________________,_____________________________________________ _______________________________,_______ _,_________,_________, 18 ~

~ S120 6 ~

~

nl~PID~d101318 ~Y

G

[B

ill b ili g qg ~ ~ 51 903 ac 71 ~
us su t s]

,________ ,____,_______,_______,________________,_________________________.._____________ _____________________________________,_______ _,_________4_________, 29 1 ~ 54D ~gi~1773192similar H' ( 1 to ~ the 2kd t i i TETB
EXOA
i f . ~ ~ 56 540 pro 71 e n n -reg on o B, subtilis (Escherichia ~ ~ ' W
coli) ________,____ ~_______,_______ ,________________,_____________________________________________________________ _______________,_______ _,_________,_________t ( ~20 A3327A ~

3B 3830 i~537036 ~ORF

E
h i hi g _o ~ ~ 48 504 ( 71 ~
SC
er c a coli]

,________,____ ,_______,_______ ,________________~_____________________________________________________________ __________ _____~_______ _,_________,_________, 51 Q12 A5015A2676 ~gi~149528 ~di e tid l e tidase IV (La t l i p ~ ~ 55 2340 p 71 ~
y p p c ococcus act sl ,________,____ ,_______,_______ ,________________,_______________________________________~,_-~-________________________~_______y_______ _,_____~___~_________;
55 Q23 Q21040Q20585 ~gi~2343285 ~ 58 456 ~(AF015453) ~
surface located protein [Lactobacillus rhamnosus]
~ 71 ,________,____ ,_______,_______ ,________________,_____________________________________________________________ _______________,_______ _f_________a_________, 60 ~ ~ ~ ~gnl~PID~d101320~YqgZ

2 7D5 265 [Bacillus subtilisl ~ ~ 44 441 ,________,____ ,_______,_______ ,________________,_________--_______-___ 1 ~

______________________________________________________,_______ _,_____--__,_________, 71 Q18 Q2467926226 ~gi~580920~rodD
~ 44 1548 (gtaA) ~
polypeptide (AA
1-673) (Bacillus subtilis) ~

________,____ ,_______,____-__ ,________________, ____________________________________________________________________________,__ _____ _+________-,_________, 71 Q25 e3058730360 ~ ORF

i~606028 414 ( l g _o ~ ~ SO 228 ; 71 ~
Genep ot suggests frameshift near start but none found (Escherichia coli) ,________,____ ,_______,_____.._ ,________________,_____________________________________________________________ __________ _____,________,_________y___ ______, 72 ( ~ ~ ~gi~580835lysine ~
~ 48 1491 ________6 5239 6729 ,________________decarboxylase 71 ~
,____ ,_______,_______ [Bacillus subCilis]
,__________________ __________________________________________________________ ,________,_________,_________, 72 Q14 A t2878 ~gi~624085similar 71 ~ 54 B88 1991 to ~

rat beta-alanine synthetase encoded by GenBank Accession Number S27881; ~
contains ATP/GTP
binding motif [Paramecium bursaria Chlorella I
virus 1]

________,____ ,_______,_______ ,________________,_____________________________________________________________ _______________ ,________i_________, _________, 73 ~11 ~ ~ ~gi~1906594~PN1 ~
~ 42 237 7269 70J3 IRattus 71 ~
norvegicusl ________,____ ,_______,_______ ,________________,_____________________________________________________________ _______________,_______ _,_________,_________, 74 ~ 10385~ ~gi~1573733~prolyl-tRNA
~ ~ 52 1869 ________6 ,_______8517 ,________________synthetase 71 ~
,____ f_______ (pros) [Haemophilus influenzae) ,_____________________________________ __________________________________ _____,_______ _,_________,_________, B1 ~ ~ ~ ~gi~147404~mannose ~ ~ 45 807 ________9 5772 6578 ,________________permease 71 ~
,____ ,_______,_______ subunit II-H-Han (Escherichia coli]
,________________________________ _______________________________________ _____,_______ _,_________,_________i 86 ~ ~ ~ ~gnl~PID~e322063~ss-1,4-galactosyltransferase ~ ~ 53 999 4602 3604 (Streptococcus 71 ~
pneumoniae]

,________,____ ,_______,_______ ,________________,_____________________________________________________________ _______________i_______ _,_________i_________, 105 ~ ~ ~ ~gi~2323341~(AF0144601 9 3619 4707 PepQ

(Streptococcus mutans]
~

( 10B9 ,________,____ ,_______y_______ ,________________,____________________________ ~

________________________________________________,_______ _,_________f_________, 106 Q13 A355712955 ~gi~1519287~LemA

[Listeria monocytogenes]
~

~ 603 ,________,____ ,_______,_______ ,________________,__________~___________________________ ~

_________________________________ _____,_______ _,_________,_________, 114 ~ ( ~ ~gi~310303~mosA ~
~ 55 951 , 2 1029 1979 [Rhizobium 71 ~
_ meliloti) _ ___ ;____ ?_______,_______ ~_______-________,_____________________________--________________________________________ _____,________,_________, _.._______+
__ ( ~ ~ ~gi~1649037~glutamine _ 2 564 1205 transport ( ATP-bindin 122 rotein GLNQ
[S
l ll hi i g ~ ~ 50 642 p 71 ~
mone a a typ mur um) ________,____ ,_______,_______ ,________________,_________________-_ __,________,_________, _________, 132 5 9018 7063 gnl~pID~d102049_ ' ____________________ H
influenzae hypothetical ABC
trans orter (974) [B
ill i i i i i i 71 i S1 . i p ;
ac us subtilis) ,________y____ ,_______,_______ ,________________,___________________ _____f________,_________,_________, 1~
___________________________________________________ 140 ~ ~ ~ ~gi~1673788~(AE000015) ar 71 ~ 49 915 1 114l 227 Mycoplasma ~
~
pneumoniae) fructose-bisphosphate aldolase;
simil to Swiss-Prot Accession Number P13243, from B.
subtilis (Mycoplasma pneumoniae]

,________,____ ,_______,_______ ,________________y____________________________________________________ _____,________,_________,_________, (/~
Z40 ~ ~ ~ ~gnl~PID~d100964_____________ f 71 5 5635 4973 homologue ~

of hypothetical protein in a rapamycin synthesis ene cluster g o ~ 663 8 ~

Streptomyces hygroscopicus [Bacillus subtilisl ________,____ ,_______,_______ ,________________,_____________________________________________________________ __________ _____,________~_________,_________, 141 ~ ~ ~ ~gnl~PID~d102005~/AB001488) ~ 71 ~ S1 ~D

UNxNOWN, SIHILAR
PRODUCT
IN
E
COLT
AND
MYCOPLASHA

. ~ 77 PNEUMONIAE. I I ' ( pp (Bacillus subtilis]

________,____a_______ ,_______ ,________________,_____________________________________________________________ __________ _____;________ ,_________,_________, TABLE 2 S. pneumoniae - Putative coding regions of novel proteins; similar to known proteins ,________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________,_________,_________, ( IORF( ( ( ( ~ 1 ~
t length ContigIID StartStopmatch match sim ident ( I ( 1 I gene ( (nt) Ib (nt)(nt)acession name I I

________,____ ,_______,_______,_______________ _,________________________________________________________________________ ____i_______ _,_________,_________, ( ( ( (ribosomal ( ( 59 pp 193 1 165 protein 71 ( 165 ( ,_______(9i(46912 L13 ____y_______ I
1 ( ,_______,_______________ [Staphylococcus 1 _~_________,_________, ________,____ 22051 carnosus) _,________________________________________________________________________ 194 (g11535351 (CodY
I
1 [Bacillus 3 subtilis) ,________,____,_______,_______,_______________ _,____________________________________________________________________....__ ____,_______ _f_________,_________, Hr 1 1 ( 1 (9i12182574 ((AE000090) 199 3 15101319 Y4pE 71 [Rhizobium I
sp.
NGR234) ,________,____,_______,_______,_______________ _,________________________________________________________________________ ____,_______ _,_________,_.________, ( ( ( ( (9i(1787378 ((AE000213) ( ( 57 :208 2 26163752 hypothetical 71 ( 1137 protein ( in purB
5' region (Escherichia cola]

________,____,_______~_______,_______________ _,________________________________________________________________________ ____,_______ _,_________,_________, 1 ( 1 ( (g1141432 IfepC

209 2 20221141 gene 71 ( 8g2 product 1 (Escherichia coli) ________,____,_______,_______,_______________ _,________________________________________________________________________ ____,_______ _,_________,_________, ( 1 1 1 (9i149316 IORF2 210 S 19113071 gene 71 product ( [Bacillus subtilis) ,________~____,_______,_______,__________..____ _y___________________________________..____________________________________ ____,_______ _,_________,_____ ( 1 1 1 19i(580900 (ORF3 ( ( 48 210 6 3069l386 gene 71 ( 318 product ( lBacillus subtilis) ,________,____,_______+_______,_______________ _,___________________.,.________________________--__________________________ ____,_______ _,_________,__._______, ( ( ( 1 Iribonucleotide 1 ( 53 212 2 35611381 reductase ~________,____p_______19i(557567 R1 ____,_______ I
( ( 1 ,_______,__________..____ subunit ( _,______..__,_________, 2l3 3 20031 [Mycobacterium 2920 tuberculosis) 1 Ignl(PID(d101320 _~________________________________________________________________________ I
IYqgR
[Bacillus subtilis) O
,________,____,_______,_______,________________t_______________________________ _________________________________________ ____,_______ _,_________,_________, N
( 1 1 1 71 24d 1 13 1053 I ( ( (gnllPID(d100964 I ~
homologue of aspartokinase alpha and beta subunits LysC

of B.

subtilis ( I

I

[Bacillus subtilis) ___..__,____,_______,_______,_______________ _,____________________________________.______________________________-_____ ____,_______ _,_________,_________, w.
1 ( ( 1 lunknown ( 2S1 2 100B1874 [Bacillus 19i(755601 subtilis) ,________,____v_______?_______,_______________ _,________________________________________________________________________ ____y_______ _~_________,_________, O

1 ( ( ( ( ( 2A2 2 906 712 71 ( ,________,____,_______(9i(1353874 ____,_______ 1 I ( ( lunknown I
_,_________,_________, J12 4 2137[Rhodobacter capsulatusl ( 573 ,_______,________________,_____________________________________________________ ___________________ I
( IgnlIPiDId102245 ([A80055541 yxbF

(Bacillus subtllis) ,________,____,_______,_______,_______________ _,___________________________________..____________________________________ ____,_______ _,_________,_________, ( 1 ( 1 19i11591045 (hypothetical 1 ( 4A o 338 1 3 683 protein 71 ( 681 (SP:P31466) ( [Methanococcus jannaschiij ,________,____,_______,_______,_______________ _,________________________________________________________________________ ____y_______ _,_________,_________;

( ( 1 1 (9i(1591234 ( 1 36 346 1 3 164 (hypothetical 71 ( 162 ________,____,_______,______protein ____,_______ ( (SP:P42297) _,_________,_________, [Methanococcus jannaschii) _,________________,____________________________________________________________ ____________ ( ( ( ( 19i1397526 (clumping ( ( 23 374 1 619 2 factor 71 ( 6l8 [Staphylococcus I

aureus) ,________,____,_______,_______,_______________ _,________________________________________________________________________ ____,_______ _,_________,_________, I ( ( ( 1g1(397526 (clumping 1 ( 23 377 1 6A8 2 factor 71 ( 687 [Staphylococcus I

aureus( ,________,____,_______,_______;_______________ _,________________________________________________________.._______________ ____,_______ _,_________,_________, ( ( I 1 (gnllpID(e269486 (Unknown ( ( 42 3 8 741969S8 [8aci11us subtilis) ( ________,____,_______,_______,________________,________________________________ ________________________________________ ____,_______ _,_________,_________, ( I10 ( ( ( I

3 83959075 70 ( IgnlIPID(e255543 I

(putative iron dependant repressor [Staphylococcus epidermidis) ,________,____,_______,_______,________________,_______________________________ _________________________________________ ____,_______ _,_________,_________, ( 114 (11024110254 1 7 IgnIIPIDId100290 lundefined ( open reading frame [Bacillus stearothermophilus) ,________,____,_______,_______~ ________________, ____________________________________________________________________________,__ _____ _,_________,_________, ( (18 I14213(13719 gnl(PIDId101090 biotin carboxyl caariez protein of acetyl-CoA( 56 7 I ( carboxylase [Synechocystis 70 i I I ( ~
'b sp.] ~

________,____,_______I
____,_______ ' 1 1 ( I ( _ ______ 9 2 1057I 70 _ ________,____,_______,_______,________________,________________________________ ________________________________________ ____,_______ ( 52 ( 1 ( 1 ( ' 12 4 2610287 70 ' IgnllpID(d100581 _,_________!_________I
lunknown ( 52 (Bacillus ( 822 subtilis) ( ;_______,________________,_____________________________________________________ ___________________ ( (gnl(PIO(d101195 (yycJ

[Bacillus subtilis) ,________,____+_______,_______,________________,_______________________________ _____________________________________________,________,_________,_______.._~

( 1 I ( ,________+____,_______(9i12293447 IIAF008930) 1 I13 (10955ATPase 22 ,____,_______[Bacillus ,________ subtilis) I

( ( ~_______,________________,______________________________________________.._____ ______________________ ____ _____ __,__ __,__ __~_________, 19i(1165295 IYdr540cp [Saccharomyces cecevisiael I

( ,_______,________________,_____________________________________________________ _______________________,________,_________,_________, 1 ( ( ( (9i(39478 (ATP
( ( 51 30 6 93153980 binding 70 ( 336 protein ( of transport ATPases [Bacillus firmusl ,________,____,_______,_______,________________ ,____________________________________________________________________________,_ _______,_________,_________, TABLE 2 S. pneumoniae - Putative coding regions of novel proteins s~imilar to known proteins ________, ____,_______,_______, ________________1 ____________________________________________________________________________,__ ______,___ ______,_________, C h i J JORF J J J J J
identlength ont StartStop matc match ! J J
g gene sim name 8 .

J JID ~ J ~ J ~

ID (ntl [nt) acession J ( [nt) J

________,____ ,_______,______ _,_______________ _+________________________________________________________________________ ____,________,___ ______,_________, J J J ~ JgiJ662792 Jsingle-stranded J J
binding J
protein [unidentified eubacterium) ,________1____ ,_______,______ _,_______________ _+________________________________________________________________________ ____,________~___ ______,_________1 J J15 J10639~ JgiJ1161219 Jhomolgous 33 9521 to 70 J
D-amino ~
acid dehydrogenase enzyme [PSeudomonas aeruginosa) ,________,____ ,_______,______ _,_______________ _~________________________________________________________________________ ____,________,___ ______~_________, J ~ ~ J JgiJ2058547 JComYD
~ 48 501 38 6 3812 43I2 (Streptococcus 70 ~ J
gordonii] ~

________,____ ,_______,______ _1_______________ _~________________________________________________________________________ ____i________1___ ______y_________, 38 J25 J17986J18477 JgiJ537033 JORF_f356 ~
[ESCherichia 70 58 492 coli] J J

1________1____ 1_______,______ _~_______________ _1________________________________________________________________________ ____1________1___ ______4_________t ( J13 J11054J Jg1J1173516 Jriboflavin-specific 40 9846 deaminase [Actinobacillus J

pleuropneumoniael ,________,____ ,_______1______ _,_______________ _,________________________________________________________________________ ____,________1___ ______y_________1 J ~ J J JgiJ1146183 Jputative 42 2 722 1954 [Bacillus subtilisl J

,________1____ 1_______,______ _,_______________ _,_____________________________________________________' ____1________1___ ______y_________y _________________ J J J J JgiJ1591493 Jglutamine 43 3 2373 1612 transport ATP-binding J

protein Q
[Methanococcus jannaschii) 1________p____ 1_______,______ _,_______________ _1________________________________________________________________________ ____,________,___ ______,_________~

J J J J JgnIJPIDJd102036 (subunit 45 8 9197 8049 of 70 J J
ADP-glucose J

pyrophosphorylase [Bacillus stearothermophilus]

1________,____ p______,______ _1_______________ _1________________________________________________________________________ ____y________1___ ______1________,_, J J J J JgnIJPIDJd100302 Jneopullulanase 59 2 S67 956 )Bacillus sp.J J

(________,____ ,_______,______ _,_______________ _,________________________________________________________________________ ____,________,___ ______,_________, o J ~ ( J JgnIJPIDJe276466 Jaminopeptidase J J
[Lactococcus J

lactls) 1________,____ ,_______1______ _,_______________ _,________________________________________________________________________ ____,________ _ ______1_________1 N

, __ J J J J JgnIJPIDJe275074 JSNF

61 4 5553 2437 [Bacillus cereus) J J J

________,____ ,_______,______ _,_______________ _,________________________________________________________________________ ____,________,___ ______,_________, J J J J JgiJ1573037 Jcystathionine 61 7 7914 6802 gamma-synthase (metB) J
INaemophilus influenzael o ,________,____ ,_______,__-___ _,_______________ _,________________________________________________________________________ ____,________y___ ______,_________, 63 J J ( JgnIJPIDJd100974 Junknown 7 5372 7222 [Bacillus subtilis/ J

________,____ ,_______,______ _,_______________ _,______________________________________________ _____ ____ _ __ ____ ____ J ~ J ~ JgiJ1263014 __ ____ __1_ _ ~o 68 7 7126 6962 ______________ _ 37 ___, Jemm18.1 __,__ J 165 gene __y__ ~
product ~
[Streptococcus 70 pyogenes) J

________,____ ,_______,______ _,_______________ _i________________________________________________________________________ ____1________t___ ______f_________y 72 J12 J10081J10911 JgiJ2313093 J[AE000524) i) 6 carboxynorspermidine ~

decarbox 70 lase (ns C) [Heli b t l y ( J 831 p 5 J
co ac er py or ________,____ ,_______,______ _,_______________ _,________________________________________________________________________ ____,________1___ ______,_________1 J J10 J J JgiJ1877423 Jgalactose-1-P-uridyl 75 7888 B124 transferase [Streptococcus J

mutansl ________i____ ,_______1______ _1_______________ _,________________________________________________________________________ _.__,________1___ ______,_________~

J J J J JgiJ39881 JORF

J J
[AA J

(Bacillus subtilis) ________,____ ,_______,______ _,_______________ _~________________________________________________________________________ ____,________1___ ______,_________, J J10 J J JgnIJPIDJe323506 Jputative 87 9369 7324 Pkn2 protein J
(Bacillus subtilis) ________,____ ,_______,______ _,_______________ _a__________~_____________________________________________________________ ____,________,___ ______,_________, J J14 J10640J11788 JgiJ1573209 JtRNA-guanine 96 transglycosylase (tgt) ( [Maemophilus influenzae) ________1____ 1_______,______ _,_______________ _,________________________________________________________________________ ____1________,___ ______1_________1 J J J J JgiJ433630 JA180 ( 59 5I3 113 2 574 1086 [Saccharomyces 70 J ( cerevisiael J

,________,____ ,_______,______ _1_______________ _+__________________________-_________________________________-___________ ____,________y___ ______1_______-_~

J J J J JgnIJPIDJd100585 Junknown 123 5 290l 346i [Bacillus subtilis) J

1________,____ 1_______,______ _,________________i____________________________________________________________ ____________ ____,________~___ ______1_________, J J J J JgnIJPIDJe276974 Jcapacitative J 35 312 "d 125 5 4593 4282 calcium entry J
channel (Bos taurus]

________,____ v_______,______ _,_______________ _,________________________________________________________________________ ____,________~___ ______,_________, J J J J JgnIJPIDJd101314 JYqeT

129 S 4500 3454 [Bacillus 70 ( J
subtilisl J

1________,____ 1_______1______ _t________________1____________________________________________________________ ____________ ____,________y___ ______1_________~

J J J J JgiJ2293312 J(AF008220) ( 50 1215 133 3 2608 1394 YtfP

[Bacillus ( subtilisl 1________,____ ,_______r______ _1________________1____________________________________________________________ ____________ ____1________,___ ______1_________i J J J J JgnIJPIDJe265530 JyorfE

135 1 420 662 [Streptococcus pneumoniae) J

,________,____ ,_______,_______,_______________ _,_______________._________________________________________----________________,--______,_________,______---) ~O
~

J J J J JgiJ472919 (v-type 137 3 438 932 Na-ATPase iEnterococcus J

hirae!

________,____ ~_______,______ _,________________1____________________________________________________________ ________________,________f_________1_________~

J J J J (giJ147336 Jtransmembrane 138 1 440 3 protein (Escherichia J

coli) ________1____ v_______,______ _,_______________ _,________________________________________________________________________ ____,________1_________f_________~

S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ________,___________,_______+________________,_________________________________ ___________________________________________,________,_________~_________, ( IORFI I I I match gene name I
t ident Contig StartStopmatch 1 ( sim length ( I IID( I ( I ( I tnt) ID (nt) (nt)acession ( I

,________,____;_______,_______ ,________________;_____________________________________________________________ _______________,________,_________i_________ pp i i16i18796i16364igi1976441 NS-methyltetrahydrofolate homocysteine ( 53 24l3 y.~.
140 methyltransfezase (Saccharomycea 70 i cerevisiae] ~ ~
I ~

,________,___________y_______, ________________,______________________________________________________________ ______________,________,_________i_________+ W

( I10( I Igi(149535 ID-alanine activating enzyme (4actobacillus( 70 I
I67 8263 6695 casei) 52 I

I

________,___________,_______,__________________________________________________ ____________________________ ______________,________,_________4_________ ( ( ( ( (gnl(PID(d102049 (E. cola hypothetical protein:
P31805( 70 ( 204 4 3Z26 2747 l267) (Bacillus subtilisl 51 ( 4B0 ( ,____-___,____ ,_______,_______y._______________,______________~________________________..____ __________________ ______________,________,_________,_____--__, ( I ( ( IgnlIPID(e309213 (racGAP [Dictyostelium discoideum]
( 70 I
207 3 2627 2869 95 ( 243 ( ________,____,_______,_______,________________,________________________________ ______________________________ ______________,________,_________,_________ ( I I ( (9i11353874 lunknown [Rhodobacter capsulatus] I 70 ( 255 ( ________,____,_______,_______,________________ ,_______________________..____________________________________________________, ________,_________,_________, 6 I21(1755418453IgnlIPTDIe233879 (hypothetical protein [Bacillus subtilis]I 69 I

________,____,_______,_______,________________,________________________________ ______________________________ ______________,_________________,_________, ( (22(18482(19471(9i(580883 (ipa-88d gene product (Bacillus subtilis]( 69 ( 53 ( 990 ( (____________,_______,_______,________________,________________________________ ______________________________ ____________ ____ _____ __ __,__ __+__ __,___ ____, I ( I' I (9i12209379 I[AF006720) ProJ (Bacillus subtilisl ( 69 ( y 22 6 4682 5824 IB ( ( .._______,____,_______,_______,________________,_______________________________ _______________________________ ______________,________,_________,_________, ( I ( ( IgnLIPIDId100580 (unknown (Bacillus subtilis]
I 69 ( I N

,________,____,_______y_______,________________,_______________________________ _______________________________ ______________,________,_________,_________, N

( I12( 110767IgnIIPIDId100581 (unknown [Bacillus subtilis]
I 69 ( w.

________,____,_______,_______,________________,________________________________ ______________________________ ______________,________,_________f_________ ( I I ( IgnllPiDId102012 I(AB001488) FUNCTION UNKNOWN.
[BacillusI 69 I N
27 7 58S7 5348 aubtilis] 28 ( 510 ( ,________,___________,_______,_______-____________.._________________________________________________________ ______________y________+__________________, ( (10~ (101t6(gi(43791b (isoleucyl-tRNA synthetase (Staphylococcus( 69 ( 36 7294 aureus] 53 ( ( ________,____,_______,_______,________________,________________________________ ______________________________ ______________,________,__________________, ( I ( ( (9i1141900 lalcohol dehydrogenase [EC 1.1.1.1) ( 69 I
l8 1 2 1090 (Alcaligenes eutrophus] 1B ( l089 ( ,________,____,_______,_______,_..______________ ,____________________________________________--______________________________________+_________,_________, n 40 111(11333I11944(9i(1573280 (Holliday junction DNA helicase (ruvA)I 69 I o (Haemophilus influenzae[ 44 I

________,____,____..__,_______,________________ ,____________________________________________________________________________,_ _______,__________________, I I15(11942I12517( (DNA-3-meth 69 40 i11573653 ladenin l osidase I (t I) [H
hil fl i 9 y ael I
N
e g I 50 ( yc 576 I
ag aemop us n uenz ________,____,_______,_______,__________.._____ f____________________________________________________________________________,_ _______,__________________, ~O

( ( ( ( (9i1580887 (starch (bacterial glycogen) synthaseI b9 ( 45 6 6917 S490 [Bacillus subtilis) 47 ( I

________,____,_______,_______,________________,________________________________ ______________________________ ______________,________,_________,_________, I I34124932124153IgnIIPIDIe233870 (hypothetical protein [Bacillus subtilis)( 69 I
48 36 ( 7B0 ( ________,____,_______,_______,__________.._____,_______________________________ _______________________________ ______________,________,_________,_________, I ( I I Igi1396297 laimllar to'phosphotransferase systemla] I

49 6 6183 6521 enzyme II (Escherichia co 69 ( ,________,____~_______,_______,________________,_______________________________ _______________________________ ______________,________,_________,_________, I I 7586 I 19i(396420 (similar to Alcaligenes eutrophus epimerase69 753 49 B i 8338 pHGI D-ribulose-5-phosphate 3 I ( I

( I I I I I [Escherichia coli) I I
I
I

________,___________,_______________________ ,____________________________________________________________________________,_ ___..___,_________, _________, I I I I (9i11146238 (poly[A) polymerase [Bacillus subtilis)I

55 6 8262 7033 50 ( I

,________,____,_______,_______,________________ ,____________________________________________________________________________y_ _______,_________,_________a I ( I I IgnlIPIpIe313038 (hypothetical protein [Bacillus aubtilis]I 69 I

I

________,____,_______,_______,________________ ,____________________________________________________________________________,_ _______ _ _____ ___ _ __y__ ____, 62 3 1170 1418Ignl I yp P yn Y p I 69 I

I ( I ) PID h othetical rotein [S echoc stis s 49 I

d101915 _] 249 I

I

I

,________+____,_______,_______,________________ ,____________________________________________________________________________,_ _______,_________,_________, ( ( ( ( (9i(293017 IORF3 (put.); putative (Lactocaccus ( 69 ( 63 8 7298 7762 lactis] 42 ( 465 ( ,________,____,_______y_______,________________ ,___________________________________ y _ ( I I I 19i1153755 ___ ______________,________4_________,_________, 66 4 3657 5081 _______________________ rem Iphospho-beta-D-galactosidase (EC ris] I

85) [Lactococcus lactis . c . o . ( ________,____,_______,_______,________________ ,____________________________________________________________________________,_ _______,_________v_________ ( ( I I (9i1433809 lenzyme II [Streptococcus mutans] 69 I
I

I

,________,____,_______,______ _,________________ ,____________________________________________________________________________,_ _______,__________________~

I I (10017(10664IgnIIPIDIe322063 Iss-1,4-galactosyltransferase [Streptococcus( 69 I
71 6 pneumoniae] 39 I

________,____,_______,______ _,________________ ,__________________________________________..__________________________________ _______,_.________,_________ TABLE 2 S. neumoniae - Putative coding re ions of novel ~
p g proteins similar to known proteins ;________;____ ;_______;_______;______________.._;_-__________________________________________________________________________y____ ____ ;

__,_________;

j jORFj ~ ~ match j match gene name j j 8 ~ length Contig StartStop ident aim ID jID~ ~ j acession~ ~
j (nt) (nt) (nt) ,________,____ ,_______;_______i_______________ _;_____________________________-______________________________________________;________;_________,_________, ~D

71 Q21Q27730j27966~gnl~PID~d100649(DE-cadherin [Drosophila melanogaster)~ 69 ~ 30 j 237 ,________,____;_______;_______,________________,_______________________________ _____________________________________________,________,_________,_________;
Op j j J ~ j j 77 1 1 237 ij287870 roES

ene roduct [Lact lacti ]

g g ~ 69 ~ 44 p ~ 237 g ocaccus s ;________~____ ;_______,_______;________________;_____________________________________________ _______________________________;________,_________;_________, r..

B1 ~ ~ ~ ~gi~1573605~fucose operon protein (fucUl IHaemophilus~ 69 j 3622 4101 influenzae] ~ 180 j ;________,____;_______,_______;________________;_______________________________ _____________________________________________;________,_________,_________;

83 ~ ~ ~ ~pirjC33496jC334~hisC homolog - Bacillus subtills ( 69 ~ 46 1 40 714 ~ 675 ________;____ ,_______,_______;________________;_____________________________________________ _______________________________;________;_________,_________;

B3 j16A A ~gi~143372jphosphoribosyl glycinamide formyltransfecaselus subtilis]
5742 6335 (PUR-NI IHacil ~ 69 ( 16 ~ 594 ;________;____ ,_______,_______;________________,_____________________________________________ _______________________________;_______,;_________;_________, j ~ ~ ~ ]9i]194097jIFN-response element binding factorj 69 j 48 85 2 121Z 916 1 (MUS musculus] j 297 j ,________;____,_______,_______;________________,_______________________________ ______________________~______________________,________;,________;______.__;

91 ~ j ~ jgi~1574712anaerobic ribonuleoside-triphosphaterotein (nrdG)69 44 597 5 3678 4274 reductase activating p j ~

j j j ~ j j [Haemophilus influenzae] j ' j j ________,____;_______;_______;________________,________________________________ __________________________________________ ____ _____ __,__ __,__ __;_________, 98 ~ (~ ~ jgnl~PIDjd100262jLivF protein (Salmonella typhimurium)( 69 j 5 3247 4032 j 786 ( ;________,____;_______,_______,________________f_______________________________ _____________________________~_______________;________,_________;_________, 10A ~ ~ ~ jgnl~PID~e257629~transcrlption factor [Lactococcus ~ 69 j 49 5 4085 50S6 lactis] ~ 972 j ~________;____,_______,_______,________________,_______________________________ _____________________________________________;________;_________,_________, N

126 j ~ ~ ~9nljPID~d101329~YqjJ [Bacillus subtilisl ~ 69 ~ 19 "1 3 3078 4568 ~ 1d91 ~

(________,____,_______;_______,_______________ _______ _______________ _;_____________________________________________________,________;____,____;____ _____;

j j j ~ ~gnljPIDjd10131d(YqeR (Bacillus subtilisl ' 69 j 4?
N
l31 6 4121 2889 j 12JJ

j ________.____,_______,_______,________________;________________________________ ____________________________________________,________,_________;_________, 136 ~ ~ ~ ~gnl~PID~d100581unknown (Bacillus subtilisl j 69 j 47 'p ,r 2 1505 2299 j 795 j ,__-_____,____;_._-____,_______,________________,_________________________________________________ ___________________________;________,________._,------___, 149 ~ ~ ~ ~gnI~PIDje323525~YloQ protein [Bacillus subtilis! ~ 69 ~ 50 5 3852 4763 ~ 912 ________;____,_______,_______,________________;________________________________ ____________________________________________,________,_________,_________, 149 j12j j10655~gi~151571Homology with E.coli and P.aeruginosaunknown 9336 lysA gene; product of j ~

Function; putative [PSeudomonas j j syringse]

,________;____,_______,_______,________________,_______________________________ _____________________________________________,________,_________;
_________, N

153 ~ j ~ ~g1~1710373jBrnQ (Bacillus subtilis]

4 3191 3B29 j ~ 69 j 44 j 639 ,________,____,_______;_______;_..______________,______________________________ ______________________________________________,________,_________,_________, j j ~ ~ (gnljPIDjd100582temperature sensitive cell divisionj 69 j 49 169 3 849 2l24 (Bacillus subtilis] j 14T6 j ________,____,_______,_______,________________,________________________________ ____________________________________________,________i_________,_________, 180 ~ ~ ~ ~gij488339]alpha-amylase [unidentified cloningj 69 ~ 50 1 566 3 vector) j 564 ;________,____;_______s_______;________________;_______________________________ _____________________________________________,________,_________,_________, 212 j ( ~ jgi~1395209jribonucleotide reductase R2-2 smalltuberculosis) I 1196 231 subunit [Mycobacterium j 69 ~

53 j 966 ;________,____,_______;_______y________________;_______________________________ _____________________________________________,________,_________,_________, J j J j ~pirjJQ2285jJQ22jnodulin-26 - soybean j 69 ( 41 226 1 2 66t ( 660 ________;____,_______;_______,________________;________________________________ ___________________________ _________________4________,_________;_________, j j j j ]91j472918~v-type Na-ATPase [enterococcus j 69 j 56 233 5 3249 4766 hirae] j 1518 j ________.____,_______,_______,________________.________________________________ ____________________________________________;________,_________,_________, j ~ j j ~gi~148945~methylase [Haemophilus influenzae]( 69 ~ 43 235 3 660 1766 j 1107 ,________,____,_______,_______;________________,_______________________________ _____________________________________________,________;_________,_________, 243 ~ j ~ jgnl~PID~d100225~OHFS [Barley yellow dwarf virus] j 69 ~ 69 ________2 865 2361_;________________;__________________________ ~ 1497 ;____,_______;______ ____ ..__ ___ ___ _ _ _________________;________,_________;_________, _ __ __ __ ___________ j j j ~ ~gi~2289231jmacrolide-efflux protein [Streptococcusj 69 j 51 2S1 3 2899 1967 agalactiae] j 933 j (________,____,_______;_______,________________,_______________________________ _____________________________________________;________;_________;_________;
J

310 ~ ~ ~ ~gnI~PID~e322442peptide deformylase [Clostridium j 69 j 55 1 I 282 beijerinckii] ~ 2B2 j ,________;____4_______,_______~_____________---,__________---,~______________________________________________________________y________,_____ ____;_______._;

j ~ ~ ~ ~gi~397526~ciumping factor [Staphylococcus ~ 69 j 22 369 i 86A 2 aureusl ~ 867 ( ________,____;_______;_______,_______________ _i__________________.._________________________________________________________ ;________,_________,_________+

370 j j j jgi~397526]clumping factor (Staphylococcus ~ 69 j 21 1 7a9 3 aureus] ~ 7d7 j ;________,____;_______,_______,________________;_______________________________ _________________________-__ _________________;________;_________,_________, TAI3LI. Z S. pneumoniae - Putative coding regions of novel protein8 'similar to known proteins ;____..___;____;_______;_______ ;________________y_____________________________________________________________ _______________;________;_________;_________;

I IORF1 1 I match ! sim I i Contig StartStop1 match ident ( length gene name I

I IIDI 1 I acession I ~

ID (ntl (nt) I I (nt) 'I

;_____.___;____y_______;_______;-_______________;___________________n___________________________________________ _____________;________;______.__;______.__;

I I ( I IgnIIPIDId100649 I 69 I 30 379 1 44 280 IDE-cadherin I 237 I

[Drosophila melanogaster) ;________;____;_______;_______;________________;_______________________________ _____.._______________________________________;________;_________;_________;

I I I I (9i(1787524 intergenic 388 i 260 72 I(AE0002251 region I

hypothetical 69 32.7 kD

protein in trpL-btuR

~ i I I I I I I [Escherichia coli] I
I

;________;____;_______;_______;________________;______________________-__________________________-__________________________;________;______.._;_________;

( ( 1 I IgnIIPIDId101809 I 68 I d3 1 2 2006 3040(ABC transporter ( 1035 I

[Synechocystis sp.) y________;____;_______;_______;________________;_______________________________ ______________________ _______________________;________;_________;_________;

I I I I (g112182992 I 68 I 45 12 5 3958 2600Ihistidine I 1359 1 kinase [Lactococcus lactis cremoris) y________;____ y_______;_______;________________;_____________________________________________ _______________________________;________;_________;_________;

I 1 I I IpirIS16974IR5BS 1 68 I 56 15 2 1790 13l1(ribosomal I 4B0 1 protein L9 - Bacillus stearothermophilus ;________;____;_______;_______;________________;_______________________________ ______________________ _______________________;________;_________;_________;

I 6 I ( (9i11787041I(AE000184) o530; This 530 as (,14 gaps) 1 68 16 7353 5701 orf is 33 pct identical to 525 ! i i residues of an approx. 690 P44808 [ESCherichia as protein YHES_HAEIN SW:

I I I coli) ;________;____;_______;_______;________________;_______________________________ ______________________ _______________________;________;_________;________,;

I I12I' I (91I553165 ( 68 I 68 y 17 6479 6805Iacetylcholinesterase I 327 I

[Homo Sapiens) ;________;____;_______y_______y________________y_______________________________ _____________________________~_______________;________;_________y_________;

I 11II14128I14505(911i42700 ve [Bacillus 20 IP competence subtilis) protein I 68 1 10 (ttg start I 378 I

codon) (put l: putati .

;______.._;____y_______;_______;________________;______________________________ _______________________ _______________________;________;_________;_________; N

I I32(24612125397(91I289262 I 68 1 36 J
22 IcomE ORF3 I 7B6 1 [Bacillus subtilisl ;________;____;_______;_______;________________;_______________________________ ______________________ _______________________;________;_________;_________;
w.

I 1 1 1 (9i1311388 I 68 I 46 30 7 I548 4288IORF1 (Azorhizobium I 261 N
caulinodans) I

;________y____;_______y_______;________________;_______________________________ ______________________ _______________________;________~_________;_________; O

1 I ( I (91I1573041 I 68 ( 54 lp 36 5 3911 d585(hypothetical ( 675 I

[Haemophilus influenzae) ;________;____;_______,_______;________________;_______________________________ _____________________________________________;________;_________;_________;
~1 I I I I (91I1790131(AE0004461 hypothetical 29.7 intergenic 68 46 6 5219 6040 kD protein in ibpA-gyrB region i i i ;
1 I I I I ~ [ESCherichia coli) y________;____;_______y_______;________________;_______________________________ ______________________ _______________________;________;_________;_________; O

1 I10( I (91I882579 I 68 I 55 54 6235 7086ICG Site I B52 I

No. 29739 [Escherichia cola) ;________;___-;_______;_______;________________;_________________________________________..__ _________ _______________________;________;_________;_________; N

I I I 1 IgnIIPIDId101914 I 68 I 45 55 5 7069 5165(ABC transporter I 1905 I

[Synechocystis ap.) ;________y____y_______;_______;________________y_______________________________ ______________________ _______________________;________;_________;_________;

( I I 1 (9i11573353 influenzae) 71 3 6134 5613(outer I 68 ( 50 _ membrane 1 S22 I

integrity protein (tolA) [Haemophilus ;______;____;_______;_______;________________;_________________________________ ____.._______________ _______________________;____.___;_________;_________;
_ I10I15342I16613(91I580866 I 68 I 31 ( Iipa-12d 1 1272 I

71 gene product [Bacillus subtilis) ;________;____;_______;_______;________________;__________~____________________ ______________________ _______________________;________;_________;_________;

I I12117560118792(g1144073 I 68 I 35 i1 ISecY protein I 1237 I

(Lactococcus lactic) ________;____y_______y_______;________________;________________________________ _____________________ _______________________y_,______;_________;_________;

I I17I22295I24703(9i11762349 ( 6B I 50 71 (involved I 2409 I

in protein export (Bacillus subtilis) ________;____;_______y_______y________________y________________________________ _____________________ _______________________;________;_________;_________;

I I16110208I 19i11353537 I 68 I 51 73 9729IdU1'Pase ( 480 1 IBacteriophage rlt) ;________;____;..______;_______;________________;______________________________ _______________________ _______________________;____.___;_________;_______._;

1 118117198116011(9i1413943 1 68 1 53 86 Iipa-19d 1 118B I

gene product [Bacillus subtilis) y________;____;_______;_______;________________;_______________________________ ______________________ _____________________ ____ _____ _____ I I'17I17491J15866(91I150209 I 68 1 43 87 IORF 1'(Mycopla~Ta I 1626 I

mycoides) ;________;____;_______y_______;________________y_______________________________ ______________________ _______________________;________;_________;_________;

I I I I (91(149882d coccus jannaschii) 89 6 5139 A354(M. jannaschii I 68 I 40 predicted I 786 ( coding region [Methano y________y____y_______;_______y________________;_______________________________ ______________________ _______________________y________y_________;_________;
1 Ill1 I (9i1150974 1 68 I 43 89 8021 8242I4-oxalocrotonate I 222 I

tautomerase (Pseudomonas putida) ;________;____;_______;_.._____;________________;______________________________ ______________________________________________;________;_________;_________;

1 ( I 1 (91I2367358tAE000491) hypothetical 52.9 97 8 675S 5J94I kD intergenic 68 protein in aid8-rpsF region I I I 1 I I I I I ( tEscherichia coli) 1 ( I I
________;____;_______;_______,________________;________________________________ ____________________________________________;________;_________;_________;

TABLE S. pneumoniae- Putative coding regions of novel proteiris'similar 2 to known proteins >________y____>_______ y_______,________________y_____________________________________________________ _______________________,________,_________>_________>

Contig~ORF~ ~ ~ match ~ match gene name ~
' t ' length StartStop t ident sim ID ,IU~ ~ ( acession~ ( ( ~ (nt) ~O
(nt) (nt) ,________>____>_______>_______ ,________________y__________________-_________________.________________________________________>________ >_________>_____ W

98 ~ ~ ~ ~gnl~PID~d100261~LivA protein (Salmonella typhimurium) ~

~

~

a________>____>_______ >_______>_______________ _,____________________________________________________________________________>
________>_________>_________> a0 99 Q13A A7280(gi~455363regulatory protein (Streptococcus mutans) , ~

~

rr ,________,____>_______,_______>________________>_______________________________ _____________________________________________y________y_________>_________>

1l5 ( ~ ~ ~gi~466479~cellobiose phosphotransferase enzyme II

3 50S4 3693 " (Bacillus stearothermophilus) ~ 68 ~

44 ~ 1362 >________a____a_______>_______>________________>_______________________________ _____________________________________________,________>_________,_________>

124 ~ ( ( ~gnl~PID~d100702(cutl4 protein (Schizosaccharomyces pombe)~

~

~

>________>____>_______a_______,________________>_______________________________ _________________________________-___________a________>_________>______.__>

125 ~ ~ ~ ~gi~450566(transmembrane protein [Bacillus subtilia)~

~

~

,________,____>_______,_______,________________>______________.._______________ ______________________________________________~________>_________>_________>

132 ( ( , ~gnL,PID~d101732(ONA ligase (Synechocystis sp.) ~

~

~

>________,____>_______>_______>________________>_______________________________ _____________________~______________________i________y_________y_________>

140 ~ ~ ~ ~gi~1209711unknown [Saccharomyces cerevisiaej ~

~

~

>________,____>_______a_______>________________,_______________________________ _______________________________.,_____________>________>_________~_________>

L50 ~ ~ ~ ~gi~402490ADP-ribosylarg(nine hydrolase (ltus musculusl( ~

' __-_____a____t_______,_______,________________>___________________________________ _________________________________________>________>_________>_________>

164 ~ ~ ~ ~gnI~PID~e255114glutamate racemase [Bacillus subtllis) ~
o ~

~

~

,________>____>_______,_______,________________>_______________________________ _____________________________________________y________>_________>_________, N

( ( ( ~ ~gnI~PID~e255117(hypothetical protein (Bacillus subtilisl ~
J

~

~

~

,________>____>_______,_______>________________,__________________________ _ ____ _ ____ _ _ _ _ ______>________>_________>_________>
~r 169 ~ ~ ~ ~pir~B54545~B545_ ~
'J
7 3946 4104 _ 68 _____________-_-_______ ~
_ 40 _ ~
_ 1S9 __ ~
(hypothetical protein - Lactococcus lactis subsp. lactis plasmid pSL2 ,________,____,-______,_______,________________>_______________________________________________ _____________________________>________>_________>____.____>

N

170 ~ ~ ~ ~gi~304146spore coat protein (Bacillus subtilis[ ( 4 4247 d396 68 ~

~

f ________>____>_______>_______>________________>________________________________ ____________________________________________a________>_________>_________>

( ~ ~ ~ ~gi~38722 precursor laa -20 to 381) (ACinetobacter ~

171 8 6002 7054 calcoaceticusl 68 ~

~

a________a____>_______,_______>________________>_______________________________ _____________________________________________,________,_________>_________>
vp 198 ~ ( ~ ~gnl~PID~e313075hypothetical protein (Bacillus subtilis) , o ( ~

( ,________>____>_______>_______>________-______.a.____-______________________________________________________________________>________ >_________>____-____>

2I1 ~ ~ ~ ~gt~1439528~EIIC-man [Lactobacillus curvatusl ~
' ~

~

~

>________>____>_______>_______>________________>_______________________________ _____________________________.._______________>________>_________>_________, N

214 8 d926 4231 ( yp ( 9n1 PID H, influenzae h othetical protein; P43990 68 d102049 (1A21 (Bacillus subtilis) ~

~ ~ ' ( , >________,____>_______>_______>________________>_______________________________ _____________________________________________>________>_________>_________>

217 ~ ~ ~ ~gnI~PID~e326966~stmilar to B. wlgaris CBS-associated mitochondria) ~ 36 216 6 4955 5170 ... (reverse 68 ~

~

transcriptase) [Arabidopsis thaliana) J

>________>____>_______>_______>________________>_______________________________ ________..____________________________________>________>_________>
_________>

218 ~ ~ ~ ~gi~2293198~(AFOOB220)'YtgP [Bacillus subtilis) ~

~

~

a________>____>_______,_______>________________>_______________________________ _____________________________________________,________>_________,_________, 220 ~ ~ ~ ~gnl~PID~e325791~(AJ000005) orfl [Bacillus megaterium) ~

~

~

>________>____>_______>_______>________________>_______________________________ _____________________________________________>________>_________>_________>

236 ~ ~ ~ ~gi~910137~ORFX13 (Bacillus subtilis) ~

~

~

>________>____>_______>_______>________________>_______________________________ _____________________________________________>________>_________>_________>

237 ~ ~ ~ ~gi~396348~homoserlne transsuccinylase [ESCherichia ~
b 2 675 14S1 cola) 68 ~

i ~

,________a____>_______>_______>________________>_______________________________ _____________________________________________>________>_________>_________y 250 ~ ~ ~ ~gi~310859~ORF2 [Synechococcus sp.l ~

~

________,____,_______,_______>________________y________________________________ ____________________________________________>________>_________>_________, 2S4 ~ ~ ~ ~gi~1787105~(AE000189) o648 was o669; This 669 as to 68 1 5I7 1S5 orf is 40 pct identical I1 gaps) ~

217 residues of an approx. 232 as protein ~
YBBA_HAE1N SW: P45247 (Escherichia cola) r.
,________>____>_______>_______>________________>_______________________________ _____________________________________________,________>_________>_________>

337 ~ ~ ~ ~ tative orf (Bacillus subtilis) 68 N
1 I 774 nl~PID~e261990 7 g pu ( p ~ p ~

~

>________>____>_______,_______>________________>_______________________________ _____________________________________________>________~_________>_________>
p0 345 ~ ~ ~ ~9i~149513~thymidylate synthase IEC 2.1.1.45) (Lactococcus~

1 3 653 lactis) 68 ~

~

,________>____,_______,_______>________________>_______________________________ _______________________________________ ______y________>_________>________..>

TABLE 2 S. pneumoniae - Putative coding regions of novel proteins5lmilac to known proteins i________i____i _______i____p__i ________________i______-g ___i-$-s ti ORF St S h ________________________________________________________ ~

C t h ene name im t ident on ar to matc matc ~ ~
g length ID SID~ ~ ~ ~
~ (nt) (nt) (nt) acession ~
~

________,____,_______,_______,________________,________________________________ _________________________________________ ___,________,_________ ,_________, 386 ~ ~ ~ ~gi~1573353 ~
2 417 4 pouter 68 membrane ~

integrity 51 protein ~

(tolA) 414 [Haemophilus influenzae) ________,____,_______,_______,________________,________________________________ ____________________________________________,________+_________,_________, 2 ~ ~ ~ ~gi~1592141 ~
4 5T22 4697 ~M. 67 jannaschii ~

predicted 26 coding ~

region 1026 (Hethanococcus jannaschii]

,________,____,_______,______ _,________________i____________________________________________________________ _____________ ___,__.._____,_________,___,_____, 3 ~ ~ ~ ~gi~2293175 ~
6 5397 d591 ~(AF0082201 signal ~

transduction 44 regulator ( (Bacillus 807 subtilis]

________,____,_______,_______,________________,________________________________ ____________________________________________i________,__..______+_________, ~ ~ ~ ~gi~2313385 ~

2 2301 S74 ~1AE000547( pare-aminobenzoate ~

synthetase 48 (pab8) ~

(Helicobacter 1728 pylori]

(________,____,_______,______ _,________________,____________________________________________________________ _____________ ___,________,_________,_________, 6 Q19A A ~gi~413931 ~
6063 6758 ~ipa-7d 67 gene ~

product I1 (Bacillus ~

subtilis) 696 ,________,____,_______,______ _,________________,____________________________________________________________ _____________ ___,________,_________,_________, 22 ~ ~ ~ ~gi~1928962 ~
8 7094 7897 ~pyrroline-5-carboxylate reductase ~

(Actinidia 51 deliciosa] ~

80d ________,____,_______ ,______ _,________________,____________..________________________________________ _ ~__________________ ____ _____ _____ __,__ __,__ __,__ __, 29 Q10~ ~ ~gi~468745 ( 8335 9072 ~gtcR 6T

gene ~

product I1 (Bacillus ~

brevis] 73B

________,____,_______ i______ _,________________,____________________________________________________________ _____________ ___,________y_________,_________, 31 ~ ~ ~ ~gi~2425123 ( 3 1379 585 ~(AF019986) PksB ~

[Dictyostelium 49 discoideuml ~

,________,____~_______ ,______ _,________________,_________-_____________________.________________ __,.._______,_____-___,_________, ______ _ 32 Q11~ A ~gi~42029 ~
B849 0150 ~ORF1 67 gene ~

product 97 [Escherichia ~

col 1302 d ~

' ________,____,_______ ,______ _,________________,____________________________________~_______________________ _____________ ___,________,_________,_________, o 36 Q16A 15546 (gi~1592142 ~ N
4830 (ABC 67 transporter, ( probable 43 ATP-binding ~

subunit 717 (Methanococcus ~

jannaschii]

_ _~_ N
,________,____,_______ ,______ _ _ _ _____ _,________________,_____ _ _________________-___________________________ _ __,__ _ _ ____,_________, J8 ~ I ~ ~gnl~PID~e214803 ~
9 d958 5392 (T22B3 3 ~

(Caenorhabditis 47 eiegans) ~

(________,____,_______,______ _,________________,____________________________________________________________ _____________ ___,________,_________,_________, J

38 21 13775 14512 i N
537037 ~

o216 67 [Escherichia ~

colt) 52 ~9 ( ~ 738 ~ORF ~

_ o ________,____,_______ ,______ _,________________,______________________________________ ___ _ ___ ___,________,_________,__,______, 45 ~ 10428 ( _______ ~
9 9181 _____________________ ~gi~551710 ~

branching 51 entyme ( IgigBl 12I8 (EC

2.4.1.I8) (Bacillus stearothermophilus) ________,____,_______ ,______ _,________________,______________________________________ ____ __ ___ ___,________,_________,_________, ______ ____________________ 48 Q23A 17514 ~gi~413949 ~ ~o 8744 ~ipa-25d 67 gene ~

product 50 (Bacillus ~

subtilis) 831 ~

,________,____,_______ ,______ _a________________,____________________________________________________________ _____________ ___,________,_________,_________, 50 ~ ~ ~ (gnI~PID~d101330 2 1773 952 ~Y

jQ

(Bacillus subtilis]

q ~

~

,________,____,_______ ,______ _,________________y____________________________________________________________ _____________ ___,________,_________,_________, 53 ~ ( ~ ~gi~1574291 e) N
1 431 3 ~fimbrial ~

transcription 67 regulation ~

repressor 10 (pil8) ~

[Haemophilus 429 influenza ~

________,____,_______ ,______ _,________________,____________________________________________________________ _____________ ___,________,_________,_________, 55 Q13A 11946 ~gnl~PID~e252990 ~
2740 (ORF 67 YDL037c ~

[Saccharomyces 51 cerevisiae) ~

________,____,_______ ,______ _,________________,____________________________________________________________ _____________ ___,________,_________,_________, 61 ~ , ~ ~gnl,PID~e264711 ~
9 9210 8329 ,ATP-binding 5i cassette ~

transporter 50 A ~

(Staphylococcus 88Z

aureus]

________,____,_______ ?______ _,________________,__________._________________________________________________ _____________ ___,________+_________,_________, 71 ~ ~ ~ ~gi~1197667 ' 2 561d 6117 ~vitellogenin [Anolis ~

pulchellus) 36 ~

i ________,____,_______ ,______ _,________________,____________________________________________________________ _____________ ___,________,_________,_________ 81 ~ ~ ~ ~gi~1142714 7 4489 9983 ~phosphoenolpyruvate:mannose phosphotransferase element IIB

[Lactobacillus ~ i i ( ~ ~ i ( curvatus) ,________,____,_______ ,______ _,________________y________________________________________ _ ______ __,________;_________, _________4 ___ _____ 83 ~ ~ ~ ~gi~1276746 ~
7 2957 3214 ~ACyl carrier ~

protein 37 (Porphyra ~

purpurea) 2S8 .________,____._______ ,______ _,________________,____________________________________________________________ _____________ ___.________,_________,_________, b 86 ~ ( ~ ~gi~1147744 ~
8 8140 6B09 ~PSR 67 (Enterococcus ( hirael 45 ~

,________,____,_______ ,______ _,________________,______________.._______..___________________________________ _______________ ___f________,_____.~___,_________, 97 ~ ~ ~ ~gnl~PID~d102235 ~ ~
3 986 1366 ~(AB000631( unnamed ~

protein d3 product ~

(Streptococcus 381 mutans]

________,____,_______ ,______ _,________________,____________________________________________________________ _____________ ___,________;_________,_________I C/J

102 ~ ~ ~ ~gi~682765 ~O
1 601 1413 ~mccH

gene product [ESCherichia colt) ~ J

~

~

~

________y____,_______ ,______ _,________________,____________________________________________________________ _____________ ___,________,_________,_________, w ~gi~148921 ~

~LicD 67 protein ( [Haemophilus '43 influenzae) ~

,________f____,_______ ,______ _,________________,____________________________________________________________ ________________,________?_________,_________, 115 ~ ~ ~ ~gi~895750 ~
4 5982 5656 putative cellobiose ~

phosphotransferase 14 enzyme ( (Bacillus subtilis) ,________,____v_______ v______ _,________________,____________________________________________________________ ________________,________,_________,_________?

S. pneumoniae - Putative coding regions of novel proteins similar to known proteins y________,____y______ _~_______y_______________ _a__ _________________________________________________ _ v _ Contig~ORF~ ~ ~ _______ ,_________,_________, Start Stop match -_,________ ~ match gene ~ ! ~ length name ident 8 sim ID SID~ ~ ~ ~
~ Int) (nt) (nt) acession ________,____,_______,______ _,________________,____________________________________________________________ _______________ _,________,_________ y_________, ~p 115 ~ ~ ~ ~gi~466473 ~cellobiose phosphotransferase enzyme II' ~ 67 7 8421 B077 (Bacillus stearothermophilus) ~

~

r.
________y____,_______,_______,_______________ _,____________________________________________________________________________, ________,_________,_________, 127 (13~

i ~ ~g transport protein [Escherichia coli] ~ 67 ~14 ~

~

_ ____________________________________________ W
________,____,_-_____,_______,________________ ,_______________ __,________,_________,_________, 136 ~ ~ ( ~gnl~PID~d100581 unknown (Bacillus subtilis]

~ 67 ~

~

,________,____,_______y______ _,________________,____________________________________________________________ ________________ 140 Q21 _ Q23317 __ 20906 _________ ~

nl~PID~d101912 ~

h l l R

l A

h l~

g p 67 eny 43 a ~
any 412 N I
-t I
synt etase (Synechocystis sp.) ,________,____,_______,_______y________________,_______________________________ ____________________________________________ _ _ ________ _________I____?

( ~ ( ~ ~gi~2182994 ~histldine kinase [Lactococcus lactic cremoris)~ 67 146 6 2894 1B93 ~

~

,________,____,_______ y_______y________________,_____________________________________________________ ______________________ _,________,_________,_________, 1S1 ~ A ~

B 1117 nl~PID~d100085 11476 ~ORF129 ill g [Bac ( 67 us cereus) ~

~

,________,____y_______,_______, ________________,____________________________________________________ ____________ _,________,_________,_________, __y___ _____ 160 Q10~ ~ ~gi~2281317 ~OrfB; similar to a Streptococcus pneumoniae 67 46 1194 7453 B646 putative membrane protein encoded by GenBank Accession Number X99400; i i inactivation of the OrfB gene ~

leads to W-sensitivity and to decrease of homologous recombination (plasmidic test) (Lactococcus 1 ________,____,_______ ,_______ _ _ , ,____________________________________________________________________________,_ _______,_________,_________, ( ~ ~ ~ _ ~Y

163 3 3099 4505 _____________ fR [Bacillus s ~gnl~PIU~d101317 btili ]

q ~ 67 u ~
s 47 ( ________~____,_______,______ _,________________ ,___________________________________________________________________-________ _ _ _________ 167 ~ ~ ~ ~gi~1161933 ~DltB [Lactobacillus casei) ~

i I

________,____,_______ y_______,________________ ,____________________________________________________________________________ 169 ~ ~ ~ ~ ~Y _ 4 2322 2879 nI~PID~d101331 kG !B
________ ill _____ b __!____~
ili g q ~ 67 ac ~
us su 41 t ~
si 558 ________y____,_______ ,_______,________________ ,______________________________________________________________________________ ____ _ _ _ I71 Q11~ ~ ~gi~153841 ~pneumococcal surface ______ 7656 8384 protein A (Streptococcus pneumoniae] ~ 67 ~
SO
~

________y____,_______ y_______,________________ ,____________ _ ____________________ _____________ _y________,_________,_________y 18B ~ ~ ~ ~gi~1542975 ~AbcB (Thermoanaerobacterium thermosulfurigenes]~ 67 0 3 1930 3723 ~

~

~

________,____,_______ ,_______,________________ y__________________________________________________________________ __ _ _____ __ _____ _ 1B9 ~ ~ ~ ~gnl~PID~e325178 H

6 3599 3141 othetical rotein [B
ill b ili yp ~ 67 p ~
ac 52 us su ~
t 459 s) ,________,____,_______ y_______,________________ y_______________________ ___________________________________ 205 ~ ~ ~ ~gi~606073 _ _,________,_________,_________, _ ___________ ~ORF
o169 (ESCherichia colij _ 67 ~

~

________,____,_______ ,_______,________________ ,____________________________________________________________________________,_ _______,_________,_________, 207 ~ I ~ ~gi~2276374 ~DtxR/iron regulated lipoprotein precursor ~ 67 4 2896 34S6 (Corynebacterium diphtheriaei ~

~
56l y________y____y_______ ,_______,________________ ,____________________________________________________________________________;_ _______,_____,___,_________, 217 ~ ~ ~ ~gi~895750 putative cellobiose phosphotransferase enzyme~ 67 3 4086 3703 III (Bacillus subtilis) ~

( ________,____,_______ ,_______,________________ ,___________________ __ _ _________________________________________________,________,_________,_________, 246 ~ ~ ~ ~gi~1842438 unknown (Bacillus subtilis]

~ 67 ,_ ~

___ 43 ~

_ ,____,_______ ,_______,________________ ,____________________________________________________________________________,_ _______,_________,_________, ___ ~ ~ 745 i ~ ~g ~PspA [Streptococcus pneumoniae) ~ 67 ~ ~

~

,________y____,_______ ,_______,________________ ,____________________________________________________________________________ _ , _______,_________,_________, 265 ~ ~ ~ ~gi~2313847 ~(AE000585) L-asparaginase II (ans8) (Helicobacter~ 67 3 1134 1811 pylori) ~

~

,________,____,_______ ,_______,________________ ,_______-_______________________________________________________,________,_________,____ -____, __ _ 295 ~ ~ ~ ~gi~2276374 ~DtxR/iron regulated lipoprotein precursor ~ 67 1 1 375 [Corynebacterium diphtheriae] ( ~

,________,____,_______ ,_______,________________ ,_________________ _ _,________,_________,_________, b 1 ~, ~ ~ ~gnl~PID~e255179 unknown (Mycobacterium tuberculosis] ~ 66 n (________,____,_______ ,_______,________________ ,___________________________________________________________________________~

~

~
_,________,_________,_________, 3 ~ ~ ~ ~gnl~PID~e269548 Unknown [Bacillus subtilis) ~ 66 1 389 3 ~

~

(________,____,_______ y_______,________________ ,____________________________________________________________________________,_ _______,_________,_________, ( ~20(19267 20805~ I

3 i~39956 l il g ~ ~ 66 IG ~
c [Bac 50 lus subtilis] ~

~

,________,____,_______ ,_______y________________ ,____________________________________________________-_______________________,________,_________4_________r 4 ~ ~ ~ ~gi~1787564 ~(AE000228) phage shock protein C
[Escherichia~ 66 3 2S45 27I8 coli) ~

~ 1 ,________,____,_______ ,_______,________________ ,____________________________ 4 U
I

________________________________________________y________y_________,_________ p ~ 13197 12592~gi~1574291 ~fimbrial transcri p 9 ti l ti i p ~ 66 ,________,____,_______ +__ on regu ~
a 46 on repressor (p ~
lB) [Haemophilus influenzae) 606 ____ _y________________ ,____________________________________________________________________________,_ _______4_________,_________, TABLE 2 S. pneumoniae - Putative coding regions of novel proteins 'srmilar to known proteins ________,____, _______ _______, ________________ a____________________________________________________________________________ ,________,_________4_________, 1 IORF1 1 1 match 1i sim 1 3 length1 Contig Start Stop match gene ident name I

1 IID1 1 1 = II
I (nt) ID (nt) (nt) acession I

________,____,_______ ,______ _,_______________ _,__________________________________________________________________________ __,________,_________,______ 1 I 1 1 IgnIIPIDle266928 )unknown 9 4 2872 1451 (Mycobacterium tuberculosis) I

I

________,____,_______ ,_______,_______________ _,______________-_________________.____________________________________________,________'_______ __,_________, 1 1 1 1 19i1520407 lorE2;

start 1 codon 42 )Bacillus 1 thuringiensis) 270 ________,____,_______ ,______ _,_______________ _,_____________________________________________________________________.____ __,________,_________a_________, w,, 1 I12I10979 1 19i12314738 1(A0006531 15 9897 translation elongation 1 factor 49 EF-Ts I
(tsf) 10A3 [Heiicobactez 1 pylori) ,________~____,_______ ,_______,_______________ _,__________________________________________________________________________ __,________,_________,__..___ 1 1 1 1 IgnIIPIDId102245 1(AH005554) 16 2 1312 734 yxbF

(Bacillus 1 subtilis) 35 ,________,____,_______ y______ _,_______________ _,__________________________________________________________________________ __,________,_________,_________, 1 ( 1 1 (9i11480916 (signal 22 3 1372 1851 peptidase type 1 [Lactococcus 1 lactis) 480 ,________,____,_______ ,_______,_______________ _,__________________________________________________________________________ __,________,_________,_________f 1 ( 1 1 IgnIIPID1e206261 (gamma-glutamyl 22 7 5828 7096 phosphate reductase 1 (Streptococcus 51 thermophilus) 1 ( ________,____,_______ ,_______,_______________ _f______________________________________________________~___________________ __,________,_________,_________, 1 120I16194 I17138 IgnIIPIDle281914 IYitL

22 /Bacillus subtilis) 1 SO

,________,____,_______ ,______ _,_______________ _,__________________________________________________________________________ __,________,_________,_________, 1 1 1 1 19i12314379 1(A0006271 transporter, 1 ATP-binding protein (yhcGl [Helicobacter I I I' I I 1 I
I I n pylori) I

,_______y____,_______ ,______ _,_______________ _,_____________________________________________________________-____________ __,________, _________,_________, I 1 1 1 19i1312444 IORF2 32 1 199 984 [Bacillus caldolyticus) 1 ,________~____,_______ ,_______,_______________ _,_________________________________..________________________________________ __,________,_________,_________, N

33 I131 1 gi11387979 (44t s 44 1119 N
8352 7234 identity 1 over 66 residues with hypothetical protein Erom Synechocysti ~ ~ i J
I I I sp, 1 accession 1 CD;
expression induced by environmental stress:
some I I J
similarity 1 to glycosyl transferases:

two potential membrane-spanning I I o helices I
(Bacillus subtil ________,____,_______ ,______ _,_______________ _,__________________________________________________________________________ __,________,_________,_________, 1 1 1 1 IgnIIPID1e250724 (orE2 34 6 56S8 4708 [Lactobacillus sake) 1 _______,.v____,_______ ,______ _,_______________ _,__________________________________________________________________________ __,________,_________,_________, 1 1141 1 (9i11590997 IM.
1 ~o 34 9792 9574 jannaschii predicted 1 coding 48 region 1 IHethanococcus 1 jannaschii) ________,____,_______ ,_______,_______________ _,__________________________________________________________________________ __,________,_________,_________, 35 i11773352 (Staph ICa 1 lococcus 46 aureus) 1 9 p 1 y ________,____,_______ ,______ _,_______________ _,__________________________________________________________________________ __,________,_________,_________, I ( 1 1 1g111518680 Iminiceli-associated 36 9 6173 6976 protein DiviVA 1 [Bacillus 35 subtilis) 1 ( ,________,____,_______ ,______ _,_______________ _,__________________________________________________________________________ __,_____-__i_________,_________, 1 11I110396 I10824 Ibbs1155344 )insulin 36 activator 1 ( 1 factor, INSAF
[human, Pancreatic insulinoma) Peptide I I
Partial.

aa) )Homo Sapiens/

,_______..,____,_______ ,______ _,_______________ _,__________________________________________________________________________ __,________,_________,_________, 1 1 1 1 IgnIIPID1e325204 (hypothetical 48 1 28 1419 protein (Bacillus 1 subtilisl 50 ____________,_______ ,_______,_______________ _,__________________________________________________________.._______________ .._,________,_________,_________, I 1 1 1 19i12182574 1(A000090) 48 7 3B10 4112 Y4pE

[Rhizobium 66 sp. 1 NGR234) 40 ( ,________,____,_______ ,______ _y_______________ _,_________________________..________________________________________________ __,________,_________y_________, 1 1 1 1 19i1388565 (major ( 52 4 3595 2789 cell-binding factor 1 [Campylobacter 52 jejuni) ( ________,____,_______ ,______ _,_______________ _,__________________________________________________________________________ __,________~_________,_________, 1 1 1 1 IgnIIPIDId101831 (glutamine-binding 54 3 2662 1076 periplasmic protein 1 [Synechocystis 43 sp.l 1 y____________,_______ ,______ _,_______________ _,_________________________..______ __,________+_________,_________, b _________________________________________ ( I101 1 IgnlIPIDle154144 Imdr 61 9740 9183 gene product 1 )Staphylococcus 44 aureus) ( ________,____a_______ ,______ _,_______________ _,__________________________________________________________________________ __,________,_________,_________, 1 113110B93 111993 19i12313129 11A000526) 72 H. 66 pylori 1 predicted 44 coding 1 region 1101 )Helicobacter pylori/

________,____,_______ ,______ _,________________,____________________________________________________________ ______________ __,________,_________,______ 1 1 113267 I12476 19i11573991 (hypothetical 74 9 (Haemophilus influenzae) I

_____,____,_______ ,______ _,_______________ _,____________________________________________________________.._____________ __,________,_________,_________, pr ( 1 1 1 19i11574631 Inicotinamide 75 1 2 868 mononucleotide transporter 1 (pout) 48 (Haemophilus 1 influenzae) 867 ____________,_______ ,______ _,________________s____________________________________________________________ ______________ __,________,_________,_________, pp 1 1 ( 1 19i141312 (put.

repressor 1 protein 40 (Escherichia 1 roll) 1029 ( ,________t____,_______ ,______ _,_______________ _,__________________________________________________________________________ __,________,_________,_________+

TABLE 2 S. neumoniae - Putative coding regions of novel P protein9 ~l5nilar to known proteins ________v____ ,_______, _______, ________________,______________________________________________________________ ______________,________,_________y_________, C

i ont ~ORF ~ ~ ~ ~

g Start Stop match match ~ 8 sim gene ~ E
name ident ~ length ID SID ~ ~ ~ ( ~ Intl (nt) (nt) acession ________,____,_______,_______,________________ ,______________________________________________________________ ____________ _ ____ __ , _____ ______ __,__ __+__ __, 82 ~ ~ ~ ~gnl~PIp~e255128 trigger ~ 66 7 6B13 B123 factor ~ 53 [Bacillus ~ 1311 subtilis]

_____ ____________ ,______-_,____ ,_______,_______,________________,_____________________________________________ ________ __,________4_________,_________, ___ 83 ~ ~ ~ ~pir~C33496~C339 ~hisC
~ 66 3 905 1219 homolog ~ 44 - ~ 315 Bacillus subtilis ,________,____ ,_______,______ _,________________,____________________________________________________________ __ ______________,________,_________,_________, w., 86 ~10 ~ ~ ~gi~683584 ~shikimate ~ 66 9407 8925 kinase ~ 41 (Lactococcus ~ 483 lactis) ,________,____ ,_______,______ _,________________ ,______________________________________________________________ ______________,________,_________,_________i 88 Q10 ~ ~ ~gi~2098719 putative ~ 66 7001 6060 fimbrial-associated ~ 52 protein ~ 942 (ACtinomyces naeslundii) ,________,____ ,_______ ,______ _,________________ ,______________________________________________________________ ____________ _ __y__ -__~_________f_________, 89 ~ ~ ~ ~gi~410118 ~ORFxl9 ( 66 1 9S1 4 [Bacillus ~ 41 subtilis) ~ 948 ,________,____ ,_______ ,______ _,________________,____________________________________________________________ __ ______________,________,_________,_________, 93 ~ ~ ~ ~gi~1787936 ~/AE000260) to 297 66 49 951 7 3661 2711 f298: ~
~ ~
This as orf is pct identical (5 gaps) residues ESCherichia of an approx.

as protein YCSN_BACSU
SW:

[

coli) ,________,____ ,_______ ,______ _y________________ ,______________________________________________________________ ______________,________,_________, _______ I04 ~ ~ ~ ~gi~1469784 putative ~ 66 3 1805 3049 cell ~ 48 division ~ 124S
protein ftsW
[Enterococcus hirae) ,________,____ ,_______ ,______ _,________________ ,______________________________________________ __,________,_________+_________, 106 Q14 A3576 A4253 ~gi~40027 homologous ~ 66 to ~ 52 E.coli ~ 67B
gide [Bacillus subtilis) ________,____ ,_______ ,______ _,________________ ,_____________________________________~________________________ ______________,________,_________,_________, 107 ~ ~ ~ ~gi~144858 ~ORF
~ 66 3 965 1864 A ( 49 (Clostridium ~ 900 perfringens) (________,____ ,_______ ,______ _y________________ ,______________________________________________________________ ______________~________y_________,_________y 112 ~ ~ ~ ~gi~609332 ~DprA
~ 66 7 S718 6593 [Haemophilus ( 43 inEluenzae) ~ 876 ________,____ ,_______ ,______ _,________________ ,______________________________________________________________ ______________,________,_________,_________, 115 ~ ~ ~ ~gi~727367 ~Hyrlp ~ 66 1 3 302 ISaccharomyces ~

cerevisiae) ~ 300 ,________,____ ,_______ ,______ _,________________ ,______________________________________________________________ ______________,________,_____.._..._,_________y 122 ~ ~ ~ ~gnl~PiD~d101328 ~YqiY
~ 66 O
1 3 566 (Bacillus ~ 36 subtilis) 564 ________,____ ,_______ ,______ _,________________ ,______________________________________________________________ ______________,________,_________;_________~
N

126 ~ 11759 A ~gnI~PID~d101163 ~ORF3 ~ 66 8 1046 [Bacillus ~ 48 subtllis) ~ 714 ________,____ ,_______ ,______ _,________________ ,______________________________________________________________ ______________,________,_________,_________, 128 Q11 ~ ~ ~gi~726288 growth ~ 66 8201 8431 associated ~ 41 protein ~ 231 [Xenopus laevis) ,.._______,____ ,_______ ,______ _,________________ y______________________________________________________________ ______________,________~_________,_________y 131 ~ ~ ~ ~gi~486661 ~TNnm ( 66 8 4894 4508 related ~ 39 protein ~ 387 [Saccharomyces cerevisiae) ________,____ ,_______ ,______ _,________________ ~______________________________________________________________ ______________,________,_________,_______ 140 ~ ~ ~ ~gi~40056 ~phoP
~ 66 3 3236 2574 gene ~ 36 product ~ 663 (Bacillus subtilis) ,________,____ ,_______ ,______ _,________________ ,______________________________________________________________ ______________,________,_________,_________, 140 Q15 A A ~gi~1658189 Q5,10-methylenetetrahydrofolate ~ 66 6318 5434 reductase ~ 48 [Erwinia ~ 8B5 carotovora) ________,____ ,_______ ,______ _,________________ ,__________.___________________________________________________ ______________,________,_________,_________, 146 Q12 ( ~ ~gnl~PID~d101140 ~transposase ~ 66 7926 7636 [Synechocystis ~

sp.) ~ 291 ,________,____ ,_______ ,______ _,________________ ,____________________________________________________________________________,_ _______,_________,_______ l47 ~ ~ ~ ~gi~472326 ~TPP-dependent magnum[
6 7I37 61S4 acetoin ~ 66 dehydrogenase ~ 48 alpha-subunit ~ 984 [Clostridium ,________,____ ,_______ ,______ _,________________ ,____________________________________________________________________________+_ _______+_________,_______ 149 ~ ~ ~ ~gnI~PID~d101887 ~pentose-5-phosphate-3-epimerase ~ 66 6 4435 5430 [Synechocystis ~ 46 sp.) ~ 996 ,________,____ ,_______ ,______ _,________________ ,____________________________________________________________________________,_ _______,_________,_________, 149 Q13 A A i ~g ~pyruvate ~ formate-lyase activating enzyme (AA
1-246) (Escherichia coli) ~

~

~

,________,____ ,_______ ,______ _,________________ ,______________________________________________________________ ______________,________,_________,_________, 186 ~ ~ ~ ~gnI~PID~d101199 ~ORF11 ~ 66 9 2578 2270 [Enterococcus ~ 41 faecalis) ~ 309 (________,____ ,_______ ,______ _,________________ ,______________________________________________________________ ______________,________,__ _______,_________y 207 ~ ~ ~ (gnl~PID~e321893 envelo 2 2340 2S97 co e l rotein [H
i d fi i i p 1i ~
p 66 ~
g 46 ( y 258 gp uman mmuno e c ency v rus type ________,____ ,_______ ,______ _,________________ ,______________________________________________________________ ______________,________y_________,_________, J

210 ~ ~ ( ~gi~49318 ~ORF4 ~ 66 7 3358 3678 gene ~ 46 product ~ 321 [Bacillus subtilisl ,________,____ ,_______ ,______ _,________________ ,________________________________________________________ _ __,________,_________,_________, 217 ~ ~ ~ ~gi~49538 thrombin ~ 66 8 5143 5l55 receptor 8 [Cricetulus longicaudatus) ~
,________,____ ,_______ ,______ _,________________ 3 ,____________________________________________________________________________,_ _______4_________~_________, 220 ~ ~ ~ ~gi~966648 ~aiternate ~ 66 4 3875 3642 name ~ 33 ORFD ~ 239 of [Escherichia coli) ,________,____ +_______ ,______ _,________________ ,___________________________________________---______________________________f________~_________f_______-_+

S. pneumoniae - Putative coding regions of novel proteins similar to known proteins f________,____ ,_______,_______,_______________ _,____________________________________________________________________________, ________i_________,_________, ( Contig~ORF~ ~ ( match ~ match gene name ~ t ~ 8 length0 StartStop sim ident ~

~

ID (ID~ ~ ~ acession (nt)(nt) ~
Int1 ________,____,_______,_______,________________~________________________________ ____________________________________________,________,_________,_________, 223 ~ ~ ~ ~gnl~PID~e247187zinc finger protein [Bacter[ophage phigle]~

1 1070138 ~ 45 ~ 933 J

,________,____~_______,_______,________________~______________________________.
._____________________________________________,________,_________,_____.____, 224 ~ ~ ~ ~gi~1176399putative ABC transporter subunit [Staphylococcus~

2 18642640 epidermidis/ ~ 41 ~ 777 ,________,____,_______,_______,________________,_______________________________ _____________________________________________,________,_________,_________, 243 ~ ~ ~ ~dbj~~AB000617_2~IA8000617) Ycdll [Bacillus subtilis] ~ 66 1 3 872 ~ 45 ~ 870 ________,____v_______,_______,________________,________________________________ ____________________________________________,________,_________,_________, ( 268 ~ ~ ~ ~gi~517210putative transposase [Streptococcus ~ 66 2 891 568 pyogenesl ( 60 ~ 324 ________,____ ,_______y_______,________________i_____________________________________________ _______________________________,________,_________,_____~___, 322 ~ ~ ~ ~gi~1499836~2n protease [Methanococcus jannaschii/~ 66 1 2 643 ~ 40 ~ 642 ________,____,_______a_______,________________~________________________________ ____________________________________________,________,_________,_________, Q10(13909A3178(gi~1574292hypothetical [Naemophilus influenzae) ~ 65 ~ 34 ( 732 ________,____,_______,_______,________________,________________________________ ______________________W____________________,________~_________,_________~

6 ~I1A 11190~gi~142854homologous to E. coli radC gene productin from65 0465 and to unidentified prote ~ ~

Staphylococcus aureus (bacillus subtilis)~

________,____,~______,_______,________________,________________________________ ____________________________________________,________~_________,______ 7 ~ ~ ~ ~pir~C64146~C641hypothetical protein 11I02S9 - Haemophilus 2 647 405 influenzae (strain Rd KW20) ( 65 ~

42 ( 243 (________,____~_______,_______,________________,_______________________________ _____________________________________________~________,_________,_________, 7 ~ ~ ~ ~gnl~PID(d101323~YqhU [Bacillus subtilis) ~ 65 7 62166A21 ~ 50 ~ S76 _____ ,____,_______,_______,________________a________________________________________ ____________________________________,________,_________,_________, ( ~ ~ ~gi~1163111~ORF-1 [Streptococcus pneumoniael ~ ~

~ 54 ( 477 ________,____,_______,_______,________________,________________________________ ____________________________________________,________~_________,_________, 16 ~ ~ ~ ~gnI~PID~e325010(hypothetical protein [eacillus subtilisj~ 65 3 14282222 ~ 45 ~ 795 ________,____,_______,_______,________________+________________________________ ____________________________________________,________,_________,_________, f 21 ~ ~ ~ ~gnl~PID~e314910thypothetical protein [Staphylococcus ~ 65 4 38153l57 sciuril ~ 40 ~ 159 _____ , , _ O

___ ,__________________,__ _,____________________________________________________________________________, ________,_________,_________, ____________ 22 Q34Q25776Q26384~gi~1123030~CpxA [ACtinobacillus pleuropneumoniae[~ 65 ( 42 ~ 609 ________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________,_________t 43 ~ ~ ~ ~gi~1049826~F14E5.1 [Caenorhabditis elegansl ~ 65 2 164B290 ~ 38 ~ 1359 p________,____,_______,_______,________________,_____..________________________ _____________________________-________________,________,_________+___,.._____, 48 Q13A0062A ~gi~1573390(hypothetical [Haemophilus influenzae) ~ 65 0856 ~ 45 ~ 795 ________,____,_______,_______,________________,________________________________ ____________________________________________r________,_________,_________, 48 ~22A752116883~gi~1573391~hypochetical [Haemophilus influenzae) ~ 65 ~ 37 ~ 639 _____ ,____,_______,_______,________________,________________________________________ ____________________________________,________~_________f_________, 48 Q25A A8533~gnI~PID~e264484~YCR020c, len:215 [Saccharomyces cerevisiae]~ 6S
9027 ~ 38 ~ 495 ,________,____,_______,_______,________________,_______________________________ __________________________________ ___________~________~_________,_________, 49 ~ ~ ~ ~gi~1480429putative transcriptional regulator [Bacillus~ 65 3 38565334 atearothermophilusl ( 32 ~ 1479 ,________,____,_______r-______f________________,_______________________________________________________ _____________________,________,_________~_________i 50 ~ ( ~ ~gi~171963~tRNA isopencenyl transferase [Saccharomyces~ 65 6 53374519 cerevisiaei ~ 42 ~ 819 ________,____+_______,_______,________________,________________________________ _________________________________ ___________~________+_________t_________~

52 (151972A15588~gi~1499745~M, jannaschii predicted coding region schiil MJ0912 [Methanococcus janna ( 65 ( 46 ~ 861 ,________,____,_______,_______,________________,_______________________________ _____________________________________________y________f_________~_________y 59 ~ ~ ~ ~gi~496514~orf zeta [Streptococcus pyogenes) ~ 65 7 39634745 ~ 42 ~ 783 ~

b _____ ,____,_______,_______,________________,________________________________________ ____________________________________,________~_________,_________, 68 ~ ~ ~ ~gi~887824~ORF_o310 [Escherichia coli) ~ 65 3 25003483 ~ 46 ~ 984 ~ r.3 ________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________,_________, ( 69 ~ ~ ~ ~gnl~PID~e311453(unknown [Bacillus subtilis[ ~ 65 3 21711077 ( 42 ~ 1095 ________,____,_______,_______i________________s________________________________ ____________________________________________,________~_________,_________, ~D

69 ~ ~ ~ ~gi~809660~deoxyribose-phosphate aldolase [Bacillus~ 65 7 60295325 subtilis) ~ 55 ~ 705 ,________,____,_______,_______,________________,____-____________________________________________________________ ___________y________+_________,___.._____, 71 ~ ~ ~ ~gi~1573224~glycosyl transferase lgtC (GP:U1554_4)~ 65 S 8S369783 (Haemophilus influenzae) ~ 42 ~ I248 ,________,____~_______,_______+________________,_______________________________ __________________________________ ___________y________,_________+_________, 72 ~ ~ ~ ~gnI~PID~e267589Unknown, highly similar to several spermidinesubtilis) B 76648527 synthases [Bacillus ~ 65 ~ 39 ~ 864 ________,____;_______,_______,________________,________________________________ ____________________________________________,________,_________,_________, S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ____..___,____,______________________________,_________________________________ ___________________________________________,________,_________+_________, Contig~ORF~ ( ~ match ~ match gene name ~ t sim ident StartStop ~ ~

8 length ID SID~ ~ ~ acession (nt) (nt) ~ ~ ~
(nt) ____ _ ,_______v _v _ i _____,___ ______,_________, ___ ___ _______________________________________________________________________________ __________________,___ ' 76 ' ~ ~ ~gnI~PID~d101723DNA REPAIR PROTEIN RECN (RECOMBINATION

S S773 4097 N). (Escherichia coli) ~ ~ ~

________1___________,______ _________________,_____________________________________________________________ _______________,___ _____+___ _______________, 76 ~ ( ~ ~gi~1574276~exodeoxyribonuclease, small subunit (xse8) 65 9 8099 7875 (Haemophilus influenzael ~ ~ ~

~

________f____,_______,_______,________________,________________________________ ____________________________________________y___ _____,___ ______,_________, ( 84 ~ ~ ~ (gi~2313188~IAE000532) conserved hypothetical protein 65 2 2870 2352 (Helicobacter pylori) ~ ~ ( ____________,_______,______ _________________,_____________________________________________________________ _______________4___ _____t___ _______________~

86 Q15A 13407~gnl~PID~d101880~3-dehydroquinate synthase (Synechocystis 4495 sp.) ~ ~ ~

________1___________,_______,________________,_________________________________ ___________________________________________,___ _____~___ ______,_________, 87 ~ (3706~ (gi~151259 ~HHG-CoA reductase (EC l.1.1.88) [Pseudomonas 65 3 2423 mevaloniii ~ ~ ~

________,____,_______,_______,________________,________________________________ ____________________________________________,___ _____,___ ______t_________, 88 ~ ~ f ~gi~1098510unknown [Lactococcus lactis) ~ 65 3 2425 2736 ~ ~

,________,___________,_______,________________,________________________________ ______________________ _____,___ ______,_________, 1____________________+___ 89 ~ ~ ~ ~gnl~PID~d102008~(AB001488) SIMILAR TO ORF14 OF ENTEROCOCCUS
~ 65 ~ 41 621 2 1627 1007 FAECALIS TRANSPOSON TN916.

(Bacillus subtilisl ,________.____,_______,_______,________________________________________________ ____________________________________________,___ _____+___ ______,_________, ( 111 ~ ~ ( ~gnl~PID~e246063~NM23/nucieoside diphosphate kinase [Xenopus 6 663S 6l146 laevis) ~ ~ ~

1________1___________,_______,________________,________________________________ ____________________________________________,___ _____,___ ______,_________, o 116 ~ ~ ~ ~gnI~PID~d101125~queuosine biosynthesis protein QueA
[Synechocystis 65 44 N
1 3 1016 sp.) ~ ~ ~

~

________,____,______________,__________________________________________________ __________________________________________,___ ________ ______,_________i N

123 ~ ~ ~ ~gi~498839 ~ORF2 (Clostridium perfringens) ~ 65 1 69 389 ~ ~
w.

___________________,___________________________________________________________ ________________________________________,___ _____,___ ______,_________, I23 ~ ~ ~ ~gi~1575577DNA-binding response regulator [Thermotoga 65 7 6522 7190 maritima[ ~ ~ ~

~

________,____,_______,_______,________________,________________________________ ____________________________________________,___ _____,___ ______,_________, 125 ~ ~ ~ ~gnI~PID~e257609sugar-binding transport protein [Anaerocellum 3 3821 28S9 thermophilum) ~ ~ ~

________1____,______________________________,__________________________________ __________________________________________,___ ________ _______________, 137 ~12~ ~ (gi(2182574~(AE000090) Y4pE (Rhizobium sp Q015 7818 NGR234) 65 41 ~ ~

.

________,____,_______,_______,________________,________________________________ ____________________________________________,___ _____,___ ______,_________, 147 ~ ~ ~ ~gi~472329 ~dihydrolipoamide acetyltransferase (Clostridium 65 47 o 4 5021 3885 magnum) ~ ~ ~

~

____________,_______,_______,________________,_________________________________ ___________________________________________,___ _____,___ _____________ 148 ~ ~ ~ ~gnl~PID~d101319~YqgH (Bacillus subtilis) 2 105l 1931 65 42 N
~ ~

~

________t____,______________,__.._____________y________________________________ ____________________________________________,___ _____,___ _______________ 1S1 ~ ~ ~ ~gi~304897 ~ECOE type I restriction modification enzyme 65 2 3212 4687 H subunit (ESCherichia col d ~ ~ ~

________,____+_______,_______,________________,________________________________ ____________________________________________,___ _____,___ _______________, 156 ~ ~ ~ ~gi~310893 membrane protein [Theileria parva[

~ ~

____________,______________,________________,__________________________________ _____________________________________________ __________~___?_________, 164 ~ ~ ( ~gi~410132 ~ORFXB (Bacillus subtilis) ( 65 7 4256 d837 ~ ~

(___________________,__________________________________________________________ _________________________________________t___ _____,___ ______,_________, 169 ~ ~ ~ ~gi~1552737similar to purine nucleoside phosphorylase 65 6 3192 3914 (deoD) [Escherichia cola) ~ ~ ~

________,____,______________,__________________________________________________ _____________________________________________ _____,_________t_________4 176 ~ ~ ~ ~gnl~PID~e339500~oligopeptide binding lipoprotein iStreptococcus 65 43 4 2951 2220 pneumoniael ~ ~ ( ________r-___,_______,_______,________________,__________________________________________ _____________________________________________ ______,_________, 195 ~ ( ~ ~gi~1592142~AHC transporter, probable ATP-binding subunit 65 4 4556 3900 [Methanococcus jannaschiil ~ ~ ~

____________a_______v_______,________________,_________________________________ ___________________________________________y___ _____s___ ______,_________, 196 ~ ~ ~ ~gnl~PID~d102004~(AB001488) PROBABLE UDP-N-ACETYLMURAMOYLALANYL-D-GLUTAMYL-2, 65 S1 1413 1 160 1S72 6- j ~ i i DIAMINOLIGASE (EC 6.3.2.15). (Bacillus subtilis) ,________,____,______________,________________~________________________________ ____________________________________________,___ _____+_________,_______ 204 ( ~ ~ ~gi~143156 membrane bound protein (Bacillus subtilis) 65 2 2246 1215 ~ ~ ~

________,___________,_______,________________,_________________________________ ___________________________________________v________,_________+_________, y 210 ~ ~ ~ ~gi~49315 ~ORF1 gene product Ieacillus subtilis) ~ 65 ~ ~

________,____,_______,_______________________~_________________________________ ___________________________________________,___ _____v_________a_________, fJl 242 ~ ~ ~ ~gi~1787540~(AE000226) E249; This 249 as orf is 32 pct 65 2 1625 723 identical (8 gaps) to 244 ( ~ ~

residues of an approx. 272 as protein AGAR_ECOLI
SW: P42902 (Escherichia coli) ,..___________,_______,_______!________________,_______________________________ _____________________________________________________~_________, _________y S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ________,____,_______ ,_______f________________f_____________________________________________________ ___________________ ____,________,_________,_________, ( (ORF( ( ( match ( match gene name ( t ( length( 0 Contig StartStop sim ( E
ident ( (IO( ( ( acession( ( ( (nt)( ID (nt](nt) ( _____,____ ,_______,_______,_______________ _~____________________________________________________________________________a ________+_________i_________, p0 ( ( ( ( (9i(559861(clyM [Plasmid pADl] ( ( ( ( ________,____+_______,_______,_______________ _,____________________________________________________________________________, ________f_________i_________, ( ( ( ( (gnl(PID(e290934(unknown [Mycobacterium tuberculosis) ( ( ( ( rr ,_____.___,_..__,_______,_______,_______________ _,____________________________________________________________________________, ________~_________,______ ( ( ( ( (9i(790694(mannutonan C-5-epimerase (Atotobacter vinelandii)( ( ( ( __:_____,____ ,_______,_______,________________,_____________________________________________ _______________________________~________~_________s_________~

( ( ( ( (gnl~PID(d102048(K. aerogenes, histidine utilization repressor;( 46 567 320 1 3 569 P12380 (199) DNA Lording 65 ~ ~ i ( ( ( ( ~ [Bacillus subtilis) ( ,________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________,_________ ,_________, ( ( ( ( (gnl(PID(e323508(YloS protein [Bacillus subtilis] ( ( ( ( ,________,____,_______,_______,________________,____________________.__________ ______________________________________________,________,_________,_________, 2 ( ( ( ~gi~1498753(nicotinate-nucleotide pyrophosphorylase ( 7 75716696 [Rhodospirillum~ rubrum] 64 ( ( ( ________,____f_______,_______,________________,________________________________ ___________________i________________________,________,_________,_________, ( ( ( ( (gnl(PID(d101111(methionine aminopeptidase (Synechocystis ( 6 6 59246802 sp.] 64 ( ( ( ,________,____,~-_____,_______,________________,________________________________________________ ____________________________,________,_________,______ ( ( ( 4 (9i(1045935(DNA helicase II (Mycoplasma genitalium]

8 4 34173686 ( y ( ( ( ________,____,_______,_______,________________,________________________________ ____________________________________________,________~_________,_________, o 11 d 32492689 ( ( ( p ( N
( ( ( ( (gnl PID OrfB [Streptococcus neumoniae] 64 e265529 ( ( ( ,________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________;_________,_________, N

( ( ( ( (9i(1762328(Ycr59c/Yig2 homolog (Bacillus subtilis) ( ( ( ( ,________y____,_______,_______,________________,_______________________________ ___________________._____________..____________f________,_________,_________, ( (11~ ( (gnl~PID(d100581(unknown [Bacillus subtilis] ( N

( ( ( ________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________,_________,_________, ( (30(22503(23174~gi(289260(comE ORFl [Bacillus subtilisl ( ( ( ( ________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________,_________, ( ( (14375A (gi(40928b(bmrU (Bacillus subtilis] ( ( ( ( ________,____,_______,_______,________________,________________________________ ____________________________________________,________~_________,_________, ( ( ( ( (9i(40795(Ddel methylase (Desulfovibrio wigarisl ( o ( ~

( ________,____,_______,_______,________________s________________________________ ____________________________________________,________I_________,_________, ( ( ( ( (g12326168(type VII collagen (MUS musculus] ( N

( ( ( (________,____,_______,_______,________________I_______________________________ _____________________________________________,________,_________~_________y ( ~ ( ( (pir(JC1151(JC11hypothetical 20.3K protein (insertion sequence( 64 50 354 35 2 368 721 IS1131) - Agrobacterium ( ( ( ( ( ( ( tumefaciens (strain P0221 plasmid Ti ( ( ( ________,____,_______,_______,______________.._,_______________________________ _____________________________________________,________,_________,_________, ( ( ( ( (9i(96970(epiD gene product (Staphylococcus epidermidis]( ( ( ( __ ,_______,_______,________________,__________~__________________________________ _______________________________,________v_________~_________, ( ~ ( ( ~gnl(PID(e325792(IAJ0000051 glucose kinase /Bacillus megatecium]( ( ( ( ________,____,_______,_______,________________,________________________________ ____________________________________________,________;_________a_________, ( ( ( ( (gnl(PID(d102036(subunit of ADP-glucose pyrophosphorylase ( 45 7 80686920 (Bacillus stearothermophilus] 64 ( ( ( ________,____,_______,_______,________________~________________________________ ____________________________________________,________,_________,______ ( ( ( ( (9i(43985(nifS-like gene [Lac_obacillus delbrueckii)( ( ( ( ,________,____,_______,__..____,________________~______________________________ ______________________________________________y________,_________,_________+

( (13(15251(18397(9i(2293260((AF008220) DNA-polymerise III alpha-chain ( 51 (Bacillus subtilis] 64 ( ( ( ,________,____y_______,_______,________________,_______________________________ _________________________________________ __ ____ _____ ____ __,__ __,___ ( ( ( ( (9i(1574292(hypothetical [!~?aemophilus influenzaej ( ( ( ( ________a____~_______,_______,________________,________________________________ __________________________________________ ____ ___ _ -( ( ( ( (9i(1573826(alanyl-tRNA synthetase (alaSl (Haemophilus( 58 2 42361606 influenzae) 64 ( ( ( ________,___-~_______,_______,________________,_____________________________________________ _ __ _ ____ ____,________ _____ _ ______________ _____ ,____ __,__ __, ( ( ( ( (9i(895749(putative cellobiose phosphotransferase ( vp 66 1 3 1259 enzyme II" (Bacillus subtilis] 64 ( ( ( ,________,____,_______,_______,________________,_______________________________ ___________.._____________________________ ____,________,_________,_________, ( ( ( ( (9i(436965((malA] gene products (Bacillus stearothermophilus]( 0~0 ( ( ( ________,____,_______,_______,________________,________________________________ ____________________________________________,________~_________,_________, ( ( ( ( (gnl(PID(d101316(cdd [Bacillus subtilis] ( 69 6 535649d9 64 ( ( ( ________,____,_______,_______,________________,________________________________ _______________________________________..
____,________,_________,_______ TABLF 2 S.
pneumoniae - Putative coding regions of novel proteins ~Ifnilar to known proteins ,________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________,_________4_________, Contig ~ ~ ~ ~

~ORF StartStop match match ID ~ ~ ~ gene O~D
SID (nt)(nt) acession name ~
,________,____ ,_______,_______,_______________ ~

74 69485038 i t 4 726480 sim (9 ~

~ 1 ident ~
length Int) _,__ __i_______ _______ ________, L-lutamine-D-fructose-6-hos hate amldotransferase [Bacillus subtilis) ~

~

~

~
~

P
P

_ w ,__________f_______,_______,________________ pr 75 v ~ ~ bbs~133379_ 3 12831465 _ ~ _ _ _~______________ __,________y_________,_________i TLS-CHOP=fusion proteinICHOP=C/EBP

transcription factor, TLS=nuclear RNA-~

~

~

binding protein) (human, myxoid liposarcomas cells, Peptide Mutant, aa]
[Homo Sapiens]

________,____ ,_______,_______,_______________ _,____________________________________________________________________________, ________,_________,_________f 81 A A4231~gi~143175 (methanol ~13 4016 dehydrogenase alpha-10 subunit (Bacillus sp.]
( ~

~

________,____,_______,_______,_______________ _,________ ___,________~_________, __ _ _________________________________________________ _________, 83 ~22(2185122090~gnl~PID~d101315 ~YqfA
~
(Bacillus 64 subtilis] ~

~

________y____i_______,_______,_______________ _,____________________________________________________________________________, ________,_________,_________, 87 ~11A ~ ~gnl~PID~e323505 putative 00469300 Ptcl protein [Bacillus subtilis]
~

~

( ________,____,_______,_______,_______________ _,______________________________________________________~____________________,_ _______,_________,_________, 98 ~ ~ ~ (gnI~PID~e2338B0 hypothetical 7 50325706 protein [Bacillus subtilis]
~

~

~

________~____i_______,_______,_______________ _,____________________________________________________________________________, ________,_________,_________, 105 ~ ( ~ ~gi~1657503 similar 1 2 1276 to S.
aureus mercury[II) reductase [Escherichia coli]
~

~

~

________,____,_______,_______,_______________ _,_________________ 113 ~ ~ ~ ~gnI~PID~d101119 _ ____________________________________________________,________,_________,_______ __, ~NifS
[Synechocystis sp ]

~

~

________,____,_______,_______,_______________ _,____________________________________________________________________________, ________,_________,_________, 119 ~ ~ ~ ~gnl~PID~e320520 hypothetical 1 2 1297 protein [Natronobacterium pharaonis]
~

~

~

________,____,_______,_______,_______________ _,______________________________________________ 123 ~ ~ ~ ~gnI~PID~e253284 __,________,_________,_________, 3 112S2156 ~ORF

YDL244w (Saccharomyces cerevisiae) ~

~

~

,________,____,_______,_______,_______________ _,_________________________________________________________ 12d ~ ~ ~ ~gnI~PID~d101884 _ 2331I780 ____ __ _____ __ _________ hypothetical protein (Synechocystis sp.) i ( ~
( ________,____,_______,_______,________________,________________________________ ___________ ___________________ l29 ~ ~ ~ __,________,_________,_________, ,________4 34672709 ~gnI~PID~d101314 ,____,_______,_______~YqeU

[Bacillus subtilis) ~

~

~

,________________,_________________________________________________________ __,________,_________,_________, 131 ~ ~ ~ ~gi~1377841 (unknown 1 152 3 (Bacillus subtilis]
~

~

~

(________,____,_______,_______,________________, ____________________________________________________________________________,__ ______,_________~_________, 137 Q11~ ~ ~pir~JC1151~JCi1 hypothetical 20.3K protein (insertion sequence~
71967549 IS1131) - Agrobacterium 64 tumefaciens (strain P022) plasmid Ti 50 ~

________,____,_______,_______,_______________ _~____________________________________________________________________ ___ ( 139 ~ ~ ~ ~gi~2293301 _ _________ 3 32262651 ~(AF008220) _________ YtqB
(Bacillus 64 subtilis] ~

~

,________,____,_______,_______,_______________ _~_________________________________________________________________________ ___,________,_________,______..__, I46 Q10~ ~ ~gi~1322245 ~mevalonate ~
6730564B pyrophosphate 64 decarboxylase ~

[Rattus 45 norvegicusl ~

(________,____,_______,_______,_______________ _,_________________________________________________________________________ ___,________,_________,_________, 147 ~ ~ ~ ~gnI~PID~e137033 unknown ~
1 2 1018 gene 64 product ~
(Lactobacillus 46 leichmannii] ~

,________i____,_______,_______,_______________ _,___________________________________________________ _ 1d8 Q11~ ~ ~gi~2130630 ______ __,________,_________,______ _ ________________ ~
, ~(AF000430) d ~
amin-like 28 yn ( protein 354 (Homo sapiensl ______,____,_______,_______,________________,__________________________________ _______________________________________ ___,________,_________,_________i _ ~ ~ ~ ~gnl~PID~d102050 ~
156 7 43I33612 ~transmembrane ________,____,_______,_______[Bacillus ~
subtilis] 31 ,________________,_____________________________________ ~

__________________________ 1S7 ( ~ ~ ~gnl~PID~d100892 __________ ___,________,_________,_________, 4 12992114 homologous ~

____ to 64 , , Gln ~

transport 43 system ~
permease 816 proteins [Bacillus subtilis]

____ ___________,_______,________________,________ 162 ~ ~ ( _ ___,________,_________i______ 6 58806362 _ ~

,________ ~gi~517204 164 ,____,_______,_______~ORF1.
~
~13~ ~ Putative 58 ,________ 97078769 42 ~

,____ kDa 483 ,_______,__________________________________________________________ ___,________,_________~_________, protein f (Streptococcus ~

pyogenes] 64 ,________________,_____________________________________________________________ ____________ ~

~gnI~PID~d100964 40 homologue ~

of 939 ferric anguibactin transporC

system permerase protein FatD

o V.

anguillarum [Bacillus subtilis]

,______ __________ ,________________________________________________________ _ 175 ~ ~ ~ _________ __,________y_________,_________, ~p 5 39064598 ~gi~534045 _____ ~
~antiterminator 64 [Bacillus ~
subtilis) 39 ~

(________,____,_______ ,_______,________________,______________________________________ _ i.r ___ __,________,_________ 189 Q10 ~ _ ~

________~ 6S07 ________________ 191 6154 ,_________________ ~
,________,____,_______ ~

~ 2863 ~gi~581307 ~

4 ,_______response ~ regulator ~

3S19 [Lactobacillus ___,________i_________,_________, ,____,_______ plantarum) ~
~ 64 ,________________,_____________________________________________________________ ____________ ~

~gi~199520 46 ~phosphoribosyl ~

anthranilate 657 isomerase ___,________,_________i_________t [Lactococcus lactis]

,________________,_____________________________________________________________ ____________ S, Pneumoniae - Putative coding regions of novel proteins >;imilar to known proteins ________~____~_______~_______i________________~_-________________________________________________ ____________,________,_________,_______ 0~0 Contig ~ORF

~

Start ~

Stop ~

match ( match gene name ID

SID

~

(nt) ~

(nt) ~

acession ~

~

$

sim ~

$

ident ~

length ( ~

/nt) __i_______+_______~________________~___________________________________________ _________________________________~________~_________~_______ ~

( ~

~gnl~PID~e293806 (0-acetylhomoserine sulfhydrylase ILeptospira meyeri]

~

~

( ~

_ _ _ _______________________________ ~________i____~_______~_______f_______________ _~__ __+________~_________~_________+ hr 224 ~collagenase (prtC) (Haemophilus ~ influenzae) ~ 64 ~ 42 ~ 13l8 W

_,____________________________________________________________________________, ________~_________,_________, ~ ~ORF X (Bacillus subtilis] ~ 64 234 ~ 43 ~ 357 ~

~gi~1573393 ____,____,_______,_______,_______________ ( ~

~

( (gi~40174 _a _______f_______~________________;__________________________________________.._.
_____ _________________________~________~_________~_________t ~ 709 1089~pir~JC1151~JC11hypothetical 20.3K protein (insertion 3 ~ sequence IS1131) - Agrobacterium ~ ~ 64 50 381 tumefaciens (strain P022) plasmid Ti ________4____4_______~______ _~________________~________________________ __?________~_________~_________t 265 ~gi~1377832unknown [Bacillus subtilis) ~ 64 ~ 31 ~ ~ 819 ( ~

________~____~_______,______ _~________________i________________ __________ 297 ____________ ________ ~ ~gi~1590871 ________ 1 ~collagenase ___ _____ ~ [Nethanococcus 1 jannaschiil 64 i 48 ~

~ 660 ________,____ ,_______ ~______ _,________________~___________________________ _________________ __ _ ___________ __,_-______f_________i_________+
328 ~ ~ ~ 64 ' 41 ~ 263 21 ~gi~992651 ~ 243 1 (Gin4p (Saccharomyces cerevisiael ________,____,_______ ,______ _,________________t_________________________________________________________ ________ ___________y________~_________f_________~
~ ~ ~ ~gi~556885 ~ 63 ( 48 4 8730 8098 Unknown ~ 633 (Bacillus subtilis]

________,____~_______ ,______ _,________________~___________________________________ __~________~_________~_______ l0 ~ ~ ~ _ ~ 63 ~ 40 6 S178 4483 ~gi~1573101 ~ 696 hypothetical (Haemophilus influenzae]

~________~____ ~____-__ ~______ _~________________~_________________________________________________________ ___________________~________~_________t_________~
12 ~ ~ ~gi~806536 ~ 63 ~ 42 Q11 9l24 9902 membrane ~ 579 __ i______ protein ___________________f________~_________a_______ ~ ~ [Bacillus ~ 63 ~ 40 J
~)0 8897 91A7 acidopullulyticus) ~

________~____ ,_______ y______ _~________________i__ ___________________,________~_________~_________, 17 ~ ~ _ ____________________________ ~ 63 ~ 32 ~ 1031 309 ~gi~722339 ~ 723 2 ~_______ ~______ unknown ___________________f________~_____~____~_________~
~________t____ ~ ~ /Acetobacter ~ 63 ~ 45 ( 777B 6975 xylinum] ~ 804 ~

_,________________,_________________________________________________________ ( ~gnl~PID~e217602 8 ~PInU

[Lactobacillus plantarum) _~________________~_________________________________________________________ ~gi~1377843 unknown [Bacillus subtllis]

~________t____~_______ ~_-____ _f________________~__________________ ___________________~________f_________~_______ 26 ~ ~ ~ _______________ ~ 63 ~ 46 4 97A0 7078 ~gi~142440 ~ 2703 ATP-dependent nuclease [Bacillus subtilisl (________,____~_______ i______ _,________________~_____________________________,_____________________ __~________~_________~_______ 29 ~ ~ ~ ~gi~1377829 ~ 63 ~ 35 5 3488 4192 (unknown ~ 705 (Bacillus subtilis) ________,____,_______ ,______ _,________________~______________ __,________i_________~_______ 34 Q11~ ~ (gnI~PID~d101198 ~ 63 ~ 45 ________,____8830 7988 ~ORFB ( ,_______ ,______ (Enterococcus faecalis) ___________________~____ _,________________,_~______________________________ __~_____ --_______________________ 35 ~ ~ ~ ~gi~722339unknown (ACetobacter xylinum] 63 3 1187 B76 r 39__~_______ ,________ _ _____ ~ ( ( 312 ,____,_______ ,______ _r ___________ 48 HISA2509 A1691 __________ __,________,_________,_______ ~_____-__~____t_______ ~__ ___-_-________________ ~ 63 ~ 41 ____________________ ~ 819 ~gi~1573389 hypothetical (Haemophilus influenzae) f ____ _ ___________________ 51 ~11A2719 ________________~_-________--_______________________________-___________-_ _ 121B9 ~gi~142450 _______ ________~____~_______ t______ ~ahrC
~ 63 ~ 35 55 ~ ~ ~ protein ~ 531 4 3979 5022 [Bacillus ___________________t________t_________~___-_____~
subtilis) _y________________~__________________________________-____________________-_ 63 ~ 41 ~
~gi~1708640 1044 ~YeaB

[Bacillus subtilis) ~________t____~_______ ~_______~________________t_____________________________________________________ _______________________~________~_________~_________t ( Q15A (14670 55 ~____3669 ~gnl~PID~e311502 y________(10~_______ ~thioredoxine 68 ~ reductase ~ 9242 [bacillus ~1 86 7 subtilis) ________4____( ~

88 ~ ~_______ ~

________ ~ ~
CJI
96 ,____6085 1002 ~
,_______~________________~_____________________________________________________ _______________________~________~_________t_________i O~p ________8 ~_______ ~

100 ~ 8919 ,____58S8 ~sp~P37686)YIAY_ ~ HYPOTHETICAL

1 ~_______ 40.2 ~ KD

IN

AVTA-SELB

INTERGENIC

REGION

(F382).

~

~

~

____________~__________________________________________________________________ __________~________~_________~________ ~

~gi~1574382 ~lic-1 operon protein (licD) [Haemo hilus influenzae) ~

~

~_______s________________y__________________________________p__________________ _______________________~________~_________i_________i i utative fimbrial-associated ~9 ~

~P

protein (ACtinomyces naeslundii) ~

~

~

,_______,________________,_____________________ __ ___ ____ _________ _________~

~

~gi~105280J

~orflgyrb gene -_______________________________________-____________ product (Streptococcus pneumoniae) ~_______~________________~_____________._______________________________________ _______________________ _ __t_________~_________, ~

~gi~7171 ~fucosidase (Dictyostelium discoideum]

~

~

~

~

______-_____~____________-___________________________________-___________________________~________f________-~-_-_--__ S. pneumoniae - Putative coding regions of novel proteins simLlar to known proteins ________ ,____y _______y_______ y________________ y__ __________.______________________________________-_________ 1 ORF 1 Stop 1 match ____ Contig Start C
I ID 1 (nt) 1 acession __y__ __y-________y_________y I 1 match gene name 1 toll 1 1 8 sim 1 8 ident 1 length 1 ID 1 I I 1 Int) I

_ ____ _____ __y________y_________y_______ y________y____y_______,_______y________________y___________________________ W
_______ I 46 1~4 ___________y________y_________y_________y 19i1194985 _________ Iphosphoenolpyruvate __y________y_________y_________y carboxylase 1 63 [Corynebacterium 1 glutamicum) 1 183 ,______-_y____y_______,_______y________________y______________________________________-__________________________ B

19i1533099 lendonuclease III

[Bacillus subtilis]

_______________________ ,________,____y_______y_______y________________y_______________________________ ___________ I

IgnIIPIDId101139 Itransposase (Synechocystis sp.) y_____y____y_______y_______y______________-_ y_________________________________________________________________ ___________,________,_________y_________y __ I lorf2 1 63 1 7 (Methanobacterium 128 1 thermoautotrophicum) IgnIIPIDId101434 y________,____y_______y______ _y________________y____________________________________________________________ _____ ___________y________y_________y_________y 1 1 19i1472920 1 137 4 Iv-type 1 27 1 Na-ATPase 1 585 963 [Enterococcus 1 1 hirae]

y________,____,_______y_______y________________y________________________-________________________________________ ___________y_______~.y_________y_________y ( 1 4B6 IgnIIPID1e313025 (hypothetical protein (Bacillus subtilisj ________y____ ,_______,_______y ________________y ____________________________________________________________ 1 1 1 gi 1787043 _ ________ _____y________y_________y_________y 159 5 l741 I I 265 1 IAE000184) f271; This 271 as orf is 1 63 2571 24 pct identical (16 gaps) to 1 39 ( B31 I ( 1 I I I
~
residues of an approx.

as protein YIDA_ECOLI
SW:

(Escherichia I i I I I I i I
I I
, ~~li]

________,____,_______ ,_______y________________y_______________________________________-_________________________ ___________y________t_________,_________y 171 1 IgnIIPIDle324918 B803 IIgAl 1 5604 protease 1 (Streptococcus sanguis]

,________,____ y_______,_______ y________________ y_________________________________________________________________ ___________y________y_________y_________y 1 1 1 19i11773150 (hypothetical 177 1 3 14.8kd 1 34 1 protein 1 345 347 [Escherichia 1 coli) ________,____ ,-______,_______y________________ ,_____________________________ __,________y_________,_________y 1 1 1 )unknown _________ 178 2 423 (Acetobacter 1 xylinumj____________________________________ 19i1722739 1 ________,____ y_______ ,_______,________________ y_____________..___________________________________________________ ________ _ ___ _________ _________ I 1 1 1 19i11591582 cobalamin 178 3 794 101Z bios 1 ynthesis 1 219 protein ( N
[Methanococcus jannaschii) ________,____ y_______ ,_______ y________________y_____________________________________________________________ ____ ___________y________y_________y_________y 1 1 1 1 IgnIIPIDle324217 19S 1 1377 17S IftsQ 1 [Enterococcus 1 1203 hirae) 1 y________y____ y_______y_______ ,________________ y_________________________________________________________________ ___________,________y_________y_________y 1 1 1 19i11591582 Icobalamin 234 5 1739 biosynthesis 1 protein 1 213 (Methanococcus jannaschii) ,________y____ y_______ ,_______ y________________ y_________________________________________________________________ ___________,________y_________y_________y ( 1 1 1 19i11000453 (TreR

249 1 B1 2S7 (Bacillus subtilis) 1 177 I

y________y____ y_______ ,_______ ,________________ y_________________________________________________________________ ___________y________y_________y_________y 1 1 1 1 19i1396486 IORFB

283 1 127 1347 l8acillus subtilis) 1 1221 y________y____ y_______ y_______ y________________ y____________________________________________________________________________,_ _______,_________y_________;

1 1 I 1 19i1722339 )unknown 293 3 2804 3466 (Acetobacter xylinum) I
I

,________y____ y_______ ______ _______________ ________________________________ y y __ ___________y________y_________y_________y 1 1 1 1 19i11877424 _ 1 311 1 905 486 ______________________________ y 1 420 IUDP-galactose 1 4-epimerase (Streptococcus mutansl ,________,____ y_______ y_______ y________________ y__________i_________________________________________________________________y_ _______y_________,_________y 1 1 1 1 19i11477741 Ihistidine 324 1 2 556 periplasmic binding protein (Campylobacter jejuni) ( y________,____ y_______ y_______y________________ y____________________________________________________________________________y_ _______y_________,_________y 1 1 1 1 I(AF013293) 365 1 219 13 No 19i12252843 definition line found [Arabidopsis thaliana) y________,____y_______,_______ t________________ y_____________________________________..______________________________________y ________y_________y_________y I I 19i1722339 )unknown 382 1 (Acetobacter 1 xylinumj ( y________y____ y_______ y_______ ,________________ y____________________________________________________________________________y_ _______y_________y_______-_y 1 1 1 1 19i12252843 1(AF013293) 385 3 364 158 No definition line found [Arabidopsis thalinnaj ( I

,________y____ y_______ y_______ y________________ y_____________________________________________________________ 1 1 1 1 Ignl ________ H
2 1 2495 288 PID _________ e325007 _________ ,________,____ y_______ ,_______ I enicillin-bindin 3 ,____ y_______ y_______ ,________________ g y________I16 I14320 I13193 Ign11P1D1e254993 protein 1 ,____ y_______ y_______ y________________ [Bacillus 6 1 1 1 IgnIIPIDle349614 subtilis) y________B 6819 7232 y________________ 1 1 IgnllPiDId101324 62 y____________________________________________________________________________,_ _______,_________y_______ (hypothetical protein (Bacillus subtilisj ( y__________________________________________________ __y________y_________y_________y InifS-like protein [Mycobacterium leprae) ' y____________________________________________________________________________y_ _______y_________y_________y IYqhY
[Bacillus subtilis]

I

I

y________y____ y_______ y_______ y________________ y____________________________ __,________y_________y_________y 1 I19 I15466 114207 (gnlIPIDId101804 _ 7 y__ _ 1 43 y___.. ___________________ (beta 1 ketoacyl-acyl carrier protein synthaser(Synechocystis sp.) ____ __ y_______ y_______ y________________ y__________________________________________________~._.._____________ __________y________y_________y_________y TABLE 2 S, pneumoniae - Putative coding regions of novel protein4'8lmllar to known proteins ________+____,_______+_______+________________+________________________________ ____________________________________________+________+_________+_________+

( (0RF( ( ( match ( match gene name ( ( lengthi Contig StartStop 8 sim ( ident ( (ID( ( ( acession( ID (nt) (nt) ( ( (nt) ( ___________________+_______,________________+..________________________________ ___________________________________________+_________________+_________ +

( (21 (1T155(16229 (putative FabD protein (Bacillus aubtilis]

7 (gnl(PTD(e323514 ( 62 ( 46 ( 927 ( +________+____,_______+_,._____+________________,______________________________ ______________________________________________+_________________+_________+

( (24 (19526(18519 (beta-ketoacyl-ACP synthase IIT [CUphea wrightii) ,7 (9i(1276434 ( 62 ( 37 ( 1008 ( +________,____+_______+_______+________________+_______________________________ ______..______________________________________,_________________a_________+

( ( ( ( (A/G-specific adenine glycosylase (mutt( IHaemophilus 12 7 5904 4702 influenzae] ( 62 ( 43 ( l203 ( (9i(1573768 +________+____+_-_____+_______+________________+________________________________________________ ____________________________,________+________-+__._____.+

( ~ ( ( (pantothenate metabolism flavoprotein (Methanococcus 12 9 8032 8793 jannaschii) ( 62 ( 33 ~ 762 ( (9i(1591587 +________+____+_______,_______,________________+_______________________________ _____________________________________________+__.__.__+_________+_________+

i i lli ~ ipir(JC1151(JC11ihypothetical 20.3X protein (insertion sequence62 13 351 15 9678 9328 iS1131) - Agrobacterium tumefaciens (strain P022) plasmid Ti ~ ~ ~

________+____+_______,_______+________________+________________________________ __________________________________________~_ +________+__________________, ( ( ) ( (M. jannaschii predicted coding region !W0374 17 4 2609 24I2 (Methanocqccus jannaschii] ( 62 ( 43 ( 168 (9i(1591081 ( ________+____ ,_______,_______+________________+_________..__________________________________ ________________________________+________+_________+_________+

i i 5 i i gi(1495T0 Irole in the expression of lactacin F) part 62 17 3053 2B35 of the laf operon [Lactobacillus i i i i p i CZ
~

,________+____ +_______+_______+________________,_ y _ ____.._______________________________________-____________.._____________+________t__________________, ( (10 ( ( (similar to H. subtilis DnaH [Bacillus subtilis]

22 8627 9538 ( 62 ( 43 ( 912 ( (gnl(PID(d100580 ,_-______+____ ,_______+_______+________________ +____________________________________________________________________________,_ _______+_________+_________+ N

( ( 3 ( ( 9i(2314379(1AE000627) ABC transporter, ATP-binding protein62 70 865 2043 iYhcG) [Nelicobacter ( ( ( ( ( i i i ( ( PYloril i w.

,________,____ +_______+_______+________________,_____________________________________________ _______________________________4________+_________,_________ ( ( ( ( (ipa-52r gene product (Bacillus subtilis) N
33 5 223S 1636 ( 62 ( 44 ( 600 ( (9i(413976 ________+____ ,_______+_______+________________ +____________________________________________________________________________+_ _______+_________,_________ ( (li ( j (0251 [ESCherichia cola) 38 5689 6123 ( 62 ( 31 ( 435 ( ~-' w..
(9i(148231 ,________r____ +_______+_______+______________.._+____________________________________________ ________________________________+________+_________+_________ ~
( (17 (14272(13328 (hypothetical protein [Synechocystis sp.) 40 (gnl(PID(d101904 ( 62 ( 43 ( 945 +________,____ ,_______,_______+________________+_____________________________________________ _______________________________+________,_________+_________+

( ( ( ( (putative [Bacillus subtilis[ ( 62 ( 41 ( o 42 1 3 311 309 ( (9i(1146182 ,________+_-__ +_______+______..______.._________ +________________________________________________..___________________________+
_________________+_________, 44 2 ( ( 9i(1786952((AE000176) o877; 100 pct identical to the N
i i267 4005 first 86 residues of the 100 as ( 62 43 ~

i 2739 ( ( ( ( ( hypothetical protein fragment YBGB_ECOLI

SW: P54746 [Escherichia coli] ( ' ( ( ________+____ _______+_______,________________ +________________________________..____________________________________________ _______+_________,_________+

( (12 ( ( (repressor protein [Enterococcus hirae) ( 48 __,____ 9T32 9304 62 ( 32 ( 429 ( ______ ,_______(9i(662920 +_________ +_______+________________ ___ ____ __ _ _ ( ( ( ( _ 51 8 5664 7i81 ____ (gnl(PID(e301153 __ _________________________________________________+________+_________,_________, (StySKI methylase [Salmonella enterica] ( 62 ( 44 ( 1518 +__-_____+____ +_______,_______+________________ +__________r_________________________________________________________________,_ _______,_________+_________, ( ( ( ( (integral membrane protein [Bacillus subtilis]

52 3 2791 2099 ( 62 ( 41 ( 693 ( ,________,____ ,_______(gi~1183886 +_____________ +_______,________________ ______ ___ ___ .
__ _____.
( (16 (15702(14704 _ 55 (gnl(PID(e31302B __ _ ___ ________..___________________________+________,_________,_________+

(hypothetical protein [Bacillus subtilis]
( 62 ( 40 ( 999 ( ,________,____ +______..,_______,__.._________-___ ____________________________________________________________________________+__ ______+_________+_________+

( ( ( ( unknown [Lactococcus lactis lactis] ( 62 ( 59 6 341B 3984 32 ( 567 ( (9i(2065483 ,________+____ y_______+_______+________________ +____________________________________________________________________________,_ _______+_________+_________, ( ( ( ( (pilin gene inverting protein (PiVML) [Moraxella 63 5 4997 4809 lacunata] ( 62 ( 28 ( 189 ( (9i(149771 +________a____ +_______+_______,________________ ,__________________________________________________________________________ ____ _____ ( (14 (10002(10739 __+__ __+__ __+_________ 70 (9i(992977 (bplG gene product [eordetella pertussis]

( 62 ( 45 ~ 738 ( ,____________ +_______+_______+________________ ,____________________________________________________________________________,_ _______,__________________+

( ( 13(18790i203829i(1280135icoded for by C. elegans cDNA cm21e6: coded 62 71 ( for by C. elegans cDNA cm01e2:
( ( ( ( ~ ~
( i similar to melibiose carrier protein (thiomethylgalactoside permease II) ( ( ( ( (Caenorhabditis elegansl ( ( ________,____ +_______+_______,________________ ,____________________________________________________________________________+_ _______,_________,_________+
( (28 (32217(32768 (YqeG [Bacillus subtilis]

71 (gni(PID(d101312 ( 62 ( 35 ( 552 ( +_ __ ,______________ +

________ ________________ ______.._____________________________________________________________________ + + ________,_________+_________ ( ( (11666(10383 (hypothetical (Escherichia coli] ( 62 ( 38 7d 7 (9i(1552753 ( 1284 ( ________,____ +_______,_______+________________ +____________________________________________________________________________+_ ________________+..________+

S. pneumoniae - Putative coding regions of novel proteins similar to known proteins y________y____y_-_____y_______y________________y________ ___________________________________.__________ ______________y________y_________y_________y Contig ~ORF

~

Start ~

Stop ~

match ~

match gene name ID

SID

~

(nt) ~

(nt) ~

acession ~

~

!

sim ~

t ident ~

length (nt) y________y____y_______y_______y________________y_______________________________ _______________________________ __y________y_________y_________y ~

~

~

~gnI~PID~d102002 ~(AB001488) FUNCTION

UNKNOWN.

/Bacillus subtilis) ~

~

~

y________~____y_______y_______y________________y_______________________________ ______________________ __y________y_________y_______ ~

~

~gi~882463 protein-N/pi)-phosphohistidine-sugar phosphotransferase (scheriehia coli]

~

~

~

________y____ y_______y_____ __y_______________ _y____________________________________________________________________________y ________y_________y_________y 98 ~ ~ ( ~gnl~PID~d101496 ~Bra 4 2306 3268 lintegral membrane protein) [PSeudomonas aeruginosa]

~

~

~

y________y____ ,_______y_____ __y________________y___________________________________________________________ ______ ___ __ _____ __ _____ _ 102 ~ ~ ~gnl~PID~e313010 ~ 62 ~ 24 y 717 y 3 2823 hypothetical ( protein 3539 (Bacillus subtilis]

y________y____y_______y_______ y________________y ____________________________________________________________________________y__ ______y_________~_________~
103 ~ ~ 124Z ~gnl~PID~di02049 H. influenzae hypothetical ABC
transporter;i 62 i 41 i 1554 3 2795 ~ P44808 (9741 [Bacillus ~ subtilis]

y________y____ y_______y______ _:y________________y____________________________________,____________________ _____ _ __y________y_________,_________, ( ~ ~ ~ ___ ( 62 ~
44 ~ 1428 ~gi~581297 ~NisP

[Lactococcus lactis]

y________y____ y_______y_______y________________~_____________________________________________ ________________________ _______y________y_________f_______ i12 ~ ~ ~ ~ 62 ~ 39 ~ 927 y______4 3154 4080 _______ ____ __ _________ 112 __y____ y_w____~gi~1574379 _________ ~ ~ ~lic-1 62 6 4939 operon 39 ~ 711 protein (licA) (Haemophilua influenzae) y_______y________________y___________________________________________ ___ ~

~gi~1574381 (lic-1 operon protein (licC) lHaemophilus influenzae) ________,____y_______y_______y ________________y______________________________________________________________ _______ _______y________y_________y_________y l24 ~3 ~ ~ gi~1573024 ~ 62 ~
45 ~ 417 1137 721 anaerobic ~ ribonucleoside-triphosphate reductase (nrdD) [Haemophilus influenzee]

________,____ ,_______y_______,________________y___________________________________________-_________________________ _______y________y_________y_________y 124 ~ ( ~ ~ 62 ~ 40 ~ 834 ~gi~609076 ~leucyl aminopeptidase (Lactobacillus delbrueckii) y________y____ y_______y_______,________________y_____________________________________________ ________________________ _______~________y---______y_________y 126 ~ i1073~ ~ 62 ~ 38 ~ 3S58 ~gnl~PID~d101163 ~ORF4 /Bacillus subtilis]

________y____ y_______y_______y_____________, __y________y_________,_________, 129 ( ~ __y___________________________________________________ ~ 62 t( 48 ( 444 y______6 4983 ~
__y______ 'y_________y_________y 131 __y____ y_______4540 ~ 62 ~ 42 ~ 40A
y______( ~ ~pir~S41509~5415 7 4510 ~zine __y_ a finger protein -Chilo iridescent virus y_______y________________y_______________________________________ ~

~gi~1857245 unknown [Lactococcus lactic]

___ _______y_______y________________y______________________________________________ ______________________________y________y_________y_________y 149 ~ ~

( 1923 2579 2 y_______~gi~1592142 y________y____ ~ ABC

149 5360 transporter, ~ y_______probable 7 ~ ATP-binding y________y____ 4S0 subunit (7 156 y_______(Hethanococcus ~ ~ jannaschii]

1 3606 ~

y________y____ y_______62 156 ~ ~

~ 1779 41 6 y_______~

y________y____ 6S7 171 ~ y_______y________________y__ ~ 385 ______ y___________________________________________________y________y_________y_______ __y y________y____ ~ ~

172 y_______(gnl~PID~e323508 ~ ~ ~YloS

2 492 protein y________y____ y_______(Bacillus ( ~ subtilis) 173 2856 ( ~ y_______62 3 ~ ~

y________y____ 2074 40 179 y_______~

~ 696 2 ~
y_______y________________y_____________________________________________________ _______________________y________y_________y_________y y________y____ 1061 ~

181 y_______238 ~ ~gnl~PID~e254644 6 membrane y________y____ protein 185 (Streptococcus ~ pneumoniae) 2 ~

________,____ 62 ~

( 40 200 ~

~ 213 2 y_______,________________~________ y________y____ ___________________________________________y________y_________y_________y ~

~gnl~PID~d102050 ~transmembrane (Bacillus subtilis]

~

~

~

y_______y________________y_-________~_________________________________________________________________~____ bZ__~_____35__~____-513 ~

~gi~43941 ~III-B

Sor PTS

/Klebsiella pneumoniae]

y_______y________________y_____________________________________________________ ___ ____________________y________y_________y_________y ~

~gi~895750 putative cellobiose phosphotransferase enzyme III

(Bacillus subtilis]

~

~

~

y_______y________________y________________________ _____ _______________________y________y_________y_________y ~

~gi~1591732 cobalt transport ATP-binding protein O

(Nethanococcus jannaschii]

( ~

~

y_______y________________y_______..____________________________________________ ________________________y________?_________y______~___+

~

~gi~1574071 ~H, influenzae predicted coding region [Haemophilus influenzae]

~

~

~

~

y_______y________________y_____________________________________________________ _______________________ _ ___ _________ ~

~gi~1777435 ~LacT

[Lactobacillus case d y_-_____y________________y________________________________________________________ ____________________y________y_________y_________y ~

igi~2182397 (A000073) Y4fN

[Rhizobium sp.

NGR234]

~

~

~

y_______y________________y_____________________________________________________ _ ~

~

~

______________________y________i_________y_________y ~

~gi~450566 ~transmembrane protein [Bacillus subtilis]

~

~

~

y_______y_______________ y _ ____________________________________________________________________________,__ ______y_________y_________y 202 ~ ~ ~gi~42219 P35 0~p ~ 2583 347J _y________________ gene 3 y_______y______ ~gi~49315 product y________y____ ~ ~ _y________________ (AA

~ y_______y______ -3 314) ________y____ [scherichia cola]
~

~

~

~
y____________________________________________________________________________y_ _______y_________y_________y ~ORF1 gene product [Bacillus subtilis]
~

~

~

y____________________________________________________________________________y_ _______y_________y_________y S. pneumoniae - Putative coding regions of novel proteins'si~Ailar to known proteins ________,____,_______,_______,________________~________________________________ __________________________________________ ____ _____ _____ ~

( Contig(ORF( Stop ( match gene name ( 3 Start( sim ( match ( 1 ident ~
length ( ( ID (ID( Int1 ( ( ( ( (ntl Int1 ( ( acession ________,____,_______,_______,________________,____________________ ______ __ _ _ __,________~_________,________ ( 211 (mannose permease subunit III-Han (Escherichia( 62 ~ cola) ( ( ( ( ( (gi~147402 ,________,____,,______,_______,________________,..____________________________ ____________________________________________,________,~________,_________, ( 22J (ORF2 (Streptococcus mutansl ( 62 ( ( ( ( ( ( (gnl(PID(d101190 ________,____,_______,_______,________________y________________________________ _________________________________.
.__________,________,_________~_________~
Hr ( 228 (glycerol uptake facilitator [Streptococcus( 62 ( pneumoniae) ( ( ( ~ ( (gi~530063 ,________,____,_______,_______,________________~_______________________________ _____________________________________________,________t_________,_________, ( 234 ((AF008220) YtqI [Bacillus subtilis) ( 62 ( ( ~ ( ( i (gi~2293259 ________,____a_______,_______,________________,________________________________ ____________________________________________,________,_________,_________ ( 2B2 (galactokinase [Arabidopsis thaliana) ( 62 ( ( ~ ( ( ( ~gnl(PID(e276475 ________,____,_______,_______f________________,________________________________ ____________________________________________,________,_________,_________t J75 1 I 9i(1671231(AE0000521 Mycoplasma pneumoniae, hypotheticalmilar 62 40 159 ( protein homolog; si to ( ( Swiss-Prot Accession Number P35155, ~ ~ ~

~ , from B. subtilis [1[ycoplasma ' ( ( pneumoniae) ________,____,_______,_______~________________ ,____________________________________________________________________________, ________,_________+_________, ( 385 outer membrane integrity protein (tolA)~ 62 ~ (Haemophilus influenzae) 17 ~

5~4 ( (9i(1573353 (________,____,_______,_______,________________ ,____________________________________ __,________I_________I________ __ _ __ 3 ~19 (ORF_f229 [Escherichia coli) ( 61 A ( A ( ~gi~606162 ( ________,____,_______,_______,________________ ,_______________________________________~____________________________________,_ _______~_________t_________~

7 ( 2725 9i(2114425similar to Synechocystia sp. hypothetical 61 501 to 4 ( protein) encoded by GenBank 42 ~ 3225 ( ( ( ( ( Accession Number D64006 [Bacillus ~ ~
~ J
( ~ aubtilisl ~

,________,____,_______,_______,________________,______________________.._______ ______________________________________________,________,_________, _________, w-17 (lactacin F [Lactobacillus ap.) ( 61 to ~ ( ~ ( ( ~

(9i(149569 ,________,____~_______~_______,________________ ,____________________________________________________________________________,_ _______~_________~_________t C

14 ~xylose repressor (Synechocystis sp.) ( 61 ( ( ( ( 406l 897 ( ( (gnl~PID(d101068 ________,____,_______,_______,________________ i____________________________________________________________________________,_ _______~_________,________ ( 54 (YqjH (Bacillus subtilis) ( 61 (I1 ( ( 12 838B ( ~ 1155 723d ( (gnl~PID(d101329 v___ ________,____,_______t_______,________________ ,_______________..____________________________________________________ ~ __ __ __ _________ _________ _ f ( 57 (YqfK [Bacillus subtil ( 61 ( is] 42 o 6 ( ( ( 2064 397d ( ( ~gnl~PID(d1013I6 ________~____,_______,_______,________________ ,_________________________________________________________________..__________, ________,_________i_________, ( 58 ~SPERMIDINE/PUTAESCINE TRANSPORT SYSTEM( 61 ( PERMEASE PROTEIN POTC. ( ( ( ( (sp(P45169(POTC_ ________,____,_______,_______,________________ ,____________________________________________________________________________~_ _______~_________,_________, yp ( 67 (ORF_f254 (Escherichia Bali) ( ( 61 1 ( ( 46 3 ( ( 690 692 ( ~gi(537108 ,________ ____________________________________________________________________________ ____ ________~_________,_________, _______ _______ ________________ ( 68 (pPLZl2 gene product fAA 1-184i [LUpinus( 61 ( polyphyllus) ( ( ( 8B16 92?

( ( ~gi(19501 ,________,____,_______,_______~________________ +____.._______________________________________________________________________~
________,_________f_________, ( 70 (bpiF gene product [Bordetella (15 pertussis) ( 61 10737 ( (gi~992976 ( ( ,________,____,_______,_______4________________ ,____________________________________________________________________________,_ _______,_________f_________~

( 72 (carboxynorspermidine decarboxylase ( 61 (11 (Synechocystis sp.l ( ( 36 9759 ( (10202 444 (gnl(PID(d101833 ( ,________,____t_______~_______,________________ +____________________________________________________________________________t_ _______i_________,_________r ( 76 (farnesyl diphosphate synthase [Bacillus( 61 ~ stearothermophilus) ( 8 d5 ~ ( ( ( (gnl~PID(d100305 ,____--__~____+_______,_______,____.____________ ,__.__________________________________________._____..______________________ ___ _ _ s , -( 87 unknown (Bacillus subtilis) 61 ( 42 4 ( ( ~ ( ( ( ~gi(528991 ,________,____,_______,_______,________________ ,__________________________________________________________________________ "d ___ _____ __~___ __,__ __,_________, 87 ((AE000407) methionyl-tRNA formyltransferase( 61 (13 (ESCherichia coli) ( (12311 11 A ( 1361 95l (gi~1789683 ( (________,____,_______~_______,________________ ,______________________ ___________________________________________________ ____ 91 ___ ( __4__ 2 __,__ ( ____,________ i j ~ribonucleoside triphosphate reductase ( 61 9 [Eacherichia coli) ( 9 ~g 45 ~5J ( ( ,________,____~_______,_______,________________ ~____________________________________________.________________________.__ _ __ __+___ .__;_____-__..,_________~

( l05 (hypothetical protein [Synechocyatis ( 61 ( sp.) ( ( ( 27l1 789 ~ ( (gnI~PID~d101851 ________,____,_______,_______,________________ ,__ _ __,________f_________,_________, "

( 11S (putative cel opecon regulator (Bacillus( 61 ( subtilis) ( ,________________________________________________________________36 ( ( 7968 1d91 ~ ( 6478 ___ (gi~895747 _ __,____t_______,_______i________________ __ ____ 123 hi __,________~_________~_________ 8 i 7181 tidi 8518 ki i(1209527 ( ~

( ~

( g prote ( 61 s ( n 10 ne ( nase (Enterococcus faecalis) 1338 ________,____,_______,_______,________________ i____________________________________________________________________________,_ _______~_________f_________, S. pneumoniae - Putative coding regions of novel proteins'si'milar to known proteins ________,____,_______, _______ ,________________,_____________________________________________________________ _______________ ,________y_________ y_________y Contig~ORF ~ ~ ~ ~
~ length StartStop match match t gene aim name t ident ~

ID SID ~ ~ ~ ~

(ntl(ntl aceasion (nt) y________,____ y_______y______ _,_______________ _y___________________________________________________________..______________ __y________y_________,_________ , 126 b ~ ~ ~gi~1787043 (AEOOOlBd) 752S6725 f271; ~

This 38 271 y as B01 orf is pct identical f16 gaps) to residues of an approx.

as protein YIDA_ECOLI

SW:

[ESCherichia coli) y________y____ ,_______y______ _a_______________ _y____________________________________________________________________________y ________,_________y____..____, f., 128 ~ ~ ~ ~gnl~PID~d101328 ~YqiY
~
1 1 639 [Bacillus subtilis) ~

~

,________,____ ,_______,______ _,_______________ _y___________________________________________.-________________________________,________,_________y_________y 139 ~ ~ ~ ~gI~1022726 (unknown ~
7 47945054 [Staphylococcus haemolyticusl ~

~

________,____ ,_______,______ _,_______________ _,__________________________________________________________________________ __y________,_________y_________y 1l9 ~ A ~ ~gnl~PID~e270014 beta-galactosidase ~
9 2632S913 (Thermoanaerobacter ethanolicus) ~

~

,________,____ ,_______,______ _,_______________ _,____________________________________________________________________________, ________,_________y_________y 113 ~ ~ ~ 'gi~520541 penicillin-binding ~
1 255212 proteins 61 lA ( and 42 IH ~
(Bacillus 2511 subtilis) ________~____ y_______,______ _,_______________ _,_______________________________________________________t____________________, ________,_________y_________~

14A Q16 (12125A1424 ~gi~1552743 ~tetrahydrodipicolinate ( N-succinyltransferase 61 [Escheriehia ( cold 42 ~

,________y____ ,_______,______ _~_______________ _,__________________________________________________________________________ __y________y_________,_________, 162 ~ ( ~ ~gnI~PID~d101829 ~phosphoglycolate ~
3 4112l456 phosphatase (Synechocystis ~

sp.) 30 ~

___ ,____ y_______y______ _,_______________ _,____________________________________________________________________________y ________y_________y_________, l72 ~ ~ ~ ~9nl~P1D~d102048 H.
~ 351 3 727 i077 subtilis, cellobiose 44 phosphotransfecase system, celA;

( ~ ~ ~ ~ ~ ~

IHacillus i subtilisl N
y________~____ ,_______y______ _y_______________ _y___________,_________________________________________________________________ y________,_________y _________y N

177 ~ ~ ~ ~gnl~PID~d100574 unknown ~ J
3 1l011772 (Bacillus subtilis) ~

~

~

________,____ ,_______y______ _y_______________ _,____________________________________________________________________________y ________y_________y_________y w.

202 ~ ~ ~ ~gi~1045831 ~hypothetlcal ~ N
2 i27825A5 protein (GB:L18965_6) ~

(Hycoplasma 36 genitalium) ~

l308 ~

________,____ ,_______,______ _,_______________ _,__________________________________________________________________________ __,________,_________,_________, o 221 ~ ~ ~ ~gi~1591144 ~H.
~
3 27823144 jannaschii predicted ~
coding 30 region ~

IHethanococcus jannaschii) ________,____ ,_______,______ _,_______________ _,__________________________________________________________________________ __,________,_________y_________y 225 ( ~ ~ ~gI~1552771 hypothetical ~
4 33953766 [Escherichia coli) ~

~

y________,____ y_______,______ _,_______________ _,___________________________________________________________________________~, ________y_________,_________, 249 ~ ~ ~ ~gi~1000453 ~TreR

2 212 802 (Bacillus 61 o subtilis) ~
d2 ~

~

________~____ y_______,______ _,_______________ _,____________________________________________________________________________, ________,_________y_________, 254 ~ ~ ~ nl~PID~d100417 ~ORF120 ~
2 843 4A4 ~ [ESCherichia colil ( 9 ~
N

,________,____ ,_______,______ _,_______________ _;__________________________________________________________________________ __,________y______-__,_______-_, ( ~ ~ ~ ~gnI~PID~e255315 unknown ~
257 1 3 350 (Mycobacterium tuberculosis] ~

~

,________,..___ ,_______,______ _y_______________ _y____________________________________________________________________________y ________,_________y_________, 293 ~ ~ ( ~pir~JC1151~JC11 hypothetical 61 ~ 45 4 39713657 20.3K
~
protein 315 (insertion sequence -Agrobacterium tumefaciens ~

latrain P022) plasmid Ti ________,____ ,_______,______ _,_______________ _,__________~______________________________________________________________ __y________y_________,_________y 301 ' ~ ~ ~gi~2291209 ~(AF016424) 1 949 17 contains similarity to acyltransferases (Caenorhabditis elegans) ~

~

~

,________y____ ,_______,______ _y_______________ _y_________________________________________,-_______________--______________ __,________f_________4_________y 373 ~ ~ ~ ~gi~393396 ~Tb-292 ~
1 1066287 membrane associated ~

protein 38 [Trypanosome ( brucei 780 subgroup) ~________,____ ,_______y______ _+_______________ _y__________________________________________________________________________ __y________,_________;_________, 3 Q24 Q2447324955 ~gi~537093 ~ORF_o153b ~
(Escherichia 60 coli) ~

~

y________i____ i_______y______ _i_______________ _,__________________________________________________________________________ __,________y_________,_________+

6 ~ ~ ~ ~gi~2293258 ~(AF008220) ~
46365739 YtoI 60 (Bacillus ~
subtilis) 35 ~

y________y____ f_______,______ _y_______________ _y_____________________________________~______________________._____________ __,_ _____ 6 Q12 A A ~gi~293017 ~ORF3 _____ 19361187 Iput.); __,__ putative __,__ [Lactococcus __y__ lactis) __y ~

~

~

y___-____,____ y_______,______ _y_______________ _y__________________________________________________________________________ __y________y_________f_________y 17 Q13 ' ' 'gi~149569 ~lactacin ~

lLactobacillus ~

sp 32 ] ~

.

y________,____ y_______y______ _y_______________ _y__________________________________________________________________________ __,________y_________4_________t 18 ~ ~ ~ ~gi~1788140 ~(AE0002781 7 69775670 o481; ~60 This ~ ~

as orf is pct identical (19 gaps) to residues of an approx.

as protein NOL1_HUMAN

SW:

[Escherichia cold ~ ( ~ p p +
p ,________,____ y______ _y________________ __,________y_________ p ______ __________________________________________________________________________ _________+

20 Q15 A587817167 ~gnl~PID~d100584 unknown ~
[Bacillus 60 subtilis) ~

~
l290 ________,____ ,_______;______ _,_______________ _y______..___________________________________________________________________ __,________+_________,_________, S. pneumoniae - Putative coding regions of novel proteins-similar to known proteins ,________,____, _______f_______,________________~______________________________________________ ______________________________+________t_________f______~__~

C ORF St St ti ( t h ~

on at ~ ( matc ~ match gene name ~

g op 1 sim t ident ~
length ID SID ~ ~ ~ acession ~ ~
~ ~ (nt) (nt) (ntl ________,___ _,_______,_______,________________,____________________________________________ ________________________________, ________,_________,________ I2 ~ ~ ~gnI~PID~d102050~transmembrane (Bacillus subtilis]
~
1 i 60 ~ ~

~

,________,___ _,_______,_______,________________,____________________________________________ ________________________________~________~_________~_________, 32 Q10 ~ ~gi~2293275~(AF008220) YtaG (Bacillus subtilis] ~

~ ~

~

,________,___ _,_______,_______,________________,____________________________________________ ________________________________~________~_________~_________i W
rr 38 ~15 ~ ~gi~40023 ~B.subtilis genes rpmH, rnpA, SOkd) gidA ~

8837 and gidB (Bacillus subtilis] 60 ~ ~

~
86i ________,___ _,_______,_______,________________f____________________________________________ _______________________________,________f_________~_________, 43 ~ ( ~gi~171787 protein kinase 1 (Saccharomyces cerevisiae]

~ ~

~
' 2667 ________,___ _,_______a _~________~_________f_________~
_______,________________t______________________________________________________ _____________________ ( 44 ~ ~ ~gnl~PID~e235823unknown [Schizosaccharomyces pombe) ~
i 1 60 ~ ~

~

,________,___ _,_______,_______,________________~____________________________________________ _______________________________ _~________y_________~_________~

( 45 Q10 A ~gi~397488 ~1,4-alpha-gluean branching enzyme [Bacillus( 1138 subtilis) 60 10368 ~

( ________,___ _,_______,_______,________________~____________________________________________ __________..,~___________________ _,________,_________,________ 48 (19 A ~gnl~PID~e205173~orE1 (Lactobacillus helveticus]
( A4378 ~

~

,________,___ _,_______,_______,________________ ,____________________________________________________________________________,_ _______,_________~________ 48 Q21 A ~gnl~PID~d102011~(AB0026681 unnamed protein product [Haemophilus~
6,727 actinomycetemcomitans] 60 A ( ( ________,___ _,_______,_______,________________ ,____________________________________________________________________________,_ _______,_________,_________+

50 ~ ~ ~gnl~PiD~e246537~ORP286 protein (PSeudomonas stutzeri]
~

~ ~

~

_,_______,______ _,________________,_.._____.___________________________________________________ __________________,________,_________,_________y 62 ~ ~ ~gnl~PID~d100587unknown (Bacillus subtilis) ~

~ ~

~

________,___ _,_______,_______,________________ ,____________________________________________________________________________,_ _______~_________,_________, 68 ~ ~ ~gi~1573583~H. influenzae predicted coding region W ~

4 J590 0594 (Haemophilus inEluenzae] 60 ( ~

~

,________,___ _,_______,_______~________________,____________________________________________ _______________________________ _t________~______-__,_____--~y 70 Q11 S781 6182(gnl~PID~d102014~(AB001488) SIMILJ1R TO YDFR GENE
PRODUCT ( 33 402 OF THIS ENTRY (YDFR_BACSU1. 60 (Bacillus subtilisl ~ ~

________,___ _,_______, ______ _,________________,____________________________________________________________ ________________,________,_________,_________, F",, 70 (12 ~ ~gnl~PID~e324970hypothetical protein (Bacillus subtilis]
~ W

~ ~

~

~

________,___ _,_______,______ _,________________~____________________________________________________________ ________________,________,_________,_________, 71 ~ A ~gi~580866 ~ipa-12d gene product [Bacillus subtilis] ~

14157 ~

~

___ ,___ _,_______,______ _,________________~____________________________________________________________ ________________,________,_________,_________, 74 ~ (12509 ~gnl~PID~d101832~phosphatidate cytidylyltcansferase (Synechocystis~
8 A sp.) 60 1664 ~

~

________,___ _,_______,______ _,________________,____________________________________________________________ ________________+________,_________+_________, ( 76 ~ ~ 3367~gi~2352096~orW similar to serine/threonine protein ~
39 ~ 7S0 4 4116 phosphatase (Fervidobacterium 60 ~ ~

islandicum) ________~___ _,_______,______ _,________________,____________________________________________________________ ________________,________,_________ ,_________, BO 4 ~ 766S~gi~1786920~IAE0001311 f86: 100 pct identical to CB: ~

7372 ECODINJ_6 ACCESSION: D38582 60 294 ~

~ (ESCherichia coli) ~ i ,________,___ _,_______,______ _,________________,____________________________________________________________ ________________,________,_________,_________ , 81 ~ ~ ~gi~147402 ~mannose permease subunit III-Man (ESCherichia~

6 4D73 cola) 60 ~ ~

~

,________y___ _,_______,______ _,_-______________,________________________________________________________________ ____________,________+_________,_________, 86 ( ~ ~gi~143177 putative (Bacillus subtilis) ~

~ ~

~

________+___ _~_______i______ _,________________,____________________________________________________________ ________________,________,_________t_________~

92 ~ ~ ~gi~396398 ~homoserine transsuccanylase (Escherichia ~

1 1 coli] 60 ~ ~

~
i92 ,________,___ _i_______, _______,________________,______________________________________________________ ______________________,________+_________~_________~ 1~

( 93 Q14 (106t9 9384~gi~1788389~(AE000297) o464; This d64 as orf is 33 pct ~ 27 1236 ~ identical 19 gaps) to 331 60 ~

~

residues of an _pprox. 416 as protein MTRC_NEIGO
Sw: P43505 (ESCberichia ( ~ ~ ~ ~ coli]

~

________,____,_______~______ _,________________,____________________________________________________________ ________________,________, _________t_________~ C/~

94 ~ ~ ~gnI~PID~e329895~(AJ000496) c J
5S48 clic nucleotide-~ ated channel b 8121 t b it R
tt i y ~
g 60 e ~
a su 50 un ~
[ 2574 a ~
us norveg cus]

________,____,_______,______ _,________________,____________________________________________________________ ________________,____..___,_________~________ 97 ~ ~ ~gi~1591396~transketoiase' [Methanococcus jannaschii] ~

~ ~

~

,________,___ _,_______,______ _,________________,____________________________________________________________ ________________i________,_________,________ 102 ~ ~ ~gnJ~PID~e320929(hypothetical protein (Mycobacterium tuberculosis)~

~ ~

~

,________,____~_______,______ _f________________~____________________________________________________________ ________________,________t_________4_________f TABLE 2 S, pneumoniae - Putative coding regions of novel proteins IslMilar to known proteins (________,____;_______,_______;________________;_______________________________ _____________________________________________;________;_________;_________, Contig /ORF
pip ~

Start ~

Stop ~

match ~

match gene name /

ID

~IO

~

(nt) ~

(nt) ~

acassion ( /

t sim ( ident ~

length (________,____,_______,_______,________________;_______________________________ _____________________________________________I________I_________I__(nt)___i /

/

/

/gnl/PID~e334782 ~YIbN

protein (Bacillus subtliis/

/

/

/

;________~____,_______,_______,________________;_______________________________ _________________________________ ____________,________;_________~_________;
/ ~ 60 ~ 43 W
1l3 ~ 477 ( ____________,________;_________;_________;
8 ~ 60 ~ 32 ~ ~ 2232 /

/gi~466875 ~nifU;

81496_C1_157 (Mycobacteeium leprae) ________;____~_______,_______;________________;________________________________ ________________________________ /

~

/

~gnl~PID~e328143 /IAJ000332) Glucosidase II

(Homo Sapiens) ________;____;_______ ;_______,________________~______________________________________ __________ _ __,________,_________,_________, ( ________ ( 60 ~

122 ( 306 ~ ( / /gnl~PID~d101876 %4763 /transposase (Synechocystis sp.) ;________;___ _;_______;_______;________________;____________________________________________ ____________________ __________ 127 ~ ( ~ 4510 5283 __;________,_________;_________, 8 /gi/1777938 60 i 38 ~
~Pgm 774 (Treponema pallidum) ~________;____;_______~_______~________________;_______________________________ _________________________________ ____________;________ 138 ~ ~ ~ _________ 4 3082 2672 _________;

/gnl~PID/e325196 ~ 60 ~ 36 /hypothetical 4i1 protein ' (Bacillus subtilis) ________,___ _,_______,_______,________________;____________________________________________ __________ ______ _____ 139 ~ _______ __ ~ 177 ~ _____ __ 1 4 ~ 60 / 39 ~gnl~PID~d1006B0 ~ 174 /ORF

(Thermus thermophilusl ________,____,_______;_______;________________;______________ ___ ___ _____ / (11/14520_ __ ____ _ 139 ______________________________ _ _ + 60 30 ~

~gi~537145 /

/ORF_f477 [Escherichia cold ________;___ _;_______~_______;________________+____________________________________________ ____________________ ____________;________~_________~_________;
/ ~ ~ ~ 60 ~ 37 1d0 2592 1219 ( 1344 ~ ~gi~1209527 2 /protein histidine kinase [Enterococcus faecalis) ________,___ _,_______~_______,________________,____________________________________ __________ 141 ~ _______________________ _ _________ / 210 ~

1 l049 331 ~gi/463181 / 60 ~ 34 ORF

from by to 4081;

putative [Human type paplllomavirus ________,___ _,_______-141 ~ -~ 5368 ,_______;________________;____________________________________________ ______ ---__,________,_________;________ ~

~gi~145362 tyrosine-sensitive DAHP

synthase faroF) (ESCherlchia colil ~

~

~

________,___ _,_______;_______,________________,____________________________________________ ____________________ _ 142 ( / ~gi~600711 ___________;________,_________,_________, ~ 3558 4049/putative ~ 60 ~ 37 6 [Bacillus ~

subtilisl ________;___ _,_______,_______,________________~____________________________________________ ____________________ ____________;________;_________;_________;
l48 ~ ~ ( 60 ~ 27 Q10 7742 8713 ~ g72 ~gnl~PID~e313022 hypothetical protein (Bacillus subtilisl ________,____,_______;_______~________________,________________________________ ____________________________________________;________;_________~_________;

153 ~ ~ ~ ~gi~2293322 5 3667 4278~(AF0082201 branch-chain amino acid transporter (Bacillus subtilisl ~

~

~

;________;___ _;_______~_______,________________;____________________________________________ ________________________________~________+_________;_________;

/ ( ~ /gi~2104504 155 1d13 748 putative ~ UDP-glucose 1 dehydrogenase (Escherichia colil ~

~

~

/

~________~___ _;_______~_______;________________;____________..______________________________ _________________________________;________;_________;_________;

/ / ~ (gnl~PID/d100872 158 3116 2472/a ~ negative 3 regulator of pho regulon (PSeudomonas aeruginosal ~

~

/

________,____,_______,_______ ,________________,_____________________________________________________________ _______________f________,_________;_________;

/ ( / 1J86 gnl~PID~e308090 159 3 77B ( product ~ highly similar to Bacillus anthracis CapA protein (Bacillus ( 60 ( / ~ ( ~ / subtilis) ( /
~

~________;___ _;_______~_______;______--_--_,~___;________________________________________________________________ ____________;________;_________;_________;
163 ~ ~ ~ 60 ~ 38 ~ 8049 8468 / 420 7 ~gnl~PID~d101313 ~YqeN

(Bacillus subtilisl ~________;___ _;_______~_______;________________;____________________________________________ ________________________________;________~_________~_________;

l70 ~ ~ ~gi/1574179 ~ 4130 268B/H.

3 influenzae predicted coding region (Haemophilus influenzae) ~

~

~

/

~________;___ _y_______,_.._____;________________;___________________________________________ _________________________________;________;_________;_________;

( ~ ~ ~gi~606076 171 4717 5901/ORF_o384 ~ [ESCherichla 7 coli) ~

~

~

~________;___ _;_______;_______;________________;____________________________________________ _______.__________________ ______,________;_________,_________;

/ ~ ~ ~gi~1877427 183 2440 2135repressor ( (Streptococcus 3 pyogenes phage T12) ~

~

~

~________;___ _;_______;_______;________________ ;_________________________ _ / / ~ ~gi catabolite control - -________________________________________________;________;_________;_________i 191 9444 8928415664 protein (Bacillus megaterium) ~ 60 H
/10 ~ 42 ~ 1017 _,_______;_______,________________ ;___ _____________________ __ ____ ________,___ _______________________________,________,_________,_________;

~ ( ~gi~438462 ~transmembrane protein [Bacillus subtilis) ~1 200 139 1083_,________________ ~ 60 ~ 37 ~ 945 /

~ _;_______,______(gi~475112 ;____________________ _________________ 1 ~ ~ _;________________ __,________;_________;_________;

~________,___ 3895 1928 enzyme IIabc [Pediococcus pentosaceusl 201 _;_______;______~gi~1573407 ~ 60 ~ 39 / 1968 / _;________________ ;____________________________________________________________________________ 3 A A __ /

;________;___ 0930 0439~gi~608520 ______+_________,_________;

_;_______;______ hypothetical (Haemophilus influenzae) 214 / 60 ~ 39 ~ 4g2 /15 / ~
;__________________________________________________ ;________;___ 2l95 2363 __________________________;________;_________;_________i (myosin heavy chain kinase A [Dictyostelium 21B discoideum) ~ 60 ~ 31 / 219 ~

;________~____;_______;_______;________________ ;____________________________________________________________________________;_ _______;_________;_________;

TABLE 2 S. neumoniae - Putative coding regions of novel p proteins'sY~llar to known proteins ________,__________________________________,___________________________________ _________________________________________________,__________________ j ORFStartStop match j match gene name j k sim t identlength Contig~ j j j ( j j j ID (nt( (nt) acession ~ ( ~ (nt) ID ~ j j j __________________________________________,____________________________________ __________________________________________________________________ j j ~

j ~gi~437705 jhyaluronidase (Streptococcus pneumoniaej ~

j ~

________,__________________,___________________________________________________ _________________________________________,_________________,_________, pp j a42 j j j jgij43938 jSor regulator (Klebsiella pneumoniaej j ( j _______________________________________________________________________________ __..__________..____________________..______________________________, Y.1 j j j jgij304897 jEcoE

type I

restriction modification enzyme H

subunit (Escherichia coli) j j ( ( __________________________,_______________________..___________________________ __________________________________________________________,_________ j ( ( ( jgij671632 junknown (Staphytococcus sureus]

j ( j j ____________,__________________________________________________________________ ____________________________..___________________,______-___________, ~

j ~

jgij153791 jryg [Streptococcus gordonii) j j ~

,__________________________________________,___________________________________ _________________________________________________,__________________ j j Z

j ( ~pirjS31840jS318 jprobable transposase -Bacillus stearothermophtlua j ~

~

i71 ( _______________________________________________________________________________ _________________~____________________,__________________________ j j j j ~gi(1592173 jN-ethylammeline chlorohydrolase [Nethanococcue jannaschii]

~

j ~

j ,___________________,_______,________________,______________.._________________ ____________________________________________________,__________________, j j j ~

~gij1787397 ~IAE000214( o157 (ESCherichia coli) ~

~

j ________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________,_________ y j n 3l8 j j ~

jgnl~PIDje137594 jxerC

recombinase ILactobacillua leichmannii) ( j j j o .

______________________-_______________________________________________________________________________ ________-________________________,_________, . N ..

j j ~

~

jyi~509672 repressor protein (Bacteriophage Tuc2009) ( ~

j j _______________________________________________________________________________ _________________________________________________________________, J
( ~

( jgi~2293147 j(AF0082201 YtxH

(Bacillus subtllis]

j j j j ,____________,_______,_________________________________________________________ ___________________________________________________________,_________ - N

j o .

j22 j18140 j17142 jgnljPIDje280724 junknown (Mycobacterium tuberculosis( ( j ( j ~_____________________________.._______________________________________________ -__________________________________________________________________, ~ L.~:

j j j j jgij1353880 jsialldase L

(Hacrobdella decora]

j j j j _______________________________________________________________________________ _______________________________________________,__________________ ~

( j ( ~

~gij580841 jFl (Bacillus subt3lis]

( I

j ;5 w __________________________________________,____________________________________ ________________________________________ ____ __ _____ __,____ j j j j jgij142469 jals operom regulatory protein (Bacillus subtilis]

j ( ~

_____ N
____ ____,__ _______ ________________ ____________________________________________________________________________ ________ _________ _________ __ t j --j ~

j jgnl~PID~e280623 ~PCPA

[Streptococcus pneumoniae]

j ~

j l917 j _______________________________________________________________________________ _________________________________________________________________ j j ~

j ~gnI~PID~e233868 hypothetical protein (Bacillus subtilis) j ~

j ________,____________________________..________________________________________ _________________________________________________,__________________ j ~

~

j ~gnl~PIDje202290 junknown [Lactobacillus sake]

~

j ( j ________,___________,_______,________________,_________________________________ _____________________________________________________________________, j13 j12201 j11071 ~gnljPIDje238664 jhypothetical protein [Bacillus aubtilia]

( ( j j ________,____,_________________________________________________________________ __________________________________________________________~_________ j ~14 j13288 (121A2 jgij1657647 jCapBH

(Staphylococcus aureusj ~

~

( __________________________________________,____________________________________ __________________________________________________________________ ( j18 A

j17897 ~gi~1500535 ~H.

jannaschii predicted coding region M,T1635 (Methanococcus jannaschii) j ~

j j _______________________________________________________________________________ _________________________________________________________________ ~12 ~

~

jgi~2293239 j1AF008220]

YtxK

[Bacillus subtilisj j ~

j _______________________________________________________________________________ _____________________________________ ____ ____ .._,__ __y__ _..__________ ~

~

~

~gij1684845 (pinin [Canis familiarisl ~

~

~

j _______________________________________________________________________~_______ _____________________________________ ____ ___ __,__ ____ _____________ j j j j ~gnIjPIDjd101329 jYqjK

(Bacillus subtilis) j ~

j j _______________________________________________________________________________ _________________________________________________________________, J

( j j j ~gnl~PID~e137594 jxerC

recombinase (Lactobacillus leichmannii]

j ( j ( __________________________,__ ___ ___ _ __ ____ ___ _______________________________________________________________________________ ___,_______________..__ j j ~

( jgnljPIDje311516 jaminotransferase (Bacillus subtilisj j j j j ,___________________,_______,__________________________________________________ __________________________________________,__________________________ j ~

j j ~gij1146190 (2-keto-3-deoxy-6-phosphogluconate aldolase [Bacillus subtilisj j ~

j __________________________,_______________________________-_______________________________________________________________________________ _______ S. pneumoniae - Putative coding regions of novel proteins~similar to known proteins ________ ,____, _______ r_______,________________y_____________________________________________________ _______________________,________,_________+_________~
Contig ORF Start Stopmatch ~ match ~ ~ ~ ~ ene name g t sim ~
i ident ~
length ID SID ~ ~ ~

(nt) (nt) acession I i i I
~ntl (________,____,_______,______ _,________________,____________________________________________________________ ________________ ___________ ________ ___ ( ~gi~1573628 ~antothenate ~
kinase 59 (coaA) ~
[Haemophilus 38 influenzael ~

________,____ ,_______,_______,________________,_____________________________________________ _______________________________y________,_________i_________y r.

87 Q12 A 10055 ~gnl~PID~e323504 utative 1383 Fmu rotein [Bacill btili s ]

p ( p 59 u ~
su 44 s ~
l329 ,________,____ y_______ y_______y________________~_____________________________________________________ _______________________y________~_________y_________, (11 1r 113 Q14 A3927 (1S894 gi~1673731 (AE000010) 59 ( 43 ~ 1968 Mycoplasma pneumoniae, fructose-permease IIBC
component:
similar to ~ Swiss-Prot Accession Number P20966, from E.
coli (Mycoplasma ~

( ~ pneumoniae) ,________y____ ,_______ ,_______+________________~_____________________________________________________ _________________..___ _____ __ __._ ,__ ,__ y __y__ __ _____ __ 115 ~ ~ ~ ~gi~1590886 ~M.
~
8 8766 8521 jannasehli predicted ~
coding 38 region ~

[Hethanococcus jannaschii) ________,____ ,_______ ,_______,________________i__ __________ __,________~_________y_________, 119 ~ ~ ~ ~gnI~PID~e209005 homologous 2 1966 1S26 to in nrdEF
operons of E
cola and S
typhimurium [Lactococcua ~ ~ i lactisl ,________,____ y_______ ,_______ ,________________y_____________________________________________________________ _______________,________ ~_________y_________y 128 Q17 A 13178 ~gnl~PID~e279632 unknown ( 3438 [Mycobacterium tuberculosis) ( ~

________,____ ,_______ y_______ y________________,_____________________________________________________________ _______________ ,________,_________y_________+

140 ~22 Q23903 Q23388 ~gi~482922 protein ~ 516 with 59 homology ~
to 40 pail repressor of B.subtilis [Lactobacillus delbrueckii) ________r____ ,_______ v_______ y________________,_____________________________________________________________ _______________,________y_________ v_________, 148 ~13 ~ ~ ~gnl~PID~d102005 ~(AB001488) 9b97 9014 FUNCTION
~
UNKNOWN, SIMILAR
PRODUCT
IN
H.
INFLUENZAE
AND

SYNECHOCYSTIS. ~ ~

(Bacillus subtilis) ________,____ ,_______ ,_______ ,________________ ,___________________________ ,_________t_________y _____________ __y________ I49 Q10 ~ ~ ~g1~710422 ~cmp-bindingfactor ~

[Staphylococcus ~

aureus) 40 ~

________,____ ,_______ ,_______ ,________________y ____________________________________________________________________________,__ ______,_________t_________, ( ~ 6993 ~ ~gnI~PID~d100965 ferric anguibactin-binding protein precusor 164 9 6013 : FatB of V. anguillarum ~ 59 ~ 41 [Bacillus ~
subtilis) ,________,____ ,_______ ,_______ ,________________ ,__-_________________________ ________________________________________________ ,________y_________f________ 164 12 8836 7823 l d ~ ~ ~gn homologue ~PID~ of 1009 ferric d anguibactin transport system permerase protein FatC
of ( ~ i V. i anguillarum [Bacillus subtilis]

y________y____ ,_______ y_______ y________________y_____________________________________________________________ _______________ y________,_________,_________y 177 ~ ~ ~ ~gi~289759 coded 2 401 1072 for by C.
elegans cDNA

(GenBank:Z14728);

putative ( ~ ~ ~ ~ ~ i i (Caenorhabditis i elegansl ________,____ ,_______ ,_______ ,________________,_____________________________________________________________ _______________y________y _________~_________, 177 ~ ~ ~ ~gi~2313445 ~(AE000551) ~
7 3841 4200 H. 59 pylori ~
predicted 38 coding ~
region 360 [Helicobacter pylori]

________,____ ,_______ ,_______ ,________________,_____________________________________________________________ _____________ __ __,____ __y_________,_________y 183 ~ ~ ~ ~9i re ~
4 276B 2508 509672 ressor p ~
protein 50 [Bacteriophage ~

Tuc2009/ Z61 ,________,____ ,_______ ,_______ ,________________,__________~___.______________________________________________ ________________ y________y_________,_________, 1B6 ~ ~ ~ ~gi~606080 ~ORF_o290;

6 3398 2820 Geneplot suggests frameshift linking to o267) not found (ESCherichia i ~

coli[ i ,________,____ ,______ ,_______ y________________,________________-___________________________________________________________,________, _________,_________, 190 ~ ~ ~ ~gi~1613768 ~histidine ~
3 3120 1711 protein kinase ( [Streptococcus 32 pneumoniae) 1410 y________,____ y_______ ,_______ ,________________ t________________________________________________,______________________, ____ ,________,_________y_________y 191 ~ ~ ~ ~gnI~PID~d100579 ~unknowrt 2 1621 I019 [Bacillus s btili ) u ~
s 59 ~

~

_.,______,____ ,_______ ,_______ ,________________ ,_____ __ ___ _____ ,.d ___________________________ ________ __________ -y--------_, _ ______________ 198 ~ ( ( ~gnI~PID~e313073 hypothetical ~
7 5205 4306 protein (Bacillus ~
subtilis) 38 ~

________,____ y_______ ,_______ ,________________ ,__________________________________________________________________ y________,_________,_________y 220 ~ ~ ~ ~gnl~PID~d101322 _________._ ~
4362 3958 ~YqhL 59 [Bacillus ~
subtilis) 46 ~

y________,____ y_______ ,_______ y________________ y____________________________________________________________ y________,_________,_________y __ 242 ~ 1S73 ~ ~gi~17B7045 ~(AE000184) ~ 42 795 3 2367 f308; 59 ~
This as orf is pct identical gaps) to ~ i res dues of an approx.

as protein PFLC_ECOLI
SW:

[Escherichia colil ________,____ ,_______ y_______ ,________________ ,______________________________________ _ ,____ - __,_________y_________ ____________________________________ 247 ~ ~ ~ ~gi~40073 ~ORF107 2 115d 19B0 (Bacillus subtilisl ( I
' ________,____ ,_______ ,_______ ,________________ y +
____________________________________________________________________________ ________ _________,_________ TAI3LC 2 S, prreumoniae - Putative coding regions of novel proteina'sld~ilar to known proteins a________a____a_______ ,_______, ________________,_______________________________________________________.._____ _______________a_________________a_________a C F

i ont ~OR~ ~ j match gene name ~ t sim ~ t ident g StartStopmatch ~ length ~

ID CIO~ (nt)~ ~ ~
~ (nt) (nt) acession ~ ~

________a____,_______,_______a________________a________________________________ __________________________ __________________a________a_________a_________a ~p 2S6 ~ ~ ~gnl~PID~d101924 j 59 ~ 39 j 867 1 868 ~hemolysin j ' [Synechocystis 2 sp.]

________a____,_______,_______,________________a________________________________ __________________________ __________________a________a_________a_________a ( j 65 ~gi~2246532 ORF 73) contains large complex 258 1 ~ ~ repeat CR 73 (Kaposi's sarcoma-associated j B20 ~ 59 756 herpesvirus) ( _ ( ,.W..

,________,__.._a_______a______ _,________________a______________________________________________-___________ __________________a_.-______a_________a_________a 270 ~ ( ~gnl~PID~d102092 ~
59 ~ 40 ~ 741 1 386 ~YfnB

~ (Bacillus 1126 subtilis]

a________a____a_______,______ _,________________a__________________________________________________________ ________ _________a________,_________a_________, 2B1 ~ ~ ~gi~666062 ~ 59 ~
31 ( 387 1 552 putative ~ (Lactoeoecus 166 lactic) ________,____,_______,______ _a________________a______________________________________________________-___ __________________,________~_________a_________a 309 ~ ~ ~gi~405879 1 3 ~yeiH 59 ~ 38 ~ 477 ~ (ESCherichia d79 cola) ,________,____a_______,______ _a________________,__________________________________________________________ __________________a________a_________,_________a 363 ~ ~ (gi~915208 ~ 59 ~
31 ~ 1893 1 2 gastric ( mucin 189d (Sua scrotal ,________a__.._,_______,______ _,________________,__________________________________________________________ __________________,________,_________a_________a 387 ~ ~ ~gi~160671 ~ 59 ~
44 ~ 34Z
2 d25 jS

( antigen 84 precursor (Plasmodium falciparum]

________a____,_______,______ _a________________s__________________________________________________________ __________________a________a_________a_________, ~ A1223 jgnl~PID~d101812 j 58 ( 29 ~ 759 y 6 (10465 ~LumQ j [Synechocystis sp.]

________a___________,______ _,________________a_____________________________ __a________a_________a_________, ______ -29 ~ ~ ~gnl~PID~d100479 ( 58 ~ 39 ~ 1116 1 2098 ~Naa ~ -ATPase 3513 subunit .1 (Enterococcus hirae) ________,____,_______,______ _,________________,__________________________________________________________ __________________a________a_________~_________a 30 ~ ~ ~gi~39478 ( 58 ~
34 ~ 40B J
5 4058 ATP ~

~ binding 3651 protein of transport ATPases [Bacillus firmus) ________,____,_______,______ _,________________,_________________________________________,________ __i________a_________f_________a __ j ~ ~ ~gnl~PI0~d101164 ~
58 ( 45 ' 774 33 6 2983 unknown j ~ (Bacillus 2210 subtilis) (________a____,_______,______ _a________________a__________________________________________________________ __________________,________a_________,_________a o 36 ~ ~ jgi~1518679 j 58 ~
32 j 864 8 5316 jorf j j [Bacillus 6179 subtilis]

________,____;_______,______ _,________________a__________________________________________________________ __________________a________a_________,_________, 43 ~ ~ ~gi~1788150 ~ 58 ~
37 ~ 1956 5 5926 ~(AE000278) ~ protease [ESCherichia coli]

________a____,_______a______ _,________________,__________________________________________________________ __________________a________y_________,_________a 46 ~ ~ ~

5 3704 nl~PID~e267329 ~ U

S221 k (Ba ill ili b g ~ 58 ~ 42 ~ 151B
o n ~

nown c us su t s) ________,____a_______a______ _a__t_____________,__________________________________________________________ __________________,________a_________,_________a ~

48 Q14A ~gnl~PID~d101771 .] ~ 58 ~ 34 1722 thiamin j 657 A biosynthetic 1066 bifunctional enzyme tSynechoeystis sp ________a____a_______,______ _________________,__________________________________________________________ __________________a________,_________,_________a 52 ~ ~ jgnl~PID~d101291 ~
58 ~ 35 ~ 1227 1 1229 ~reductase ..
( [Pseudomonas 3 aeruginosa) ________,____,_______a______ _,________________,__________________________________________________________ __________________,________a_________a_________a 53 ~ ~ ~gi~2313357 2 702 ~(AE000545) ~ cytochrome 412 c biogenesis protein (ccdA) [Nelicobacter pylori]

~

( ' ________,____,_______,______ _,________________a__________________________________________________________ a __________________,________ _________,_________, 58 ~ ~ ~gi~147329 4 6586 transport ~ 58 ~ protein 41 5498 [ESCherichia 1089 coli) ~

' _ ( ,_______ _______________ .
_ _________ _______________________________________ a __,__ ___ _________________ ____ __ _____ __ ____ __a 69 ~ ~ ~gnl~PID~e311492 ( 58 ~ 41 ~ 1128 5 4934 unknown ~ [Bacillus 3807 subtilis]

(________,____,____-__a______ _,________________a_________________________________________________---______ __________________,________,_________,_________, 71 ~27Q31357 (gi~2408014 ~ 58 ~ 33 ~ 921 32277 hypothetical protein [Schizosaccharomyces pombel ,________,____/_______,______ _,________________a__________________________________________________________ __________________a________a_________,_________;

72 ~ ~ ~gi~1B694 ~ 58 ~
34 ~ 705 4 3586 ~nodulin-21 ~ IAA

2882 1-201t [Glycine max) ________,____,_______,______ _,________________,__________________________________________________________ __________________,________a_________,_________+ b 74 ( ~ ~gi~2293252 ~ 58 ~
33 ~ 708 ,_______3 d937 ~(AF008220) , ~ YtmO

4230 (Bacillus subtilisl _ ____,_______,______ _,________________a__________________________________________________________ __________________,________a_________,_________a 79 ~ ~ ~gi~1217989 ~ 58 ~
44 ~ 1173 4 4594 ~ORF3 ~ [Streptococcus 3422 pneumoniae) ________,____,_______,______ _,________________,__________________________________________________________ __________________,________a_________,_________, 82 ~ A ~gi~882711 ~ 58 ~
38 ~ 2415 8 0585 ~exonuclease ~ V

8171 alpha-subunit (Escherichia cola]

________,____,_______,______ _,________________,__________________________________________________________ __________________,________r_________a_________a f..

86 Q17A6017 ~gi~17642 typhi) ~
58 ~
15337 ~5-dehydroquinate 32 ~ 681 hydrolyase (3-dehydroquinase) [Salmonella ,________,____ ,_______,______ _,________________,____________________________________________________________ ________________,________,____.-____a_________a 97 ~ ~ ~gi~153794 ~ 58 ( 32 j 372 2 931 ~rgg ~ (Streptococcus 560 gordonii]

________,____,_______,______ _,________________,____________________________________________________________ ________________,________a_________a_________, S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ________,____~_______~_______,________________,________________________________ __________________________________-~________,________+_________~_________, Contig~ORF ~ ~ match ~ match gene name ~ t 8 ~ length ~ Stop sim ident Start ~ i ID SID ~ ~ acession Int) ~ (nt) (nt) ,________~____y_______ ~_______,________________~_____________________________________________________ _______________________,________f_________~____.-____ , QA

108 ~ ~ (gi~537020 ~vac8 gene product [Escherichia coli) ~ 58 2 2724 ~ 37 ( ~ 2367 ________~____+_______ ,_______E________________+_____________________________________________________ _______________________,________~_________i________ 111 ~ ~ ~gi~1592142ABC transporter, probable ATP-binding 5240 subunit [Hethanococcus jannasehii) ~ ~ 58 ~ 36 ~ 648 ___ ,____,_______ ,_______,________________a_____________________________________________________ _______________________,________,_________,_________, 120 ~ ~ ~gnl~PID~d101320~YqgX (Bacillus subtilis] ~ 58 3 5110 ~ 47 ~ ~ 690 d421 ________+____,_______ ~_______,________________,___________________________________________-____________________ ____________t________~_________,_________i 128 Q16 A2673~gi~662919 ~ORF U (Enterococcua hirae) ~ 58 A3131 ~ 42 ~ 459 ~________~____E_______ ,_______t________________~____.______________________-_____________________________________ __-_________y________~_________~____-____y 132 ~ ~ ~gi~1800301~macrolide-efflux determinant [Streptococcus~ 58 3 4939 pneumoniae] ~ 35 ~ ~ 1236 ~________~____~_______ ,_______,________________t_____________________________________________________ ___________ ____________i________,_________t_________, 133 ( ~ ~gnl~PID~e269488Unknown [Bacillus subtilis) ~ 58 1 B90 ~ 36 ~ ~ 780 ________,____,_______ ,_______,________________,_____________________________________________________ _ ____________,________,_________,_________, 7________ 160 ~I1 ~ ~gi~473901 ~ORFi (Lactococcus lactisl ~ 58 ( 9865 ~ 39 861S ~ l251 ________,____,_______ ,_______,________________~_________..__________________________________________ ____________ __-_________,________,_________+_________, 161 ~ ~ (gnl~PID~d101024~DJ-1 protein [Homo Sapiens) ~ 58 6 6849 ~ 32 ~ ~ 582 ,________a____,_______ i_______,________________i_____________________________________________________ ___________ ____________i________~_________~_________~

I69 ~ ~ ~gnI~PID~d100447translation elongation factor-3 (Chlorella~

1 2 virus) ~ 31 ~ ( 213 ________,____,_______ ,-______,________________,__-______-__-___________________________________________________ ____________,________~_________~________ 1B7 ~ ~ (gi~475114 ~regulatocy protein [Pediococcus pentosaceus)~ 58 1 2 ~ 38 ~ ~ 486 ,________,____~_______ ,-______,________________,_______________________________________________________ _________ ____________~________~_________,_________, 187 ~ ~ ~gi~167475 ~dessication-related protein (Crateroatigma~ 58 6 4620 plantagineuml ~ 55 ~ ~ 237 ,________,____,_______ ,-______,________________~________________________________-___________________________________________y________~_________i________ 190 ~ ~ ~gnl~PID~e246727competence pheromone [Streptococcus ~ 58 2 I640 gordonii) ~ 38 ~ ~ 177 ________,____,_______ ,_______,________________,_________-______________________________________________________ ____________,________,_________~________ 192 ~ ~ ~gnl~PID~d100556drat GCP360 (Rattus rattus) ~ 58 2 1344 ~ 44 ~ ~ 669 ,________?____,-.~----_ ,_______,______________-_~____________________________________________________________________________~
________,_____-___y________ 206 ~ ~ ~gnl~PID~e202579(product similar to WrbA [Lactobacillus~ 58 1 696 sake] ( 35 ~ ~ 597 ________,____,_______ ,_______,________________~_____________________________________________________ _______________________~________t___-_____,_________+

216 ~ ~ ~gnl~PID~e325036(hypothetical protein (Bacillus subtilis]( 58 2 555 ~ 33 ~ ~ 1779 ________,____,_______ ,_______,________________,_____________________________________________________ _______________________,________,_________,_________t 217 ~ ( ~9i~466474 ~cellobiose phosphotransferase enzyme phflus]

5 4321 II" (bacillus stearothermo ~, ~ 58 5250 ~ 38 ~ 930 ________,____,_______ ,_______,________________~_____________________________________________________ _______________________,________,_________~_________, i 2I7 i 5636i ignl~PID~d102048i8. subtilis cellobiose phosphotransferase98) 7 5106 system celB; P46317 (9 ~

i transmembrane (Bacillus subtilis] ~ ~

____,____,____,_______ ,_______,___-____________,________________________________________________________________ ____________f________,_________,_________~

232 ~ ~ ~gi~1573777cell division ATP-binding protein (ftsE)] ~

1 B11 (Haemophilus influenzae 58 ~ ~ 39 2 ~ 8I0 ,________~____~_______ ,_______,________________~______________________________-_______________________-_________ __________~_,________~_________~________ 264 ~ ~ ~gi~973330 ~NatA [Bacillus subtilis) ( 58 1 715 ~ 32 ~ ~ 714 ,________~____y_______ ~-______y________________~________________________________-___________________________________________t________~_________~_________r 280 ~ ~ ~ ~gi~1786187~IAE000111) hypothetical 29.6 kD proteinregion 58 1 33 767 in thrC-talB intergenic fEscherichia colil i ~ i ,________,____~_______ ~..______,________________~_____________________________-__-_________________________________________ ____ _____ _____ __,__ __~__ __~__ _ 306 ~ ~ ~gnl~PID~e334780~YlbL protein [Bacillus subtilis] ~ 58 ,________1 3 -,________________,_____________________________________________________________ ___~ 47 ~ ~______ ~ 843 ____________,________,_________,_________, ,____,_______ 360 ~ ~ ~sp~P46351~YZGD_HYPOTHETICAL 45.4 KD PROTEIN IN THIAMINASE~

3 1092 I 5'REGION. ~ 32 ~ i 465 ,____..___,____4_______ ,_______,________________~_______________-____________________________________________________________~________~_________ ~________ J63 ~ ~ ~gi~160671 ~S antigen precursor (Plasmodium falciparum]( 58 5 1867 ~ 51 ~ ~ 294 ,________,_-__,__-____ ,_______,___________-,____f_________________________----_______________________________________________~_..______y_________,__---___ 372 ~ ~ (gi~393394 ~Tb-291 membrane associated protein ~ 58 yp 1 3 [Trypanosome brucei subgroup] ~ 37 ~ ~ 804 (________,____t_______ ,_______,________________,_____________________________________________________ _______________________~________,_________~_________~
~I1 ~

382 ~ 749 ~ ~pir~JC1151~JC11hypothetical 20.3K protein (insertion terium 2 519 sequence IS1131) - Agrobac ~ 58 ~ p ~ ~ 41 ~

tumefaciens (strain P022) plasmid Ti ,________,__.._,_______ ,_______i________________~_____________________.._________________----_________________________________~________f_________t -________t S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ,________y____,_______,_______,________________a_______________________________ _____________________________________________,________~_________,_________y Contig~ORF~ ~ ~ ~ match gene name ~ i sim ~ ~ length StartStopmatch 1 ident ID SID~ ~ ~ ~ ~
~ ~ (nt) (nt)Int)acession ________,____y_______,_______,________________y________________________________ ____________________________________________,_______ ________y_________y a 3 ~ ~ ( ~gi~1499745 9 84097471~M.

jannaschii predicted coding re9lon (Hethanococcus jannaschii) ~

~

~

________,____,_______f_______,________________y__________..____________________ _____________________________________________a________,_________i_________y pp Q10~ ( ~gi~1737169 76747507homologue to (Arabidopsis thaliana) ~

( ~

________,____,_______,_______y________________y________________________________ ____________________________________________y________y_____-___,_________y Hr 11 ~ , ( (gnl~PID~d100139 1 2 412 ~ORF

(Aeetobacter pasteurianusl ~

~

~

y________y____,_______y_______,________________,_______________________________ _____________________________________________y________y_________t_________~

31 ~ ~ ~ ~gi~2293213 4 2032138B~/AF008220) YtpR

/bacillus subtilis) ~

~

________,____y_______,_______,________________y________________________________ ____________________________________________y________,-________,_________, 33 Q11~ ~ ~gnl~PID~e324949 69316449hypothetical protein (Bacillus subtilis) ~

~

~

________,_-__,_______ ,_______,________________,_____________________________________________________ _______________________y________,_________,_________f ( ~ ~ ~ .~gi~1592204 45 5 5446S060~phosphoserine phosphatase (Methanocoecua jannaschii) ~

, , ,________y____y_______y______ _y________________,_______________________________________________________~____ ________________y________y________-y_____.____4 49 ~ ~ ~ ~gi~155369 7 65237632(PTS

enzyme-II

fructose (Xanthomonas campestris) ~

~

~

________,____y_______y_______4________________y________________________________ ____________________________________________y________~_________,_________y 52 ~ ~ ~ (gi~1574144 6 4,5206B50single-stranded-DNA-specific exonuclease (recJ) (Haemophilus influenzae) ~

~

~

________,____,_______,_______,________________,________________________________ ____________________________________________,________y_________,_________, 53 ~ ~ ( ~gi~t843580 5 20791795(replicase-associated polyprotein (oat blue dwarf virus) ~

~

~

________,____,_______,_______,________________,________________________________ ____________________________________________y________,_________y_________, o 63 6 ~ ~gi~2182608 ( 4995~/AE000094) 5312 Y4rJ

(Rhizobium sp.

NGR234) __ y , , y ,_ ______________.__, _____ _ _________._______ _________________________________ ________.._, _ ___ _ y 72 Q15A 13059y 3883 y _ __ ___ _______ ______________ _____.__ ______..__ _________y (gnI~PID~d100892 homologous to SwissProt:YIDA_ECOLI

hypothetical protein [Bacillus subtilisl ~

~

~

________,____,_______,___..___,________________,_______________________________ _____________________________________________y________,_________y_________y 79 ~ ( ~ (gnI~PID~d100965 57 44 747 o 2 25611A15(homologue of NADPH-flavin oxidoreductase Frp of V.

harveyi (Bacillus ~ subtilis) i ~ i ~
~

,________,____y_______,_______y ________________y______________________________________________________________ ______________,________y_________y_________y ( ~ ~ ~ ;gi~1206045 short region of similarity to glycerophosphoryl57 35 168 B2 9 9596976l diester phosphodiesterases ( ~ y ~ ~ vp (Caenorhabditis elegans) ________,____y_______y_______y________________y________________________________ ____________________________________________ v________y______..__,_________y ' 86 ~16A 14493~gi~1787983 AE000264) o(88; 92 pct identical (1 gaps) 57 34 879 5371 to 222 residues of fragment YDIB_ECOLI SW: P(8244 (223 aa) (ESeherichia coli) y________p___y_______y_______,________________y________________________________ ____________________________________________y________y_________y_________y 93 ~ ~ ~ ~gi~1500003 3 16951177~mutator mutT

protein (Methanococcus jannaschii) ~

~

~

y________,____y_______y_______y________________y_______________________________ _____________________________________________y________y_________y_________y 96 ~ ~ ~ ~gi~559882 6 30264519~threonine synthase (Arabidopsis thaliana) ~

~

~

,_____-_-y-___,_______,_______,________________y___________________________.._____________ __________________-________________y________,_________y____-__-_4 99 Q14A721118212~gi~773349 (BirA

protein Ieacillus subtilis) ~

~

~

,________y____y_______y_______,________________y_______________________________ _____________________________________________y________y_________y_________y J ( ~ ~ ~gi~1591393 112 8 74487903~M.

jannaschii predicted coding region (Methanococcus jannaschii) ~

( ~

________,____,_______,_-_____,________________,________________________________________________________ ____________________,________,_________,_________~

113 Q16A 18328~pir~A45605~A156 mature-parasite-infected erythrocyte surface 57 8627 antigen MESA - Plasmodium 22 ~

~ ~
falciparum ,________y____y_______y_______y_____________..__y______________________________ .._____________________________________________;
________y_________~_________y 123 ~ ~ ( ~pir~F64149~F641 2 )43 1110hypothetical protein -Haemophilus influenzae (strain Rd KW20) ~

~

~

________,____,_______,_______y________________,________________________________ ___________________ ___ _ 123 ~ ( ~ _____________________,________y_________y_________y 4 210B2884(gnl~PID~d102148 ~(AB001684) sulfate transport system permease protein (Chlorella vulgaris) ~

~

( y________,____,_______,_______,________________,_______________________________ _____________________________________________y________y_________y_________, 127 (10~ ~ ~gi,1573082 64775587~nitrogenase C

(nifC) (Haemophilus influenzael ~

~

~

________y____,_______,_______y________________,________________________________ ____________________________________________y________y_________y_________y J

128 ~13~ ~ ~gi~153692 92519790~pneumolysin (Streptococcus pneumoniae) ( ~

~

________,____,_______,_______y________________a________________________________ __________________________________________ ___, 131 ~ ~ ~ __ 4 21391363__,__ __y__ _____y_________+

~gi~42081 ~nagD

gene product (AA

1-250) (Escherichia coli) ~

~

~

,________,____,_______,_______,________________y_______________________________ _____________________________________________y________y_________y_________i S. pneumoniae - Putative coding regions of novel proteins similar to known proteins .._______y____ _____________ _________________,_____________________________________________________________ _______________________,_________,_________ I IORF1 I I match I match gene name 1 I i lengthI
Contig StartStop E
ident 8im I IIDI I I acessionI I
I ~ (nt)I
ID (ntl (nt) ________,____,______________________________,__________________________________ __________________________________________y__________________________ ( I,11 I Ibbs1148453(SpaA=endocarditis immunodominant antigenMUCOB 1 136 214 1221 [Streptococcus sobrinus) 57 ~ i I I~ I I I I 263, Peptide, 1566 aal [Streptococcus 1 I
sobrinus]

,__________________________,________________r__________________________________ ____________________________________________________________________ W

1 125128701I26851(9i1505576(beta-glucoside permease (Bacillus subtilis[I 57 I

I

,________r____,______________;_________________________________________________ ___________________________________________y_________________,_________;

I [ [ I 19i1995560lunknown lSchizosaccharomyces pombel I 57 I

I

____________,_______,_______,________________,_________________________________ _____________________________________________________________________ I I 1 1 IgnIIPIDId100139IORF (Acetobacter pasteurianusl I 57 ____________,__________________________________________________________________ ________________________________________,_________________y_________, 1 I I I (9i1600431Iglycosyl transerase (Erwinia amylovora[I 57 155 9 5454 4564 ( I

I

,________.___________,_________________________________________________________ __________________________________________f_________________,_________ I I I I (9i1290509(o307 [Escherichie coli) I

( ___________________y_______,___________________________________________________ ___________--____________________________________+__________________ I I11I I IgnIIPIDId100139IORF [ACetobacter pasteurianus[ 1 57 I

I

y________,____y______________,________________,________________________________ _____________________________________________________________,_________ I ( I I (9i1147902Imannose permease subunit III-Han [EscherichiaI 57 171 6 4023 d436 coli) I

I
4t4 I

___________________,_______,___________________________________________________ _________________________________________,__________________________, I I I I IgnIIPIDId102004IIABDOlIBB) ATP-DEPENDENT RNA HELICASE ilis) 178 4 2170 107b DEAD HOHOLOG. [Bacillus subt I

I

I

I

N
____ ,___________,_______,________________,_________________________________________ _________________________________..______________-____________;
N

I I I 1 (9i1149920lexport/processing grotein (Lactococcus I 57 'J
190 1 145 I455 lactic[ I

I

I

________,____~_______,____..__________________,________________________________ ____________________________________________,____.._____________________ ,J

I I ( I 1g11522268lunidentified ORF22 [Bacteriophage bIL67)I 57 N

I

I

(________,___________,_______,________________,________________________________ ____________________________________________,_________________,_________, o 203 I 1 ( IgnIIPID1e283915lorf c01003 (Sulfolobus solfataricusl ~ 57 _ N ,r 4l I

I

____________,_______,__________________________________________________________ _________________________________________________,_________,____.____ O vo I 1 1 1 (9i11439527IEIIA-man [Lactobacillus curvatus[ I 57 I

I

,__________________________,___________________________________________________ _________________________________________________,_________;_________ I I I 1 IgnIIPIDId102099IH. influenzae, ribosomal protein alanine[1891 57 48 aa7 o 214 7 4243 3797 acetyltransferase; P94305 1 I

I I I I I I (Bacillus subtilise I I
I I

,___________________,____..__________________,_________________________________ ___________________________________________y________ __________________;
N

( I I I 19i143979 IL.curvatus small cryptic plasmid gene us 57 268 3 1767 1276 for rep protein (Lactobacill 1 ( I I 1 I I I curvatus) I 1 I I

(________,____,_______._______,________________,_______________________________ _____________________________________________,_________________,_________, I ( I I (gnllPIDle275871IT03F6.b (Caenorhabditis elegance I 57 I
29l I

;____________y_______,__________________--___________________________________________---_________________________________,________,______--__________, I I 1 1 (9i1160671IS antigen precursor [Plasmodium falciparum[1 57 I

I

f_________________________________________..___________________________________ _________________________________________4________;__________________, 1 I I104861 (9i1405857IyehU (Escherichia cola]

I

.____________f_______,_______f_________________________________________________ ________________-__________________________,________y__________________ I I I I 19i1467199IpksC; L518_F1_2 [Mycobacterium leprael I 56 8 5 367d 3910 1 __________________________________________,____________________________________ ________________________________________________;_________;_________ I 1 I 1 IgnlIPiDId101907(sodium-coupled permease [Synechocystis I 56 3 3442 1874 sp.) I

I

,____.____4____,_______________________________________________________________ _____________-_____________________________________y_________,-________ I I I I (9i12313949I(AE0005931 osmoprotection protein (proWX)I 56 21 1 1880 333 [Helicobacter pylori) I

I

I

___________________y_______a________________,__________________________________ __________________________________________y_________________+_________ I 129I21968122456IgnIIPIDId1020011(AB001188) PR08ABLE ACETYLTRANSFERASE. 1 22 [Bacillus subtilise I

I

I

,________,___________y_______,________________y________________________________ ____________________________________________________y_________,_________ I I I I (9i12151321ea59 (525) [Bacteriophage lambdal I 56 v I

~

________;____,______________________________,__________________________________ __________________________________________,__________________________ Hr ~0 I I ( I (9i11592090(DNA repair protein IiAD2 IHethanococcusI
UI
28 9 4667 427B jannaschiil I 56 I

I

________,____,_________________________________________________________________ _________________________________________________;__________________ I I I I IgnIIPIDId100139IORF (Acetobacter pasteurianus) I 56 I

I

___________-_______________________________________________________________________________ ___________________________,____~.___;__________________, TABLE 2 S, neumoniae - Putative codin re ions of novel proteins 3°fmilar to known p g 9 proteins ________,____,_______ ,_______,________________+_____________________________________________________ _______________________,________,_________,_________, Contig~ORF~ ~ ~ match ~ match gene name ( 6 sim ~ B,ident~ length StartStop ID SID~ ( ~ aeession (nt) (nt) (nt) ,________,____ ,_______,_______,________________y________-.~________________________________..____________________ _____________,________,_________;_________.y i i 5122 ~ ~pir~PQ0053~PQ00hypothetical protein (proC 3' region) (strain ( 28 36 7 5397 - Pseudomonas aeruginosa PAO) 56 ~

( ~ (fragment) ~

________,____,_______,_______,________________y________________________________ _______________________________ _____________+________y_________,_________ , W

40 ~ ~ J ~gi~18D0301~macrolide-efflux determinant (Streptococcus~ 56 4 3137 4318 pneumoniae] ~ 27 ~ 1182 ,________y____,_______,_______y________________,_______________________________ ________________________________ _____________,________y_________y_________y 40 Q16A251113191~gnl~PID~e217602~PlnU [Lactobacillus plantarum) ~ 56 ~ 38 ~ 681 ,________,____,_______,_______+________________,_______________________________ ________________________________ _____________,________y_________s_________, 4R Q17A A ~gi~143729transcription activator [Bacillus subtilisl~ 56 3775 3023 ~ 35 ~ 753 ,________y____,_______y_______,________________~_______________________________ ________________________________ _____________~________a_________i_________y 75 ( ~ ~ 'gnI~PID~d102036membrane protein [Bacillus stearothermophilus]~ 56 4 1674 2594 ~ 25 ~ 921 ,________,____ ,_______y_______y________________,_____________________________________________ __________________ _____________y________,_________y_________y 85 ( ~ ~ ~gnI~PID~d100139~ORF [Acetobacter pasteurianus) ~ 56 3 1B42 1459 ~ 41 ~ 3B4 ,________,____~_______,_____-_,________________,____--_____________________________~,___________________________ _____________,________,_________,_________, 89 ~ ~ ~ ~gi~853777product similar to E.coli PRFA2 protein~ 56 7 5815 4940 [Bacillus subtilis] ~ 42 ~ 876 y________,____,_______,_______y________________y_______________________________ ________________________________ _____________,________,_________,_________y 10S ~ ( ~ ~gnI~PID~d101913hypothetical protein [Synechocystis ~ 56 2 1360 Z718 sp.] ~ 37 ~ 1359 ,________,____y_______,_______y________________y_______________________________ ______________________________~_ _____________,________s_________y_________, 112 ~ ~ ~ ~gi~537201~ORF_o345 (ESCherichia cola) ~ 56 N
3 2151 3194 ~ 31 ~ 1044 ~

________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________y_________, N

113 ~ ~ ~ ~gnI~PID~d100340~ORF (Plum pox virus) ~ 56 4 2754 2963 ~ 28 ~ 210 ________,____,_______,_______,________________y________________________________ ____________________________________________,________,_________,_________, J

( ~ ~ ~ (gi~1649035high-affinity periplasmic glutamine ~
~ 30 N
I22 3 1203 2054 binding protein (Salmonella 56 ~

( typhimuriuml ~ ~ ~
~ 0 r-.

,__ ,____,_______,_______,________________,________________________________________ ____________________________________,________,_________,_________, N
_____ 124 ~ ~ ~ ~gnl~PtD~e248893unknown [Mycobacterium tuberculosis[ ~ 56 8 3939 3694 ( 27 ~ 246 ________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________y_________, 125 ~ ~ ~ ~gnl~PID~d100247human non-muscle myosin heavy chain ~ 56 o 4 4403 4107 [Homo Sapiens) ~ 32 ~ 297 ~

________,____a_______,_______y_____________..__y_______________________________ _____________________________________________,________,_________y_________, 127 (I1~ ~ ~9i~2182397~(AE0000731 Y4fN (Rhizobium sp. NGR234]

660R 6405 ~ 56 ~ 35 ~

_______, y_______,_______y y .
N
, ____ _______________________________________________________________________________ ___________,_ _ ________y_________ _________y ( ~ ~ ~ ~gnl~PID~d101870(hypothetical protein [Synechocystis ( 56 134 S 4769 3849 sp.] ~ 39 ~ 921 ________,____,_______,_______,________________,______________________..________ _____________________________________________,________,_________y_________, 137 Q10~ ~ ~gi~1592011(sulfate permease (cysA) [Methanococcus~ 56 b814 7245 jannaschii) ~ 34 ~ 432 ,________y____y_______,_______t________________,_______________________________ _____________________________________________,________,_________y_________y 142 ~ ~ ~ ~pir~A47071~A470~orfl immediately 5' of nifS - Bacillus~ 56 B 5019 45A2 subtilis ~ 29 ~ 138 ________,____,_______,_______,________________,__________~_____________________ ____________________________________________,________,_________,_________y 146 ~ ~ ~ ~gnl~PID~d101911(hypothetical protein (Synechocystis ( 56 8 4676 3660 sp.l ( 32 ( 1017 ________~____,_______,_______,________________,________________________________ _______________________________ _____________y________y_________y_______ 148 ~ ~ ~ ~gnI~PID~d101099(phosphate transport system permease sp.]

3 1906 2739 protein PstA [Synechocystis ~ 56 ~ 36 ~ 834 ,________,____,_______,_______,________________y_______________________________ _____________________________________________,________,_________,_________, 150 ~ ~ ~ gnI~PID~e30462Rprobably site-specific recombinase 4 4449 2743 of the resolvase family of enzymes ~

~ 1707 [Bacteriephage TP21] ~

,________,____,_______f_______,________________,_______________________________ _____________________________________________,________,_________y_________t 172 ~ ~ ' ~gi~17B7791~IAE0002491 f317; This 317 as orf is to 301 56 1 2 208 27 pct identical (16 gaps) i i i y residues of an .:pprox. 320 as protein YXXC_BACSU SW: P39140 [Escherichia ( ~ ~ ~ ~ ~ colil ________,____,_______,_______,________________,________________________________ ____________________________________________y________,_________,_________y ~ 10 ( ~ ~ ~ ~9i~396293(similar to Bacillus subtilis hypoth. region 172 7 4979 5668 20 koa protein, in tsr 3' ~ 56 ~ 40 ~ 690 [ESCherichia coil]

________y____y_______,_______~________________,________________________________ ____________________________________________,________,_________~_________y tJl 186 ( ~ ~ ~gi~1732200~PTS permease for mannose subunit IIPManJ 56 0~0 7 3732 3367 [Vibrio furnissii] ~ 36 ( 366 ~

________,____,_______,_______,________________a________________________________ ____________________________________________,________~_________y_________, f ~ ~ ~ ~pir~S57904fS579wirR49 protein - Streptococcus pyogenesM49) 187 2 2402 819 (strain CS101, serotype ~ 56 ' 35 ~ 1584 +________,____y_______r_______v________________y_______________________________ _____________________________________________,________,_________y_________y S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ________,____,_______,_____ __,________________,___________________________________________________________ _________________,________,_________,_________, Contig(ORF~ ~ ~ match ( match gene name E
~ 1 ~
StartStop sim ident length ID ~ID ~ ~ ~ acession ~

(nt)(nt) Int) ,________,____4_______,_______,________________,_______________________________ _____________________________________________,________,_________,_________, pp 204 ~ ~ ~ ~gi~606376 (ORF_o162 fEscherichia coli) ~
~ 35 3 27722239 56 ~ S34 ~________,_..__~_______y_____ __,________________ '____________________________________________________________________________,_ _______,_________,_________, 206 ~ ~ ~ ~gi~559861 ~clyH (Plasmid pADI) ~
~ 38 W
2 33421633 56 ~

~

________~____,_______,_____ __,________________+___________________________________________________________ _________________,________t_________,_________, 219 ~ ~ ( ~gi~1146197 putative [Bacillus subtilis) ~ ~ 27 3 16891096 56 ~ 594 ,________,____,_______~_____ __~________________~___________________________________________________________ _________________,_______ _,_________,_______ 230 ~ ~ ~ ~ pir~C60328~C603hypothetical protein 2 (sr 5' region) -56 ~ 40 1077 Z 409 1485 Streptococcus mutans (strain ~

OHZ175, serotype f) ~

________,____,_______,_______,________________ i____________________________________________________________________________,_ ______ _,_________,_______ 233 ~ ~ ~ ~gi~1041785 ~rhoptry protein [Plasmodium yoelii) ~ ~ 24 4 29303268 56 ~ 339 ,________~____,_______~_______,________________,_______________________________ _____________________________________________,_______ _y_________,_________, 273 ~ ~ ~ ~gi~143089 ~iep protein [Bacillus subtilis) ~
~ 32 2 1S432724 56 ~

________,____,_______,_______t________________,________________________________ ____________________________________________,_______ _,_________,_________i 353 ~ ~ ~ ~gnl~PID~e325000 ~hypoihetical protein [Bacillus subtilis) ~ ~ 41 1 1 516 56 ~ 516 ________,____,_______,_______i________________ ,____________________________________________________________________________,_ ______ _,_________~_________, l59 ' ~ ~ ~ gi~1786952~iAE0001761 o877; 100 pct identical to as 56 i 46 555 i 87 641 the first 86 residues of the 100 ~ J

hypothetical protein fragment YBGB_ECOLI
SW: P54746 [Escherichia cola[

________,____,_______,_______,________________s________________________________ ____________________________________________, ________~_________,_________~

363 ~ ~ ~ (gi~1573353 outer membrane integrity protein (tolA) ~ ~ 38 7 44824198 (Haemophilus influenxae) 56 ~ 285 ________,____,_______,_______,________________~________________________________ ____________________________________________,_______ _t_________i_________, I76 ~ ~ ~ ~gnI~PID~e325031 hypothetical protein [Bacillus subtilis) ~ ~
1 2 508 56 33 ~

__ _,____,_______,_______,________________,_______________________________________ _________________..___________________,_______ .
_~_________,_________y 18 ~ ~ ~ ~gnl~PID~d100872 (a negative regulator of pho regulon (PSeudomonas~ ~ 31 r-.
1 836 177 aeruginosa) 55 ( 660 ~

________,____,_______,_______,________________ ,____________________________________________________________________________,_ ______ _,_________,_________t N

28 ~ ~ ~ ~gnl~PID~e316518 ~STAT protein [Dictyostelium discoideum[ ~ ~ 40 N
4 18241618 55 ~ 207 ~

,________y____,_______~_______,________________ ~____________________________________________________________________________,_ ______ _~_________,_________, 29 ~ ~ ~ ~gi~1088261 unknown protein [Anabaena sp.1 ~ ~ 31 6 44965041 55 ~ 546 ,________s____,_______,_______,________________ ,__________________________________-"________________________________________,_______ _,_________,_________?

38 Q16 ~ 10702~gi~580905 ~e.subtilis genes rpmH, rnpA, 50kd, gidA
~ ~ 31 9695 and gidB [Bacillus subtilisl 55 ~ 1008 ,________,____,_______,_______,________________~_______________________________ ________________________________________ _____,_______ _y_________y_________~

49 ~ ~ ~ ~gi~1786951 ~(AE0001761 heat-responsive regulatory ~ ' 29 57276182 protein [Escherichia coli) 55 ~ 456 ________,____,_______,_______,________________ ,____________________________________________________________________________,_ ______ _,_________,_________, 51 ~ ~ ~ ~gnl~PID~d101293 ~YbbA [Bacillus subtilis) ~ ~ 42 4 23813241 55 ~ B61 ,________~____,_______,_______~________________ ,____________________________________________________________________________,_ ______ _,_________f_______ 52 ~ ~ 10866~gi~153016 ~ORF 419 protein [Staphylococcus aureus) ~ ~ 23 9 9640 55 ~

,________,____,_______,_______~________________ ,__________r_________________________________________________________________,_ ______ _f_________f_________, 53 ~ ~ ~ ~gi~B96042 ~OSpF [Borrelia burgdorferi) ~
~ 30 4 18131349 55 ~ 465 ________,____,_______,_______,________________ ,____________________________________________________________________________,_ ______ _,_________,_________, 60 ~ ~ ~ ~gi~1499876 (magnesium and cobalt transport protein ~ i 38 5 47945756 lHethanococcus jannaschii) 55 ~ 967 ,________,____,_______,_______,________________a_______________________________ ________________________________________ _____,_______ _,___.._____+_________, 71 ~ 1417615408~gi~1857120 ~glycosyl transterase [Neisseria meningitidis)~ ~ 41 9 55 ~

,________i____,_______,_______,________________ ,____________________________________________________________________________,_ ______ _,_________y_________~

75 ~ ~ ~ ~gnl~PID~e209890 ~NAD alcohol dehydrogenase [Bacillus subtilis]' ~ 44 6 31894229 55 ~

________,____,_______,_______,________________ ,__________________________________________________________________________ _____ ____ ___ __,__ _,__ _ __,____ __, 108 Q10 A ~ ~gnl~PID~e324997 hypothetical protein [Bacillus subtilis) ~ ~ 36 04889820 55 ~ 669 ,________,____,_______,_______,________________ f__________________________________________________________________________ ____ _y__ __,__ __,__ _ __y _____ _____ 113 Q12 A227313037~gnI~PID~eW unknown [Bacillus subtilis) ~ ~ 34 1496 55 ~ 765 ,__-_____,____,_______,_______,________________ ,____________________________________________________________________________i_ ______ _,_________,_________, w 113 ~t3 A A3945i~l5'13423 ~1-3007 ~ hos hofructokinase (fruK) [Haemo hilus i tluen ) g p ~ ~

p 9 ~
p 939 n zae ,________,____,_______,_______,________________ ,____________________________________________________________________________,_ _______~_________4_________, (p 126 ~ ~ ~ ~gi~1790131 ~IAE000446) hypothetical 29.7 kD protein ~ ~ 37 S 67645907 in ibpA-gyrB intergenic region SS ~ 858 [ESCherichia coli) ________,____ ,_______,_______,________________ ,____________________________________________________________________________,_ _______f_________~_________+

TABLE 2 S. pneumoniae - Putative coding regions of novel proteini;r5'~milar to known proteins ________,____, _______,______ _,________________,____________________________________________________________ ________________,________~_________a_________, Contig~ORF~ ~ i match ~ match gene name ~ b sim Start Stop ~ b ident ~ length ID SID~ ~ ( acession~ ~
~ ~ (nt) (nt) (nt) ,________,____~_______~______ _,________________,____________________________________________________________ ________________,________,_________,_________, 129 ~ ~ ~ ~gnl~PID~d101425~Pz-peptidase (Bacillus licheniformis)~ 55 ~

3 2719 902 35 ~ 1818 w.
________,____,_______,______ _,________________~____________________________________________________________ ________________,________,_________,_________, Qp I38 ~ ~ ~ ~gi~142833~ORF2 /Bacillus subtilisl ~ 55 ~

3 2593 I610 37 ~ 98d ,________,____,_______,______ _i________________~____________________________________________________________ ________________,________~_________,_________~
pr 1d0 ~ ~ ~ ~gnl(PID~d100964homologue of hypothetical protein ( ~ 26 ~ i284 6 6916 5633 in a rapamycin synthesis gene cluster 55 of Streptomyces hygroscopicus (Bacillus subtills) ,________,____,_______ ;_______,________________,________________..___________________________________ ________________________,________ ~_________~_______ 1d7 ~ ~ ~ ~gi~472330~dihydrolipoamide dehydrogenase (Clostridium~ 55 ~

3 3854 2136 magnum) 39 ~ 1719 ________,____,_______,______ _,________________~____________________________________________________________ ________________,________,_________+_______ I47 ~10(10204 ' ~gnI~PID~e7307B(dihydroorotase (Lactobacillus leichmannii]~ 55 ~
8921 38 ~ 1284 ________,____y_______ ,_______,________________,_____________________________________________________ _______________________,________,_________~_______ 148 ~ ~ ~ ~gi~290572peripheral membrane protein U (Eschecichia~ 55 ~

S 3430 41I9 cola) 29 ~ 690 ,________,____f_______ ,_______,________________,________________________________________________~.___ ___________________~____'________~_________,_________~

148 ~ ~ ~ (gi~695769~transposase (Xanthobacter autotrophicus)~ 55 ~

6 4171 4650 37 ~ 480 ________,____,_;_____ ,_______,________________,________________..___________________________________ ________________________,________,_________,_________, ( Q14(12564 A ~gnl~PID~d101329~YqjG (Bacillus subtilis] ~ 55 ~
Id9 1650 32 ________,____4_______ ,_______,________________,_____________________________________________________ ________.______________~________,_________~_________, 156 ~ ( ~ ~gi~2314496~(AE0006341 conserved hypothetical (HelicobacterSS S64 3 1113 5S0 integral membrane protein 34 1 i i ( ~ ~ ~ ~ PYlori) ~

________,____,_______ ,_______,________________~_____________________________________________________ _______________________,________,_________,_________i J

159 Q10~ ~ ~gi~290533similar to E. cola ORF adjacent to 55 6625 5897 suc operon; similar to gntR class ~ ~

of ( 729 ~

( regulatory proteins (ESChezichia coli) ________,____,_______,______ _,________________,____________________________________________________________ ________________, ________,_________4_________, O

164 ~ ~ ~ ~gnI~PID~e255118(h ~ 55 3 1784 2332 othetical 37 rotein [Bacillus subtilis] 5 yp ~
p ~

,________,____,_______ ,_______,________________,_______________________________________________--~________________-_________,________,_________,_________~

164 ~ e2772 ~ .~gi~40348put. resolvase 2np I (AA 1 - 284) ~ 55 ~

3521 (Bacillus thuringiensis) 35 ~ 750 ________,____,_______ ,_______,________________,_____________________________________________________ _______________________,________,_________t_________, 16d ~11~ ~ (gnl~PID~e249407unknown (Mycobacterium tuberculosis]~ 55 ~
o 7428 7216 38 ( 213 ~

,________,____,_______ ,_______r________________,_____________________________________________________ _______________________,________+_________,_________, 167 ~ ~ ~ ~gi~535052involved in protein secretion [Bacillus~ 55 ~

S 3860 3345 subtllis) 28 ~

________,____,_______ ,_______,________________,_____________________________________________________ _______________________,________+_________,_________f -'p 186 ~ ~ ~ ~gi~606080~ORF_o290; Geneplot suggests frameshiftfound ~ 55 S 2880 2563 linking to o267, not ~ ~

(Escherichia cola]

,________,____,_______ v_______,________________,_____________________________________________________ _______________________,________,_________,_________, 1H9 i ~ ~ ~gnl~PID~e183450)hypothetical EcsB protein (Bacillus~ 55 ~

8 4311 5J96 subtilis] 32 ~ 1086 ,________y____,_______ ,_______,________________;__________-_________________________________________________________________,________,____ _____,_________, 192 ~ ~ ( ~gi~1196504~vitellogenin convertase (Aedes aegypti)~ 55 ( 5 3270 3079 38 ~ 192 ,________,____,_______ ,_______,________________,_____________________________________________________ _______________________,________~____..____,_________f 195 ~ ~ ~ (gi~1574693~transferase, peptidoglycan synthesisluenzae) 2 2454 13B9 (murG) (Maemophilus inf ( 55 ~

33 ~ 1071 ________,____,_______ ,_______,________________a_____________________________________________________ _______________________,________+_________,_________y 198 ~ ~ f ~gnl~PID~e313074hypothetical protein (Bacillus subtilis]~ 55 ~
4 J013 a471 29 ~ 543 ____..___,____,_______ ,_______~________________,_____________________________________________________ _______________________,________,_________,_________, 214 ~ ( ~ (gnl~PID~d101741~transposase (Synechocystis sp.) ~ 55 ~

1 373 744 33 ~ 372 ,____,.___,____,_______ ,_______y____________-___,___________________________________________________________________________ _,________~____~_-__,_________, 219 ~ ~ ~ ~gi~288301~ORF2 gene product [Bacillus megaterluml~ 55 ~

2 1115 456 30 ~ 660 ________,____,_______ ,_______,________________~_____________________________________________________ _______________________,________f_________,_________i 263 ~ ~ ~ ~gi~18137 ~cgcr-4 product (Chlamydomonas reinhardtii)~ 55 ~

7 3742 3493 4B ~ 300 ________,____?_______ ,_______,________________,_____________________________________________________ _______________________,________~_________,_________, J

285 ~ ~ ~ ~gnl~PID~d100974unknown (Bacillus subtilis) ~ 55 ~

1 2 829 40 ~ 828 ,________,_~.__,_______ ,_______,________________,_____________________________________________________ _______________________,________,_________,_________, ~

286 ~ ~ ~ ~gi~396844~ORF (38 kDa1 (Vibrio cholerae) ( 55 ( 1 650 249 31 ! 402 ~

________,____,_______ ,_______,________________ ,____________________________________________________________________________,_ _______,_________,_________, ( , ~ ( 4gi~150848(prtC (Porphyromonas gingivalis]

297 2 1229 1696 55 ~ 39 ~ 46B

(________,____,_______ v_______,________________,_____________________________________________________ _______________________,___.____,_________,_________, TABLE 2 S.
pneumoniae - Putative coding regions of novel protein's'similar to known proteins ,________,____,_______s_______,________________,_______________________________ ____________________________..________________,________,_________,_________, ( Concig( ( ( match ( match gene name ( ( ( length( ~ORF StartStop !
i sim ident ( ID ( ( ( acession( (ID (nt1(nt) ( ( ~ (ntl __ , _,_ _~
,________+_________,_________ I pp , _v _________________________ __________________________________________________ ,____ _____ _ __ _______________ _____ ( 309 ( ( (9i(1574991(hypothetical [Haemophilus influenzae] ( 55 ( 765 ( 2 218 982 ( ( ________,____,_______a_______,_______________ _,____________________________________.._______________________________________ ,________,_________,_________, ( 328 ( ( (9i(571500(prohibitin [Saccharomyces cerevisiae] ( 55 ( 423 W
( 2 646 224 ( ( ,______~._,____t______ _,_______,_______________ _,___________________.-________________________________________________________,________,_________,___ ______y ( 330 ( ~ (gi~396397(soxS (Escherichia cola) ( 55 ( 867 ( 1 1390474 ( ( ________,____,_______i_______,________________,________________________________ ____________________________________________,________,_________,_________, ( 364 ( ( (9i(793394(Tb-291 membrane associated protein [Trypanosome( 55 ( 993 ( 3 25381S46 brucei subgroup] ( ( ,________,____,_______,_______,________________,_______________________________ _____________________________________________,________,_________ ,_________, ( 368 ( ( (9i(160671(S antigen precursor (Plasmodium falciparumJ

( 3 941 105 ( 55 ( 837 ( ________,____,_______,_______,_______________ _,____________________________________________________________________________, ________,_________,_________, ( 3 ( ( ( (9i(2293176(IAF008220) signal transduction protein ( 54 ( 981 46043624 kinase [Bacillus subtilis] ( ( ,________,____,_______,_______;_______________ _,______________________________________________________1_____________________, ________,..________ ,___.._____, 9 (11 ( ( (9i(1146245(putative (Bacillus subtilis]

77467246 ( 54 ( ( ( ________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________,_________;

( 38 (1b213(17937(9i(1980429(putative transcriptional regulator [Bacillus( 54 ( 1725 (24 stearothermophilus] ( ( ________,____,_______,_______,_______________ _,____________________________________________________________________________;
________,_________ ,_________, y 40 ~ ~ ~ (gi~399R9 (methlonyl-tRNA synthetase (Bacillus ( 54 ~ 195 8 507648R2 stearothermophilus] ( ( o ___._____,____,_______,_______,________________,_______________________________ _____________________________________________,________,_________ ,_________, ( 43 ( ( (gnI~PID(e148611(ABC transporter (4actobacillus helvetieus]( 54 ( 1614 ( 4 39802367 ( ( ,________,____,_______,__---__,_______________ _,________________..___________________________________________________________ ,________,_________ ,_________, ( ( ( (9i(1762962(FemA [Staphylococcus simulans] ( 54 ( 1260 ( ( ________,____,_______s_______,_______________ _,____________________________________________________________________________, ________,_________ ,_________, N

( 57 ( ( (9i(558177(endo-1,4-beta-xylanase [Cellulomonas ( 54 ( 510 o ( I 3 512 fimil ( ~

________,____,_______,_______,_______________ _,____________________________________________________________________________, ________,_________ ,_________) ( 58 ~ ( ~gnI~PID~d101237(hypothetical [Bacillus subtilis) ( 54 ( 504 ~' ( 3 47494246 ( ( ,________,____,_______,_______,_______________ _,____________________________________________________________________________, ________,_________ ,_________~

( 7I (10681(11703(9i(510255(orf3 [Escherichia coli] ( 54 ( 1020 ( 7 ( ( ________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________ ,_________+

( 71 (27546(27737(9i(202543(serotonin receptor (Rattus norvegicusl ( 54 ( 192 (20 ( ( ________,____,_______,_______,_______________ _,____________________________________________________________________________, ________,_________ i_________, to ( 72 ( ( (9i(148613(srnB gene product [Plasmid F) ( 54 ( 255 ( 2 844 1098 ( ( ________,____,_______,_______,_______________ _,______________..___________________________________________________._________ ,________,_________ ,_________, ( 72 ( ( (9i(1196496(recombinase (MOraxella bovis] ( 54 ( 744 ( 7 743B6695 ( ( ,________,____,_______,_______y_______________ _,____________________________________________________________________________, ________,_________ ,_________, ( 74 (14043(13465(9i(1200342(ORF 3 gene product (Bradyrhizobium jeponicum)~
54 ( 579 (10 ( ( ,________,____,_______i_______,_______________ _;____________________________________________________________________________, ________,_________ ,_________, 74 (12 (16483(15995(9i(2317798(maturase-related protein [Pseudomonas ( 54 ( 4B9 alcaligenes) ( ( ,________,____,_______,_______,________________,_______________________________ _____________________________________________,________,_________ ,_________, ( 86 ( ( (9i(46988 (orf9.6 possibly encodes the 0 unit polymerase~ 54 ( 723 ( 3 28772155 (Salmonella enterica) ( ( ________i____,_______,_______,________________a________________________________ ____________________________________________,________,_________ i_________, ( 89 ( ( (gi~147211(phn0 protein (Escherichia coli] ( 54 ( 513 ~ 5 44333921 ( ( ________;____,_______,_______,________________,________________________________ ____________________________________________t________,_________ ,_________, ( 90 ( ( ~gi(2317798(maturase-related protein [Pseudomonas ( 54 ( 1 3 464 alcaligenes) ( ~
30 ~

,________i____,_______r_______,______________.._,______________________________ ________.._____________________________________,________,_________ _________ ( 96 ( ( (gnl(PID(d102015(IAB001488) SIMILAR TO SALMONELLA
TYPHIMURIUH( 54 ( 453 (10 80588510 SLYY GENE REQUIRED FOR ( ( ( ( ( ( ( ( SURVIVAL IN MACROPHAGE. (Bacillus subtilis]( ( ( ( ,________,____,_______,_______,________________,_________..____________________ _____________________________________ _________,________,_________ ,_________f 1C

( 97 ( ( (9i(1591394(transketolase " (Methanococcus jannaschii]( 54 ( 1059 ( 6 46623604 ( ( _____________________________________________ a________,____,_______i_______f________________,__________________ ___ _ _ ,_________, __,________,_________ ~

( 106 (10406(12010(9i(606286(ORF_o637 [Escherichia colil ( 54 ( 16d5 (11 f ( (________,____,_______,_______,___________~____,_____________________________ _ ____ _ _ _ __,_________________ ,_________, __ _____________________ ___ 147 ( ( ( (gnl(PID(d101615(ORF ID:o319A7; similar to [SwissProt 8 86637404 Accession Number P37340] [ESCherichia i i i ( ( ( ( ( cola]
i ,________,____,_______,_______,________________,_______________________________ _____________________________________________,___...__i.________4_________, TAI3I,F Z S, pneumoniae - Putative codin re ions of novel g g proteins similar to known proteins y________,____ ,_______,_______,________________,_____________________________________________ _______________________________,________;_____.___;_________;

ContigORF~ ~ ~ match ~ match gene name ~
( E ~ length , StartStop !
ident sim ID ID ~ ~ ~ acession ~ ~
~ ~ (nt) ~ (nt)(nt) ;________,____ ;_______;_______;________________;_____________________________________________ _______________________________ ;________;_________;_________t 171 ~ ~ ~gi~1439528~EIIC-man [Lactobacillus curvatusl ~

~ 24773223 54 4 ~

~

;________;____ ;_______;_______;________________;____________________________~________________ _______~._______________________;____..__;_________;____..___;

174 ~ ~ ~gnl~PID~d100518motor protein (Homo Sapiens)' ~
~ Z0681787 54 2 ~

~

________;____ ;_______;_______;________________;_____________________________________________ _______________________________ ;________;_________;_________;

188 ~ ( ~gnl~PID~e250352unknown ]Hycobacterium tuberculosis]
~
~ 5Z6 1188 54 1 ~

~

;________;____ ;_______,_______;________________;_____________________________________________ ..______________________________ ;________;_________;__._______;

198 ~ ~ ~gnl~PID~e313074~hypathetical protein (Bacillus subtilis]
~
~ 35A22884 54 ~

~

;________,____ ;_______,_______,__________-_____;_________________________________________________________________________ ___ ,________;_________;_________;

207 ~ ~ ~gnI~PID~d101813hypothetical protein (Synechocystis sp.]
~
~ 1 164i 54 1 ( ~

________;____ ;_______;_______;________________,_____________________________________________ _______________________________;________,_________,_________i 210 ~ ~ ~gi~2293206((AF008220) YtmP (Bacillus subtilis) ~

~ 2 655 54 1 ' ~

________;____ ,_______v_______;________________,_____________________________________________ ________1_____________________,________,_________,_________, 225 ~ ~ ~gnl~PID~e330194~R11H6.1 [Caenorhabditis elegans) ~
~ 966 2357 54 2 ~

~

;________;____ ,_______;_______;________________,_____________________________________________ _______________________________ ,________;_________;_________;

24t ~ ~ ~gnl~PID~d101813hypothetical protein (Synechocystis sp,) ~
~ l681347 54 1 ~

~

,________,____ ;_______;_______;________________;_____________________________________________ _________.______________________;____..___;_________;______..__, 263 ~ ~ ~gnI~PID~d101886~transposase (Synechocystis sp.]
~
~ 907 1395 54 2 ~

( ~

o ,________;____ ,_______,_______v________________;_____________________________________+_______ _______________________________;________,_________,_________;

N
263 ~ ~ ~gi~160671 ~S antigen precursor (Plasmodium falclparuml ~
N
~ l4502977 54 6 ~

~

~

________,____ ;_______,_______;________________;_____________..______________________________ ________________________________;________,_________,_________, J

277 ~ ~ ~gi~1196926unknown protein (Streptococcus mutans] ~
J
~ 25171363 5d 3 ~

~

~

,________;____ ;_______y_______,________________;_____________________________________________ _______________________________,________;_________,_________;
N

307 ~ ~ ~ ~(AF008220) Yt ~ 828 4 i~2293198 P !Bacillus subtilis] 54 o 1 ~

g g ~
~.
~

________;____ ,_______;_______,________________;_____..______________________________________ ________________________________;________;_________,_________, N

32S ' ( ~gi~21B2507~(AE000083) Y41H (Rhizobium sp. NGR231] ~ ~

~ 19 768 54 1 ~

~

;________;____ ,_______,_______;________________,_____________________________________________ _____________________________ __ ;__ __;__ __;__ __;
____ _____ _____ 332 ~ ) ~gi)1591815ADP-ribosylglycohydrolase (drat) lNethanococcus~

~ 89B 59D jannaschii) 54 2 ~

~

________,____ ,_______;_______;________________;_____________________________________________ _______________________________,________;_________;_________;
o 385 4 ~ ~ igi~530878 amino acid feature: N-glycosylation sites, 54 ~ 240 479 as 41 .. 43. 46 .. 48, 51 .. 53, ~
~

72 .. 74) 107 .. 109, 1Z8 .. 130, 132 .. 134,_ ~ N
158 .. l60. 163 .. 165; ~

amino acid feature: Rod protein domain) as l69 .. 340; amino acid feature:

globular protein domai ,________,____ ;_______;_______,________________;_____________________________________________ _______________________________;________;_________,_________;

7 Q25 (19702A ~gnl~PID~e255111hypothetical protein (Bacillus subtilis]
~

( ~

,________,____ ,_______,_______;________________;_____________________________________________ _______________________________,_..______;_________;_________, 23 3 2497~ ~gnl~PID~d102015~(AB001488I~SIHILAR TO SALMONELLA
TYPHIMURIUM~ 25 46S
2033 SLYY GENE REQUIRED FOR 53 ~

~

f SURVIVAL IN MACROPHAGE. (Bacillus subtLlis) ,________,____ ,_______,_______,________________;_____________________________________________ ____________________-.__________,________;_________,_________;

29 ~ 1012l~gi'I43331 'alkaline phosphatase regulatory protein (Bacillus~
Q11 9042 subtilis] 53 ~

~

________,____ ,_______,_______,________________;_____________________________________________ _______________________________,________,_________i_________;

33 ~ ' ~pir~S10655fS106hypothetical protein X - Pyrococcus woesei ~
~ i479l009 (fragment) 53 3 ~

~
4?1 ________;____ ;_______;_______;________________;_____________________________________________ _______________________________;________;_________;_______-_;

( 36 ' ~ ~gnl~PID~e316029unknown (Mycobacterium tuberculosis]
~
~ 4S835134 51 6 ~

~

________;____ ;_______,_______;________________;___..________________________________________ ________________________________;________;_________;_________, ~

( 38 ~ ( ~gi~580904 (homologous to E.coli rnpA (Bacillus subtilis]~
J

~

~

) ;________;____ ;_______,_______;________________,_____________________________________________ _______________________________;________;_________;_________, 52 ~ ~ ~gi~1377B31unknown [Bacillus subtiiis) ' ( 7007B686 53 7 ' ~

' ________;____ ;_______;_______,________________;_____________________________________________ _______________________________,________;_________;_________, 5d A ,19564~gL~666069 ~orf2 gene product (Lactobacillus leichmannii]~

~17 7555 51 ~

~

' ;________,____ ;_______;_______,________________;_____________________________________________ _______________________________;________;_________;_________, ( 56 , ~ ~gi~1592266,restriction modification system S subunit ( ~ 1 68l (Methanococcus jannaschii) 53 1 ~

~

;________;____ a_______;_______;________________;_____________________________________________ _______________________-_______;________;_________;______,__;

TABLE 2 5. pneumoniae - Putative coding regions of novel proteirr~S"Similar to known proteins ,".______+____+_______ ,_______+_______,.._______ _ ____________________-___-__________________._______________.

__ __ __ __ __ __ __ -' ~ ( ~ g i g ~ 6 l ength Conti ORFStartp match match ene name Sim )dent 9 ~ ~ Sto ~

4 IO SID~ ~ , acession~ ~
~ ~ (nt>
(nil (nil (________y____ y_______,_______y________________,_____________________________________________ ___________________________ ____+________+_________+ _________+

57 10( ~ ~gi~17885d3,(AE000310) f351: Residues 1-t21 ace 100 ~ 53 ~ 31 945 , 9431 8487 pct identical to Y0.IL_ECOLI SW:
~

P339d4 (122 aa) and as 1S2-351 are 100 pct identical to Y0.1K
ECOLI SW:

( _ ~ ~ ; ~
(N
~ P33943 IESCherichia cola]

y________y____ +_______,_______+_______________________.~_____________________________-__________________________________ ____y_________..______- -________, rr 61 ~ ~ ~gnI~PID~e23646780024.12 (Caenorhabditis elegans] ~
~ 33 426 ~ 429 4 53 ~

y________y____ ,_______,_______________________~______________________________________,_______ ___________________________ ____+________ ~_________y__ _______y 71 ~ ~ ~gi~393394~Tb-291 membrane associated protein (Trypanosoma~
~ 33 5769 ~ S772 4 brucei subgroup) 53 ( +________+____ +_______,_______y______________________________________________-___-______________-______________________ ____,_________________+__ _______ 72 ~ ~ ~gi~2293178~(AFOOB2201 YtsD (Bacillus subtilis] ~
~ 27 1947 ~ 894 2840 53 ~

,________,____ _______y_______________________+_______________..______________________________ __________________________ ____+___________--____+__ _______+

73 ~ ( ~gi~1778556putative cobalamin synthesis protein (Escherichia~ ~ 32 S82 Q14 9793 9212 col d 53 ~

________,____ ,_______+______..y________________y______________________________________..____ __________~_____________________,________ +-________y__ _______+

88 ~ ~ ~gi~2098719(putative fimbrial-associated ~ 5217 4342 protein (Actinomyces naeslundii] ~
~ 38 876 ~

,________y____ y_______,_______,________________y_____________________________________________ _______________________________y_________________y__ _______f 93 ~ ( ~gi~563366(gluconate oxidoreductase (Gluconobacter 53 ~ 2395 1688 ox dans]

y ~ ~ 08 ( +________y____ y_~_____,_______,______________________________________________________________ ____________________________ _____ ____ _____ 96 ~ ~ ~gi 517204ORF1) ~ 6632 7762 putative 42 kDa protein [Streptococcus pyogenes]~
~ 42 1I31 ~

o ,________y____ y______________+______________________________________________._._________.____ ______ __________ _ +

_ __ _ ___ __ _______~ N
' __,________y_________+__ 108 ~ ~ ~gi~149581(maturation protein [Lactobacillus paracasei]~
~ 32 972 N

~ ~
B

________,____ +_______,_______+________________+_____________________________________________ ___________________________ ____,________ ,___________ _______+ 'J

y 128 ~ ~ ~gnI~PID~e317237'unknown (Mycobacterium tuberculosis) ~
~ 36 561 ,~a ~ 6412 6972 53 ~ ~

r________y____ ,_______+_______y________________y_____________________________________________ _______________________________+________+_________,__ _______ N

128 ~ ~ ~gi~311070~pentraxin fusion protein (Xenopus laevis] ~
~ 31 825 o ~ ~

________+____ ,_______+_______,________________+_____________________________________________ _______________________________+________ +_________,__ _______+ N ,r 148 ~ ( ~pir~A61607~A616probable hemolysin precursor -Streptococcus~ ~ 36 948 O\
~ 3 950 agalactiae (strain 74-360f 53 ~ ( ________,____ _______+_______y________________-___________________________________________________________________________y___ _____ y_________+__ _______ 163 ~ ~ ~gi~1755150~nocturnin [Xenopus laevis) ( ~ 30 B61 ~ 2I62 3022 53 ( ________,____ +_______,_______,______________________________________________________________ ______________________________y________ ,_________+__ _______+ O

171 ~ ~ ~gi~1732200~PTS permease for mannose subunit IIPHan ~
~ 32 321 ~ 2304 2624 (Vibrio furnissii) 53 ~

________,____ ,_______r_______,______________________________________________________________ ______________________________,________ +_________y__ _______ N

182 ~ ~ ~gnI~PID~d100572unknown (Bacillus subtilis) ~
~ 35 735 ~ 378S 3051 53 ~

(________,____ ,_______y______________________________________________________________________ _____________________________,________ y_________,__ _______ ( 209 ~ ~ ~9i~t778505ferric enterobactin transport protein (ESCherichia( ~ 2B 1014 ~ 2948 1935 coil] 53 ~

________+____ ,_______+_______+________________+_____________________________________________ _-_____________________________,________ ~___.______+__ _______, 2l8 ~ ~ ~gi~40162~murE gene product (Bacillus subtilis] ~
~ 3d 1479 ~ 3884 2406 53 ~

________+____ +_______,_______+________________+_____________________________________________ _______________________________+________ +_________+__ _______+

250 ~ ~ ~gnl~PID~e339776~YlbH protein (Bacillus subtilis( ~
~ 30 3I8 ~ 473 790 53 ~

,________,____ +_______,_______+________________+__._______________________________'__________ ________________________________+________ _________~__ ___-'__+

275 ~ ~ ~gnl~PI0~d101314~YqeW (Bacillus subtilis] ~
~ 35 1611 ~ 1 1611 53 ~

________,____ ,_______,_______y________________y_____________________________________________ _______________________________y________ +_________+__ _______+

332 ~ ~ ~gij409286~bmrU [Bacillus subtilis) ( ( 31 543 ~ S44 2 53 ( ____-___,____ y_______y_______+________________y_____________________________________________ _______________________________y________ +_________;_________+

2 ~ ~ ~ fgnlfPID~e233879hypothetical protein [Bacillus subtilis) ~
( 39 903 2 25d3 3445 52 ~

________+____ y_______y_______,____________..___+__ ____________________________ __ _ _ __+________ +_______,_+_________+

3 '2~ 2240223376(gi~3B959~lacF gene product [Agrobacterium radiobacteri' ~ 36 975 52 ~

________+____ ,_______+_______,________________+____ __________________.._____________________ ,_________,_________) __+________ 5 ( ( ~ ,gnI~PID~e324915~IgAI protease (Streptococcus sanguisl ~
~ 32 5739 ~

________+____ ,_______,_______,________________y_____________________________________________ _______________________________,________ +_________y_________y v 22 A (20212~9i~152901~ORF 3 (Spirochaeta aurantia]

b 5 ( 35 252 ~ ~

____________ _______,_______,________________+___________ __+________ +_________y_________y UI
22 Q2314029666~gi~289262~comE ORF3 (Bacillus'subtilis]

~ 32 ~

+________+____ y______________________________y_____:_________________________________________ ___________________________;_,________ +__________________+

27 ~ ~ ~gi~39573P20 (AA 1'17B1 (Bacillus licheniformis]

~ 5397 9801 52 ~ 35 597 ~

+________,____ +_______,_______,________________;_____________________________________________ _______________________________+________ +_________+_________ TABLE 2 g. pneumoniae - Putative coding regions of novel protein9'4(milar to known proteins ,_..______~____~_______ ~_______,________________~_____________________________________________________ __..____________________t________~_________~_________t Contig~ORF~ ~ ~ match ~ match gene name ~
length~
StartStop !

sim ~
t [dent ( .
ID ~ID~ ~ ~ acession ~ ~
(nt) (nt) (nt) ~

( ,________,_..__~_______,_______~_______________w ____________________________________________________________________________~__ ______~_________v_______ ( Q10~ 4 ~gi~508241 ~

35 8604 7357putative 52 O-antigen ~
transporter 27 IESCheriehia ~

cold 1248 ________~____,_______,_______,________________,________________________________ ____________________________________________,________~_________~_______ 45 4 ~ ~ ~gnl~PiD~d102243 W
4801 3662(AB005554) homologs sre found in E. coli and H.
influenzae;
see SWISS_PROT

~ 52 36 ~ 1140 ~

ACCT: P42100 [Bacillus subtilisl ,________,____,_______,_______~________________~_______________________________ _____________________________________________,________~_________,_______ d8 (18(11385M3726~grtt~PID~e205174 ~

~orf2 [Lactobacillus 52 helveticusl , ( ~________~____,_______i_______,________________a_______________________________ _____________________________________________,________~_________a_________, 49 ~ ( ~ ~qi~2317710 ~

4 5321 S755~fAF013987) 52 nitrogen ~
regulatory 19 IIA protein ~
(Vibrio 435 cholerael ________,____,_______,_______,________________,________________..______________ _____________________________________________,________,_________~_________, 54 ~ ~ ( (gi~1500472 ~

4 2773 4668~M. jannaschii 52 predictsd ~
coding 36 region ~

lMethanococcus jannaschiil ~________~____~_______~_______~________________~_______________________________ ________________________________________ _____~________~_________i_________i 51 ~ ~ ~ ~9i~2182453 ~

6 5250 4969((AE000079) 52 Y4[0 (Rhlzobium ( sp. NGR2341 40 ~

,________i____y_______~_______;________________i_____________________________..
_________________________________________ _____~________~_________~_________~

66 ~ ~ ~ ~gi~43140 ~

6 8400 6955'TrkG protein 52 (fischeziehia ~

colil 30 ~

~________~____,_______i_______v________________v____________________________.._ _________________________________________ _____~________~_________t_________~

71 Q263d65931312~gnI~PID~e311993 ~
y unknown 52 [Mycobacterium ~

tuberculosis] 23 ( ( ________,____,_______,_______,________________,________________________________ _____________________________,_________ _____,________,_________~_________, ( ( ~ ~ ~gni~PID~d102271 ~

75 2 167J 1035~(AB001683) 52 FarA (Streptom ~

ces s 27 ) ~
6l9 y p.

,________~____~_______~_______~________________;_______________________________ _____________________________________________f________~_________~_________~
N

81 ~ ~ ~ ~gnl~PIp~e311458 ~

3 1439 2893~rhamnulose 52 w.
kinase ~
(Bacillus 32 subtilisl ~

,________,____,_______y_______,________________~_______________________________ ________________.._______________________ _____~________~_________~_______ 81 ~ ~ ~ ~gi~147403 ~
N
8 1987 5781~mannose 52 permease ~
subunit 37 II-P-Man ~
lESCherichia 795 coli) ~

________,____,_______,_______,________________~_________________________.._____ ________________________________________ _____,________~_________,_________~

~

83 ~2120687,21853'gi~1A3365 phosphoribosyl aminoimidazole carboxylase 52 ~ 37 1167 ' [J
~ II (PUR-K: ttg start codon) i !Bacillus subtilis) ~ ~ ~ ~ ~l ________,____~_______,_______,________________t________________________________ _______________________________________ _____,________,_________~_________f 86 ~ ~ ~ ~g3~1276879 ~

6 5785 4592~EpsF [Streptococcus 52 thermophilusl ~

~

~

________,____,_______,_______,________________,________________________________ _______________________________________ _____,________,_________,_________, o ( (20'19790A7861,gi~454844 4 86 (ORF 3 52 lSchistosoma ~

mansonil 26 , ,________,____,_______~_______y________________~_______________________________ _________________________..______________ _____y________~_________~_________~ N

96 Q13A ~ ~gi~288299 ( 0540 9659~ORF1 qene 52 product ~
(Bacillus 33 meqateriuml ~

____ ,____,_______,_______,________________~________________________________________ _______________________________ _____~________~_________,_________, ( ~ ~ ~ ~gi~148309 ~

111 1 2 2026~cytolysin 52 B transport ~
protein 27 (Enterococcus ~

faecalis) 2025 ________,____i_______,_______~________________y________________________________ _______-_______________________________ _____~________,_________,_________, 112 ~ ~ ~ ~gi~471234 ~

2 1457 2167~orfl IHaemophilus 52 influenzael ~

~

(________,____~_______,_______~________________~__________~____________________ ________________________________________ _____~________t_________,_______ 118 ) ~ ' ~bbs~151233Hip=24 kda macrophage infectivity potentiator ~

2931 2365 protein ILegionella 52 ~

i pneumophila, Philadelphia-1, Peptide, ~
184 aa) ILegionelia pneumophilal ~________~____~_______,_______~________________~_______________________________ _____________________________________-__ _____~________f_________~_____.____t 122 ~ ~ ~ ~gi~8214 ~

9 5646 5951myosin 52 heavy chain ~
(Drosophila 36 melanogasterl ~

________,____~_______~_______,________________,________________________________ _______________________________________ _____~________~_________,_________i 122 Q11~ ' ~gi~434025 ~

6159 63i4~dihydrolipoamide 52 acetyltransferase ~

IPelobacter 52 carbinolicusl ~

~

"d y________,____~_______,_______~________________4_______________________________ ________________________________________ _____~________4_________,_________, ( ~ ~ ~ ~gi~153733 ~

134 6 4880 6J13~H protein 52 traps-acting ~

positive 43 regulator ( [Streptococcus 1434 pyogenesl _ i_______~_______~________________~_________________-_____________________________________.._______________ _____~________~_________~_______ 135 ( ~ ~ ~gnl~PID~e2d5024 ~

3 1238 2716unknown 52 (Mycobacterium ~

tuberculosis) 35 ~

________,____,_______,_______,________________,________________________________ _______________________________________ _____,________y_________,_______ 141 ~ ~ ( ~gnl~PID~d100573 ~
v 3 1681 2319unknown 52 (Bacillus ~
subtilisl 32 ~

~

~________,____,_______t_______,________________,_______________________________ ________________________________________ _____~________~_________~_________~
161 ~ ~ ~ ~gi~11462d3 ~

4 2562 502d22.4t identity 52 with Escherichia ~

coli DNA-damage 36 inducible 2463 protein .. , Putative (Bacillus subtilial ________~____,_______,_______,________________~______________._________________ _______________________________________ _____~________~_________,________ l73 ( ~ ~ ~gi~i215693 ( 2 968 183 putative 52 orf; GT9_orf434 !

[Mycoplasma 30 pneumoniae) ~

~________r____~______-,_______~________________~_____________________________________________________ __________________ _____,________4_________~_________~

TABLE 2 S. pneumoniae - Putative coding regions of novel proteins'sfiniler to known proteins ________,____,_______,._______, ________________ +___________________________________________________________________ ___ ContiORFSt S h ________ ___ __ ~ ~ t _________ ~

g ar top ~ ~

matc match ~ i ident gene sim ~
name ~ i length tD SID ( ~ ~ ~
(nt) (nt) Int) acession ________,____~_______,______ _,_______________ _,____________________________________________________ ____ _ ____ ______,_________, ( ~ ~gnl~PID~e313010 _ __________,___ ~O
198 6 _ ~

~

4400 hypothetical ~ protein 3567 (Bacillus subtilis) ~ 52 26 ____________~_______ _______ _______________ ____________________________________________ ~ ~
_ __ _____ 834 ______________ ____,___ - _____________ 210 (12 ~ ~ ~gi~497647 DNA
~ 52 38 8844 9I07 gyrase ~
~
subunit 264 B
(Mycoplasma genitalium) W
________,____ i_______ ,_______ ,_______________ _,________________________________ ______ __ 214 ~10 ~ ~ ~ ________ __ 5264 5d31 i~550697 ____ _________ __ ___ _____________________________ ( l i i ~

g enve 52 36 ope ( ~ ~
prote 168 n (Human mmunodeficiency virus type 1]

________,____ +______________ ________________ ______~_________ __________________ __________~___ 2Z5 ~ ~ ~ (gi~1552773 hypothetical 1 15 884 (ESCherichia coli) ~ 52 34 ________ ~
~
87p ____,_______ ,_______ ________________a______________________________________________ 230 __________ _______________ ~ ~ _____________ 1 362 ~gnl~PIb~d100582 ~ unknown 39 (B

ill b ili ac ~ 52 28 us ~ ~
su 32d t s) ________,____ ,_______ ,_______a______________________________________________________________________ ______________________________t___ _______________ 287 ~ ~ ~ ~gnl~PID~e335028 ~protease/peptidase 1 B71 2 (M

cobact i l ) y ~ 52 29 er ~ ~
um 870 eprae ____________ ,_______ ,_______ ,________________,______________________________________________________ __________ ____ 363 ~ ~ \ _ ____ ~ 1305 4 __________ _ 2 ~gi~393394 ~Tb-291 membrane associated rotein [T
b i p ________,____ ,_______ rypanosoma ~ 52 32 ruce ~ ( subgrou 1302 ]
P

,_______ ,________________ _______________, 23 ~ ~ ~ ____________ 2 20d8 1173 _______________________________________ __,___ ~
nl~PID~e254943 k (M
b i b g un ( 51 30 nown ( ~
yco 876 acter um tu erculosis( ,________,____ _______ _______ ________________ ____________-~~___.._______________________________________________ __________y___________ ___________----29 ~ ~ ~ ~gi~929900 ~5~-methylthioadenosine ~ 51 31 3 742 1521 phosphorylase ~ ~
(Sulfolobus 7B0 solfataricus]

________,____ ,_______ ,_______ ,________________ ________ ___ ______ 45 ~ ~ ~ ~gi~1877429 _____ ________ _____ 1 410 1597 _ ___ ___ __________________ ~integrase ___ (Streptococcus o enes ha a T12]

py ~ 51 32 ________,____ ,_______ ,_______ ,________ g ~ ( ________ ,____________________________________________ 48 ~26 A (18946 ______________________ __________,________,___ ______,_________ 9227 ~gi~2314455 y AE000633) transcri ti l l t p ~ 51 33 ona ~ ~
regu 2A2 a or IteM) (Helicobacter pylori]

____________ ,_______ ,_______ ,_________________________________________________ ________ ____ _____________ ______,____ _______ _____, 73 ~ ~ ~ ~gi~479177 alpha-D-1 4276 4016 4-glucosidase (Sta h lococcu l ) , ~ 51 31 p ~ ~
y 261 s xy osus ____________ _______ ,_______ _______________________________________________________________________________ ___ __________ 81 Q11 ~ 12057 ~gi _ ________________,_________ ~pentraxin ~ 5L

fusion protein (Xenopus laevis) ~ 31 ____________ ,_______ ,_______ ,________________,_______________________________ ~

____________________ 83 ~ ~ ~ ~gnljPID~d101316 _______________ __________,________a___ ______,_________, 5 1195 1986 (YqfI

(Bacillus subtilisl ~ 51 33 ______, ~
~

_ ____ ,_______ ,_______ ________________ ,__________________________________________________________________ __________,________~___ ______,_________, _ Q10 ~ ~ ~gi~41500 ~ORF

(AA

kD
( t ft X) E
h i hi ; ~ 51 28 pu ~ ~
. 1008 s ( SC
er c a toll) ____________ _______ _______ _______________________________________________________________________________ ___ ____________________________________ 113 ~ ~ ~ ~gi~466882 6 3908 5173 ~ppsl;

(Mycobacterium leprae) _ ~ 51 27 ________,____ ,_______ ,_______ ,________________,_____________________________________________________________ _____ ~ ~
__________ 1266 _ , _______,__ _ 124 ~ ~ ~ ~gi _ _ 1 326 57 219116B ~(AF007270) thaliana] ____,_________, ________ ( contains ~

____ ,_______ _______ _____ similarity ! ~
to 270 myosin heavy chain (Arabidopsis ___________ __________________________________________________________________ ____________________________________ 129 ~t0 ~ ~ ~gi~1046241 ~orfl4 7286 6816 (Bacteriophage ' HP1]

~ 51 30 ________f__-~ ______ ~ ~

_ _______ ________________ __________.________________________________________---____________ __________________~___ ______y_________, 143 ( ( ~ ~gi~1354935 probable 3 4963 3983 copper-transporting atpase (Escherichia toll]

~ 51 ________i____ _______ _______ ,________________ __ ~ 26 14B Q15 A1359 10226 ~gi~2293256 ____________________________________________ ~ 98l ~(AF008220) _______ utative __,__________________________, hi t h d l ill p ~ 51 36 ppura ~ ~
e 1134 y ro ase (Bac us subtilis) ____________ ,_______ ,_______ ,________________ __ __,________~____ _____,_________, 1d9 ~ ~ ~ ~gi~1633572 ~Herpesvirus B 6003 7313 saimiri h l (K
i~

omo pes-like 21.
og ~ 51 1311 apos s sarcoma-associated her virus]

________,____ ,_______ ,_______ ,________________ __________________________________________________________________ __________,________,_________, _________,ICJ

I51 y A2092 A ~gnl~PID~e281580 hypothetical ~ 51 34 9 1550 40.7 ~

kd protein (Bacillus subtilisj - ~
~____________ _______ _______ ________________ -_ ____________________________________________________ _ _____ _ ________ 159 ~ ~ ~ ~gi~146944 ______________ ______________ 6 2555 J208 ~CMP-N-acetylneuraminic ~ 51 6 acid synthetase (ESCherichia toll) ~ 3 ____________ _______ t_______ ,________________ __________________________________________________________________ __________ ~
_ 654 174 ~ ~ ~ ~gi~1773166 ( _______,____ ______________ 1 1797 4 robable co er-trans ti t h i p ( 51 28 pp ( ~
por 1794 ng a pase (Esc er chia toll]

____________ ,._______ _______ y________________ __________________________________________________________________ ______________________ ______________ 265 ~ ~ ~ ~gnl~PID~e256400 anti-P.faleiparum ~ S1 ,r 4 2231 1773 antigenic polypeptide (Saimizi sciureusj ~ 18 ____________ ,_______ ,_______ _ ~

_______________ ,__________________________________________________________________ __________,____________ _____~_________, pp 277 ~ I ~ ~pir~S32915~S329 ~pilD
~ 51 33 2 643 1311 protein ~

-Neisseria gonorrhoeae ~
________f____ _______ _______ ________________ __________________________________________________________________ _-________,________~____ 69 _____~_________ TABLE 2 S. pneumoniae - Putative coding regions of novel proteinf~si'milar to known proteins ________ ,____,_______,_______, ________________y______________________________________________________________ ______________+________+_________y_________y ( g (ORF( ( ( ( match gene name ( ( 1 length( Conti StartStopmatch E ident sim ( ( (ID( ( ( ( ( ~ ( (nt) ( ID /nt) (nt)acession ______-_y____ y_______,_______y________________+_____________________________________________ ___________________ ____________y________y_________y_ ________y ( i ( ( (9i(290509 (0307 (Escherichia coli] ( 3S0 ( 890 3 ( 30 ( ______, y_______,_______,________________+_____________________________________________ ___________________ ____________y________y_________y_ ________+
__+:___ ( ~ ( ( (9i(1707247 (partial CDS (Caenorhabditis elegans] ( 363 4 1228 44B5 ( 23 ( ( f________y____ y_______,_______,________________y_________-______..___________________-___________________________ _____.._____y________y_________y_ ________y hr ( ( ( ( (9i(393394 (Tb-291 membrane associated protein ( 367 1 1701 4 [Trypanosome brucei subgroup) ( 32 ( ( ,________+____ +_______y_______,________________y____-.___________________________________________________________ ____________y________y_________y_ ________y ( ( ( ( (gnt(PiD(e58151 (F3 [Bacillus subtilis[
( 50 678 15 5 5i74 4497 ( 38 ( ( ________,____ y_______y_______y________________+_____________________________________________ ___________________ ____________y________y_________y_ ________y ( ( ( ( (gnl(PID(e325010 (hypothetical protein [Bacillus subtil]s]( 50 363 16 4 2220 2582 ( 29 ( ( ________,____ +_______+_______,________________+_____________________________________________ ___________________ ____________y________y_________y_ ________+

( ( ( ( (9i(1552733 (similar to voltage-gated chloride channeloli] 1569 19 S 259i 4i59 protein (Escherichia c ( 50 ( ( 30 ( ________,____ ,_______y_______+________________ +__________________________________________________________________y________y__ _______y_ ________y ________ 25 ( ( ( (9i(887849 ~ORF_f219 (Escherichia coli]

4 2701 1997 ( 50 ( 27 ( ( ________y____ ,_______y_______+________________y________________.____________________________ ____________________ ____________y________+_________+_ ________+

( ( ( ( (gnl(PID(e236697 (unknown (Saccharomyces cerevisiael ( 50 207 35 1 211 417 ( 33 ( ( ________,____ ,_______,_______,________________y_____________________________________________ ___________________ ____________,________,_________,_ ________+

39 ( ( ( (gnl(PID(d100974 (unknown (Bacillus subtilis]
( 50 1737 4 3416 515Z ( 27 ( ________,____ ,_______,_______,________________ y____________________________________________________-_______________________,________+_________,_ ________, O

( ( 7 ~ ( (9i(1592027 (carbamoyl-phosphate synthase, pyrimidine-speclflc)( 50 27 ( 1182 S1 4000 5181 large subunit ( ( ( ( ( ( ( (Hethanococcus jannaschii] ( ( ( +________~____ ,_--____,_______y---_____________ ,-___________________________________________________________________________,___ _____y_________+_ ______-_+

( ( 9 ( ( (9i(1591847 (type I restriction-modification enzyme.( 50 28 ( 1125 51 7I79 8303 S subunit (Methanococcus ( ( ( ( ( ( ( ( jannaschii] ( ( ( ( p __.._____,____ ,_______+_______y________________ y_______________________________________________________________.____________+_ _______+_________+_ ________+

( ( ( ( (9i(144297 (acetyl esterase (%ynCl (Caldocellum ( 52 8 8740 9534 saccharolyticum] ( 34 ( ( y________y____ y_______y_______y________________y_____________________________________________ ___________________ ____________~________+_________y_ ________y -.

52 (16 (16591(1S770(9i(2108229 (basic surface protein (Lactobacillus ( 50 822 fermentum] ( 34 ( ( +________,____ ~_______,_______+________________y_____________________________________________ ___________________ ____________+________y_________y_ ________y ( ( ( ( (9i(2275264 (60S ribosomal protein L7B
[Schizosaccharomyees( 50 306 57 7 6031 6336 pombe] ( 40 ( ( +________y____ y_______y_______,________________ ,___________..________________________________________________________________, ________,_________,_ ________+

( (23 (29348(2A383nl(PID(dt01328 (Y

71 ( jA [Bacillus s btilis]

g q ( 966 u ( 39 ( ( ,________y____ y_______y_______y________________y_____________________________________________ ___________________ ____________,________y_________~_ ________y 86 (12 A (10769~qnl(PID(e324964 (hypothetical protein [Bacillus subtilis]( 50 387 1155 ( 24 ( ( ________,____ +_______,_______+________________ +____________________________________________________________________________,_ _______y_________y_ ________y ( ( 2 ( ( (9i(1066016 (similar to Escherichia coli pyruvate, Accession50 24 876 93 1205 330 water dikinase) Swiss-Prot ( ( ( ( ( ( ( ( Number P23538 [Pycococcus furiosus] ( ( ~

(________,____ ,_______+_______y________________ y____________________________________________________________________________+_ _______y_________+_ ________y ( ( ( ( (gnt(PiD(e322433 (gamma-glutamylcysteine synthetase (Brassica( 50 1287 96 5 1673 2959 juncea] ( 29 ( ( ________+____ +_______y_______,________________y__________-_____________________________________________________ ____________+________y_________+_ ________+

( ( ( ( (9i(151110 (leucine-, isoleucine-) and valine-bindingeruginosa] 954 98 2 218 1171 protein (PSeudomonas a ( 50 ( 30 ( ________,____ y_______+_______y________________ +____________________________________________________________________________,_ _______,_________y_ ________y 103 ~ ( ( (gi~154330 (0-antigen ligase (Salmonella typhimurium]( 50 519 4 3J03 2785~ ( 31 ( ( +________+____ y_______,_______y_____________ +________________________________..___________________________________________y ________+_________y_ ________y __ ( ( ( ( (9i(895747 (putative cel operon regulator [Bacillus( i15 5 6480 5980 subtilis] ( 26 ( ( ,________~____ ,_______y_______,________________ y_______________.____________________________________________________________,_ _______+_________y_ ________+

( (11 ( ( (9i(1216475 (skeletal muscle zyanodine receptor ( 129 7559 7305 (Homo sapiens) ( 32 ( ( ,________+____ /_______,_______,________________ y____________________________________________________________________________+_ _______y_________+_ ________+

( (13 ( ( (9i(152271 (319-kDA protein (Rhizobium meliloti) ( 129 8192 7965 ( 30 ( ( y________y____ ,_______y_______y_______________-y____________________________________________________________________________,_ _______y_________y_________y ( ( ( ( (9i(40348 (put. resolvase Tnp I (AA 1 - 2B4) (Bacillus( 50 151 5 7634 6819 thuringiensis] ( 35 ( 816 +________+____ +_______,_______+________________ y____________________________________________________________________________y_ _______+_________+_________' ( ( ( ( (gnl(PID(d102015 (IAB001488) SIMILAR TO
NITROREDDCTASE. ( 50 597 153 1 1 597 [Bacillus subtilis) ( 29 ( ( ,________+___. +_______y_______y________________ ,____________________________________________________________________________+_ _______+_________y_________y S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ________h__________________________________ ;____________________________________________________________________________4_ _______f_________f_________h ( jORF( j ( match j match gene name ( ( $ j ( .
Contig StartStop $
ident length sim ( (ID( ( ( acession( ID (nt) (nt) j nt) j I

_______________________________________________________________________________ ________________________________________________________,___ ______ pp j j j j )9i(1276880jEpsG [Streptococcus thermophilus) j ( 28 555 155 5 5986 S432 50 ( j ________;__________________ _______________________________________________________________________________ _____________;____________________ ______ ( 9 7390 ( (9i)1786983j1AE000179) o331; 92 pct identical to the ( ( 30 1068 l60 6323 333 na hypothetical protein 50 ~ i ( j ~ YBHE_ECOLI SW: P52697; 26 pct identical ~ ( (7 gaps) to 167 residues of the ( ( j j ( ( 373 as protein MLE_TRICU SW: P46057: SW: ( ( ( ( P52697 [ESCherichia coli) _______________________________________________________________________________ _________________________________________________________________ j ( ( ( jgnl(PID(d101313jYqeN [Bacillus subtilis]

163 6 7396 8091 j j 50 ( ( ____________,_______;__________________________________________________________ _____________________________________________________________ ______ ( ( ( ( )9i(413926jipa-2r gene product [Bacillus subtilis) ( ( 167 6 5232 3940 50 ( ( ____________,_______;_______,__________________________________________________ ______________________________________________________________ ______ ( j ( j (gnl(PIDje304540(endolysin /Bacteriophage Bastille) j ( 35 678 169 2 807 130 50 j ( ________,___________,_______,__________________________________________________ ____________________Y________________________________________ ______, 17l 5 3168 4025gij606080 (ORF_o290; Geneplot suggests frameshift linking50 to o267, not found i i ~ i i i i j ( [Escherichia coli) i h_____________1________________________________________________________________ ___________________________________~.___________ ______ __________ j (11j ( (9i(330038(HRV 2 polyprotein [human zhinovirus) 210 81S1 8414 j ( SO j ( (____________h_________________________________________________________________ _________________________________________h________;____________ ______ O

( ( j j (9i(393396(Tb-292 membrane associated protein [Trypanosomej ( 31 1404 N
364 1 1S38 135 brucei subgroup[ 50 ( ( ____ h______________________________________________________________________________ ________________________________________;____________ ______ N

( ~ j j (9i)144859(ORF B [Clostridium perfringens] ( j 7 5911 5090 49 ( ( w.

p________,____y________________________________________.,______________________ ___________________________________________________4____________ ______ 26 j (10754j jyij142440)ATP-dependent nuclease [Bacillus subtilis] j j 31 987 N
5 9768 49 j j h________h____h_______h_______h________________________________________________ ____________________________________________h____________________ ______ j j j ( )9i)414170jtrkA gene product IMethanosarcina mazeiil ( ( 66 7 9777 8l98 49 j j ________,____,______________h__________________________________________________ ___________________________________________________________,___ ______ O
~O

( j ( ( (gnljPID~e285322(RecX protein [Mycobacterium smegmatis]
( ( 28 717 77 6 5364 4648 49 ( ________,____,_______,_______,________________,________________________________ ____________________________________________,_________________;___ ______ 82 ~13(12689(13249~gnljPIDje255091(hypothetical protein [Bacillus subtilis) ( 49 ( ( o ( 20 56I

(___________________,_______,________________,_________________________________ ___________________________________________,________,_________,___ ______ j ( ( ( )9i)40067 jX gene product (Bacillus sphaericus) ~ ( 93 9 4866 4531 49 ( ( ____________,_______,__________________________________________________________ _____________________________________________________________ ______ ( ( ~ ( (gi~1574380jlic-1 operon protein (licB) [Haemophilus ( ( 27 930 112 S 4019 4948 influenzae] 49 j ( h________,_____________________________________________________________________ _____________________________________________________________ ______ ( ( ( ( (gnl~PID(e267587(Unknown [Bacillus subtilis) ( ( 35 11I0 129 7 6058 4949 49 ~
( ____________h______________________________-_________________________________________-__________________________________________________y___ ______ ( ( j ( (9i(39573 (P20 IAA I-178) [Bacillus licheniformis] ( ( 135 5 3875 4438 49 ( ( h______________________________________________________________________________ _________________________________________________________;___ _..____ j ( ( ( (gnl~PID(d101102)regulatory components of sensory transduction( ~ 29 531 154 2 1423 1953 system [Synechocystis sp.] 49 ( ( _______________________________________________________________________________ ___________________________________________________________ ______ j j ( j (gnljPID(d101732)hypothetical protein (Synechocystis sp.) j ( 25 1242 1S6 5 2R78 1637 49 ( j ;_____________________________________________________________________________.
.________________________________________h----____--_______h_________ 173 ( ( j (9i(490324(LORF X gene product (unidentified] ( j 5 3500 2940 49 ( ( ~S!

;____________~______________________________;__________________________________ ______________________________________________________________ ______4 j ( j ( (9i(331002)first methionine codon in the ECLF1 ORF
[Saimiriinej ( 25 1056 182 1 1057 2 herpesvirus 2] 49 ( ( h________h______-_______________________________________________________________________________ ________________________;_______________._____ ______ ( ( ( ( ~gij2394472j(AF024499) contains similarity to homeobox ( ( 23 1686 fjl 192 6 5352 3667 domains (Caenorhabditis elegans] 49 ( ( ~________~___________~_______________________;_________________________________ __.._______________-_________________________________________f___ ______h ( ( ( ( )9i(531116jSIR4 protein [Saccharomyces cerevisiael ( j 253 4 1129 1350 49 ~
( ________,____,_______,_______,________________,________________________________ ____________________________________________,_________________;___ ______, j j j j ~gi~J96844(ORF 1l8 kDa) [Vibrio cholerae] ( ( 277 i 600 I36 49 ( ( ____________,_______h_______~__________________________________________________ __________________________________________________ __________________;
Qp ( j j j )9i(733524jphosphatidylinositol-4,5-diphosphate 3-kinase( j 24 549 327 3 1435 887 [Dictyostelium discoideum] 49 j ) ________,____,_______,_________________________________________________________ __________________________________________,_________________,_________ S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ________,____,_______,_______,________________ -___________________.._________________________________________-________ ( ( E length( ContiORFStartSto match match gene name t ident ( ( ( ( ( ( sim ( g P

( (ID( ( ( acession ID (nt) loll ( ( ( ( loll ( ~O
,_-______,____,_______,______ _,________________,_____________________________-_____________________________________ ________._,________E_________,_____-__-, ( ( ( ( (9i(393394(Tb-291 membrane associated protein [Trypanosoma( 49 130S i 365 l 1436 132 brucei subgroup) ( ( ( ,________,____,_______,_______,________________,_______________________________ ______-__________________-___________________t________i_________y_ ________, ( ( ( ( (9i(145644(codes for a protein of unknown function( 18 33 7 4461 3277 [ESCherichia coli[ ( ( ( ,________,____,_______s_______,________________,_______________________________ ___________-_______________-________ _________,________,_________,_ ________, ( ( ( ( (gnl(PiD(e290649(ornithine decarboxylase [Nicotiana tabacum)( 40 2 6S2 1776 ( ( ( ________,____f_______,_______,________________,____________________-______________________________________________ _________,________,_________,_ ________+

( ( ( ( (gi~1772652(2-keto-3-deoxygluconate kinase [Haloferax( 48 67 9 1377 2384 alicantei) ( ( ________,____,_______,_______t-_______________a_______________________________________________________________ ____ _________,________,_-_______~_ ________, ( ( ( ( (9i(2182678((AE000101) Y4vJ [Rhizobium sp. NGR234) ( 48 74 2 4269 3871 ( 27 ( ( f________;____~_______,______ _,________________,____________________________________________________________ _______ _________,__.___-__,_________~_ ________, ( ( ( ( (9i(153672lactose repressor (Streptococcus mutans)( 48 81 2 1326 541 ( ( ( ________,____,___-___,_______,________________,__________________________________________________ ___-~
_________,________~_________,_ ________, ___________ ( ( ( ( (9i(146042(fuculose-1-phosphate aldolase (fucA) ( 48 81 4 298l 3646 (Escherichia col d ( ( ( ________,____,___-___,_______,________________,__________________________________________________ ___-_____________ _________,________f_________~_ ________, ( ( ( ( (9i(153794(rgg [Streptococcus gordonii[ ( 48 97 1 602 51 ~

( ________,____i_______v______ _,________________,____________________________________________________________ _____,________,_________,_ ______ _ ( ( ( ( (9i(1381114(prtB gene product [Lactobacillus delbrueckii)( 48 110 1 1 J132 ~
( ( ________,____,_______,_______,________________,________________________________ _____________________-______________________,________,_________,_ ______ ( ( ( ( (gnl(PID(e183811(ACyl-ACP thioesterase [Brassica napus) ( 48 131 5 2914 Z147 ( ( ( ,________,____,_______,_______,-_______________,_______________________-___________________________________________ _________,________,______-__,_ ________~
J

( ( ( ( (gnl~PID(e261988putative ORF (Bacillus subtilis) ( 48 867 '-' l33 4 7494 2628 ( ( ( J
,_______-,____,_______ ,_-_____,________________,_____________..__________________.._____________________ ______________________i________,_________,_ ________, N

( ( ( ( (9i(1049388(ZK470.1 gene product JCaenorhabditis ( 48 369 o 139 6 4231 4599 elegans) ( ( ( ________,____,_______,_______,________________,______-_________________________________________-__________________ _________,________,_________,_ ________, ( ( ( ( (9i(1022725(unknown (Staph ( 48 139 8 5036 5665 lococcus haemol ( ticusJ 29 y ( ( y ,________,____,_______,______ _,________________,________________________________________________-__________________ _________,_-______,_________,_ ______ ( (12(11936(11007(gnl(PID(d102049(H. inEluenzae, ribosomal protein alanine(1A9) 27 ( 930 140 acetyltransEerase; P44305 ( ( ( ( ( ( ( ( (Bacillus subtilis) ~ ' ( ( o ________,____,_______,______ ..4__________--____f__________________________________________________________________________ __,________,_________,_ ________, ( ( ( ( (9i(15917)1(melvalonate kinase (Methanococcus jannaschii]( 4B
101?
l46 9 5670 4654 ~
~

( N
________,____,____-__,_______,________________,___________________________-________________________________________________,________,_________,_ ___-__ ( ( ( ( (9n1(PID(d101578(Collagenase precursor (EC 3.4.-.-1. ( 48 161 3 1280 23'l4 [Escherichia eolil ( ( (________,____,_______,_______,___-____________,__________________________________________________________________ __________,________~_________,_ ______ ( (11(10581(11048(gnl(PID(d101132(hypothetical protein [Synechocystis ( 172 sp.l ~
( ( ,________,____,_______,_______,________________,_______________________________ _____________________________________________,________f_________,_ ________, ( ( ( ( (9i(40067 (X gene product [Bacillus sphaericus) ( 48 182 4 2930 2S86 ( ( ( ________,____,_______,_______,________________,________________________________ _________________________________________-__,________,_________,_ ________, ( (L5(10786(1t196(sp(P13940(LE29_(LATE EMBRYOGENESIS ABUNDANT PROTEIN

21Q D-29 ILEA D-291. ( 18 ( ( ' ,________,___-,_______ _ ___________________________________________________________________-_______________' ________y ______ _________ ________,_________ _ ( (12~ ( ~gi(40389 non-toxic components [Clostridium botulinum]( 18 214 6231 6482 ~
( ~

________,____,_______,_______,________________,________________________________ ____________________________________________,________,_________y_ ________~

( ( ( ( (9i(1573364(H. influenzae predicted coding region el 221 1 704 3 NI0392 [Haemophilus influenza ( ( ~

( ,________,____s_______,_______,________________,_______________________________ _____________________~______-________________,________~______-__,_ ________, 227 ( ( ( (9i(1673697(AE0000051 Mycoplasma pneumaniae, C09 i 48 30 3282 y.j 2 647 3928 ocf718 Protein (Mycoplasma j ~

( ( ( ( pneumoniael i i ,________,____,_______,_______t___-____.._______,_________________________________________________________________ _________ ____ ___ __,__ __~____ __,_________, ( ( ( ( (gnl(PID(e236697(unknown (Saccharomyces cerevisiae) ( 48 25l 2 480 758 ( ( ( ________~____,_______,_______,________________,________________________________ ____________________________________________,________~_________,_________, v ( ( ( ( (9i(18137 (cgcr-4 product [Chlamydomonas ceinhardtii)4 ( 753 8 ( ( 0 ( ________,____,_______,____..__,________.._______,__-________ ____ ____ _ _ __,________,_________,_______-_~ t1~
-~

( ( ( ( (9i(18137 (cgcr-4 product [Chlamydomonas reinhardtiil( 48 389 1 50S 2 ( ( ( ,________,____,_______~_______,________________,_______________________________ _____________________________________________,________,_________,______-__, ( (21(20879(2225$(gnl~PID(e264778(putative maltose-binding pootein [Streptomyces( 47 138p 3 coelicolorl ( ( ( ________,____,_______,_______,________________~________________________________ _________________-__________________________,________,_________,_____-___, S. pneumoniae - Putative coding regions of novel proteins sTmllar to known proteins ,_-______4____4_______,_______ 4____________-___ ,____________________________________________________ i _____ ____ __4_. __4_________4_________, I IORF I I I I

Cont Start Stopmatch match I identI length g gene $
( name sim I
$

I (ID I I I ( I
I I (nt) ID Int) (nt)acession I

________4____4_______ 4 _ _ _____________________________ 4_______,________________ __________________ _ __4________4_________4_________4 I I I ( (g1139573 1P20 6 4 4089 4658 (AA
q7 I
1-178) I
(Bacillus licheniformis) 4________4____4_______ 4_______,________________4_____________________________________________________ _____________________ __,________,________ _,_________4 pp I I I I IgnIIPIDId100572 lunknown 15 3 3736 1760 (Bacillus subtilisl I

4________4____4_______ 4_______4________________ 4__________________________________________________________________________ __,________f_________4_________4 pr I I15 I14516 I132631g111773351 ICapSL

35 [Staphylococcus q7 I
aureus) I

,_____.__,____ ,_______ 4_______,________________4_____________________________________________________ _____________________ __4________4_________4_________4 ' I I I I IpirIA370241A370 132K

51 6 3S47 4002 antigen precursor I
-Mycobacterium tuberculosis (________4____4_______ 4_______4________________,_____________________________________________________ _________________ _______ _________ I I 110154 I 19i139848 Itl3 q7 26 I 882 55 8 9273 (Bacillus I I
subtilis] I

__ , _____________________ __,________,___ ______4_________4 , ___4_______ 4 __________________________________________.._______________________________ _____ 4 I ( I I IgnlIPIDIe280611 IPCPC

92 4 1753 3276 (Streptococcus pneumoniael I

________4____, _______4_______,________________4______________________________________________ ________ ~__________________ __4________4___ ______4_________, ( I I I 19i11786458 (AE000134) I 47 32 I 204 ( 127 9 5589 53B6 f120;
I
This as orE
is pct identical gaps) to I ( I I I residues of an approx.

as protein Y127_HAEIN
Sw:

(ESCherichfa ~

I I I I I coli) ' ________4____ ,_______ ,_______,________________ 4________________.._.._______________________________________________________ __f________,_________,_________ I I I I IgnIIPIDIe266555 lunknown 110 2 1232 1759 (Mycobacterium tuberculosis) I

________4____ ,_______ ,_______, ________________,______________________________________________________________ ____________ __4___________ ______4_________i I ( d951 I llPtp h 140 4 3542 d100964 l I f ~ gn ( I 24 I

l omo 47 I

ogue ( o hypothetical protein in a rapamycin synthesis gene cluster of I I I I I II
I I
Streptomyces hygroscopicus (Bacillus subtilis) ________,____ 4_______ 4_______4________________ ,__________________________________________________,_________ _ __,________,_________,_________ , _ ____________ I I I I Igi~1522674 IM.

15t 4 6814 620U jannaschii predicted I
coding region (Methanococcus jannaschiil ________4____ ,_______ ,_______,________________ ,_________________________________ __,________4___ ______,_________, _ 157 I ~ I IynIIPIDId101320 IYqgZ
I 25 I 372 w-..
3 803 1174 (Bacillus subtilis) I

________,____ ,_______ ,_______4________________ 4__________________________________________________________________________ __4________4___ ______,_________t W

I I I I 19i12367190 '(AE000390) 178 S 3267 2155 o334:

sequence I
change joins ORFS
ygjR

ygjS
from earlier i version (YGJR_ECOLI
SW:

and YGJS_ECOLI
SW:
P42600) (ESCherichia I I I I coli) I
I I
I

________,____ ,_____.... ,_______,________________ _ __,________, _________4_________4 _________ ~_______________________________ _ I I I I IgnIIPIDIe254973 lautolysin 273 1 2 1549 sensor kinase I
[Bacillus subtilis]

,________,____ ,_______ ,_______4________________ ~__________________________________________________________________________ ____ _ _________ I I I I 19I11835755 (zinc 300 2 880 644 finger protein q7 I
Pn I

g (Mus musculus]

________,____ ,_______ ,_______4________________ ~_______________________________________________________________ ________ _______________ ___ I I14 114182 I12638IpirIS43609~5436 IcofA

54 protein I

Streptococcus I

pyogenes ________4____ ,_______ ,_______,________________ ,__________________________________________________________________________ __,________,___ ______,_________4 I I I I IgnIIPIDle223891 (xylose 88 1 2 1018 repressor (Anaerocellum I

thermophilum) .._______,____ 4_______ 4_______4________________ ~__________________________________________________________________________ __,________4___ ______i_________4 i i i i ignllPIDId101652 iORF

96 7 4553 5B60 ID:
I
o34715;
similar to (SwissProt Accession Number P45272]
(ESCherichia coil] II I

,________4____ ,_______ ,_______4________________ 4__________________________________________________________________________ __y________4_________4_________ I I I I 19i12209215 I(AF004325) 112 1 1127 3 putative I I
oligosaccharide repeat unit transporter (Streptococcus I I I I I I II
I I
pneumoniael ________,____ ,_______ 4_______,________________ ,______________________________________________ ____________________________ ______________________ 122 13 7308 7982i hr44 I I ( I 1054776 I
I 3q 7 19 gene q6 I product I

(Homo sapiensl ,________,____ ,_______ ,_______4________________ ___ _ _ __,________4___ ______4_________4 _________________________________________________________________ ____ I I14 I I (9i11469286 IafuA

127 9198 8125 gene product I
(ACtinobacillus pleuropneumoniael I
,________,____ 4_______ ,_______,________________ 4________________________________________________________ __________________ __,________,___ ______4________ I I I I 19i1153794 Ir 132 4 7093 6197 (St t d ii) gg I 26 I 897 rep 46 ( ococcus I
gor on _______; , 4 , , __,________4___ ______,_________4 tr __ _____ _ _ _________________________________ ____________________ I I I I Igi~1235795 Ipuliulanase I 21 ( 498 140 8 8220 7723 (Thermoanaerobacterium thermosulfucigenesl I

,________4____ 4_______ ,_______~________________ 4__________,___...________ _ ______________ __ _________________________________________________4________4___ I I I I 19i leucine I 27 i 140 9 9205 83Z5407B78 rich I I ( protein /Streptococcus equisimilis/

4________4____ ,_______4_______ 4________________ 4___________________________________________________________________________ _4________,___ ______,_________4 S. pneumoniae - Putative coding regions of novel proteins similar to known proteins ,________.,____ ,_______4______ _,________________,____________________________________________________________ ________________,________t_________,______ j jORFj j j match j match gene name j simj 8 lengthj Contig StartStop E
ident j j jIDj j j acessionj j ~
j (nt) ID (nt) (nt) ~O

________,____,_______;_______,________________i________________________________ ____________________________________________,___ _____+_________, _________~ 0 162 j j j jgij1143109jORF7; Method: conceptual translation supplied 46j 25 1125 ' i 1 1125 by author [Shigella sonneil j ( j , ________~____,_______t_______y________________f____________________________..__ _____________________________________________~___ _____t_________,_________~

j j j j jgij1947171j1AF000299) No definition line found iCaenorhabditis 46j 28 585 I99 1 1 585 elegans) j j j ________,____,_______~_______,________________,____________________________,.__ _____________________________________________,___ _____,_________, _________, j j j j jspjP02562jMYSS_jHYOSIN HEAVY CHAIN, SKELETAL MUSCLE
(FRAGMENTS). 46j 27 495 223 3 I971 1477 j j j ,________y____y_______,_______~________________~_______________________________ _____________________________________________~___ _____~_________; _________, j j j j jgij1016112jycf38 gene product (Cyanophora paradoxa) j 46j 28 849 232 2 760 1608 j j ~________,____,____,.__,_______,________________,______________________________ ______________________________________________,___ _____,_________i_________t j j j j jgij1673744j(AED00011) Mycoplasma pneumoniae, cytidine 292 1 687 220 deaminase: similar to GenBank j ~ i j j j ( j j Accession Number C53312) from H. ptrum (Hycoplasma i pneumoniae) j (________,____t_______,_______,________________,_______________________________ _____________________________________________,___ _____f_________,_________, j j j j jgij1788049j(AE000270) o235: This 235 as orf is 29 30 8 5843 6472 Pci identical (70 gaps) to 198 j 45j j j ~ ( j j residues of an a ~

PProx. 216 as protein YT%B_BACSU SW: P06568 (Escherichia j j j j j ~ coli) j j j j ________,____,~______,_______,________________~________________________________ ____________________________________________,___ _____,_________ j j j j jgij722339junknown [ACetobacter xylinum) j 45j 48 6 3461 3868 j j ________,____,_______,_______,________________~________________________________ ____________________________-_______________~___ _____,_________,_________~
O

j j j j jgij1699079jcoded for by C, elegans cDNA yk41h4.3; coded 45j 36 306 N
60 1 307 2 for by C. elegans cDNA j j j j j j j j yk148g10.5: coded for by C. elegans cDNA j j J
yk152g5.5; coded for by C. j j j j j j j elegans cDNA yk59a10.5: coded for by C, elegans cDNA yk41h4.5: coded for ~ i i j j j j j j j by C. elegans cDNA cm20g10; coded 'J

N
________,____~_______,_______,________________,________________________________ ____________________________________________,__.._____,_________i__ __ ~

_ O
j j16j14371j19874jgij1321900jNADH dehydrogenase lubiquinone) (Artemia franciscanaj 45( 25 ____ 72 ( j j ,________,____,_______~_______,________________,____________________________-_______________________________________________,___ _____~_________,______ 99 7 9158 7991i 1S2192 mutation causes a succinoglucan-minus henot j j j ~ j9 j j p ype: ExoQ is atransmembrane j 45j ( j j j j j j j protein; third gene of the exoYFQ operon;: j j j ~o putative (Rhizobium meliloti) j ,________,____,_______,_______,________________~______________________.._______ ___________-__________________________________,___ _____~_________f_________, n O

127 12 7096 6606bhs 153689HitB=iron utilization rotein (Haemo hilus influenzae, 45j 24 441 j j j j j j C j j p p ype b, DL42, NTHI j j j j j j j TN106, Peptide, 506 aa/ [Haemophiius lnfluenzael j ( j j ,________,____,_______,_______,________________~___..__________________________ ______________________________________________,________,______..__,_________, N

, j j ( j jgi(472921jv-type Na-ATPase (Enterococcus hirae) j 45j 137 5 1561 2619 j j ,________,____~_______,_______,________________~_______________________________ _____________________________________________,___ _____,______.._ j j j j jgij301141jrestriction endonuclease beta subunit (Bacillus 45j 28 411 209 1 774 364 coagulans) ~ j ( ,________,____~_______t_______,________________,___..__________________________ ________________________.._____________________,___ _____,_________,_________~

j ( ( j jgij1480457jlatex allergen lHevea brasiliensisi j 45j 31 60J
314 1 60A 2 j j ,________s____,_______,_______~________________,__________~____________________ _____________________________________________,___ _____,_________, ______ ( j18j19782j20288jgij433942(ORF ILactococcus lactis) ( 49j 26 507 20 j j ________i____,_______,_______,____________..___4_______________________________ _____________________________________________+___ _____~________ j j j j jgij537207jORF_f277 (Escherichia cold j 44j 87 8 7030 6452 j j ,________,____~_______~_______,_________._______;______________________________ ________________________-_____________________,___ _____+______~__,_______-_, j j j j jgnljPIDje308082jmembrane transport ?rotein [Bacillus subtilis) 44j 25 873 166 5 4909 4037 j j j ________~____,_______,_______~________________,________________________________ ___..______________________________________ ________ ___ _ __~__ __,__ __ j j j ) jgnljPIDjdID0718j0RF1 iBacillus sp.) j 44j 20 744 ___,_______~_______~________________,__________________________________________ __________________________________,___ _____~_________,_________;

( j j j jgij2351768(PspA [Streptococcus pneumoniae) j 43j 24 1992 32 3 1885 3876 j ( ~________t____~_______,_______i________________,_______________________________ _____________________________________________,___ _____i_________+_________~

( j17(15467j18256jgij1015739jH. genitalium predicted coding region MG064 43j 26 2790 36 (Hycoplasma genitalium) j j j ,________,____,_______,_______,________________ ;_______________________________________________~..___________________________, ___ _____,_________~_________, j j15j14656j17343j j 54 ij520541 icilli -bi di t i IA
d IB
ill B
b l g pen 43j 27 26A8 n j j n ng pro e ns an /
ac us su ti is) j ________~____v_______~_______,________________,________________________________ ____________________________________________,___ _____,_________,_________, j j j j (gij536934jyjcA gene product (ESCherichia coli) 67 2 696 1352 j 43j ( j ,________i____,_______~_______i________________ ~____________________________________________________________________________t_ _______ ~_________~_________~

j j j j jgij396400jsimilar to eukaryotic Na~/H exchangers [Escherichia 43( 24 2079 139 2 2416 338 coli) j j j ,________~____,_______,_______;________________ ~____________________________________________________________________________,_ _______ ~_________ S. pneumoniae - Putative coding regions of novel proteins'~similar to known proteins ________,____,_______,_______,________________,________________________________ ____________________________________________,________ ,_________,_________, Contig~ORF~ ~ ~ match ~ match gene name ~
~ ( length StartStop ! t sim ident 1D ~1D( ~ ~ acession( ~
~ ~ (nt) (nt) (nt) ________,____,_______,_______,________________,________________________________ _____________________________________________1________ ,_________,_________, 298 ~ ' ~ ~gi~4139'72~ipa-48r gene product (eacillus subtilis] ~
~ ~ 807 ~ i.~r ,________,____,_______,_______,________________~_____________________________..
______________________________________________,________ ,_________,_________, 387 ~ ~ ~ ~gi~2315652~(AF0166691 No definition line found (Caenorhabditis~ ~ ~ 381 1 47 427 elegans] 43 30 ________,____,_______,_______~________________,________________________________ ____________________________________________,________ ,_________,_________, 1B5 ~ ~ ~ ~gi~2182399~(AE000073) Y4fP (Rhizobium sp. NGR234] ~
~ ~ 1095 ,________~____~_______,_______a________________4_______________________________ _____________________________________________,________ ,_________,_________1 3d0 ~ ~ ~ ~gnI~PID~e218681~CDP-diacylglycerol synthetase (Arabidopsis~
~ ~ 513 1 582 70 thaliana] 41 20 ________,____~_______,_______,________________~________________________________ ___________________________________________..,________ f_________1_________, J ~ ~ ~ ~gi~1256742R27-2 protein (Trypanosoma cruzi] ~
~ ~ 2292 363 6 4205 19t4 41 ,________a____,_______~_______f________________,_______________________________ _____________________________________________y________ ,_________,_________, 368 ~ ~ ~ ~gi~21783 ~LHW glutenin (AA 1-356) (Triticum aestiwm]~
~ ~ 942 2 2 943, 41 34 ,________~____~_______,_______,____________..___~______________________________ _______________________~_____________________,________ ,_________,_________, 155 ~ ~ ~ ~gi~42023 member of ATP-dependent transport family, and 40 ~ 1629 3 4489 2861 very similar to mdr proteins ~ 18 ( ~ ~ ~ ~ ~ hemolysin B, export protein (Escherichia coli) ________,____,,_______,_______,________________,_______________________________ _______.._____________________________________i________ ,_________~_________, 365 ~ ( ~ (gi~1633572'Herpesvirus saimiri ORF73 homolog (Kaposi'sike 40 ~ ~ 1344 2 95 1d38 sarcoma-associated herpes-l ~ 21 ~ virual __.._____,____,_______,_______,________________,_______________________________ _____________________________________________,________ ,_________,_________, 1 ~ ~ ~ ~gnl~PID~d101908hypothetical protein (Synechocystis sp.] ~
( ~ 8B2 ~__ 3 2979 3860______ , 39 _ _ ,_ 1 _ ______1 ~_____________ _ _____1________ ,_______-_,_________y 1 ___~ ~ _____~__ ________________.___ ~
~ ~ 834 ~ 3814 4647~gnl~PID~d101961___________________________________________________39 ~hypathetical protein (Synechocystis sp.]

(________1____,_______1_______,________________,_______________________________ _________-___________________________________~________ 1_________,_________, 26 ~ A A ~gi~142439ATP-dependent nuclease (bacillus subtilis)~
~ ~ 3312 6 4035 0724_ 38 20 ,________,____,_______1________ _,________________.____________________________________________________________ 1________ ,_________,_________, 47 ~ ~ ~ ,_____________~NF-180 (Petromyzon marinus] ~
~ ~"r 1 3 4916~gi~632549 16 23 ~ 4914 ~ W

,________,____,_______,_______y________________,_______________________________ _____________________________________________1________ ,_________,_________~

ro S. pneumoniae - Putative coding regions of novel proteins n6t slm~lar to known proteins ,________,____,_______,_______, -( ContigORFStartStop ( ( ( ( ID ID (nt)(nt) S ( ( ( ,________,____,_______,_______, ( 1 ~

( ( ( ___,____~_______,_______~ pp ( 1 ( W

~

( ,________,____,_______,_______, pa ( 3 ( ( ( 99d ( ,________,____,_______,_______, ( 3 ( ( ( ( ___,____,_______,_______, ( 3 ( ( ~

( ,_______,_______, ( 3 (25 (25046 (25396 ( ,________,____,_______,_______, 3 (26 (25625 ,_______,_____ ( 6 ( CZ

( ( ( ,________~____,_______~_______, ( 6 (14 ~1~875 (12618 ( ~________,____,_______,_______, o ( 6 to (15 (13215 ~

___,____,_______,_______, ~1 ( 6 (18 (15977 (15390 ( J

;________,____,_______,_______, N

7 (12 o ~

( ~

,________a____,_______,_______, -( 7 (13 W ~o A

~

( ,________,____,_______,_______~ N ~o ( 8 ~

~

~

( ,________,____,_______,_______, o ( 9 ~ a, ( ( ( ,________;____,_.._____~_______, N

( 10 ( B
( ( ( ,________,____,_______,_______, ~

( ~

,________,____,_______,_______, (11 ~

( ( ________,____,_______,_______, ~

~
11d0 ( ( ,________,____,_______,_______, ( 12 ~

~

~

( ___,____,_______,_______, ~

~

~

( ___,____,_______,_____ ( 16 ( ~

( ,________,____,_______,_______, ( 16 ( ( ~

( ,________,____,_______,_______, ( ( ( ( ,________,____,_______,_______, J

( 17 ~ Hr ~

( ( ,________,____,_______,_______, ~

( d890 ~

' ,________,____,_______,_______, ( 20 (14 ,________,____,_______,_______, S. pneumoniae - Putative coding regions of novel proteins not similar to known proteins ,________,____,_______,_______, Contig StartStop ~ORF ~
~

ID ~ID (nt)(nt) ~ ~

y________,____,_______,_______, 21 ~ 3 per ~ 3359 ~ 2589 ~

,________,____,_______,_______, 00 21 ~ 5 W
~ 4802 ~ 4482 ~

,________,____,_______,_______, Yr 22 ~21 ,________,____,_______,_______, 22 ~25 ________,____,__..____,_______a 22 ~33 ,________,____,_______,_______?

(26218 ,________,____,_______,_______, ( 22 ~36 y________,____,_______,_______y 23 ~ 7 ~ 6655 ~ 6032 ,________,____,_______,_______, 23 ~ 8 ~ 7132 ~ 6653 ,________,____,_______,_______, 24 ~ 1 ~ 36 ~ S18 y________,____,_______y_______, J

25 ~ 5 ""
~ 3009 ~ 2641 ~

J

,________,____,_______,_______, N

27 ( 4 o ~ 4819 ~ 4223 ~

,________,____,_______,_______, 27 ( 5 W
~ 4789 ~ 4956 ~

___,____,_______,_______, O~ ~p 28 ~ 5 ~ 3017 ~ 1797 (________,____,_______,_______, o 2B ~ 8 ~ 4272 ~ 3850 ,________,____,_______,_______, N

28 (10 ~ 5028 ~ 9597 ~

,________,____,_______,_______, 28 ~11 ~ 5746 ~ 5072 ,________,____,_______,_______, 29 ~ 7 ~ 5596 ~ 4919 ,________,.____,_______,_______, 29 ~ 8 ~ 5019 ~ 5518 ,________,____,_______,_______, 29 ( 9 ~ 5595 ~ 8207 ,________,____,_______,_______, 30 ~ 9 ~ 6511 ~ 626J

,________,____,_______,_______, 31 ~ 6 ~ 2664 ( 2344 ,________,____,_______,_______, 32 ~ 5 ~ 5203 ~ 55J8 ,________,____,_______,_______, 33 ~ 8 ~ 5327 ( 466B

y________,__.._y_______,_______, 74 ~10 ~ 8024 ~ 77d0 ~0 ,________,____,_______,_______, ~ 9360 ~ 8641 ,________,____,_______,_______, ~ 9667 ~ 9377 ,________,____,_______,_______, TABLC J S, pneumoniae - Putative coding regions of novel proteins not ~tlnilar to known proteins ,________f____i_______~_______t ( ContigORFStartStop ( ( ( ( ( ID ID(nt) (nt) ( ( ( ( ~________t____~_______~_______a ( 34 ~18 (13104 ___,____~_______~_______~ 0D

( 35 (11 ( ( ( ~________~____~_______t_______4 ( 35 (12 (11073 ( ( ~________~____~_______~_______t ( 36 ( ( ( ( _,_______~_______, ( 36 (12 (10893 ( ___,____~_______,_______, ( 36 (13 (10993 A

,________~____,_______~_______, ( 36 ~15 (12172 (14595 ( ~

( ( d577 C'1 ,________,____,_______,_______~

( ~
d480 ( ( ,________,____,_______~_______v o v ( 38 (10 ( ( ( t________,____~_______,_______f ( 38 (17 (10732 ( 40 ( 0 ~

( ( ( 4J W
~

( ( S
( ________,____,_______,_______ w ( 43 ( ( 8&84 ( ( ( 43 ~

( ( ,________+____~_______,_______, ( 41 ~

( ~

( 45 ~

~

~

( ,_______,_____ ( 46 ( ~

~

( ( 46 ( ( ( ( ~________f____~_______~_______~

( S
( ( ( ,________,____,_______,_______t ( 48 ( ( ( ___,____ ( 48 ( ( ( ( 48 (16 (12494 ( ~________~____t_____-_y_.~_____t ( 48 (20 (16342 ( t________~____,_______~_______~

( 48 (24 (18351 ( ~.________~____t_______~_.._____~

( 18 (30 (21979 (21776 t________~____~_______i_____ ( ( ~

~________t____~_______~_______, TABLE 3 g, pneumoniae - Putative coding regions of novel proteins not similar to known proteins y________y____y_______y_______y ( ContigORFStartStop ( ( ( ( ( ID ID (nt) (nt) ( ( ( ( y________y____y_______y_______y 0 D

( 50 ' ( ( ( ( y________y____y_______y_______y ( 51 ( ( ( y________y____y_______y_______y ( 52 (11 (12146 (128B3 y________y____y_______y_______y ( 54 ( ( ( ( y________y____y_______y_______y ( 54 ( ( ( ( +________y____y_______y_______y ( 54 ( ( ( ( y________y____y_______y_______y ( 54 (16 (17685 (17506 ( y________y____y_______y_______y ( 55 ( (10515 (10123 ( y________y____y_______y_______y ( 55 (12 (11947 (12141 ( y________y____y_______y_______y o ( 56 N
( ( ( ( y________y____y_______y_______y ~.1 ( 56 ( ( ( ( J

y________y____y_______y_______y N

( 57 ( O

( ( ( y________y____y_______y_______y ,_..

( 57 ( W

( ( ( y________t____,_______~_______~

( 58 ( ( ( ( y________y____y_______y_______y p ( 59 ( ( ( ( t________y____,_______y_______+
N

( 59 ( ( ( ( y________y____y_______y_______y ( 59 ( ( ( ( y________y____y_______y_______y ( 59 ( B
( ( ( +________a,____y_______,_______t ( 59 ( ( ( ( y________y____y_______y_______y ( 60 ( ( ( ( y________~____y_______f_______~

( 61 ( ( ( ( y________y____y_______y_______y ( 61 ( ( ( ( y________y____y_______y_______y ( 64 ( ( ( ( y________y____y_______,_______y ( 66 ( ( ( ( y________y____y_______y_______y ( 66 ( ( ( ( y________y____y_______y_______y ( 67 ( 00 ( ( ( y________y____y_______y_______y ( 69 ( ( ( ( y________y____y_______y_______y TABLE 3 S. pneumoniae - Putative coding regions of novel proteins not ~Ydilar to known proteins _______..s____,_______,_______, ( ORFStartStop Contig ( ( ( ( ~ (nt) (nt) ID ID ( ( ( ;___70___i_5_-i'4 0~0 ( r ( ,________,____,_______,_______, ( ( ( ( ( ,________+____,_______,_______, ( ( ~

( ( ,________,____,_______,_______, ( (15 (20351 (21901 ( ,________,____,_______,_______, ( (16 (21859 (22338 ( ,________,____,_______,_______, ( (19 (26204 (27556 ( ,________,____,_______+_______, ( ( ( ( ( ,________,____,_______,_______, ( ( ( 38l5 ( ( ,________,____,_______,_______, ( ( ( ( ( ,________,____,_______~_______) o ( ( ( ( ( N

,________,____,_______,_______, ( ( ( ( .
( ,________,____,_______,_______, N

( i3 0 (15 ( ( ( ,________,____,_______,_______, ( ( ( ( ( f________,____,_______y_______, ( ( ( ( ( ,________,____,_______,_______, o ( 7s (11 ( esoz ( 9z10 ( ,________,____,_______,_______, ( ao ( s ( ( Alo9 ( ,________,____,_______,_______, ( el ( ( z04 ( z i ,________,____,_______,_______, ( (10 ( ( ( ,________,____,_______,_______, ( ( ( ( ( ,________,____,_______,_______, ( (17 (16A10 (16460 ( ,________,____,_______,_______, ( ( ~

( ( ,________,____,_______,_______, ( ( ( ( ( ,________,____,_______,_______, ( ( ~

( ( ,________,____,_______,~______, ( ~19 (16767 (17114 ( ~O

,________,____,_______,_______, J

( 87 w ~

( ~

( ,________,____,_______,_______, ( ( ( ( ( ,________,____,_______,_______, ( ( ( ( ,________,____,_______,_______, S. pneumoniae - Putative coding regions of novel proteins not similar to known proteins y________y____y_______y_______y ContigORFStartStop ~ ( ~

ID ID (nt) (nt) ~ ~ ~

y________y____y_______y____ __y ~

y________y____y_______y_______~
pp ~19 W
A

y________y____y______________y hr ~

~

~
l810 ,________,____y_______y_______, ~
d ~

~

,________y____,_______y_______, ~

~

~

y________y____y_______y_______, ( ~

~

y____________,_______y_______, ( ~

~

__y____,_______y_______, ~ CZ

~

~

y________y____,_______y_______y ( ~

~

o y________y____,_______y_______, N

( N

~

~

y________y____y_______y_______y ~

91 , ~ a ~

~

~

y________y____y_______y_______y N

( ~

~

y________y____y_______y_______y ~, ,r 93 O 'o ~

~

~

( ' y________y____y___ __y_______y ~

~

( y________y___..y_______y..______y O

~

~

~

y________y____y_______y_______ N

~

~

~

y________y____y_______y_______y ~

~

~

y________y____y_______y_______y ~

~

~

y________y~__________y_______y ~

~

~

y________y____a_______y_______y ( ~

~

~

y___..____y____y_______y___..___y 99, ~15 y___..____y____y_______y_______y ~17 A

y________y____y_______y_______y ~

~

~

___..____y____y_______y_______y ~

~

~

y________,____y_______y..______y y ~

~

~

y________y____y_______,_______y i ~

~

y________y____y_______y_______y ~

) ~

y________y____y_______y_______y TABLE 3 S, pneumoniae - Putative coding regions of novel proteins not ~151fllar to known proteins y________y____y_______y_______y Contig StartStop lORF ~
~

IO ~ID (nt) (nt) ~ ~

y________y____y_______y_______y 106 ~ 0 1 ~ 0 1 ~

~

,________y____y_______y_______, ~

'", ~

.

y________y____y_______y_______y 108 ~
1 ~
2 ~

y________y____y_______y_______y 1l1 ~
3 ~

~ 3788 y__-_____y____y_______y_______y 111 ~ , 4 ~

~ 1606 ~

y________y____y_______y_______y 115 ~

y________y____y_______y_______y 116 ~
3 ~

( 2121 y________y____y_______y_______y 118 ~
2 ~

~ 1357 y________y____y_______y_______y 122 ~ y 4 ~

~ 2333 ~

y________y____y_______y_______y 122 ~10 ~ 585A 0 ~ 6199 N

y________y____y_______y_______y N

122 ~12 J
~ 6301 ~ 7416 ~

y________y____y_______y_______y J

124 ~
2 ~ N

~ 690 y________y____y_______y_______y O

128 ~
9 ~

( 336A

y________y____y_______y_______y l29 ~ ' 1 ~

( 102 ~

y________y_.-__y_______y_______y l29 ~ o 2 ~

~ 724 ~

y________y____y_______y_______y 129 ~
8 ~

( 6056 y________y____y_______y_______y 129 , ~ 9 ~ 6540 ~ 6277 y________y____y_______y_______y 129 ~12 ~ 7809 ~ 7621 y__-_____y____y_______y_______y 131 ~
3 ~

~ 756 y________y____y_______y_______y ~ 5972 ~ 5673 y________y____y_______y_______y y________y____y_______y_______y 135 ~
2 ~

~ 1110 y________y____y_______y_______y 136 ~
4 ~

~ 3B30 y________y____y_______y____..__y 137 ~
2 ~

~ 134 y________y____y_______y_______y l39 ~12 J
(14027 (14521 ~

y________y____y_______y_______y 139 ~13 N
(14840 ~

y________y____y_______y_______y y________y____y_______y_______y S. pneumoniae - Putative coding regions of novel proteins not similar to known proteins y___..____y____; _______y_______;

Contig ~ORF ~ Start ~ Stop ID CIO ~ (nt) ~ (nt) ~O

y___.____y____y_______,_______;

( 140 ~ ~2019822 ;________;____y_______;_______~
W

142 ~ 1 ~ 1 ~ 285 r ,________,____,_______,_______y 116 ~ 3 ( 760 ~ 479 y________;____y_______,_______y 146 ~ 1 ~ 1149 ~

y________,____;_______~_______, 116 ~ 7 ~ 3604 ~

y_--_____+____y_______y_______y 1d6 ~13 ~ 8223 ~

y________y____y_______y_______y 146 ~14 ~ 9399 A

y________,____i_______;_______;

146 ~15 (10052 ~

,________,____y_______,_______, l17 ~ 7 ~ 7d88 ~

y________;____y_______~_______;
N

147 ~ 9 ( 8913 ~

J

y________~____y_______;_______p r 148 ~ 7 ~ 5298 ~

N

;________;____,_______y_______;
o 149 ~ 1 ~ 2 ~ 1936 ;________y____y_______y_______, ""'' 149 ~ 3 ~ 2557 ~
2880 y N ~o ;________,____,_______,_______, 119 ~ 9 ~ 6258 ~

__,____,_______,_______y 150 ~ 2 ~ 1355 ~ ' 579 ~

y________y____y_______y_______y N

150 ( 3 ~ 2556 ~

,________~____~_______,_______, 153 ~ 3 ~ 2061 ~

;________,____,_______,_______, 154 ~ 3 ~ 19S3 ~
17d1 y________p ___p______y_______y 155 ( 2 ( 2181 ~

y________~____,_______,_______y 156 ~ 8 ~ 4550 ~

y________?____;_______f_______y 157 ~ 1 ~ 37 ~ 294 ;________f____;_______y_______y ( 159 ~ 2 ( 631 , y________,____y_______y_______y 159 ~ 4 ~ 1384 ( y________,____+_______~_______s C/~

159 ~ 7 ~ 3271 ~

y________y____;_______~_______, w,, 161 ~ 2 ( 1332 ~

y________y____y_______;______ 165 ~ 3 ~ S535 ~

y________y____y_______y_______y 166 ~ 6 ~ S406 ~

y________y____,_______y_______y S. pneumoniae - Putative coding regions of novel proteins not sis,ilar to knowrn proteins ,________,____,_______,_______, ContigORFStartStop = ~

( ID (nt) (nt) ID ( ( ,________,____,_______,_______E p'0 ( ( ( ( ,________,____,_______,_______, ( ( ~

( ( y________,____,_______,_______, ( ( ( ( ( __,____,_______,_______, ( ( ( ( ( y________,____y_______,_______, ( ( ( ( ( ,________,____,_______,_______, ( (11 ( ( ( ,________,____,_______,_______, ( ( ( ( ( y________,____t_______y_______, ( (]

( S
( ( ( y________,____,_______y_______, y ( o ( ( a913 ( 2s77 ( y________,____,_______y_______, N

( ( ( ( ( __,____,_______,_______, ( J

t ( ( ( ,________,____y_______,_______, N

( ( ( ( ( ,________,____,_______y_______y ~ w..

( W

( ( 2a00 ( ( y________y____y_______y_______y ( ( ( ( 19a5 ( ,________,____,_______,_______, o ( (10 ( ( ( ,________~____~_______,_______, N

( (11 ( ( ,________y____,_______,_______, (13 ( ( ( y________y____,_______,_______, ( ( ( ( ( ,________,~___y_______,_______, ( ( ( ~

( ,________y____y_______,_______, ( ( ( ( ( ,________y____,_______,_______, ( ( ( ( ( ,________,____,_______,__..____, ( ~

~

( 23a0 ( ,________,____y_______y_______, ( ( ( ~
4a19 ,________y____y_______y_______y ( ( ( ( ( ,________y____y_______,_______y ( (p ( ( ~

( ,________,__._,_______,_______y ( ( ( ( ( ,________,____y_______y_______, ( lee ( s ( ( ( ,________,____,_______,______ S. pneumoniae - Putative coding regions of novel proteins not aiuilar to known proteins ,________,___________1_______, Contig StartStop yORF ( ( ID yID int)(nt) ~ ~

,________a____,_______,_______, 188 ~
6 ~ 5882 ~ 6493 ,________,____,_______,_______, pp 189 ~
( 3143 W
~ 2844 ;________,____/_______,_______/

189 ~
9 ~ 5956 ~ 5564 ,________,____/_______/_______a 191 ~
1 ~ 618 ~ 1 /________/____,_______,_______, l91 yll y10357 /________,____/_______/_______, 192 ~
3 ~ 2861 ~ 2268 ,________,____,_______,_______, 19Z ~
1 ~ 3081 ~ 2878 ,________,____,_______,_______, 192 ~
7 ~ 6800 ~ 5331 /________/____4_______/_______/

193 ( 3 ~ 997 ~ 839 ,________,____,_______,_______, o 194 ~
4 ~ 2315 ~ 2127 ,________,____,_______,_______, ,J

l95 ~
S ~ 6249 ~ 4543 ,________,____/_______,_______, 195 ~ o 6 ~ 6620 ~ 6231 ~

196 ~
2 ~ 1553 ( 1849 ,________,____,_______/_______, 197 ~
1 ~ 1 ~ 861 ,________/____,_______,_______, o l98 ~
9 ~ 684d ~ 6644 /________,____,_______,_______, 200 ~
5 ~ 5329 ( 5769 ,________,____/_______,_______, 200 ~
6 ~ 5993 ~ 6595 ,________,____,_______,_______, y 204 ~ 5 ~

~ 3276 ________/~____,_______,_______/

205 ~
2 ~ 447 ~ 1709 ,________,____,_______/_______/

209 ~
1 ~ 2038 ~ Z160 p ____..__/____/_______,_______/

209 ~
5 ~ 2158 ~ 26B2 y /________,____/_______,_______, 210 y10 ~ 7370 b ~ 8Z30 /________/____/_______,_______, 210 y13 ~ 9029 ,________,____,_______,_______, ' 210 y14 y104)9 ,________,____,_______/_______, J

2I4 ~
5 ~ 2581 r ~ 2330 ,________,____,_______,_______/

214 ( 9 ~ S065 ( 5277 ,________,____,_______f_______, y 214 y11 y y 5754 ( /________,____,_______,_______/

S, pneumoniae - Putative coding regions of novel proteins not ~i~lar to known proteins _ , ____,____, _______,_______, ___ StartStop Contig jORF j j S ID (ID j (nt)(nt) j ,________,____,_______,_______, j 217 ( 2 j 541 ~ 191 j ,________y____,_______,_______, j 218 j 2 j 914 ~ 1432 ~
rr ,________,____,_______,_______, j I18 j 3 j l430 j 1972 ,________,____,_______,_._____, j 218 j 6 j 3639 j 3821 ,____.___,____,_______,_______, j 219 j 1 j 4S8 j 39 j ,________,____,_______,_______, , ( 22D ( 1 ( 869 j 60D j ,________,____,_______,_______, j 223 j 4 ~ 2617 j 1961 j ,________,____,..._____y_______, j 227 ~ 1 j i j 510 j y________y____y______ 234 j 4 j 1539 j l312 j y________,____,_______,_______, N

j 234 j 6 / 2116 j 1838 j N

J

,________,____,_______,_______, j 235 j 1 j 52 j 312 j J

y________,____y_______,_______y N
O

( 235 j 2 j 310 j 68? ( ,________,____,_______,_______, '"' j Z38 j 1 j 660 j 64 ,________,____,_______,_______, j 246 j 1 j 1 j 270 j ' y________s____,_______,_______, ( 248 ~ 1 j 3 j 362 j ,________,____,_______,_______, N

248 j 2 ~ 443 j 1222 j ,________y____,_______y_______, 254 j 3 j 2789 ( 792 ,________,____y_______y_______, j 258 j 2 j 1179 j 1616 j ,________,~___,_______,_______, j I60 j 3 j 1770 ( 2123 j ,________,____,_______,_______, j 263 ( 1 ~ 653 ~ 177 j ,________,____,_______,__.___.., ( 263 ~ 4 ~ 2244 ~ 1900 j ,________,____,_______y_______, b j 263 j 5 ~ 3569 ~ 2973 j ,________,____,_______,_______, j Z66 ~ 1 ; 1 j 342 ,________,____,_______,_______, fA

j z66 j 2 j 177 j loaa j ,________,____,_______,_______, j 270 j 2 j 1I24 j 16B1 j ,____.___,__.,.,__.____,__.____, j 27z j 1 ~ e57 j 1a6 j y________,____,_______,_______, 275 j 2 ( 168I j 2295 j ,________,___~,_______,_______, TABLE 3 S. pneumoniae - Putative coding regions of novel proteins not li~lar to known proteins __________________________ ContigORFStartStop ( ~ ~

ID ID (nt) (nt) ~ ~ ~

~D

__________________________y ~

~

~

__________________________ ~

~

~

~

__________________________, ~

~

~

________y__________________ ~

~

~

,________,___________,_______, ( ~

~

________y__________________ ~

~

( ____________,_______,_______ ~

~

~

__________________________ ~

~

~

__________________________ 294 o ~

~

~

~

y________,____y_______;_______, N

~ N

( ~

'J

__________________________ ~ J

~

( 8d3 ________,__________________ N

( ~

~

~

___________________a_______ ~

~

~

___________________,_______ ~

~

~

,________,____,_______,_______, o ~

~

( __________________________ N

~

~

~

________,___________,_______, ~

~

~

___________________,_______ ~

( ~

___________________y_______ ~

~

~

__________________________ ~

( ,__________________________ ~

~

~

__________________________ ' ~

' ~

__________________________ ~

~

( ________+__________________ ~

~

' __________________________ J

3d5 ~ r ~

~

__________________________ ~

~

~

____________,_______,_______, ~

~

~

,__________________________ TABLE 3 S_ neumoniae - Putative codin re ions of novel proteins not ~i lar to known p g g ~n~i proteins ,________,____,_______,_______, Contig StartStop ~ORF ~
~

ID ~ID (nt) (nt) ( ( ~0 y________,____,_______,___..___, 350 ~
2 ( 81 ~

,________4____,_______a___..___, 355 ~
1 ~ W
44 ~

,________,____,_______,_______, 358 ' 2 ~

( 448 a________,____,_______,_______, 360 ~
2 ~

( 628 ,________,____,_______~_______, 361 ~
2 ( ( 1265 ,________,____,_______,_______, 378 ~
1 ~

~ 1004 ,________,__.._,_______,_______, 379 ~
a ~
s83 ~ slo ,________,____,_______,_______, 381 ~
1 ~ CZ

~ 693 ,________,____,_______,_______, ( 385 ~ 1 ( 150 ~ 4 ,________~____,_______,_______, N

3es ~
a ~ N

f 30 'J

,________,____,_______,_______, H

J

N

O

H
J
O
N
ro n H
a (1) GENERAL INFORMATION:
(i) APPLICANT: Charles Kunsch Gil H. Choi Patrick S. Dillon Craig A. Rosen Steven C. Barash Michael R. Fannon Brian A. Dougherty (ii) TITLE OF INVENTION: Streptococcus pneumoniae Polynucleotides and Sequences (iii) NUMBER OF SEQUENCES: 39 1 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Human Genome Sciences) Inc.
(B) STREET: 9410 Key West Avenue (C) CITY: Rockville (D) STATE: Maryland (E) COUNTRY: USA
(F) ZIP: 20850 (v) COMPUTER READABLE FORM:
fA) MEDIUM TYPE: Diskette, 3.50 inch) l.4Mb storage (B) COMPUTER: HP Vectra 486/33 (C) OPERATING SYSTEM: MSDOS version 6.2 (D) SOFTWARE: ASCII Text (vi) CURRENT APPLICATION DATA:

WO 98/18931 PCT/US9'f/19588 ' (A) APPLICATION NUMBER:
(B) FILING DATE:
(C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE:

(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Brookes) A. Anders (B) REGISTRATION NUMBER: 36,373 (C) REFERENCE/DOCKET NUMBER: pg340P1 (vi) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (301) 309-8504 (B) TELEFAX: (301y 309-8512 lso _ (2) INFORMATION FOR SEQ ID NO: 1:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 5625 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID
NO: 1:

AACTTGACTA

TCAGTAGTTA

CGACGGGCAT

AGAAGTCGGT

CGTGTTTGGA

GAACGGACGT

TAGTGGAATG

AACCACGTCA

GCCGTGCGTA TGGTTACTGA CTTCGTCAGT TCTATCCACAAGTGTTTTGAs40 ACCTCAAAAC

TTTGAGCAAC

TTGTTAGAAG

TATCCCATAG

GTTTCTTTAG

GAAATAAATA

TATCTTTATT

TTGCTACCTA

TAATTTATTA

TGCATTATAC

TGGATGATTT

GTTATAATCA

ATCTAAAATA

TCGATTTGTT

ATGATAAAAT

CTATTGAAAA

WO 98l18931 PCT/US97/19588 CAAGTCGTTC

AATATTCAAG

ATAGTTTCCT

' AAGAGGTGGT CGAGTTGGTT TAGGTAGTCG ATGCGTGAGTCAGGGTATGG1680 TGATAATTCT

TAGAGACAAT

AGTTAATACC

ATTTTTGATT

CGCTTGCATT

AAAATGTAGA

GCTATTCCTT

AATTTTGTGA

CCTGAAAAGC

ATTTTAGGAT

GGAATTTTGA

CGGCAAGATT

GATTTGGCTC

CTCTTTCATA

GATCCCGTGC

CTCCACAATC

AAGGATAATA

TTGACCACTC

AAGGGGCAAG

AAGACTCTCT

CTGTCTGATA

CGCTACCAGT

AAGATGGTGG

GATGATCAAA

TACTTACCGA

CTTTTATCTC

WO 98l18931 PCT/US97119588 ' TATTCTCAGA GTGCTATACT GTAAGTGTAA TCGCCGATTTAGCTTAGTTG GTAGAGCAAG5220 (2) INFORMATION FOR SEQ ID NO: 2:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 7571 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:

AAGAAGTCTA AAAATATCTC

TGTAGTGTAC TTGCCACAAT

GTTTCCGCTG ATCTTGATTG

TTTTTTTAGT GCCATAACGC

AAGCATCTTA GCGAAATAGA

TCTTCGCATC TGAGATAGCC

TATCAGACAA ATTATCATCT

TGGCTGGTGC AATTCCATTT

TATAAGGTAA AATGGTATTG

AGCCAGCTAG ATAGTGGTTT

CATCCGGAGT ATAACCAATT

CAGTTGTTCC AGTTTTCCCT

ATAACAGCATTCTGCAAGTTTTTACTGATGTCAGTCAGCTCAACATAGGTTCCCTTTTGA.2160 WO 98/1893l PCT/US97/19588 lss ATTTAGCTAT TTCCTATCCAAATAGGGCTTTTTTTGTTACAATATCTGTA TGCAATTCAC2s80 CCACAAAGCC TTGCTTTCTATCAACTCAAGAATTATTTAGCAATTTTTGC GAAGTATTCA3s40 TCAGCTGCTT CTZ"fCTTAGCAAAGAAACCAGTAGCTTCAAGAACGATTTC TACACCGTCA4260 1s7 ' (2) INFORMATION FOR SEQ ID NO: 3:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 26385 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double 1s8 (D) TOPOLOGY: linear -(xi) SEQUENCE 3:
DESCRIPTION:
SEQ
ID
NO:

CAGATGGCGC

TCAGCAATTC

CTACTCGCGC

CCCTTCCAAT

TCCCAGACAT

ACAACACAGC

AAATCGTCAA

AAATCTTCAA

GCGGTTTGAC

CATCAACAGG

GAAACAGCGA

ACTTGAAAAA

GTAAAGAGCT

TGCCAAATGG .

TTTCTGGCTA

CTGGTCACTG

AAGAATTTGT

TTCACTCTTA

AAAAATCTGG

ACCTATTCGT

TCGATGTTTA

CTCAAGATAC

TTCGTGCAGC

TCTACGGACC

CTACTATCCA

ATGGCGAAGA

TCACAGCTAT

AAGTAACCCT

AGAAACTCCG

ACCAGCAAGA

AACGTTCGTC

GCTATCCTAG

GCCTCCAATC

AAACTGAATC

CAGTTCTTCA

ACGGCCATTG

CCCGATTCCC

GTATCTTTGATACAAAGCTC TTGGTCATCCATATAAATCT CCAGACCACC 390d TTCCTTGGTG

TTTATCCGTC

AAAGAAAAGA

TAAGACCAAA

GGAGTCGATT

TGCAACTAAG

GTAATCCAGC

CTCACGCGCT

TCCATAGAGA

TAATATGGTT

AAATAAAAAC

ACTACGGGAG

ACCCTATTCC

GCAAACGAGC

GTTCCCGCAT

AAATCTGGAA

ATTTGGTATT

AGGAACGACG

TCACAAAGTC

AAGAAAGAAA

CATTAAACAA

GACTCAAAAC

ATAGAAGCAA GATATTCATT CCACAAGGTTCCATTATCGGTATGATGCTG GAAATTCCTG6l20 AAGTAGACAC AACCTT'fAAA GAATGGGTGGCCACAGTCCCTGACGAAGAA CTTCAGCTCT6300 AAATAACTTG AAATGAGGGA TAATAAAAATAATACTGGATTCCACAAACT TCTATTATCC6$40 WO 98/18931 PCT/US97/1958$

CTTGCTTTCCTTCCCCTCGAGGGCAATGATTATCAGCATA.TGAGTCGCAATGGTAAATCT7380 TTCATCCACTCCTATCTGCCGAGCATCTGCCAAAACAGCC TCCAAGGCGGTGGTATTTC'C8280 CGTGGAAAGA

s GGAGAACTAC TCGTTGAAAT CAATAACCTC CCACTAGCTGATATCAAGGAAGCTGGCGCC8880 GCTCAAAGTG AAAGTCATTG AGCTTGCGAA TGACAGTTGA AGTTGAAATGGCCAGCTGAT1d080 GGGCAATATC AGTCATAGAA ATTTTTTCAA TTAACTTTTG AGCAATyTTTTGGTTGATGA10140 CGCTTTTCAT

CTATAGTGGA

GTTTTCAAGC

CTGAAACTAC

ACAACTCTAT

GTTATAGATT

TGTTTTCTTA

ACTTTTAATA

AGTTAAAAAA GATTTGAAACTAAATTCCAA ATTAGAAAAA ACTAP~AAAAA10980 GACTTGAAAT

TAGAGAATGA

AGGAGATTTA

AACGAAGTGA

AAAAAATAAA

TTGAACTAAC

AGGAGATACT

CACCGCTTTG

TTGAAGAGAT

AAATGCTAGA

GAAGAGATGA

TTTGCAACAT

ATTGCTATCG

GCAGGGGCTT

GAAGAAGCCA

GAGAAAGATA

GAATGCCTCA

CCTGTGAAAA

AAAAAAATGA

GCTTGTTCAG

GAGATTACAA

GCAGATGCTT

' ACAAGATAAA TTTAACGAAA CTGCCTTAAA TGAAGCAAAA ATCGGAGATGATTACTACTC12480 CTTGAACTTC TACGTACGTA rTGGTGGAGG AAGCCTCTTA ACAAAAGATCTTAAAGCAGA12720 _ CAAACAGTAC

GTGTTCAAGT

AGAACTGTAT

GCCAATGCTA

TTGGGTCGTG

ATCTACTTTG

CCAGTATTTA

TTTTCCCATT

ATCCAACACG

ATTTGCACTT

TTGCGATTAT

GAGCAATCAT

TTCCCTATTC

TGGTTTATCT

CAGTTCCAAT

TTTATAAAGT

TTATCAATGC

TGACAGTAGC

TGATGGCAGC

ATAAGATTGC

GAATAAAAGA

GGGAGTCCCC

TATTTTTCAA

ATTAGATAAA

AAACTCTCTT

CTTTTCAATT

GGGAATAAAT

AGTTGAAAAT

TAATGGAATA

TGGTATAGAT

1s7 " TATCAAGTAT TGRTGCACGT TATGGTGGGA CTCATGATTC TAAAAGTAAG 16020 ATTAATATTG

ATAACGATTT CCGTTATACA GTTAGAGAAA ATGGTGTCGT TTATAATGAA ACAACTAATA1b380 AACAGGTTCC TATGAATGTT TTCTACAAAG ATTCGTTATT TAAAGTGACT CCTA(.'TAATT16S60 168 _ GAAAAACCAA

AACAATGTAA

AAAGAACTCA

GGTTGGGAAG

CACTTGTATC

GAAACTGTAA

GATGATATGA

GGTTCAGCTT

CGCTTAGACT

TTGATTCACG

ATGGATGGAG

GTCATTGATA

GAATTTGCAA

TGTACCCAGT

TTGGTATTCT

TTGACTTTGC

AATTTTAAGG

TTAGCAAAGA

TTCCTCATGA

CTGAGCAAGG

AGGAAGTCAC

CCTTCATCAC

TTGCTCTGGA

GTCAGGTTAA

AAGGGCTAGC

CATCCTACAG

CTGGTGTAGA

TTGAATATGG

CAGAACGCTT

TACAAAAAGT

WO 98/18931 PG"T/US97/19588 ' CTTTCAAACC CAACCACGAT AGCAACTTATCCAATACTAG ACAATAGTATTTTTCAATCA19440 ' TGTGTGGGAT TATGTATTGG CTTAGCGAAACGAGATAAAG GAACCGCTGCGTTAGCAGGA19560 17o GACGGTGTTG

ATAAAAGTGA

ATCGAAGCAG

GGTAAAAACG

GTCAACAATG

ATTAGTTCTG

GCAAACCTTG

AAAGACAAGG

ACACGTGCCT

TATACAACTG

GACAATTTGA

AACGGTCAAA

CTTTTAGAAG

CCAGCTCTTG

GTCGCTGCAT

GACGTAGTTC

AAACGCATGG

GATGGATTTG

GACGAAAAAC

AAAGCTATGA

GAGTGTATAC

AATGTAAGAA

AATTAAAATG AAGTTCTTAC ATAAGCGAAT CATAAAAAATTTTTAAAAC:A22440 TTCATTTTGA

TGCCGACTGT

TAGCACCAGT

TTACAAGTTT

TCCGTATGTT

TTGGATCTGT

AAAATGTCAT

GTGTTGCCG't WO 98/18931 PCT/US9?/19588 ' AATATTGACA ATTCACTGGTTGAAGCGGCGCGTGTTGATG GTGCAACTGAGTTTCAAGTT23100 ACAG.AAAAAA AACCATTAACAGCCTTTACTGTTATTTCAA CAATCATTTTGCTCTTGTTG23460 17z _ (2) INFORMATION FOR SEQ ID NO: 4:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2716 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double ' (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9:

ATTCCATTTT TCTTAAAGGG AATTGTGAAA CGCTATGTATTTTCTTTTTACAACCGGATG980 .

GCCTACCGAG

CATACTCAGA

ATTCCAGTCA

GGGATGTTGA

GATGGGGTTA

GTTGAAAAAC

AAGCAGGAAA

TTGCTTAGTC

GCTGATGTTC

CTGAATGACC

GGGATGATTG

GCATCGACAA

GTCATTGCTC

TTGTACTATG

GACATGAACG

AAACGAGTGC

GCTAAAGATG

GTGGTTGCTG

ATCAGTATGA

AAACAATTAA

CATTATCTGG

TACGGAGCTG

(2) INFORMATION
FOR SEQ ID
NO: 5:

(i) SEQUENCE
CHARACTERISTICS:

(A) LENGTH: 13926 base pairs (H) TYPE: nucleic acid (C) STRANDEDNESS:
double (D) TOPOLOGY:
linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:

' GGGAAGTAGT TTAAAAATCA GCAATTGAAGATAAAATAGGATATTCCCTG CTAATTTAAG300 AT'1'T'TGCTCA AAAAAGAGCT CTGGGCTCATGGAAATCACTGTCATCTCGT CATGTTCCAC1800 WO 98/18931 PCTlUS97/19588 TAAT'TTATTA CCTTGATATA CAATATAATCTTTATTGTAGAATGGTATTAATTTTTCAAG4800 17s WO 98/l8931 PCTlUS97/19588 WO 98/18931 PCT/US97/i9588 WO 98/18931 PCT/US97l19588 1e1 TTTAAACTAA

AACCTATTAG

t TTGTAACTGC TCTTCTTTCTCTTCCTTGTC TAGTTTTTGT CCAACATTTC10860 TTGATTTTCC

TCCACACCTA

ACCAAATGAA

TGAGGCATGT

GCCTGCCTTC

AGTTTAGTCA

ATGCTCTGCT

AGGAAAGACC

TTCTCAAAGT

TTTGAGGTTG

ACGTGGTTTG

CGAAGAGTAT TAATCAACATAATCTAGTAA ATAAGCGTAc CATTTGGTCT11520 CTTTTTCTTC

GTAAGCCGCC

CTCGGCTCCG

CATCTGGACT

TTGATGAAAA

CCCAGTAACG

CAGGTGACAG

TGACAGGATA

GATAGTCTTG

CTAGAGGTTG

CCTTGTCATC

TATTTTTGCT

TTTGCTTGGC

CGTACTTGGT

GAACACGTGA

GCAAGTCTGC

WO 98l18931 PC'T/US97/19588 laz ATTTCTGTTT TTTTCACTGC TTTTCCTCCTCCGCCTTTTC AATTTGCGAG12480 ' TGGCTAACTG

AGAACATAGA

TTATCATTCA

TTCCATGAAT

TGGGTAAGAA

GAGATTTTTA

CTTATGTCCT

CTCATCCGTA

ATAGACTTTC

TTCAAAATCA

CTTCATTTCA

ACAAAGGAGC

TTCCATTTTA

CCATCACAAT

TTCGGAAATG

AGCCCAGCGT

CCGCCAAGGC

AGGTCAAGCC

GTTGTAGCCT

CAAGAAGGAT

CTAAAAAACC

CCAGAGTTCG

CATCCTTATC

AAAAGAAGGA

AGTTCCTCCT

(2) INFORMATION FOR SEQ ID
NO: 6:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 20199 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION:
SEQ ID NO: 6:

AAAAAACTAT

TATTGGGAAT

AAATTTAGTG

TTTTAGATAC

TTCGTCGTGA

GCAATGTGGT

GTCATTTCAA

CTGAGGACCT

AGGTTTTAAC

GTGGACGTGG

ATGGAGAACC

GTTTAGTAGG

AGCCTAAAAT

CCCAATCAGG

ACATCATTGA

ATAAAGAGCT

ATAAGATGGA

AAAATTATGA

AAGGTCTGGC

TGCTCTACGA

AAGCCTTTGA

TGAAACTCTT

AGCTTCGTGG

TGGTCCGCAT

AACCGATATC

AAAAGAAATT

AGATTGGCAC

TCTCAGTAAAGAAGCTAAAA AATCCCGTGC ACGGGATTTTGTGGTACGAC174d CTCATCAGAC

GCACAGCATA

CAGTAAAATC

AACTATATTG

TTTGTTATAG

GACTATCGTG

GGAAATAAGA

ATAATAAGGC

GTAACCTATA

CTGTAACCTT

GTGAGGTCTA

CTGTCTGATA

CTTGCTAAAA

CTCCTACTCA

TTCGAAAATC

CGTCAGTTCC

GTTTTGAGCAACcTGCGGCT AGTTTCCTAG GATTTTCATTGAGTATTAGT2700 TTTGATCTTT

TATCCAATAA

AGACCTTATT

ATGGTGGATT

CCTTAATTCC

TTTCGGATGT

ATGACGATGT

AAATTAACAG

AAGCGACAGT

TTAAGGCGTT

CTGCTAAAGA

CAACGATTAA

CAGCCCGTGA

' TTTACGAACA CCTGGAAGGG TTTATTGCTAAGTTGGAAGA AATGGGAGTGAGAATGACTG3660 .

GAAAATGAAA ATGCAGTTCT GGTTACAGGGGGATTTCAGG TTCATGATGATGTGCTTAAA5l00 WO 98/18931 PCT/US9'7/19588 1s7 ' TATTATACTT CCTGCGAAAC AAAATATGGTATAGTAGTTCTATGAATGATGAAGCAAGTA7200 AACAACTAAC TGATGCACGA TTTAAGCGTCTTG'M'GGTGTTCAGCGTACCACTTTTGAAG7260 ~:CTCCCCGTA AAGTTTCTAT TTTCCCTGATTTCTGATATAATAGAAATATTGACTTCAAG8160 lae _ ACTCTTGAAA TGGAAAGTAT TCTCAGCAGT CAAAAGTTGGCCAAGAAGATGCAGCAGGAAl0860 AATAGAGTTG

AAAACCACGA

AATACCGATA

TCTCTTAAAA

CTGTTTATCT

AAAAATCAAC

CATGAGATAG

AAGTCAGAGA

TGAGAGCCTC

TCAGTTCAAA

CTATCCATGA

TCAACTAGAC

TTAATGATGA

GTCAGTTCAT

TTAACAGAAG

AATGCCATA'r TTCAACGGTA

TGCGCCATAC

GCCAGTAGAG

ACCAGGAAAT

CTGAATGCCC

TTCTTCTTGT

CTTCTTTTAGGGCTGCAACCATGCCTACAA TGGCAGGCAG CCTGCACG'!'T13620 ATTTTCAGTT

GTCCATGCTA

AGCAGTGAGA

AACTGCATCA

GGGCAGTAGG

ACGTAAAGCC

TGCTTCAAAC

GATGGCAGTT

AGCCAATGAT

' TCCCACCAGA AGTGAAAAAGATATGTTGAG TAGTAACTGGGCTAGTTCCT19160 GTTTTGTCCT

GACGACCATG

TAGCTGAAAT

AATCACCTTA

TACGGACGAT

TTTTTTCTTT

GAAGCTCAGC

CTCCTTCACG

CATCAATCAA

TTGCATCGTC

TACGCGCACG

ATCCTTTATC

TATCAAAGAA

CAGCCTTAAC

TGCGGTCTTG

CACGCACACA

TTGTTGATTG

TTTCTCCGTC

CAATTTTTTG

CTGACATTAT

TATGAACTGG

CAGCTATTTT

ATCTGATTTT

TCAAGCTCCA

CCAATAGCTG

TTACTTTGGA

TTACTCAGAT

ACACTCTGCG

_ TTTCTAAGGG
ATTGCTGGCC

' TTGGAGGCTA TTGAGCTTCCATTAGATATT CTCTATGAGGATGACCACTT TCTAGTCTTG17820 WO 98I18931 PCT/US9?I19588 GTGGCTTTGC

CAAGCCAGTT AATGATCTTT TAATGCAGATGATGTTTACA AGTTGACCCT19490 _ CTCGTGGATG

TTCATCAATA

GTACGAAAAG

AAACTTCCAA

TCCGTAAAAC

CTCTTTGTCA

CTTTCTTTTT

AAACCAAAGG

TTAGAATAGT

AAGACAGTCT

TTCTCTGGTT

TTCAAGTCTT

CGAAAAATAG

TTCCAGTAGC

(2) INFORMATION FOR
SEQ ID NO: 7:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 19702 basepairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) 7:
SEQUENCE
DESCRIPTION:
SEQ
ID
NO:

TTTTCAAACG

GAAAAAAGACCCTAAGGTCTCCTTTGCTTT CGCGTTCAAC TTTACCTGf.'r120 TATTATTAAA

TTAGGTTTAC

GTTTGGTTCA

CATACTTTAG

ACATACCGTA

ATTTGTGTCT

AACAGAACAC

ATAGCCGTCA

WO 98/18931 PCT/(TS97119588 ' GTCTTTTTTC CATCAATTGGAACCATTCTC GCGGAAGGTCATCATTAAAAACATAAAACT780 196 _ CTGGAATGCC

TGGCAGAGTC

CTGAGAAGTC

ATTGCATAAC

ATTTAGCAAA

CAAAGTAGAG

CAAAGGTACG

TCTATTTTTC

CAACACTCCA

CCTCCCTTAT

AAAGGATGCA

TTAAGATTCT

ATGGTTGAAT

GACCGCAAGA

TTTAAAAACG

GGAGGAGATG

TCTCTAAGGA

GTGAGGAAGC

AAGCCATTGA

TTAAGGATAA

ACTATGAGCC

TCGTTGGAAA

ACGGAGCAAC

GCAAGGTTCCTGGTGGGTCA TCAAGTGGTT TGTAGCCTCAGGACAAGT'PC3720 CTGCCGCAGC

CCATCCGCCA

TTTCACGTTT

CTACTGTTAA

CTACTTCTGC

GTATGAAAAT

AAACAATCTT

WO 98/18931 PC"T/US97/19588 ' GCCAAGGTTT TGGTGAAGAG GTAAAACGTC GTATCATGCTGGGTACTTTCAGTCTTTCAT4320 TGCCAAGAAA

AAAAGCAGGT

CTTTGCCGAT

GGCtTTACAG

AAACTACTTG

TAAGGTTGGT

TATTTTATGG

CAAGTACCTT

TGGAAGCTAG

ATGTCCGTAC

CTATTATCGA

TGAATACAGA

TTTACATCCT

TCACCGTTCC

CTACTGTTAC

CAGACTTCAT

CTCGTGCCTA

TCGTTATCGC

GTGTTCACTC

GTGGCATTTA

AGTACGGAGT

TCCGTAATAT

CAGATAAAAC

AGTCCACTAT

CTTGAGTTCG

GATACGGATG

AAGGAAGAAC

TTAACGCTCG

ACTCCTCAGC

AAATCTGCCC

' TGAGATTATT GGTCGCTTCCAGTTTGGCATTAGAATAGTGTAGTTGAAGGGCGTTGACAA7860 TAGGCTCCAT AATATCTATAGGGGATTTACCCACTACAAATATTATAGAGCCAACAATAA9l20 _ AATCCCAAAG

GGTATTCATA

CAGGACAAAA

AAAATACTCA

ACCGAGAATC

GCTCTTTTTC

GAAACTTCCC

AATGATCGTC

CAAAAATATT

TACCCTGCTC

ATATAGGGAG

AGACCACCTG

GTCATTGCAA

TTTTTTGTCT

TTTCCATAAA

GTTGATAGCG

TTTGGAGTTC

TCACCTTATC

CTGCTTCCAT

GAATGGCATA

CCCCGCCTGA

TTTCCATGAG

GATAAGCACC

TCATCAACCG

GGTTGTCTTG

GCCAACCAAT

GGATAAATTC

GCGCTTCTCT

GCTAGCAATC

TAATAGGAAT

zoI

~ TCAGGTAAAA GAGACCAGCA TTTGAATGGCGTTTAACCGCCGCAGAGATCTTAGCCATCT11340 CTGGCAATTT TTCGACAGTC GCATACTCAAACAAACGAGTGATTTTTTCACCTACAACCG1l460 '~ TTTCAGCAAC AATCAAAGCT TCTTCAGAGTTATGCACTTCTCCATCTGAACCTGGTATAA12900 WO 98I18931 PCT/US97/195$8 i CGATTGGAGA

_ WO 98l18931 PCT/US97l19588 ' GATTTGGATA TCAAAAGCAT CTTCGATTTC TGAGATTACT TGGAACAAGTCCAATGAATC18360 ' TTCAACGATA ATTTCTTGTA CTTTTTCAAA TACTGCCATG ATAGGACTCCTTTAAAATAA18480 (2) INFORMATION FOR SEQ ID NO: 8:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 6211 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double ' (D) TOPOLOGY: linear (xi) SEQUENCE
DESCRIPTION:
SEQ ID NO:
8:

' TGGTAAGAGT AATCATGAGA TGTTGGGGGT GCAGTTCCATGCGTAGAATGTTAACAGCAA1800 WO 98/1893i PCTIUS97l19588 zoa _ TCGGTAATAG

ATTAACATTT

TTCTTCAAAT

AGAGCTGGTG

CATTAACAAA

CTTTATCCTA

ATCAAATAGA

CGAAAACTTT

CATATGAATA

GTGATGAAAC TGATAGGCAA TATTTTTTCA GTTGGATTTACAATTGGGAG4d20 CTGCAAAAAC

AATTGACAGA TCAACTAAGA CTAAT'I"I"I'GT CAATTTGTAG4080 AAATTAGATT TCCTCGTAGT

GTAGAATATA

TTGGTAGAAT

TCGATTCGTA

CTAAAAAACA

AAAAGATATA

TTACGATCAT

CCAGCGTGTT

GGAAGAACGT

CTATACTTGT

ACAAAAGATT

AGATACCTTT

CTTTATTGAT

GATTTTAGAA

TGGTGGCGGT

GGTTATCGGA

TCCAGTAAAA

TCAGTTGACC

TTGGTAGGTG TCGATGAGGG GAAACCTTGA TTGACCTTTACTCTAAGCAA5160 _ ATTGATTTCT

TGGAGCGGCT

' GATGATATCA CACGTTTTGA GTATATCAAA CGAGCTAGCAAGGGAACAGG CCCAGTATTA5960 .

(2) INFORMATION FOR SEQ ID NO: 9:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 7939 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9:

WO 98/18931 PC"T/US97/19588 zlo _ ATTCTTTTTC TACCATACCAAATACTTCTAAGGCAGCAAAAATGCCATCTTCTTCTAATG2100 _ z11 ~'-GTAATCAAC ATATTCAATT TTGTTTGCTG CGATGTAATCAACTTTTTTA CGGCGTTTGA3360 GAATGTCAGC

CTTCAACGTT

GACGTTT'TTC

CAAAACGTGC

GAATTTCGTA

CAAGGGGTAA

AAGTAAAAAC

GATACGGTGC

GAAGGTTACC

ACGAAGGTCA

GACAAGGGCA

TGGAGCAAGC

GAAGGCCTTG

AATCCAAGTT

ATCTTCATCA

GAATGGCTCT

TTGAGGCGCG

AAGCGACTCA

GTGACGGCGT

AAGAGCCATA

TCCGAAGTAG

TGGGCTGTCA

AGCGTTACGG

CTTACGAGAGCGTAgCCACA AGTGACGGTT AAGTCTGTTCCGTGTTCTTT5340 ATCCATCAAA

GATCACTTCG

GACAATACCT

CTCAAGTCCC

AAAGAAGGCT

GGCAACCCAA

TACACGTTTT

x TTTTTATTCT TTATGGCAAA CCACCTCTATATTGTTCCCATCCAC;GTCAATCATAAAAGC5760 CTAGGTCTGT CGCATAGCTG AGGCGGACATTTTCTGGTGCTCCAAATCCAGCTCCTGTTAb180 AGCAAGCGCT

GTATAAATCA

TTCACTTTTA

ACTGATAACC

TGCTTTGTTT

TGAATATAAG

CAAAGCTGTG

GTGAAACAGT

(2) INFORMATION FOR SEQ ID
NO: 10:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 9897 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE
DESCRIPTION:
SEQ
ID
NO:
10:

CTCACCATTG

AGATATTCCA

AGTAAAGTCA

TGTTAAAGTA

CTTAACAGTA

AGGTTGTACT

TACTTCAGAA

CACTACAGTA

TTAAAACTTT

GAAGATCTTC

CTTTGCTATC

GAACTGGGGA

ATTTCTTATG

TTGCAAAAAC

TATTTTCTAA

AATCAGATTG

AACTTGAATC

CTTTATTTCC

TCAATGAATA

TGACGATGGT

CAAATCCAAG

ACCATTCGTA

ATAAACCTAC

TTTTAGAAAT

TTGAAATAGT

CTGGTAACTG

CACTATCTGC

CTGTTGCAGT

TATTTTGTGT

z17 TTTTAATTGT

GAATTTTTCt TCGGCAATTACCGGAATATT AAAATCAGCC TTAGTTCAAA4560 AATTTTTTCA

GATAATGTTG

TTACTTACAT

TCACTGACAA

GTTCCGCATT

GGTGGATAAT

TTAATATCAC

GCCATAAAAG

CAAGAAACAA

CTCATTTTAT

ACTTTCCAGA

TATGCGATTG

ACAATCTTCT

AGCTTTTTGT

AAGGCAATTT

TTCACCAATC

ACTTCCTTTA

AAGTTGAACT

AACTTTTTCT

GTATTTGAAA

TGTTGCTTTG

TTGCTGTTTT

AGCTTCTATC

AACGTTTACA

TAACACTTTT TTTTTTTTTCAATATTTTTC ATAAATTAGA CAATTTCTTT600d AACTAGTTTC

TTTATTCTTT

ATATTAAAAT

TATAATATTT

z18 GTAATATATT

AAATCAGGTC

TGTAACCCCA

AGTGAAACGT

ACACTTGTCA

TACAGAAATT

ACTTGATGTT

CTAAAACATA

GTTTTCATAT

AGAGACGCAC

AAGTATAGAG

TTCATAGTTA

CTACCCATGG

AACGATTCAA

TCCAGCTGAT

TATTrCTTCT

GATAAATAAT GTGTTTGyGCCATGTAAATC AATTGTTTCG CAATAGAGCT7200 TATCTCTTGG

TTGAAACCAC

AATCCGTTAA

CAAAACAATC

CTGAATGACA

TTTGGCTATT

TAGAGATCAA TCATGGGAGACCTCCAACAA ATTTGCTTCC CTGAGACGhT7560 ATTTGATATT

TCTTCATCAT

CGATAGTCTT

ACTCCGTTCT

TAAAATAGTG

TTTTT'TATGG AAAATGTTAC CTCAAACTCA TGGGCATCAA7860 TAATCATCTA CATGGATAAT

AATTCTGTTT

TTCCTCTATT

TGTCTAATCT ATCTGGTGTC ATACAAGGAA TCGCAACTTTAAATCCTTCTCCTTTACCAC864d ATGCTTGACA AGGGAATCCC CCACAGATGA CATCGACTTTCCCTCTAAGTTTTTTAAATT912d CTCATCATCA

ATCCCTCCTT

ACTGGTCTTG

(2) INFORMATION FOR SEQ ID
NO: 11:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 8148 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: 11:
SEQ ID NO:

TTATCGTGGA

CGATAAAACG

CGCCTACAAT

AATCAAACGC

CAAGCAGCTC

TTCTAGGATT

AGCGTCGTTA

GGAGGAGTCT

GTAAAACTGA

ACATCTAGTT

CAAACTCACC

TGAGAGTTTT

TGCTCTGGCT

GGATATTGGT

TTTAGTATAT

TTGAATAGTG

CTGACCATAA

CTAAACGGGA

CATGAAACAA

TTAATCGGTG

ATTGAGGATA

WO 98/18931 PCT/US9?I19588 ' ATGCTTCTCT TGGGAGTAGA CGGCTTTATT ATTCAGCCGACCTCTAATTT CCGAAAATAT1380 TGATGGTTAC GACTAGGAAT ATTGAAAATT TCCATTGGAC AGGGTTGGTT AAAAG'i'TGTG2220 WO 98I18931 PCT/US9'1/19588 ACACGATTTG

' ATGAAAATAT TAGAATAGCG GAGTAAGATA TGAAGTGGAC ATCCGTTATG4920 AAAAAGAGTA

AGTTGGAAAA

TGCGGTCATT

GGTTCATCAA

AGTCCCGATT

TAAGTCAGCA

GGTCTTTCAT

TAATCTCAAG

GTCTATCAAA

CAATCGTAGT

TGAAAATGAA

CAGTATTCTT

TGTACCCGGA

CATTTCCAGC

AAAAGGAAAG

AACGCAACGA

GCGTGATATC

GGCTGCTGCT

AAAAATTGCA

AACGATGACT

GCAAAAGATG

GACCCGTTTC

CATTGCGCAA

CTTTACAGCT

TGATCCAAAT

TAAAGATGCA

TGGAACCATG

AGCCCTCAAG

CTCAGATTGCTGACGAGAAA AATGGTGGTT GTTAACCGAC TATATTTTCT6600 ~
ATCTAGTCGG

CTGATTTTAT

GAGAATTTGC

AAACAACAGC

TCAAGGATAT

TCGTTGTGGG

CCCCGAACCA

GTTGGACAAA

AAGGTTCGGG

TTAAAAAAAT

GTGACATTTT

CAAATTACAG

CCTACAATGG

CTATTTAGTA

GGCTTGGTCG

AATCCTAAGC

GAGCGCGGTG

ATTTTCTTCG

GTTGACGGTT

TATGTGCAAG

GGTTTCTTCA

GCCCTAGGTG

ATTATCGGAA

AAACTGACAG

GGGGGATTGT

GGAGTGGGAC

CACCCTGATG

(2) INFORMATION
FOR
SEQ
ID
NO:
12:

(i) SEQUENCE
CHARACTERISTICS:

(A) LENGTH: 9909 base pairs (H) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION:
SEQ ID NO: I2:

CTTTTAAGTT

TTTTTAGGAG

TCTGCGATTT

CCTGTTTCCT

CTAGATAGGC

GATTCTTCTA

TTTTGTTTAG

GAGATAATAG

AGATAATAGT

TGATACTCAT

TCATCTGAAA

GATAATGGAA

CCACAAGTTA

TTCGATTCTA

CTTATTTTTG

TTGGCAAAGT

CCTAACAATT

CGTAnGCCAA ACAATTGATT TCTTTGTCGT TCGATCTTTT TTAATAAGTC10S0 AAAAGAATTT

GTCATCAGCA

TTTACTTGAA

CGACTAATTC

ATTGACCAAC

ACCAAAGCCT TTCCGTGTCG TCTTGGGTCT TCCAAAACAT ATAGTTZ'GTA1380 TGGTTTGTAA

TGATAAAACG

ACTAA'I"M'AT ACAAATTATT CATCCTTCAA GCCTAAATCA CTTCCCAAGT1500 TGCATCATTT

_ AATGGGTTCA ACTCCTTTTT CCAAGTCTTC TAAATACTCT AATCTGCCAC1560 TGATAGGCTA

WO 98l18931 PCT/US97/19588 CAGTTGTCTTCATTCTGACAGTGATTTTACCCCCATC'I'GGCGAATACTTAATAGCATTAT2940 ' CTCGAAGGTT CAAATATTCG CCATTGATATCTTGGGAATCTAGCAACAAT TCTGGACTTT3480 _ TTCTATAAAG GGGAAATGCC AAAAACCTGCCAAGAGCTTTTCGCTTTCAT TTTTTTCAAG5100 WO 98I18931 PCT/US9?l19588 ACTCAAATTT

GTCCAGACTA

TAGAAAGGAA

AGTCGGCAGA

CTCAGGCTGC

TCCACTTGGA

AAAAAGCAGA

ACGGATTTGC

AACTAATAGC

TGAAAACATT

TGGAGACCAC

AACTATCGAT

TCGTGATACA

CCATTGTTCA

CACTTGGATT

CAAGCTACCT

TCGTCCCACG

AGACTGGTCT

GTTCcTTGAC AAATACTATC TTTGTCCTTGCTTCCTATTT GGAAATGTTT9780 GAGGAATCTT

CCGTTATCTC

TTGTTCCACG

(2) INFORMATION FOR SEQ ID
NO: 13:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1126 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:

' TCCAGCAGAC TTCCCAGCTT TAGAAAATGG TTTTtCTCAA GCAACT 1l26 (2) INFORMATION FOR SEQ ID N0: 14:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2520 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14:
CCGGCAACAA AAAAGAAAAA ATCAACAGTT F~AAAAAAATC TAGTCATCGT GGAGTCGCCT 60 AAATCACCAAGGATGCAGTC AAAAATGCTT TCGTAAGATCGATATGGACT420 ' TTAAAGAACC

TGGATCGCTT ~

TGTCAGCAGG

TCAATGCCTT

AACAATTTCA

ACGAAGTCAA

ATAAGAAAGA

ATGCTGCCAA

ATGAAGGAAT

CGACTCGTAT

GTAGCAAGTA

ATGAGGCTAT

ACAAGGATCA

CAGCGGCCGT

CCAATGGTAG

ATAAGATGTT

AGCAACATTT

AGGAAAATGG

GTTATTATGT

ATAAGCTCAT

AAGGTAAACT

TTTACAAACC

TTAAGGATGA

TTGGTCGTTT

CAATCGTGAA

GAAAAACCAA

CCTCTTGGGA

AAAAAGTCCG

AGATGGCTCT

TTGTCCTTTC

WO 98l18931 PCT/LTS97/19588 (2) INFORMATION FOR SEQ ID NO: 15:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 20993 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE 15:
DESCRIPTION:
SEQ ID NO:

TTGGGATACC

CTACGATATC

TCTTGTCCTG

TCAATTCCTT

TTTCTCGACC

CAGTTTCTTG

AACGAATCTC TGTAATCCGAATACCTGGCG ATAGATACTGGCGTTAGcTA420 TATCATCGAT

ACTCCTCATC

CTAACATACG

TGTCCAACTT

CTGGACGAGC

AATCACGATA

GATTTCCAAA

TTTCACTGAC

ACGCTTCGTA

CTTTTTCTGC

CAATCTCAAC

ATCACCTTGA

GGATAAATCG

CTCAATGTAT

GGATTGCTCA

ACTTCCCTTC

GATTGATTAC

CTCGAATCGG

GCAATTCTTC

CTTTTTCAAC

CTTCTGCAAT

CTACAGCTTG

ACCCTGTTGG

AGATTACTTT

TTTTTCACCT

GCCTCCACCG

TGAGATAGAG

CATGGCTAAC

CTTAGCCTCT

GACCTCACGA

TCCGCGCGTT

ATTTTTAGTA

ATTTTTCTTA

ACCACTTTCG

GTCAATAACA

CTTTGAATGG

AACAGACAAC

TGGAGACATT

CAACTGCATA

AAAAACCTGA

AGTCCGTGTA

WO 98l18931 PCT/US97/19588 . ATCCCATACT TAAGGTCAAG GGCAACTGTC TCTGTTTCGACTCTTCTCTG AAAGCATCAA3000 CGATTTATAGTTGCTCCTGTAGTATAGATA TCATCTATAAGTAGGATTTT TTTAGGAATA4620 ' WO 98l18931 PCT/US9'7l19588 GTTCTTCTTC

TCATACTTCT

TGGAATTTAG

TTATCTGCCA

TCAAAAAAGA

GTGAAATTAA

TTGGGGTACT

GTGGTATTAA

CTGTCAAAGA

CTTATGCTCA

ATACAAACTT

CTATTAAAGC

TACGAGAGGT

GACATTCTCA

AAATGAAGCA

AGCTGAACGC

CTTTGATCTA TGATATATAG AAATGGTATG TACTAAAGAT ATCTTATACA?380 GATAGCGTTA

CATTACTGAA

TGCTGCAGAC

TATTGCCCTT

TATTGTTGAA

AGGGTATAAA

CCAAGCTTAT

TGCTAAGGCT

TCCAGCTAAT

TAAAGATGGA

TTCTCATGCA

ATGTGCTATT

ATTTATTCCT

WO 98/18931 PCT/L)S97/19588 CTATGATGGT ATCGTTCGTGTAACATCAGA GCACTCGGACGTGAAATTGG8160 ' TGACGCTCTT

CTCAGCTGCA

AGTCCTTGCC

GTAACCGTCC

TCCTAATCTG

GGCCTCTAAA

GGATAGATAT

ACATGATGGT

TAAGTCAGCT

GGCTGAGTCA

GTCTTGATAT

TCTTCTTCTT

ATAATAGAAG

AGCTTATCCA

TACCTTAAAG

GAACCCCATT

TCGAACTTTG

TCGGAGACGG

CAATAGAAAT

AAATCGATTT

TATTTGGGGA

TTTGACATCC

TATCCAAAAT

AAAGAGAAAG

TAATGAGTCT

ATCACTTAGT

TATACTGGAA

GAAGTATAGT

TAATCGAGTG

GGAAGCAATT

(2) INFORMATION FOR SEQ ID NO: 16:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 8411 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) 16:
SEQUENCE
DESCRIPTION:
SEQ
ID
NO:

TGATGTCGTAGGTTCGAGTCCTACTGGCGG AGTAATtGATAAAAGGGaACACAGCTGTGT720 CCT'TTATTATTGTCATGATCGGGATTTCTC TTATTCCAGATCTGTACAATATCATATTTT1500 WO 98/18931 PG"T/US97/19588 ACGCATATAC

CTCAGGGCGC AAGTCAACTA AGTGAAAAAA ATGCCACCTTGACAGGTAGT TTGGATAAAC2l60 ACGTTAGCAAGTTGATTTAA AAATGAGGCCTGATTATCCA AGGTATGTTCATTGAACTTG3840 ' AAGGGAAATAGTCCAACAAA AATCATTGGGATGGCCCCAT ACTTTGTTGTGTCAAAGGAA48d0 ' TAATGCAGTC GTTTGAATATCATTTTGTCC AATAAGT~CTGTCTTATCAT CTGGACGCAA6300 CTGCCTTGAC TTAGAAGCATTGGCAGAGAA ACGAGCAACAAAT't'CTTGCA 6540 ATTGTTTAAT

AGAAAATACA

ATCCTTTACA

CATTAAAAAA

ATTATTTTAA

CTCAAAAATC

GACCAACAAG

AATGTGGCTT

CAAAGCCAGA

GCACGTTTGG

AGCATTCCGA

AAGGCTAATT

ATTGTTCGTT

TATCCAGAAA

AAATCACTAA

ATGAGTATGT

AATATGGTTT

GATATGAAAG

CTTGAAATAT

(2) INFORMATION
FOR
SEQ
ID
NO:
17:

(i) SEQUENCE
CHARACTERISTICS:

(A) LENGTH: 9064 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:

WO 98118931 PCTlUS97119588 AAAGTGAGCA

GGATAAATGC

ACGAAGATAT

_ GAAAAATGAG GGAGAGCAAA TAAAATAGAA AATCAAGACC AAATCGCGAA540 GGAAGAAGCA

CTATAAATCA

TCTGCAAAGT

ACATGTTTAG

ATCGAATACC

ACAAGAACCA

AAATTGCCTA

AGACTAAATA

TTTTTCATAG

TAAGAAGACA

GCATTAAAGA

TAAAGTTTGA

ATAAATTTAA

GCAAGAAGGA

ACGATAGAGA

ACAi4AAAATA

AATAGAGCCT

CTTCAAGCCC

ATTAGTTGTT

AAATCTGCCA

ACTTATATTT

GCCAATAATA

CACTACAAGA

TTACTCCTTT

AGCACCCGTT

AATCTTGAAT

., GAAATCGTAA CACCACTTTG AACAAGAGTTACTTCAACCC ATTGGCTCCGACGGAGTAAG4020 AAATCTCCCC

GAAAATTGTG

AAATCAGAAT

CACCAACCAT

CAAATGCACC

TTATAATTCC

CCCCTTCAAC

TATTCATGAT

AACAAATAGA

TCTTCTTAAA

AAAATATCTT

CTTGCAGAAA

CAATCCATCA

TTTTGCTAAC

AATAAGACAG

GAGGACATAA

GTCGCAGCAG

CTTCCTCCAC

TTTGTATTCA

ATAAGTTTAG

AACGTCTTGA

CTAATCACTT

AACTTAGTTT

TTATAAACGT

CTCGATTGCT

GACCAATAAA

TGAAACAAGA

ATTTATCGTT

AAAGAAAGAC

AAAAGAACGC

2so (2) INFORMATION FOR SEQ ID
NO: 28:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 7780 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: 18:
SEQ ID NO:

CCTCTCTGTT

AATTTTTATC

TCCCTTTTAT

ACAACCCATT

TGGAAATAAT

AGGTCACGGA

GCGTCGTCAA

TCACGTCCCA

ACAGCTGGAA TGCGCTCTTC CCCCTTCATACCTGGGCAATGGCTACAGCGs40 CGGATGGTTG

GCCTTAAGGA

ATTTCAAATG

ATGACCTCTT

TGTGAGAGCA

TCAAATTGAG

CCGTTTCCTT

CTCCATCTAG

TTACCAGAAG

CTGAAATCGG

CCAAGACACG

CATATAGACG

CTTCATCCAG

CATAGATAGT

ATTCAACCGT

TTGTGGTCAC

AGTTATATGT TGACATGGCT
TCTCCT't"1'AGGCAGCGGTTAATTTCTTGTGTAGATAGCTT2460 WO 98/18931 PCTIUS9?l19588 ATAATAACGAGGAACCGCAC GCAAGCTATCCGTTGTCATAAAGGTTACGGTCGGCAAAAT372p TTTCCAACATGGATCGTGCC AAATTCATCTGCCGCTACTTCAACCAAGGG.TTGCAAGGCA4980 CGATCAATCC

AGAGCAAGCT

GAATTTTTTA

TTGATTATTT

GATAAAATCT

TTTTTCCCCT

AAGGTATCTC

ATTTTATGAG

CTTTGAAAAT

CTTTGATTTT

CCTACTAAAA

CATTTGAATC

AAGCCCAAGC

AGCTGAACGG

TGTTCGACCT

TTCTTAAAAG

AAAAGTTCTG

ACCTTCAAAT

CTCTTACTTG

AACTGACCCT

CCAGCTACCA

AACCAATTGA

TGACCACCTT

CATTGACTCG

CCTTCACCAG

ACCAAGCGGT

TTTTCAACCA

GAAATTTCAT TGGAAnCAAG TAGCCCCTCC AGGCTGCCAGTTGAGTTGAT6660 CCTGCTAGAT

ACCTTCATAC

AGGTTCTTGC ATGCTCAGGC

ATAGTGGCCC ATCAGAAAAG

GGGATTGACC CTCCTCAAAA

CTCATCTCCT AACAAATCCT

TCTTTTCCTT TTCCTCCTGC

GACCGGCTGT TTCACCAAAA

CAGACTTTCC ATAATCCCAT

GGTCAACAAA AAAACTAATG

AACACCAAGA CAGCCCCCAT

ACTATTCCCT TGGTTTAGTT

ATCATTTAGA ACCGTGGTAA

ATAAATCAGA TTCAAAATAA

TTCCTGCTGG ATCAATGGGA

CTCTTGAACC TTCAGCACAA

GGACAAAATC AACTTCTTTG

TGTTGGAGAG GTCTTCCTGC

ACCTGCTTTC TTCAAATTTA

CTTCCCCTTT

(2) INFORMATION FOR
SEQ ID NO: 19:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 4820 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19:

.. AGGAAATAAA GTTTAAAAAG GTGATGAAGA ACAAACCAAG ATTCAAGCAGGAATTCCTAC600 (2) INFORMATION FOR SEQ ID NO: 20:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 21338 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE 20:
DESCRIPTION:
SEQ
ID N0:

ACTACCAACT

CCGTGAAAAG

ACCATTATAT

CTTTTAATTT

TTTCTAAGTT

GAGGAGAATA

AAAACTCTTT

TGCGCAATTT ATGAGATACC CCTGTTGTTTAATATCATGG TAAGGAATTT480 ' CAATATACAA

AATAATCTAC

CAATTCTCTT

CCTGGTATTT

CGTAAGGATT

GAATATCGAT

ATTCACGGAC

CATCCAATAT

TCATTCAAAT

TTCTAAAATA

GCGATTTCTT

CATTTTACAT

TTCCATCTGC

CATGATACGA

CAGTGTAAAC

TAAAGCTGAG

ATGATTTGCT

GTCAATTGCC

TTCATTATAC

CTTCTGCTTC

GGTCAAAAGG

TTTAATCGTA

AAATACTTTT

GAAATAGTAA

TGGGAAATGA

AAAGTTATGA

CCTCTCATAC

TTCAATCCCT

TAGTAATGTA

TTCTT~"fCTA TTTATTCGAC CTTTACAAATCTGTGACTAA TAATTAAAAA2220 TAAACGGTAA

TCTTTTGATT TTCTGATTCA CTGGCCTTAT CTGGTGTTTT TTCATCTGATAACTCAATCA3060 .

GGCGCCAgCA AAGTAGATAT TCGATTGTCA GATGGGACTA AAGTACCTGGAGAAATTGTC3780 CAAGCTATTT

GGCCCACTGA

AATGGAGGAA

ATTATTGAAC

GTTAATTTAT

ACATCTGGTG

AAATACGATG

AGTGCTCTTT

AAAGAAGAAA

ACATCTATGT

AAATTTGAAA

TTTGATAGAG

CCGATTATTG

GGTTATGAAATCcTTGCAGG AGAGAGACGC CACTTTTAGCTGGTCTACGG4800 TATCGGGCTT

GACCAAGAGA

ATAGAAGAAG

GCAGATAAGA

CCAGAACAGA

CTAGTTGGGT

ATTTCTGTAA

ACTAATCATT

GAAATTAAAC

GAATATAGTA

AAGGTTATCC

CCCCAACCTG

TTCTTGCTAT

AAGGATTGAA

CTCGATCCAT

TTGCCACTAT

ATATTATTGT

WO 98/18931 PCTlUS97/19588 ACTAGCTCAC

GTATCTATTC

CAAGGGGATG

CTGACCTATA

AACGCTATTG

GCCGAAAGCT

AAAAAGACCT

AAAAAi4GTCG

AAACAGATTG

CTTGTCACGC

CGTATTGCCA

CTAGAATACC

GACATCACTT

TAATTGCCAG AGTAAAAAAA ATCAAGGATA TCACTATTGA TATTGCTGCA66d0 GAAGCCATTA

ATCCAAACTG

CGCCTTCAAA

GATAATAGTC

CATGCCCATG

GAATCAATCA

IaAAAGAAAAT CAAATAATTT GTGGATAACT TTTAGTTTTT TATCTTTTTT6960 ATCCACATTT

CAGATTTCAC

AATCCATGAT

AGAGAGCTAT

CCAATGAAGG

CTCAAAAAAA

CTTCTTTCTT

TTGAACAAAA

ATAGCGAACA

CAAAATTACT

ATGATGAAGA

AGTTGACGCG

' TTACTTCACA GCTGGTGAAA CGCTTGGACTTTCAAACGTG GTATGAAGGC9480 AAGAAGTTCG

TCCACTCAGA

TGAAATACGG

AATATATCGT

ATGGTGTCAA

ATGACCAAAT

AATGTTGGTT

GATAAGATAT

GTTAAACCAA

TATGGTTTGG

AAAATTCGTT

CAACATATAG

GGTATGTCAG

TTACAGTCTG

GAGAAAACAA

AATGATCAGA

GG2'1'TATCAA

ATTGTGTTAT

ATCTTGGGTG

TTGATGTCTT

TCATCTAAGA

CCCAATGCAT

GCGTTTATCC

GGCGAATTTA

TGTCGAATTG

CAATTATCGA

AGAGAAAAGG

CCTAGAAGAA

TTTATCTTTG

AATATTCTTT

AGCGCAGTAC

TTCTGATATT

AAAGGGTTTA

GGAATTTTTC

TTACACCATT

AGAGGAATAT

AAACTTAATA

GATAACTGAA

TTCAAATGCA

TCATATCCAT

TCGCGATTAT

GATTCATCTA

AAATGACGGT

TGATGATTTA

TGATGATGAT

ACTTCGTAGT

TTTAGTTGGG

AGTCAATGAT

CTATACGAAT

CTTTAGAAGT

TATTTTGATT

GATGATTATT

GAAGAAACAA

GTCTATGCTG

TGTTCAGACC

AATGGAGCGT

GGTTTCAGAA

TGCTGCTTGT AAAAGTACTA GAGATGAAAG ATAGTACAAA AAAAGAAGATGCAGCAGGAAl9100 AAACGGAATC GTTTTATGGG AGGGGTATTG ATTTTGATTA TGCTATTATTTATCTTGCCA.14280 WO 98/18931 PG"T/US97/19588 ., ATCTAACAAT ATTAATGTGG AAGATTTACA GCAGTTATTT TCTTACTCTG 16620 AGTCTACACA

TGGCTATGAA TTGATAAAAG AGTACCAACA GTTTCAGATT TGTAAAATCA GTCCGCAGgC16800 r ATTTCAAGAC CAAGTTTCGA AATCGAATTT TAGACTATATCCGTAAACAG GAAAGTCAGA2O040 (2) INFORMATION FOR SEQ ID NO: 21:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 6273 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION:SEQ 21: w ID
NO:

TCGTTAAACTCGAGTGTAAA ACCTGCCTTCAACTGCCCTTCCACTTTTTTCAAGTCTGAA?80 WO 98/18931 PC'T/US97119588 .

T'CTTGAATGG CATGGATGTA TAGGTTGTGA GCATTTTTCA CTTGTTGTGACATATTCTAA3360 WO 98I18931 PCTlUS97I19588 z7z ATCAAAGTTG TTTGGATTT'T TCATGAAATT TACAGAAAATAGTTGACTTC CCTTTCTTCT6120 (2) INFORMATION FOR SEQ ID NO: 22:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2B171 base pairs --- (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22:

AGTTATTCGTCTCAAGGATAACCTAGATAACCGCATGGTC~.'ATGTTAAACGTGAAGAAGT1320 TAT'TTACATAAACACATGGAGGACTCATTCTGGATGGTCTTGGGTTTGACTCTAATAATC1680 GCGGGTGGTCTTGGAAACTTTAT'TGACAGGGTCAGTCAGGGCTTTGTTGTGGATATGTTC1Z40 CAAACCTCAGGGAATGGTTGTGCACCCGAGTGCTGGTCATACCAGT'GGAACCCTAGTAAA2160 ' TGTTCACCGT ATTGATAAGG ATACGTCAGG TCTTCTCATG ATTGCTAAAA 2280 ACGATGATGC

' TGTTCATGGA AATCTACCTA ATGATCGTGG TGTAATTGAA GCGCCGATTG 2400 GCCGGAGTGA

GATGCAACAG TTZ"rl'GATAT TCGAAAAGAC GGTTTTGTCA ATATTTCAAC 3720 ATCCTACAAG

. GATTTACCGT GACGACTGGA TTTCCATTACTCCTGAAATCCAACTACTTTTTACAGAATT5820 TATGATGCCT AATACCCCTG GCAAGGAGTGATTAGTTATGCCTTGTCTCC7500 ' CTTCTATCGG

GTGAGCTCTT

TAATCGATGC

AGGCCTTGGC

CAGCACAAAC

TATTGAAAGA

AAGCGCATGC

AAGAACTAGG

GACACAAAAA

AATGAGTTAG

AAGACCAGTG

ACGACCCGTG

CCAAGTCATA

CAGCATTTGG

CGTTTTATCG

ATTGACTGGC

GACATCGAGG

CGTTTGGATT

CTGGATAAAG

GTGGAAACTA

AACTAAAGGC

AGCTCAATCA

TAGCTAAGAG

GTTGCAAGCT

AGGTCATTAA

AAAGCCAGCA

ATTCTCTGCT

GCGATGAGGA

AGCAAGAAGA

CTCTTTTAGC

GTTTTTGGAC

AAGAAAGTTA

ATCAGGTTTT

GAGTGATTCT

AAAATGCCAT

AAGGAAAGGG

AACAATTATT

TAGAGGAAAA

TGGAAGCAGA

GTGATGGATT

GTATGTTTTG

GGGGCAGTCT

TATGACTTTT

GCGCAATACA

TGAGCACAAT

TATTGCTCAG

TAAGGCAGCT

TTCTGCCTTG

GAGAAAATCA

GATTTTTTAT

CGGTGACCGC

AGGTACTATC

TCTCATTGTT

AGAAATTCAA

TAAGATTTAC

ACAATAAAGG

GATAATATAA

CATACGGAAT

AAATAAGGTC

TAAAGAAGAA

TAATATTCCA

AAACCTTGCT

AAAGGGGCTT

ATATCAGTGG

AAATGAAAAT

GATTGATTCG TTGAGTGAGG AAGAATTATTATGAGAAAGT GGGCTGATGA1l400 TGAACCGCAT

TAAGTTTATT

ATGGAAGAAG

TACCAAAGGC

GTGATCTTGC

AAAAGTACCG

TATCCTTTAT

CAAGGAAGGC

CTCAAAGCCA

ATCGAAAGAG

AGTATATTTC

AAAGATTTTA

AACTATATAG

GGACTTGGAA

ATAATGAACT

ATGATTGATA

ATAGATTGTT

TCTCTCCTTG

GCTGGATTTA

GTCAGACAAG

ATCACGACAA

CGCTTGAGCC

TAGGGCGATT

GAGTTCTTGA

a GGATTAAGTT

CTCGCACATA

CACCTTTTGC

TCGTGCGCTG

CATCATTCAT

AAATATTGAA

AATAAAATAT

TTTAGATAAT

AAAAGATTTT

CATCAAGAAA

GAAGAAAAAA

ACTTAGGTCC

GTGCTGGTTT

AACTATATGA

AGTTGGTTTG

AGGAAGAAAT

CTGCAGGTGG

TTGCCAACCA

TTATTACGGT

CTCTTAATGA

ACACTATCGA

ACCTCAATGC

TTATGGATTT CCATGAAGCTTTGGTCAATG ACCGCTTAGTTCTTTTGAAA14i60 CAGAAGAAGC

TCGAAGTCAT

GTCTTGAGTA

CGGTTGTGCA

AGACCCACCT

ATGCGGAGTT

TTCTTGAGCA

CGGGTGTGGA

CTCGTCTCTT

TAGCTCATTA

GGATCATCAA

TTGCAGAGCG

ATTGCTCATG

GAGAGTGAAA

CCTTGCCGGT

GATTCAAGAA

CTGGCGTGGA

AATGCTTGGG

TGATACGCGT

ACGTGCCCTT

ACCTTACTTC

CATCCTCATG

GACAGCTGTT

CATGGACTCA

CATGAACCAA

CGTAT9'TGGTGAAAATATCG GAACAACAGT ATCGAAGAAA AGGAATAAGA15660 TTCAAATAAT

GCTAAAGAGA

GCTGGTCGTG

ACTCCTCTTA

CCATTTGACA

ATCACACCGG

ACTCGTCGTG

CGCAATATCC

ACTGAAGACG

AAACACATCG

ACAGAAAAAC

AGTTTTATTCGAAAGAAGGA AATATGAATA AAGTTTTATC GTTGGACTGA16320 ' CAAATCTTGC

TCAAACCACG CTCCTTTGAA RTGTTGGAAAACGATGCTCAGATGATTTTGACTTATTTGG1698d ACCTTCCAAATCATGTCAGTCCCTTATCGT AACCGCAGAAAACGTTTCGAGTTACGGGCG1926p TCTCAGCATT GCAGGGCTTT
AAATCAATTG

~ AGAAATGGCA TCAAGTAAGA ACTATTTGGAATTTGTTTTGGAACAATTATCAGGATTAGA20040 .

CTTGGAACTA ATATCTAAAA TAGTCACTTGAGCACCAAGACCAAGGGCGATGCGGGCA~C21180 ACTCCTCTTA

ATTCCAGAAGCGCGTGTTTT GTTGGTAACA AGTCTTCAT

WO 98l18931 PCT/US9'1/19588 TGTTTCGATA AGGACACGAT GACCACGACTTGAACACCTGCAGGTGTGAG21660 ' AACTAAGCTA

GGCGACACGG TTTTCGTTAT TTTTAATTTCCCGATTAACATTGAGATAAC2l720 TTTTGGGATT

TGTCACATTC

AAAAATCAAG

CTAATGGAGG

AAACGGAGCG

CTTTGGATGA TGCGGAAcAA TGTTTGACTAAAGGGTAATACACGTTACAC22020 TGCCTCGGAC

CAAGAATAAC

AAAAAGCAAT

GAAGGCAGCT

AGCCAATCGT

CCTTCACGAT

TTCCCATGCA

AGTTCATTAT

ATTTTTCGAC

CAATTATCGA

TTGTAGGAGG

TGCAGGCTGA

AAGAAAAGGA

AATCGCCAGG

C'PGGTGGCTT

ATGAGGCTCT

CGGGGACAGC

AAGAACTCAA

GTGAGGCAAA

GCAAAACAAT

CTAATTTACC

CTTGCTTTGT

TCTGCTGGTA

TGGCAACAGA

ACTATCCAGA CTTGAAAGTA AATGTTTTGA AAGCTAGCCA 2S200 ' ACATGGCAAT AAAAAATCAT

TCTTATCTCA GTTGGAAAGA

ACTGGAAGGT ATCAATAGCA

GGGGTTGGAT AGTTGGAAAA

TGAAATAAAC TAAAAATTTG

CAATATTGAG GATATAAAAT

AATCTATTGG TCTTCTTCAG

ATGATTGGAA TACGGTTTGG

TTCCGTCATG GTCTCGTTAT

ACGCTCGTTA TGAGGTCATA

AAAAGGAGGG CTAGATATGT

CTTGCAAGAT GTGGATTTCA

TGGCTCTGGA AAGACGACCC

AAATATCGCA GCCCCTCCTT

CTTAAGTGGG ATGGACTACC

GAGGGATGAA ATCGCCTATT

TTCCTTAGGC ATGAAGCAAC

CTGGCTCATG GATGAGATTA

TAGGCTAGCA CAAATCGATA

AGAGTTGGTT GATGTCTGCG

TTAGTTTATG AAAGATGTTA

CTGGATTGTC TTAGCTTTAT

GACTGCAAAC TCACACAGCT

GGCTATCAAT GAAAATGAAG

CCAGTTTGCT AAAAATAATT

TCTGACTTTA TTAAAAGAAG

AGAGAAGAAT TATGAATTTG

GGTTGACCGC GAACGGAAGA

TTTGGAGTTT CCGACCCACG

AAGTTTGTTT GTGGTTGCTA

(2) INFORMATION FOR SEQ ID NO: 23:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 7147 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23:

ATTGACTATT GTACAAGTCA

CTTGCTCGAT GATATTTCCA

ACAAGCGGTG GGCAATGACA

GGATCAATTC CTCTGTCCGT

GTGCATCCTT AAGAAGGGCA

TCACGGTGTC ATCCAAGATG

TTCCCACAGC CTTACTAGCT

TGAGATTGTC TCGAATAGTT

CATCATGCAC TTCTGAACGC

CCTTATCAAT CTCATAGAAT

TCGGCCCAAC AATGGCAACC

GAACATTGAC ACCGTCCACC

GATTGACCAG AGTTGATTTA

TTTCTGCTTT AAAGCTAACA

TGTTCAATAA CTGCCTCCGA ATTTGCCGCA TAGCGgAAGG 1020 TCACATCCTT AAACTCGACC

GGTTTTGGAT AGAAGAATGC

TTCGGGGAAG AACGATGAAG

CATAAGACAT GAAAACAATC

CGTTAATCAC ATAGGCCCCA

TCATGATAGG ATTCAAAATA

CATCATTTAC TGCTGCAAAT

CACGAATACC TGTTAAACTC

TCAAGGACTG TTTTGGAAAG

TCACTGCCAC AAGTACGGCC

CCCAGATAGC CATAATTGAA

GAACTTGAGT AATGTCATTG

TCTCTGTCTG CGAGTAATCC

AAGAAGCCGC CACTCGGGAT

AGGACATTCC CATCATCATG

TACCTAGCAA ATCCGTAATT

AAAGAGAATG

CTTTATTCTC

AAAATCAGTA

_ ATTCATCTTC ATCAATGTCTTCCATCAACTGCTTGTCTAT GCGTTCAAAA 2160 AAAGCCTTAA

CGCTTATCAA

AGATTACTAG

AAGTCTTGGT

ATAAATTCAA

AAACGCTTAA

CTCTGGAGAA

ATTTTCTCAC

ATCTTTTGGA

TCTTCGCCTG

TCTTCAATAC

CAGACTCACG GAGGGCAACGATAGCCTTGTGAAGGTCAGT TGGCGCTGTG 282d TAAACTGTGA

AATTGCTCAA

AAGAGGTAAG

AAACAGAACC TGAAGCGCCCATGTTTCCGCCGTTTT2'ACC AAAGGCTGCA 3000 CGGACATTGG

CCATTTGGCC

GCTTTATCAA

ATAACGAATT

TAGATTTCTA

TTGGCTACGA

CTTTCTTATT

ATAACACAAG TTTTTTTGATTT1'CACTAGAGGAAATGGAT TTTATTAGCA 3420 AATCAAGCTA

AGGCACTCAT

CCACATTCAA AAAACAAACTAGACCATTATCTGCAAATAG AAAG M'TCAG 3590 CCAAGTTTGA

ACAATCATAC

ACTTTGAAAT

ATAAACTTTCAGATATCCGCAGAGAGATCATCGCCTCTTT TTGTCGCAAGCATTCTCCTC42d0 CCAGGATAGAGGCGACTGTCGTTGGTAGCTGTTACAGAAA TATCACTTGTATTTGTCGAC5460 ' WO 98l18931 PC'T/US9'7l19588 ACCCTCTTTG

ATCGTAATTC

" ATAGCTAGTA TAAAGTCATT TACTGCTTTA TTTGCCATCT TCTACCTCCT5640 AATAAGTTCC

GGACTAAGTA

CTAAATAAGA

AGGTTGGTCT

CCAAACGTCT

TAGTCATGCT

TAGTCGGAAA

CCCTTTTCTC

TCCAAAGTTT

TGCTTGTCGC

ATCAGATGTT

TGATAAAAAT

AGAAAAAGGG

AAGTTTTTTA

AAACCTTTCG

AACATTTGAT

GCTTCTTTTT

TCCTACTCTT

TCCCACTGGT

GCCTTGGT~T

AGGAGTTGCG

ATCCCCTCAC

CCTGGCATAG

TTTAATACCT

ACGTCTAATC

TACTTTGATA

a (2) INFORMATION FOR SEQ ID NO: 29:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 755 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double iD) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION:
SEQ ID NO: 24:

CTCTTTGACC

ACGTCTATAT

TTGTCATTGC

TTATCTGTGG

AGAACTTCCT

GGATTGCGCC

TCGACCGTAT

TCATCGGAGC

AAATCTGATC

TCAAGAAATC

AGATGGTCAC

CACCAGCTCC

TTtcCCAACA AGGGaAtCAA GGTcACAGTC 755 GTCAC

(2) INFORMATION FOR SEQ ID
NO: 25:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 3010 base pairs (H) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25:

WO 98/18931 PC'T/US97/19588 CTACATCACA

A,AGTTAAGGT

TGTGCTTCAT

TAAAACAAAT

GTGGTGTTAT

CGGCAGCTGT

CCCGCATGAG

TGGCTAAGGT

ATTATATCGA

AAGAATTCCA

TCGCTGAAGG

AAGCTGTTCG

AGGACGAGCT

TTCATGAACA

CAGATGCTGC

TCAAGTCAGG

GTAATCCTCA

ATGAAAATGA

TGGCCTTGCA

GTGTAGAACT

TTTTGCCTGG

TTCCCATCCG

TTTTGCTGGC

TGGTCGAGCG

AGGGAGTTGG

AGGGTGTAGA

TGTTGGTAAG

TCAATATGTG

CAATAGCGAT

TTTCTTGGCA TAAAATCCAG

CCAGAAAACG GGTGTCGTTT

CATAATCAGG TAAAGAGCAA

GAGAATACCA AAGATGGTCG

ATTCATCAAG GTCAAGACAA

ATGGTCAATG ATTCGCAAAA

GGTGGATGAT AGGAACATGA

TAAAAATCCG TGTGCTTCAT

GACTGCACCC ACAGCATGGG

GCTGAGTTTA AGACTAGTGT

AGAGTTTGAT GATAGAGTTT

TCAATTCCTT GGTTCATGTA

GTTTGTAGAG TATTAAGTGT

GTGATAGCAA TCAAACGGGC

AGGTAACCAT TTTTCACATA

GGAGAGATAG GGGCGCAGAC

(2) INFORMATION FOR SEQ ID NO: 26:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 15213 base pairs (H) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26:

GGTGCTTGTA

CCTCCAAGAT

CCATCCTTAT

CCGTGTTGAT

TTTGGATTAT

TTATAAGCAT

ATACGATACC

TCAACAAAAT

GCTCCAAATC

AGGAGAGGCA

CCGATAGAAT

TTTTCTTCGA

GGTAAAAAGA

CTGGCAAGAC .

GATTTACCCA

AAATGGTACA

TCCAGATAGG

TAACCTTCAG

AAACTGTCTT

TGCTTGACGT

TTTTGTAAGT

GGCGTATAGA

TGGTATTTGA

GCAGTCTGTT

GACAAGAGTT

ATGGTTGCCT

TTAGCTTGAT

GGCTCCACTG

GCATCTAGCA

TATAAAGGATTTTATCATTTTTTCTTTCCTCTGATATTGA T'GCTACTGGTAGGTATACAT3180 CA 02271720 1999-.04-29 TCCAGCTTCC TTGTGCTGAT

GGACTTGTTC TGGTGCGATT

GATGATCTGA CAGGTATTCA

AGAGTCTAGA AAATCGAGCT

AAGTCTGCTT CTGTTTTCTT

GCTCTTTTTG GTATTGTTTG

TTTTTTCCCA CTTGCGTTCT

AAGCACGCTC TGCGGGTCCC

TATCCCTCTT CTTGCGTTCT

TCTCCTTGCC TAGCTTGACA

TGATACACTT TTCAAGGACT

AACTTCCTCC CTGAAAGACT

TGCCACGATT GGGTTTGAAA

AGTTACTTTT ATTGACCTTG

CCTTTCTGAG CAGTTTT'TCT

GACGAACACA GTCGCTACCA

CATAAGCGTA TTTGATGGCA

TAAAGGAAAC TTCATTCCAT

GTAAAACTGC ATCGTGCAGG

ACTCAATAAA AATCAAAGAG

TTGAGGTTGT AGATAGAACT

GACGAAGTCA GCtCAAAACA CTGTTTTGAG GTTGTGGATA 7020 GAACTGACGA AGTCAgTAAC

TTTTCAAAGA GTATAAGTTA

AAGTATTTTT CAATATTTTC

AACTGACCAC GATAGCGGTC

TCGAACAGAA CAATTTTGTT

AAGTCTTCCT GACTCTTTTG

TGGTCGGTAT TAGCAAGAAT

TCAAGATTGA TCTTGTCTCT

AGGCTAGCAA GGGTTAGTTG

i DEMANDES OU BREVETS VOLUMINEUX

COMPREND PLUS D'UN TOME.
CECI EST LE TOME ~ DE
NOTE: Pour les tomes additionels, veuillez contacter le Bureau canadien des brevets THAN ONE VOLUME
THIS IS VOLUME ~ OF
NOTE: For additional volumes please contact the Canadian Patent OfficE

Claims (20)

What Is Claimed Is:
1. Computer readable medium having recorded thereon the nucleotide sequence depicted in SEQ ID NOS:1-391, a representative fragment thereof or a nucleotide sequence at least 95% identical to a nucleotide sequence depicted in SEQ
ID NOS:1-391.
2. Computer readable medium having recorded thereon any one of the fragments of SEQ ID NOS:1-391 depicted in Tables 2 and 3 or a degenerate variant thereof.
3. The computer readable medium of claim 1, wherein said medium is selected from the group consisting of a floppy disc, a hard disc, random access memory (RAM), read only memory (ROM), and CD-ROM.
4. The computer readable medium of claim 3, wherein said medium is selected from the group consisting of a floppy disc, a hard disc, random access memory (RAM), read only memory (ROM), and CD-ROM.
5. A computer-based system for identifying fragments of the Streptococcus pneumoniae genome of commercial importance comprising the following elements:
a) a data storage means comprising the nucleotide sequence of SEQ ID
NOS:1-391, a representative fragment thereof, or a nucleotide sequence at least 95% identical to a nucleotide sequence of SEQ ID NOS:1-391;
b) search means for comparing a target sequence to the nucleotide sequence of the data storage means of step (a) to identify homologous sequence(s), and c) retrieval means for obtaining said homologous sequence(s) of step (b).
6. A method for identifying commercially important nucleic acid fragments of the Streptococcus pneumoniae genome comprising the step of comparing a database comprising the nucleotide sequences depicted in SEQ ID NOS:1-391, a representative fragment thereof, or a nucleotide sequence at least 95%
identical to a nucleotide sequence of SEQ 137 NOS:1-391 with a target sequence to obtain a nucleic acid molecule comprised of a complementary nucleotide sequence to said target sequence, wherein said target sequence is not randomly selected.
7. A method for identifying an expression modulating fragment of Streptococcus pneumoniae genome comprising the step of comparing a database comprising the nucleotide sequences depicted in SEQ ID NOS:1-391, a representative fragment thereof, or a nucleotide sequence at least 95%
identical to the nucleotide sequence of SEQ ID NOS:1-391 with a target sequence to obtain a nucleic acid molecule comprised of a complementary nucleotide sequence to said target sequence, wherein said target sequence comprises sequences known to regulate gene expression.
8. An isolated protein-encoding nucleic acid fragment of the Streptococcus pneumoniae genome, wherein said fragment consists of the nucleotide sequence of any one of the fragments of SEQ ID NOS:1-391 depicted in Tables 2 and 3, or a degenerate variant thereof.
9. A vector comprising any one of the fragments of the Streptococcus pneumoniae genome SEQ ID NOS:1-391 depicted in Tables 2 and 3 or a degenerate variant thereof.
10. An isolated fragment of the Streptococcus pneumoniae genome, wherein said fragment modulates the expression of an operably linked open reading frame, wherein said fragment consists of the nucleotide sequence from about 10 to 200 bases in length which is 5' to any one of the open reading frames depicted in Tables 2 and 3 or a degenerate variant thereof.
11. A vector comprising any one of the fragments of the Streptococcus pneumoniae genome of claim 8.
12. An organism which has been altered to contain any one of the fragments of the Streptococcus pneumoniae genome of claim 8.
13. An organism which has been altered to contain any one of the fragments of the Streptococcus pneumoniae genome of claim 10.
14. A method for regulating the expression of a nucleic acid molecule comprising the step of covalently attaching to said nucleic acid molecule a nucleic acid molecule consisting of the nucleotide sequence from about 10 to 100 bases 5' to any one of the fragments of the Streptococcus pneumoniae genome depicted in SEQ ID NOS:1-391 and Tables 2 and 3 or a degenerate variant thereof.
15. An isolated nucleic acid molecule encoding a homolog of any of the fragments of the Streptococcus pneumoniae genome of SEQ ID NOS:1-391 and Tables 2 and 3, wherein said nucleic acid molecule is produced by a process comprising steps of:
a) screening a genomic DNA library using as a probe a target sequence defined by any of SEQ ID NOS:1-391 and Tables 2 and 3, including fragments thereof;
b) identifying members of said library which contain sequences that hybridize to said target sequence; and c) isolating the nucleic acid molecules from said members identified in step (b).
16. An isolated DNA molecule encoding a homolog of any one of the fragments of the Streptococcus pneumoniae genome of SEQ ID NOS:1-391 and Tables 2 and 3, wherein said nucleic acid molecule is produced a process.
comprising steps of:
a) isolating mRNA, DNA, or cDNA produced from an organism;
b) amplifying nucleic acid molecules whose nucleotide sequence is homologous to amplification primers derived from said fragment of said Streptococcus pneumoniae genome to prime said amplification;
c) isolating said amplified sequences produced in step (b).
17. An isolated polypeptide encoded by any of the fragments of the Streptococcus pneumoniae genome of SEQ ID NOS:1-391 and depicted in Table 2 and 3 or by a degenerate variant of said fragments.
18. An isolated polynucleotide molecule encoding any one of the polypeptides of claim 17.
19. An antibody which selectively binds to any one of the polypeptides of claim 17.
20. A method for producing a polypeptide in a host cell comprising the steps of:
a) incubating a host containing a heterologous nucleic acid molecule whose nucleotide sequence consists of any one of the fragments of the Streptococcus pneumoniae genome of SEQ ID NOS:1-391 and depicted in Tables 2 and 3, under conditions where said heterologous nucleic acid molecule is expressed to produce said protein, and b) isolating said protein.
CA002271720A 1996-10-31 1997-10-30 Streptococcus pneumoniae polynucleotides and sequences Abandoned CA2271720A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US2996096P 1996-10-31 1996-10-31
US60/029,960 1996-10-31
PCT/US1997/019588 WO1998018931A2 (en) 1996-10-31 1997-10-30 Streptococcus pneumoniae polynucleotides and sequences

Publications (1)

Publication Number Publication Date
CA2271720A1 true CA2271720A1 (en) 1998-05-07

Family

ID=21851789

Family Applications (2)

Application Number Title Priority Date Filing Date
CA002271720A Abandoned CA2271720A1 (en) 1996-10-31 1997-10-30 Streptococcus pneumoniae polynucleotides and sequences
CA002269663A Abandoned CA2269663A1 (en) 1996-10-31 1997-10-30 Streptococcus pneumoniae antigens and vaccines

Family Applications After (1)

Application Number Title Priority Date Filing Date
CA002269663A Abandoned CA2269663A1 (en) 1996-10-31 1997-10-30 Streptococcus pneumoniae antigens and vaccines

Country Status (11)

Country Link
US (9) US6420135B1 (en)
EP (4) EP1770164B1 (en)
JP (6) JP4469026B2 (en)
AT (2) ATE479756T1 (en)
AU (6) AU6909098A (en)
CA (2) CA2271720A1 (en)
DE (2) DE69737125T3 (en)
DK (1) DK0942983T4 (en)
ES (2) ES2350491T3 (en)
PT (1) PT942983E (en)
WO (2) WO1998018931A2 (en)

Families Citing this family (335)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7078042B2 (en) * 1995-09-15 2006-07-18 Uab Research Foundation Pneumococcal surface protein C (PspC), epitopic regions and strain selection thereof, and uses therefor
EP0885297A4 (en) * 1996-02-20 1999-11-24 Smithkline Beecham Corp Novel era
GB9608000D0 (en) * 1996-04-18 1996-06-19 Smithkline Beecham Plc Novel compounds
US5968777A (en) * 1996-04-18 1999-10-19 Smithkline Beecham Corporation DNA encoding seryl t-RNA synthetase
GB9607991D0 (en) * 1996-04-18 1996-06-19 Smithkline Beecham Plc Novel compounds
US5795758A (en) * 1997-04-18 1998-08-18 Smithkline Beecham Corporation DNA encoding histidyl tRNA synthetase variant from Streptococcus pneumoniae
GB9607993D0 (en) * 1996-04-18 1996-06-19 Smithkline Beecham Plc Novel compounds
US5834276A (en) * 1996-04-18 1998-11-10 Smithkline Beecham Corporation Asparaginyl TRNA synthetase polynucleotides of streptococcus
US6270999B1 (en) 1997-04-10 2001-08-07 Smithkline Beecham Corporation Compounds
JP2000509984A (en) * 1996-05-01 2000-08-08 スミスクライン・ビーチャム・コーポレイション New compound
EP0914330A4 (en) * 1996-05-14 2002-01-09 Smithkline Beecham Corp Novel compounds
US6165989A (en) 1996-05-14 2000-12-26 Smithkline Beecham Corporation Era of Streptococcus pneumoniae
US5712108A (en) * 1996-05-29 1998-01-27 Eli Lilly And Company Peptidoglycan biosynthetic mure protein from streptocuccus pneumoniae
US5681694A (en) * 1996-06-18 1997-10-28 Eli Lilly And Company Murd protein method and kit for identification of inhibitors
GB9613907D0 (en) 1996-07-03 1996-09-04 Smithkline Beecham Plc Novel compounds
EP0816502A3 (en) * 1996-07-03 1999-10-20 Smithkline Beecham Corporation Mur A-1, an UDP-N-acetylglucosamine enolpyruvyl-transferase from Streptococcus pneumoniae
US5776733A (en) * 1996-07-25 1998-07-07 Eli Lilly And Company Biosynthetic gene DD1 from Streptococcus pneumoniae
US5691161A (en) * 1996-08-01 1997-11-25 Eli Lilly And Company Peptidoglycan biosynthetic mura protein from Streptococcus pneumoniae
EP0956289A4 (en) * 1996-08-16 2004-10-13 Smithkline Beecham Corp Novel prokaryotic polynucleotides, polypeptides and their uses
US5858718A (en) * 1997-07-08 1999-01-12 Smithkline Beecham Corporation Pcr a
US6284878B1 (en) 1996-08-16 2001-09-04 Smithkline Beecham Corporation def1
US6165762A (en) * 1996-08-16 2000-12-26 Smithkline Beecham Corporation DNA encoding adenine phosphoribosyltransferase from Streptococcus pneumoniae
GB9619071D0 (en) * 1996-09-12 1996-10-23 Smithkline Beecham Plc Novel compounds
US5882896A (en) 1996-09-24 1999-03-16 Smithkline Beecham Corporation M protein
US5928895A (en) * 1996-09-24 1999-07-27 Smithkline Beecham Corporation IgA Fc binding protein
US5882871A (en) * 1996-09-24 1999-03-16 Smithkline Beecham Corporation Saliva binding protein
US6225083B1 (en) * 1996-10-01 2001-05-01 Smithkline Beecham Corporation FtsL from Streptococcus pneumoniae
US5910414A (en) 1996-10-15 1999-06-08 Smithkline Beecham Corporation Topoisomerase I of streptococcus pneumoniae
US5789202A (en) * 1996-10-17 1998-08-04 Eli Lilly And Company DNA encoding a novel penicillin binding protein from streptococcus pneumoniae
US6096518A (en) 1996-10-24 2000-08-01 Smithkline Beecham Corporation DNA encoding SPO/REL polypeptides of streptococcus
US6022710A (en) * 1996-10-25 2000-02-08 Smithkline Beecham Corporation Nucleic acid encoding greA from Streptococcus pneumoniae
EP1770164B1 (en) 1996-10-31 2010-09-01 Human Genome Sciences, Inc. Streptococcus pneumoniae antigens and vaccines
US6887663B1 (en) * 1996-10-31 2005-05-03 Human Genome Sciences, Inc. Streptococcus pneumoniae SP036 polynucleotides
US5821335A (en) * 1996-11-19 1998-10-13 Eli Lilly And Company Biosynthetic gene murg from streptococcus pneumoniae
US5786197A (en) * 1996-11-25 1998-07-28 Smithkline Beecham Corporation lep
US6287803B1 (en) * 1996-11-27 2001-09-11 Smithkline Beecham Corporation Polynucleotides encoding a novel era polypeptide
US5948645A (en) * 1996-12-04 1999-09-07 Eli Lilly And Company Biosynthetic gene muri from Streptococcus pneumoniae
EP0854188A3 (en) * 1997-01-21 2000-02-09 Smithkline Beecham Streptococcus pneumoniae aroE polypeptides and polynucleotides
EP0863205A1 (en) * 1997-02-10 1998-09-09 Smithkline Beecham Corporation Def2 protein from Streptococcus pneumoniae
EP0863152A3 (en) * 1997-02-10 2000-04-19 Smithkline Beecham Corporation Streptococcus pneumoniae Def1 protein
US6228838B1 (en) * 1997-02-28 2001-05-08 Smithkline Beecham Corporation LicD1 polypeptides
CA2230497A1 (en) * 1997-02-28 1998-08-28 Smithkline Beecham Corporation Novel lica
US6110899A (en) * 1997-02-28 2000-08-29 Smithkline Beecham Corporation LICC of Streptococcus pneumoniae
US5962295A (en) * 1997-02-28 1999-10-05 Smithkline Beecham Corporation LicB polypeptides from Streptococcus pneumoniae
DE19708537A1 (en) * 1997-03-03 1998-09-10 Biotechnolog Forschung Gmbh New surface protein (SpsA protein) from Streptococcus pneumoniae etc.
US6210940B1 (en) 1997-04-18 2001-04-03 Smithkline Beecham Corporation Compounds
US6074858A (en) * 1997-04-18 2000-06-13 Smithkline Beecham Corporation Cysteinyl TRNA synthetase from Streptococcus pneumoniae
US6676943B1 (en) 1997-04-24 2004-01-13 Regents Of The University Of Minnesota Human complement C3-degrading protein from Streptococcus pneumoniae
CA2283755A1 (en) * 1997-04-24 1998-10-29 Margaret K. Hostetter Human complement c3-degrading proteinase from streptococcus pneumoniae
US5919664A (en) * 1997-05-29 1999-07-06 Smithkline Beecham Corporation Peptide release factor, prfC (RF-3), a GTP-Binding protein
US6165992A (en) * 1997-05-30 2000-12-26 Smithkline Beecham Corporation Histidine kinase
US6287836B1 (en) * 1997-05-30 2001-09-11 Smithkline Beecham Corporation Histidine kinase from Streptococcus pneumoniae and compositions therecontaining
US6165991A (en) * 1997-05-30 2000-12-26 Smithkline Beecham Corporation Sensor histidine kinase of Streptococcus pneumoniae
EP0881297A3 (en) * 1997-05-30 2002-05-08 Smithkline Beecham Corporation Histidine kinase
US5866365A (en) * 1997-06-05 1999-02-02 Smithkline Beecham Corporation RNC polynucleotides
US5882889A (en) * 1997-06-13 1999-03-16 Smithkline Beecham Corporation Response regulator in a two component signal transduction system
EP0891984A3 (en) 1997-06-20 2000-01-19 Smithkline Beecham Corporation Nucleic acid encoding streptococcus pneumoniae response regulator
EP0885965A3 (en) * 1997-06-20 2000-01-12 Smithkline Beecham Corporation Histidine kinase polypeptides
EP0885903A3 (en) * 1997-06-20 2000-01-19 Smithkline Beecham Corporation Nucleic acid encoding streptococcus pheumoniae response regulator
EP0885966A3 (en) * 1997-06-20 1999-12-29 Smithkline Beecham Corporation Novel compouds
US6140061A (en) * 1997-06-20 2000-10-31 Smithkline Beecham Corporation Response regulator
US6270991B1 (en) 1997-06-20 2001-08-07 Smithkline Beecham Corporation Histidine kinase
US6268172B1 (en) * 1997-06-20 2001-07-31 Smithkline Beecham Corporation Compounds
EP0885963A3 (en) * 1997-06-20 1999-12-29 Smithkline Beecham Corporation Compounds comprising polypeptides and polynucleotides of Streptococcus pneumoniae,said polypeptides being related by amino acid sequence homology to SapR from Streptococcus mutans polypeptide
CA2236441A1 (en) * 1997-07-01 1999-01-01 Smithkline Beecham Corporation Gida2
US6238882B1 (en) * 1997-07-01 2001-05-29 Smithkline Beecham Corporation GidA1
US5866366A (en) * 1997-07-01 1999-02-02 Smithkline Beecham Corporation gidB
US6800744B1 (en) * 1997-07-02 2004-10-05 Genome Therapeutics Corporation Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeutics
US7151171B1 (en) * 1997-07-02 2006-12-19 sanofi pasteur limited/sanofi pasteur limitée Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeutics
EP0890641A1 (en) * 1997-07-10 1999-01-13 Smithkline Beecham Corporation Novel IspA
EP0915159A1 (en) * 1997-07-17 1999-05-12 Smithkline Beecham Corporation Novel xanthine phosphoribosyl transferase
CA2237043A1 (en) * 1997-07-18 1999-01-18 Smithkline Beecham Corporation Response regulator
US5994101A (en) * 1997-07-18 1999-11-30 Smithkline Beecham Corporation DNA encoding gidA1 polypeptides
EP0894857A3 (en) * 1997-08-01 2001-09-26 Smithkline Beecham Corporation SecA gene from Streptococcus pneumoniae
US5840538A (en) * 1997-08-06 1998-11-24 Smithkline Beecham Corporation Lgt
EP0896061A3 (en) * 1997-08-08 2000-07-26 Smithkline Beecham Corporation RpoA gene from Staphylococcus aureus
US5929045A (en) 1997-08-12 1999-07-27 Smithkline Beecham Corporation Recombinant expression of polynucleotides encoding the UDP-N-acetylmuramoylalanine:D-glutamate ligase (MurD) of Streptococcus pneumoniae
US6197300B1 (en) * 1997-08-12 2001-03-06 Smithkline Beecham Corporation ftsZ
US5888770A (en) * 1997-08-26 1999-03-30 Smithkline Beecham Corporation Spoiiie
US6072032A (en) * 1997-08-29 2000-06-06 Smithkline Beecham Corporation FtsY polypeptides from Streptococcus pneumoniae
US5882898A (en) * 1997-08-29 1999-03-16 Smithkline Beecham Corporation Streptococcus pneumoniae polynucleotides which encode folyl-polyglutamate synthetase (FPGS) polypeptides
US6225087B1 (en) * 1997-09-09 2001-05-01 Smithkline Beecham Corporation Response regulator
EP0902087A3 (en) * 1997-09-10 2000-03-15 SmithKline Beecham Corporation Histidine kinase
EP0913478A3 (en) * 1997-09-17 1999-12-29 Smithkline Beecham Corporation Histidine kinase from Streptococcus pneumoniae 0100993
US5866369A (en) * 1997-09-18 1999-02-02 Smithkline Beecham Corporation Amps
US5885804A (en) * 1997-09-18 1999-03-23 Smithkline Beecham Corporation PhoH
BR9812525A (en) * 1997-09-24 2000-07-25 Univ Minnesota Human complement degrading proteinase c3 from streptococcus pneumoniae
EP0905249A3 (en) * 1997-09-25 1999-08-04 Smithkline Beecham Corporation Alcohol dehydrogenase
US6194170B1 (en) 1997-09-25 2001-02-27 Smithkline Beecham Corporation MurF of Streptococcus pneumoniae
CA2244954A1 (en) * 1997-09-25 1999-03-25 Smithkline Beecham Corporation Murf
US6121028A (en) * 1997-10-10 2000-09-19 Smithkline Beecham Corporation MurE
US6331411B1 (en) * 1997-10-14 2001-12-18 Smithkline Beecham Corporation TopA
EP0913479A3 (en) * 1997-10-27 2000-10-25 Smithkline Beecham Corporation Adenine glycosylase
US5861281A (en) * 1997-10-30 1999-01-19 Smithkline Beecham Corporation Lacc
US6165763A (en) 1997-10-30 2000-12-26 Smithkline Beecham Corporation Ornithine carbamoyltransferase
EP0913480A3 (en) * 1997-11-03 2001-07-11 Smithkline Beecham Corporation Chorismate synthase
WO1999023212A1 (en) * 1997-11-05 1999-05-14 Smithkline Beecham Corporation Novel streptococcal ers
US20020058799A1 (en) * 1998-05-18 2002-05-16 Sanjoy Biswas Novel pth
JP2002508156A (en) * 1997-12-31 2002-03-19 ストレスジェン バイオテクノロジーズ コーポレイション Streptococcus heat shock proteins of the Hsp60 family
EP1042361A1 (en) * 1997-12-31 2000-10-11 Millennium Pharmaceuticals, Inc. Essential bacterial genes and their use
CN1268745C (en) 1998-02-20 2006-08-09 生化制药有限公司 Group B streptococcus antigens
US6858706B2 (en) 1998-04-07 2005-02-22 St. Jude Children's Research Hospital Polypeptide comprising the amino acid of an N-terminal choline binding protein a truncate, vaccine derived therefrom and uses thereof
WO1999051266A2 (en) * 1998-04-07 1999-10-14 Medimmune, Inc. Derivatives of pneumococcal choline binding proteins for vaccines
HUP0102306A3 (en) * 1998-04-07 2008-04-28 St Jude Childrens Res Hospital A polypeptide comprising the amino acid of an n-terminal choline binding protein a truncate, vaccine derived therefrom and uses thereof
GB9808350D0 (en) * 1998-04-22 1998-06-17 Glaxo Group Ltd Bacterial polypeptide family
GB9808363D0 (en) * 1998-04-22 1998-06-17 Glaxo Group Ltd Bacterial polypeptide family
US20010016200A1 (en) * 1998-04-23 2001-08-23 Briles David E. Pneumococcal surface protein C (PspC), epitopic regions and strain selection thereof, and uses therefor
EP1080181A4 (en) * 1998-05-18 2001-08-08 Smithkline Beecham Corp clpX OF STREPTOCOCCUS PNEUMONIAE
WO1999061452A2 (en) * 1998-05-28 1999-12-02 Smithkline Beecham Corporation acpS
US6660520B2 (en) 1998-06-05 2003-12-09 Smithkline Beecham Corporation Nrde
US6190881B1 (en) * 1998-06-05 2001-02-20 Smithkline Beecham Corporation Ribonucleotide diphosphate reductase, nrdF, of streptococcus pneumoniae
WO1999064610A1 (en) * 1998-06-11 1999-12-16 St. Jude Children's Research Hospital ZmpB, A NEW DETERMINANT OF VIRULENCE FOR STREPTOCOCCUS PNEUMONIAE, VACCINE DERIVED THEREFROM AND USES THEREOF
JP2002519055A (en) * 1998-07-02 2002-07-02 スミスクライン・ビーチャム・コーポレイション FTSZ multimeric proteins and uses thereof
EP1100879A1 (en) * 1998-07-24 2001-05-23 SmithKline Beecham Corporation Pneumococcal nrdg protein
WO2000006736A2 (en) * 1998-07-27 2000-02-10 Microbial Technics Limited Nucleic acids and proteins from group b streptococcus
EP1624064A3 (en) * 1998-07-27 2006-05-10 Microbial Technics Limited Nucleic acids and proteins from streptococcus pneumoniae
WO2000006737A2 (en) * 1998-07-27 2000-02-10 Microbial Technics Limited Streptococcus pneumoniae proteins and nucleic acid molecules
CN1318103A (en) * 1998-07-27 2001-10-17 微生物技术有限公司 Nucleic acids and proteins from streptococcus pneumoniae
EP1801218A3 (en) * 1998-07-27 2007-10-10 Sanofi Pasteur Limited Nucleic acids and proteins from streptococcus pneumoniae
US6936252B2 (en) 1998-07-27 2005-08-30 Microbial Technics Limited Streptococcus pneumoniae proteins and nucleic acid molecules
US20030134407A1 (en) 1998-07-27 2003-07-17 Le Page Richard William Falla Nucleic acids and proteins from Streptococcus pneumoniae
EP1785486A3 (en) * 1998-07-27 2007-08-08 Sanofi Pasteur Limited Streptococcus pneumoniae proteins and nucleic acid molecules
US7098182B2 (en) 1998-07-27 2006-08-29 Microbial Technics Limited Nucleic acids and proteins from group B streptococcus
AU5822399A (en) * 1998-09-09 2000-03-27 Millennium Pharmaceuticals, Inc. Essential bacterial genes and their use
US6268177B1 (en) * 1998-09-22 2001-07-31 Smithkline Beecham Corporation Isolated nucleic acid encoding nucleotide pyrophosphorylase
US6225457B1 (en) * 1998-09-23 2001-05-01 Smithkline Beecham Corporation murF2
KR20010089280A (en) * 1998-09-24 2001-09-29 리전츠 오브 더 유니버스티 오브 미네소타 Human complement c3-degrading polypeptide from streptococcus pneumoniae
US20030082614A1 (en) * 2001-05-29 2003-05-01 Sanjoy Biswas Map
WO2000018797A1 (en) * 1998-09-28 2000-04-06 Smithkline Beecham Corporation Map
US6515119B1 (en) * 1998-09-30 2003-02-04 Millennium Pharmaceuticals, Inc. Use of S-ydcB and B-ydcB, essential bacterial genes
US6537774B1 (en) 1998-10-14 2003-03-25 Smithkline Beecham Corporation UPS (undecaprenyl diphosphate synthase
US6255075B1 (en) 1998-10-20 2001-07-03 Smithkline Beecham Corporation Bira
US6110685A (en) * 1998-10-28 2000-08-29 Smithkline Beecham Corporation infB
US6228625B1 (en) 1998-11-03 2001-05-08 Smithkline Beecham Corporation metK from Streptococcus pneumoniae
WO2000026359A1 (en) * 1998-11-04 2000-05-11 Smithkline Beecham Corporation ftsX
US6165764A (en) 1998-11-09 2000-12-26 Smithkline Beecham Corporation Polynucleotides encoding tRNA methyl transferases from Streptococcus pneumoniae
WO2000029434A2 (en) * 1998-11-19 2000-05-25 St. Jude Children's Research Hospital PNEUMOCOCCAL CHOLINE BINDING PROTEINS, CbpG AND CbpD, DIAGNOSTIC AND THERAPEUTIC USES THEREOF
US6495139B2 (en) * 1998-11-19 2002-12-17 St. Jude Children's Research Hospital Identification and characterization of novel pneumococcal choline binding protein, CBPG, and diagnostic and therapeutic uses thereof
US6277595B1 (en) * 1998-11-19 2001-08-21 Smithkline Beecham Corporation FabZ
ATE422899T1 (en) * 1998-12-21 2009-03-15 Medimmune Inc STREPTOCOCCUS PNEUMONIAE PROTEINS AND IMMUNOGENIC FRAGMENTS FOR VACCINES
AU2004242430C1 (en) * 1998-12-21 2008-06-05 Med Immune, Inc. Streptococcus pneumoniae proteins and immunogenic fragments for vaccines
JP2002533065A (en) * 1998-12-22 2002-10-08 マイクロサイエンス リミテッド Outer surface proteins, their genes, and their uses
AP1502A (en) 1998-12-22 2005-12-20 Microscience Ltd Genes and proteins, and their uses.
US7128918B1 (en) 1998-12-23 2006-10-31 Id Biomedical Corporation Streptococcus antigens
EP1950302B1 (en) 1998-12-23 2012-12-05 ID Biomedical Corporation of Quebec Streptococcus antigens
DE69938670D1 (en) * 1998-12-23 2008-06-19 Id Biomedical Corp STREPTOCOCCUS ANTIGENE
AU2005209689B2 (en) * 1998-12-23 2008-07-17 Id Biomedical Corporation Of Quebec Novel streptococcus antigens
US6340564B1 (en) * 1999-01-26 2002-01-22 Smithkline Beecham Corporation yhxB
WO2000044764A1 (en) * 1999-01-28 2000-08-03 Smithkline Beecham Corporation Mvd
US6346396B1 (en) * 1999-01-29 2002-02-12 Jianzhong Huang MurA
US6306633B1 (en) 1999-02-01 2001-10-23 Smithkline Beecham Corporation Polynucleotides encoding mevalonate kinase from Streptococcus pneumoniae
WO2000049033A1 (en) * 1999-02-17 2000-08-24 Smithkline Beecham Corporation yybQ
US6270762B1 (en) * 1999-02-26 2001-08-07 Smithkline Beecham Corporation tdk
GB9906437D0 (en) * 1999-03-19 1999-05-12 Smithkline Beecham Biolog Vaccine
WO2000056884A1 (en) * 1999-03-24 2000-09-28 Smithkline Beecham Corporation pksG
US6130069A (en) * 1999-03-24 2000-10-10 Smithkline Beecham Corporation IspA from Streptococcus pneumoniae
WO2000056873A1 (en) * 1999-03-25 2000-09-28 Smithkline Beecham Corporation mvaA
CN100379758C (en) * 1999-03-26 2008-04-09 科特克斯(Om)有限公司 Streptococcus pneumoniae antigens
WO2000059514A1 (en) * 1999-04-02 2000-10-12 Smithkline Beecham Corporation Yeaz
WO2000062804A2 (en) 1999-04-15 2000-10-26 The Regents Of The University Of California Identification of sortase gene
US7101692B2 (en) 1999-04-15 2006-09-05 The Regents Of The University Of California Identification of sortase gene
US6326167B1 (en) * 1999-04-23 2001-12-04 Smithkline Beecham Corp. TktA from Streptococcus pneumoniae
WO2000065026A2 (en) * 1999-04-28 2000-11-02 Smithkline Beecham Corporation YycG
US6245542B1 (en) 1999-05-06 2001-06-12 Smithkline Beecham Corporation tRNA methyltransferase from Streptococcus pneumoniae
WO2000070075A1 (en) * 1999-05-17 2000-11-23 Smithkline Beecham Corporation STREPTOCOCCUS PNEUMONIAE yerS
EP2270172B1 (en) 1999-05-19 2016-01-13 GlaxoSmithKline Biologicals SA Combination neisserial compositions
US6177269B1 (en) * 1999-06-04 2001-01-23 Smithkline Beecham Corporation, aroA
ATE500843T1 (en) * 1999-06-10 2011-03-15 Medimmune Llc STREPTOCOCCUS PNEUMONIAE PROTEINS AND VACCINES
US6887480B1 (en) * 1999-06-10 2005-05-03 Medimmune, Inc. Streptococcus pneumoniae proteins and vaccines
US6869767B1 (en) * 1999-06-11 2005-03-22 The United States Of America As Represented By The Secretary Of The Department Of Health And Human Services Detection of Streptococcus pneumoniae and immunization against Streptococcus pneumoniae infection
US6511839B1 (en) * 1999-06-22 2003-01-28 Pan Fong Chan Fucose kinase
WO2000078785A1 (en) * 1999-06-22 2000-12-28 Smithkline Beecham Corporation lacA
WO2000078784A1 (en) * 1999-06-23 2000-12-28 Smithkline Beecham Corporation treP
US6291230B1 (en) 1999-06-23 2001-09-18 Smithkline Beecham Corporation Galk promoter
JP2003512024A (en) * 1999-07-01 2003-04-02 ビーエーエスエフ アクチェンゲゼルシャフト Corynebacterium glutamicum gene encoding phosphoenolpyruvate: sugar phosphotransferase protein
US6884614B1 (en) 1999-07-01 2005-04-26 Basf Aktiengesellschaft Corynebacterium glutamicum genes encoding phosphoenolpyruvate: sugar phosphotransferase system proteins
WO2001007460A1 (en) * 1999-07-22 2001-02-01 Smithkline Beecham Corporation FucR
WO2001007463A1 (en) * 1999-07-22 2001-02-01 Smithkline Beecham Corporation Trer
GB9918319D0 (en) 1999-08-03 1999-10-06 Smithkline Beecham Biolog Vaccine composition
US6277597B1 (en) * 1999-08-03 2001-08-21 Smithkline Beecham Corporation kdtB
EP1075841A1 (en) * 1999-08-13 2001-02-14 Erasmus Universiteit Rotterdam Pneumococcal vaccines
US6168797B1 (en) * 1999-08-18 2001-01-02 Smithkline Beecham Corporation FabF
WO2001012649A1 (en) * 1999-08-18 2001-02-22 Smithkline Beecham Corporation Aga
US6833356B1 (en) 1999-08-25 2004-12-21 Medimmune, Inc. Pneumococcal protein homologs and fragments for vaccines
JP4749641B2 (en) * 1999-08-25 2011-08-17 メディミューン,エルエルシー Homologues and fragments of pneumococcal proteins for vaccine use
WO2001023403A1 (en) * 1999-09-27 2001-04-05 Smithkline Beecham Corporation ypga CLADE GENES
CA2905326C (en) * 1999-09-28 2016-09-27 Geneohm Sciences Canada Inc. Nucleic acids and methods for the detection of klebsiella
WO2001023599A1 (en) * 1999-09-28 2001-04-05 Smithkline Beecham Corporation Streptococcus pneumoniae ykqc
US6861516B1 (en) 1999-10-04 2005-03-01 Merck & Co., Inc. MraY gene and enzyme of pseudomonas aeruginosa
US6951729B1 (en) 1999-10-27 2005-10-04 Affinium Pharmaceuticals, Inc. High throughput screening method for biological agents affecting fatty acid biosynthesis
KR20020073150A (en) * 1999-12-23 2002-09-19 브이맥스 리미티드 Virulence genes, proteins, and their use
IL149472A0 (en) * 1999-12-30 2002-11-10 Bristol Myers Squibb Co Nucleotide sequences and polypeptides encoded by the sequences that are essential for bacterial viability and methods for detecting and utilizing the same
US6936432B2 (en) 2000-03-01 2005-08-30 Message Pharmaceuticals Bacterial RNase P proteins and their use in identifying antibacterial compounds
FR2807764B1 (en) * 2000-04-18 2004-09-10 Agronomique Inst Nat Rech MUTANTS OF LACTIC BACTERIA OVERPRODUCING EXOPOLYSACCHARIDES
CA2407455A1 (en) * 2000-04-27 2001-11-01 Medimmune, Inc. Immunogenic pneumococcal protein and vaccine compositions thereof
DK1332155T3 (en) * 2000-06-12 2007-02-26 Univ Saskatchewan Vaccination of dairy cattle with GapC protein against streptococcal infection
US6833134B2 (en) 2000-06-12 2004-12-21 University Of Saskacthewan Immunization of dairy cattle with GapC protein against Streptococcus infection
DE60136356D1 (en) 2000-06-12 2008-12-11 Univ Saskatchewan Chimeric GapC protein from Streptococcus and its use for vaccination and diagnosis
EP1734050A3 (en) * 2000-06-12 2012-12-05 University Of Saskatchewan Immunization of dairy cattle with GapC protein against streptococcus infection
US6866855B2 (en) 2000-06-12 2005-03-15 University Of Saskatchewan Immunization of dairy cattle with GapC protein against Streptococcus infection
US7074415B2 (en) 2000-06-20 2006-07-11 Id Biomedical Corporation Streptococcus antigens
AU7038101A (en) 2000-06-20 2002-01-02 Shire Biochem Inc Streptococcus antigens
AU2001269527A1 (en) * 2000-07-18 2002-01-30 Center For Advanced Science And Technology Incubation, Ltd. Isopentenyl pyrophosphate isomerase
GB0022742D0 (en) 2000-09-15 2000-11-01 Smithkline Beecham Biolog Vaccine
US7048926B2 (en) 2000-10-06 2006-05-23 Affinium Pharmaceuticals, Inc. Methods of agonizing and antagonizing FabK
US6821746B2 (en) 2000-10-06 2004-11-23 Affinium Pharmaceuticals, Inc. Methods of screening for FabK antagonists and agonists
US7056697B2 (en) 2000-10-06 2006-06-06 Affinium Pharmaceuticals, Inc. FabK variant
US7033795B2 (en) 2000-10-06 2006-04-25 Affinium Pharmaceuticals, Inc. FabK variant
AU2001295795A1 (en) * 2000-10-26 2002-05-06 Imperial College Innovations Ltd. Streptococcal genes
AU2002214127B2 (en) 2000-10-27 2007-06-07 J. Craig Venter Institute, Inc. Nucleic acids and proteins from streptococcus groups A and B
DK1355918T5 (en) * 2000-12-28 2012-02-20 Wyeth Llc Recombinant protective protein of streptococcus pneumoniae
EP1227152A1 (en) * 2001-01-30 2002-07-31 Société des Produits Nestlé S.A. Bacterial strain and genome of bifidobacterium
GB0107661D0 (en) 2001-03-27 2001-05-16 Chiron Spa Staphylococcus aureus
GB0107658D0 (en) * 2001-03-27 2001-05-16 Chiron Spa Streptococcus pneumoniae
KR100886095B1 (en) * 2001-04-16 2009-02-27 와이어쓰 홀딩스 코포레이션 Novel Streptococcus pneumoniae open reading frames encoding polypeptide antigens and a composition comprising the same
EP1456231A2 (en) * 2001-12-20 2004-09-15 Shire Biochem Inc. Streptococcus antigens
US20030211470A1 (en) * 2002-03-15 2003-11-13 Olson William C. CD4-IgG2-based salvage therapy of HIV-1 infection
US20030199603A1 (en) * 2002-04-04 2003-10-23 3M Innovative Properties Company Cured compositions transparent to ultraviolet radiation
ES2537737T3 (en) 2002-08-02 2015-06-11 Glaxosmithkline Biologicals S.A. Vaccine compositions comprising lipooligosaccharides of immunotype L2 and / or L3 of Neisseria meningitidis of IgtB
WO2004020609A2 (en) * 2002-08-30 2004-03-11 Tufts University Streptococcus pneumoniae antigens for diagnosis, treatment and prevention of active infection
EP1540559B1 (en) * 2002-09-13 2013-02-27 The Texas A & M University System Bioinformatic method for identifying surface-anchored proteins from gram-positive bacteria and proteins obtained thereby
ATE352316T1 (en) 2002-11-01 2007-02-15 Glaxosmithkline Biolog Sa IMMUNOGENIC COMPOSITION
FR2846668B1 (en) * 2002-11-05 2007-12-21 Univ Aix Marseille Ii MOLECULAR IDENTIFICATION OF BACTERIA OF THE GENUS STREPTOCOCCUS AND RELATED GENRES
GB0227346D0 (en) 2002-11-22 2002-12-31 Chiron Spa 741
WO2004048575A2 (en) * 2002-11-26 2004-06-10 Id Biomedical Corporation Streptococcus pneumoniae surface polypeptides
GB0302699D0 (en) * 2003-02-06 2003-03-12 Univ Bradford Immunoglobulin
EP2336357A1 (en) * 2003-04-15 2011-06-22 Intercell AG S. pneumoniae antigens
JP4875490B2 (en) 2003-07-31 2012-02-15 ノバルティス バクシンズ アンド ダイアグノスティックス,インコーポレーテッド Immunogenic composition for Streptococcus pyogenes
US8945589B2 (en) 2003-09-15 2015-02-03 Novartis Vaccines And Diagnostics, Srl Immunogenic compositions for Streptococcus agalactiae
US8574596B2 (en) 2003-10-02 2013-11-05 Glaxosmithkline Biologicals, S.A. Pertussis antigens and use thereof in vaccination
CA2545325C (en) 2003-11-10 2015-01-13 Uab Research Foundation Compositions for reducing bacterial carriage and cns invasion and methods of using same
EP1692277A4 (en) * 2003-11-26 2008-03-12 Binax Inc Methods and kits for predicting an infectious disease state
EP1607485A1 (en) * 2004-06-14 2005-12-21 Institut National De La Sante Et De La Recherche Medicale (Inserm) Method for quantifying VEGF121 isoform in a biological sample
EP1768662A2 (en) 2004-06-24 2007-04-04 Novartis Vaccines and Diagnostics, Inc. Small molecule immunopotentiators and assays for their detection
EP1765313A2 (en) 2004-06-24 2007-03-28 Novartis Vaccines and Diagnostics, Inc. Compounds for immunopotentiation
US20060165716A1 (en) 2004-07-29 2006-07-27 Telford John L Immunogenic compositions for gram positive bacteria such as streptococcus agalactiae
US7838010B2 (en) 2004-10-08 2010-11-23 Novartis Vaccines And Diagnostics S.R.L. Immunogenic and therapeutic compositions for Streptococcus pyogenes
WO2006048753A2 (en) * 2004-11-05 2006-05-11 Pharmacia & Upjohn Company Llc Anti-bacterial vaccine compositions
JP2006223194A (en) * 2005-02-17 2006-08-31 Tosoh Corp METHOD FOR MEASURING STANNIOCALCIN 1(STC1)mRNA
AU2006227165B2 (en) 2005-03-18 2011-11-10 Microbia, Inc. Production of carotenoids in oleaginous yeast and fungi
GB0505996D0 (en) 2005-03-23 2005-04-27 Glaxosmithkline Biolog Sa Fermentation process
US7476733B2 (en) * 2005-03-25 2009-01-13 The United States Of America As Represented By The Department Of Health And Human Services Development of a real-time PCR assay for detection of pneumococcal DNA and diagnosis of pneumococccal disease
SI1896065T2 (en) 2005-06-27 2014-12-31 Glaxosmithkline Biologicals S.A. Process for manufacturing vaccines
US20070059716A1 (en) * 2005-09-15 2007-03-15 Ulysses Balis Methods for detecting fetal abnormality
EP2357000A1 (en) 2005-10-18 2011-08-17 Novartis Vaccines and Diagnostics, Inc. Mucosal and systemic immunizations with alphavirus replicon particles
CA2630220C (en) 2005-11-22 2020-10-13 Doris Coit Norovirus and sapovirus antigens
TWI457133B (en) 2005-12-13 2014-10-21 Glaxosmithkline Biolog Sa Novel composition
ES2707499T3 (en) 2005-12-22 2019-04-03 Glaxosmithkline Biologicals Sa Pneumococcal polysaccharide conjugate vaccine
GB0607088D0 (en) 2006-04-07 2006-05-17 Glaxosmithkline Biolog Sa Vaccine
KR101737464B1 (en) 2006-01-17 2017-05-18 아르네 포르스그렌 아베 A NOVEL SURFACE EXPOSED HAEMOPHILUS INFLUENZAE PROTEIN (PROTEIN E - pE)
WO2008019162A2 (en) 2006-01-18 2008-02-14 University Of Chicago Compositions and methods related to staphylococcal bacterium proteins
ATE539079T1 (en) 2006-03-23 2012-01-15 Novartis Ag IMIDAZOCHINOXALINE COMPOUNDS AS IMMUNE MODULATORS
WO2008051285A2 (en) 2006-04-01 2008-05-02 Medical Service Consultation International, Llc Methods and compositions for detecting fungi and mycotoxins
EP2589668A1 (en) 2006-06-14 2013-05-08 Verinata Health, Inc Rare cell analysis using sample splitting and DNA tags
US20080050739A1 (en) 2006-06-14 2008-02-28 Roland Stoughton Diagnosis of fetal abnormalities using polymorphisms including short tandem repeats
MX2009001778A (en) * 2006-08-17 2009-05-22 Uab Research Foundation Immunogenic pcpa polypeptides and uses thereof.
EP2078092A2 (en) 2006-09-28 2009-07-15 Microbia, Inc. Production of carotenoids in oleaginous yeast and fungi
EP1923069A1 (en) * 2006-11-20 2008-05-21 Intercell AG Peptides protective against S. pneumoniae and compositions, methods and uses relating thereto
WO2008127094A2 (en) * 2007-04-12 2008-10-23 Stichting Katholieke Universiteit, More Particularly The Radboud University Nijmegen Medical Center New virulence factors of streptococcus pnuemoniae
CA2688009C (en) 2007-05-23 2019-04-02 David E. Briles Detoxified pneumococcal neuraminidase and uses thereof
RS54349B1 (en) 2007-06-26 2016-02-29 Glaxosmithkline Biologicals S.A. Vaccine comprising streptococcus pneumoniae capsular polysaccharide conjugates
GB0713880D0 (en) 2007-07-17 2007-08-29 Novartis Ag Conjugate purification
GB0714963D0 (en) 2007-08-01 2007-09-12 Novartis Ag Compositions comprising antigens
ES2561483T3 (en) 2007-09-12 2016-02-26 Glaxosmithkline Biologicals Sa GAS57 mutant antigens and GAS57 antibodies
BRPI0821240B8 (en) 2007-12-21 2022-10-04 Novartis Ag mutant forms of streptolysin o
EP2245048B1 (en) 2008-02-21 2014-12-31 Novartis AG Meningococcal fhbp polypeptides
DK2268618T3 (en) 2008-03-03 2015-08-17 Novartis Ag Compounds and compositions as TLR aktivitetsmodulatorer
WO2009127676A1 (en) 2008-04-16 2009-10-22 Glaxosmithkline Biologicals S.A. Vaccine
US20100068718A1 (en) 2008-08-22 2010-03-18 Hooper Dennis G Methods and Compositions for Identifying Yeast
EP2174664A1 (en) 2008-10-07 2010-04-14 Stichting Katholieke Universiteit, more particularly the Radboud University Nijmegen Medical Centre New virulence factors of Streptococcus pneumoniae
NZ594029A (en) 2009-01-12 2014-01-31 Novartis Ag Cna_b domain antigens in vaccines against gram positive bacteria
MX2011007624A (en) * 2009-01-19 2011-10-12 Hospices Civils Lyon Methods for determining the likelihood of a patient contracting a nosocomial infection and for determining the prognosis of the course of a septic syndrome.
WO2010109324A1 (en) 2009-03-24 2010-09-30 Novartis Ag Combinations of meningococcal factor h binding protein and pneumococcal saccharide conjugates
BRPI1009828A2 (en) 2009-03-24 2019-03-12 Novartis Ag meningococcal h factor binding protein adjuvant
ITMI20090946A1 (en) 2009-05-28 2010-11-29 Novartis Ag EXPRESSION OF RECOMBINANT PROTEINS
EP2440245B1 (en) 2009-06-10 2017-12-06 GlaxoSmithKline Biologicals SA Benzonaphthyridine-containing vaccines
EP2443250B8 (en) 2009-06-16 2016-09-21 GlaxoSmithKline Biologicals SA High-throughput complement-mediated antibody-dependent and opsonic bactericidal assays
SG10201403702SA (en) 2009-06-29 2014-09-26 Genocea Biosciences Inc Vaccines and compositions against streptococcus pneumoniae
US8877356B2 (en) 2009-07-22 2014-11-04 Global Oled Technology Llc OLED device with stabilized yellow light-emitting layer
TWI445708B (en) 2009-09-02 2014-07-21 Irm Llc Compounds and compositions as tlr activity modulators
DK2459216T3 (en) 2009-09-02 2013-12-09 Novartis Ag IMMUNOGENIC COMPOSITIONS INCLUDING TLR ACTIVITY MODULATORS
US20120237536A1 (en) 2009-09-10 2012-09-20 Novartis Combination vaccines against respiratory tract diseases
US8962251B2 (en) 2009-10-08 2015-02-24 Medical Service Consultation International, Llc Methods and compositions for identifying sulfur and iron modifying bacteria
WO2011044576A2 (en) * 2009-10-09 2011-04-14 Children's Medical Center Corporation Selectively disrupted whole-cell vaccine
WO2011057148A1 (en) 2009-11-05 2011-05-12 Irm Llc Compounds and compositions as tlr-7 activity modulators
US9241987B2 (en) 2009-11-20 2016-01-26 The University Of Chicago Methods and compositions related to immunogenic fibrils
JP5814933B2 (en) 2009-12-15 2015-11-17 ノバルティス アーゲー Homogeneous suspension of immune enhancing compounds and uses thereof
US9173954B2 (en) 2009-12-30 2015-11-03 Glaxosmithkline Biologicals Sa Polysaccharide immunogens conjugated to E. coli carrier proteins
WO2011100443A1 (en) * 2010-02-11 2011-08-18 Intelligent Medical Devices, Inc. Oligonucleotides relating to clostridium difficile genes encoding toxin b, toxin a, or binary toxin
ES2557431T3 (en) 2010-02-23 2016-01-26 Stichting Katholieke Universiteit Combination vaccine for Streptococcus
US10113206B2 (en) * 2010-02-24 2018-10-30 Grifols Therapeutics Inc. Methods, compositions, and kits for determining human immunodeficiency virus (HIV)
GB201003922D0 (en) 2010-03-09 2010-04-21 Glaxosmithkline Biolog Sa Conjugation process
GB201003920D0 (en) 2010-03-09 2010-04-21 Glaxosmithkline Biolog Sa Method of treatment
GB201003924D0 (en) 2010-03-09 2010-04-21 Glaxosmithkline Biolog Sa Immunogenic composition
US20130011429A1 (en) 2010-03-10 2013-01-10 Jan Poolman Immunogenic composition
EA023725B1 (en) 2010-03-23 2016-07-29 Новартис Аг Compounds (cystein based lipopeptides) and compositions as tlr2 agonists used for treating infections, inflammations, respiratory diseases etc.
WO2011137368A2 (en) 2010-04-30 2011-11-03 Life Technologies Corporation Systems and methods for analyzing nucleic acid sequences
EP2575988A1 (en) 2010-05-28 2013-04-10 Tetris Online, Inc. Interactive hybrid asynchronous computer game infrastructure
US9268903B2 (en) 2010-07-06 2016-02-23 Life Technologies Corporation Systems and methods for sequence data alignment quality assessment
GB201015132D0 (en) 2010-09-10 2010-10-27 Univ Bristol Vaccine composition
US9127321B2 (en) * 2010-10-06 2015-09-08 The Translational Genomics Research Institute Method of detecting Coccidioides species
WO2012072769A1 (en) 2010-12-01 2012-06-07 Novartis Ag Pneumococcal rrgb epitopes and clade combinations
EP2665490B1 (en) 2011-01-20 2020-03-04 Genocea Biosciences, Inc. Vaccines and compositions against streptococcus pneumoniae
GB201103836D0 (en) 2011-03-07 2011-04-20 Glaxosmithkline Biolog Sa Conjugation process
WO2012138470A2 (en) 2011-04-04 2012-10-11 Intelligent Medical Devices, Inc. Optimized oligonucleotides and methods of using same for the detection, isolation, amplification, quantitation, monitoring, screening, and sequencing of group b streptococcus
KR20140146993A (en) 2011-05-11 2014-12-29 칠드런'즈 메디컬 센터 코포레이션 Multiple antigen presenting immunogenic composition, and methods and uses thereof
UY34073A (en) 2011-05-17 2013-01-03 Glaxosmithkline Biolog Sa IMPROVED VACCINE OF STREPTOCOCCUS PNEUMONIAE AND PREPARATION METHODS.
US20150132339A1 (en) 2012-03-07 2015-05-14 Novartis Ag Adjuvanted formulations of streptococcus pneumoniae antigens
BR112014031386A2 (en) 2012-06-14 2017-08-01 Pasteur Institut serogroup x meningococcal vaccines
CN105307684A (en) 2012-10-02 2016-02-03 葛兰素史密丝克莱恩生物有限公司 Nonlinear saccharide conjugates
WO2014118305A1 (en) 2013-02-01 2014-08-07 Novartis Ag Intradermal delivery of immunological compositions comprising toll-like receptor agonists
CA2900008A1 (en) 2013-02-07 2014-08-14 Children's Medical Center Corporation Protein antigens that provide protection against pneumococcal colonization and/or disease
WO2015188899A1 (en) * 2014-06-12 2015-12-17 Deutsches Krebsforschungszentrum Novel ttv mirna sequences as an early marker for the future development of cancer and as a target for cancer treatment and prevention
US9943583B2 (en) 2013-12-03 2018-04-17 Universität Zürich Proline-rich peptides protective against S. pneumoniae
US9616114B1 (en) 2014-09-18 2017-04-11 David Gordon Bermudes Modified bacteria having improved pharmacokinetics and tumor colonization enhancing antitumor activity
AU2015332634B2 (en) * 2014-10-15 2020-05-14 Xenothera Composition with reduced immunogenicity
US9107906B1 (en) 2014-10-28 2015-08-18 Adma Biologics, Inc. Compositions and methods for the treatment of immunodeficiency
JP6399589B2 (en) * 2014-10-28 2018-10-03 国立大学法人山口大学 Expression promoter in Streptococcus pneumoniae
WO2017023855A1 (en) * 2015-07-31 2017-02-09 The General Hospital Corporation Protein prostheses for mitochondrial diseases or conditions
GB201518684D0 (en) 2015-10-21 2015-12-02 Glaxosmithkline Biolog Sa Vaccine
CU20210061A7 (en) 2015-12-04 2022-02-04 Dana Farber Cancer Inst Inc VACCINE COMPOSITION COMPRISING THE ALPHA 3 DOMAIN OF MICA/B FOR THE TREATMENT OF CANCER
US11129906B1 (en) 2016-12-07 2021-09-28 David Gordon Bermudes Chimeric protein toxins for expression by therapeutic bacteria
US11180535B1 (en) 2016-12-07 2021-11-23 David Gordon Bermudes Saccharide binding, tumor penetration, and cytotoxic antitumor chimeric peptides from therapeutic bacteria
US10259865B2 (en) 2017-03-15 2019-04-16 Adma Biologics, Inc. Anti-pneumococcal hyperimmune globulin for the treatment and prevention of pneumococcal infection
WO2018183475A1 (en) 2017-03-28 2018-10-04 Children's Medical Center Corporation A multiple antigen presenting system (maps)-based staphylococcus aureus vaccine, immunogenic composition, and uses thereof
US10525119B2 (en) * 2017-03-31 2020-01-07 Boston Medical Center Corporation Methods and compositions using highly conserved pneumococcal surface proteins
EP3641828B1 (en) 2017-06-23 2023-11-22 Affinivax, Inc. Immunogenic compositions
SG11202005255PA (en) 2017-12-06 2020-07-29 Merck Sharp & Dohme Compositions comprising streptococcus pneumoniae polysaccharide-protein conjugates and methods of use thereof
IL276608B2 (en) 2018-02-12 2024-04-01 Inimmune Corp Toll-like receptor ligands
CA3106291A1 (en) 2018-07-19 2020-01-23 Glaxosmithkline Biologicals Sa Processes for preparing dried polysaccharides
BR112021004193A2 (en) 2018-09-12 2021-05-25 Affinivax, Inc. multivalent pneumococcal vaccines
JP7275277B2 (en) 2018-12-19 2023-05-17 メルク・シャープ・アンド・ドーム・エルエルシー Compositions comprising Streptococcus pneumoniae polysaccharide-protein conjugates and methods of use thereof
BR112022008761A2 (en) 2019-11-22 2022-07-26 Glaxosmithkline Biologicals Sa DOSAGE AND ADMINISTRATION OF A SACCHARIDE GLYCOCONJUGATE VACCINE
KR20230117105A (en) 2020-11-04 2023-08-07 엘리고 바이오사이언스 Cutibacterium acnes recombinant phage, manufacturing method and use thereof
WO2023006825A1 (en) * 2021-07-29 2023-02-02 Université de Lausanne Novel pneumococcal polypeptide antigens

Family Cites Families (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4433092A (en) 1981-03-09 1984-02-21 Champion Spark Plug Company Green ceramic of lead-free glass, conductive carbon, silicone resin and AlPO4, useful, after firing, as an electrical resistor
JPH07119760B2 (en) 1984-07-24 1995-12-20 コモンウエルス・セ−ラム・ラボラトリ−ズ・コミッション How to detect or determine a mimotope
US4631211A (en) 1985-03-25 1986-12-23 Scripps Clinic & Research Foundation Means for sequential solid phase organic synthesis and methods using the same
LU86128A1 (en) 1985-10-18 1987-06-02 Vander Poorten Henri PROCESS FOR PRINTING OR COATING CERAMIC SIMULTANEOUSLY TO ITS ELECTROFORMING AND CONDUCTING SINGLE-COOKING TO DECORATIVE OR TECHNICAL PRODUCTS
GB8904762D0 (en) * 1989-03-02 1989-04-12 Glaxo Group Ltd Biological process
EP0394827A1 (en) 1989-04-26 1990-10-31 F. Hoffmann-La Roche Ag Chimaeric CD4-immunoglobulin polypeptides
US6251581B1 (en) * 1991-05-22 2001-06-26 Dade Behring Marburg Gmbh Assay method utilizing induced luminescence
WO1993010238A1 (en) * 1991-11-14 1993-05-27 The Government Of The United States Of America As Represented By The Department Of Health And Human Services Pneumococcal fimbrial protein a vaccines
AU682340B2 (en) * 1993-01-28 1997-10-02 Regents Of The University Of California, The TATA-binding protein associated factors, nucleic acids encoding TAFs, and methods of use
CA2116261A1 (en) 1993-04-20 1994-10-21 David E. Briles Epitopic regions of pneumococcal surface protein a
US5480971A (en) 1993-06-17 1996-01-02 Houghten Pharmaceuticals, Inc. Peralkylated oligopeptide mixtures
US5928900A (en) * 1993-09-01 1999-07-27 The Rockefeller University Bacterial exported proteins and acellular vaccines based thereon
US5474905A (en) * 1993-11-24 1995-12-12 Research Corporation Technologies Antibodies specific for streptococcus pneumoniae hemin/hemoglobin-binding antigens
WO1995016711A1 (en) 1993-12-17 1995-06-22 Universidad De Oviedo Antibodies against pneumolysine and their applications
AU2638595A (en) 1994-05-16 1995-12-05 Uab Research Foundation, The (streptococcus pneumoniae) capsular polysaccharide genes and flanking regions
SE9404072D0 (en) * 1994-11-24 1994-11-24 Astra Ab Novel polypeptides
US5620190A (en) 1994-08-18 1997-04-15 Fisher-Price, Inc. In-line skate
US5565204A (en) * 1994-08-24 1996-10-15 American Cyanamid Company Pneumococcal polysaccharide-recombinant pneumolysin conjugate vaccines for immunization against pneumococcal infections
US6001564A (en) * 1994-09-12 1999-12-14 Infectio Diagnostic, Inc. Species specific and universal DNA probes and amplification primers to rapidly detect and identify common bacterial pathogens and associated antibiotic resistance genes from clinical specimens for routine diagnosis in microbiology laboratories
JPH08116973A (en) 1994-10-25 1996-05-14 Yotsuba Nyugyo Kk Novel aminopeptidase and aminopeptidase gene coding the same
AU5552396A (en) 1995-04-21 1996-11-07 Human Genome Sciences, Inc. Nucleotide sequence of the haemophilus influenzae rd genome, fragments thereof, and uses thereof
EP0914330A4 (en) 1996-05-14 2002-01-09 Smithkline Beecham Corp Novel compounds
EP1770164B1 (en) * 1996-10-31 2010-09-01 Human Genome Sciences, Inc. Streptococcus pneumoniae antigens and vaccines
US6136557A (en) 1996-12-13 2000-10-24 Eli Lilly And Company Strepococcus pneumoniae gene sequence FtsH
US6800744B1 (en) * 1997-07-02 2004-10-05 Genome Therapeutics Corporation Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeutics
US7151171B1 (en) * 1997-07-02 2006-12-19 sanofi pasteur limited/sanofi pasteur limitée Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeutics
BR9812525A (en) * 1997-09-24 2000-07-25 Univ Minnesota Human complement degrading proteinase c3 from streptococcus pneumoniae
US6251588B1 (en) * 1998-02-10 2001-06-26 Agilent Technologies, Inc. Method for evaluating oligonucleotide probe sequences
KR20010089280A (en) * 1998-09-24 2001-09-29 리전츠 오브 더 유니버스티 오브 미네소타 Human complement c3-degrading polypeptide from streptococcus pneumoniae
ATE422899T1 (en) * 1998-12-21 2009-03-15 Medimmune Inc STREPTOCOCCUS PNEUMONIAE PROTEINS AND IMMUNOGENIC FRAGMENTS FOR VACCINES

Also Published As

Publication number Publication date
US20040029118A1 (en) 2004-02-12
AU773895B2 (en) 2004-06-10
JP4467606B2 (en) 2010-05-26
EP1770164B1 (en) 2010-09-01
ES2350491T3 (en) 2011-01-24
JP2009039110A (en) 2009-02-26
US7056510B1 (en) 2006-06-06
US20070154986A1 (en) 2007-07-05
DE69737125D1 (en) 2007-02-01
US20020032323A1 (en) 2002-03-14
JP2008035865A (en) 2008-02-21
EP1770164A3 (en) 2007-10-10
JP2008178407A (en) 2008-08-07
EP0942983B1 (en) 2006-12-20
AU2007200944A1 (en) 2007-03-22
AU5194598A (en) 1998-05-22
WO1998018930A2 (en) 1998-05-07
AU3140701A (en) 2001-06-28
AU6909098A (en) 1998-05-22
EP0941335A2 (en) 1999-09-15
ATE479756T1 (en) 2010-09-15
AU2011200622A1 (en) 2011-03-03
DE69737125T3 (en) 2015-02-26
EP0942983A2 (en) 1999-09-22
US7141418B2 (en) 2006-11-28
US20100196410A1 (en) 2010-08-05
DK0942983T4 (en) 2014-12-01
ES2277362T5 (en) 2014-12-18
US6420135B1 (en) 2002-07-16
EP1400592A1 (en) 2004-03-24
DE69739981D1 (en) 2010-10-14
US6929930B2 (en) 2005-08-16
PT942983E (en) 2007-02-28
DK0942983T3 (en) 2007-04-30
JP2001501833A (en) 2001-02-13
WO1998018930A3 (en) 1998-10-08
US20020061545A1 (en) 2002-05-23
AU2004210523A1 (en) 2004-10-07
ES2277362T3 (en) 2007-07-01
ATE348887T1 (en) 2007-01-15
JP2008022854A (en) 2008-02-07
WO1998018931A3 (en) 1998-08-20
WO1998018931A2 (en) 1998-05-07
US6573082B1 (en) 2003-06-03
JP2001505415A (en) 2001-04-24
DE69737125T2 (en) 2007-10-25
CA2269663A1 (en) 1998-05-07
JP4469026B2 (en) 2010-05-26
US20100221287A1 (en) 2010-09-02
US8168205B2 (en) 2012-05-01
EP0942983B2 (en) 2014-09-10
EP1770164A2 (en) 2007-04-04
US20050181439A1 (en) 2005-08-18
AU2004210523B2 (en) 2007-01-04

Similar Documents

Publication Publication Date Title
US8168205B2 (en) Streptococcus pneumoniae polypeptides
US7378514B2 (en) Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeutics
US6617156B1 (en) Nucleic acid and amino acid sequences relating to Enterococcus faecalis for diagnostics and therapeutics
US6583275B1 (en) Nucleic acid sequences and expression system relating to Enterococcus faecium for diagnostics and therapeutics
US9840538B2 (en) Nucleic acids and proteins from Streptococcus groups A and B
US6737248B2 (en) Staphylococcus aureus polynucleotides and sequences
US6593114B1 (en) Staphylococcus aureus polynucleotides and sequences
US7090973B1 (en) Nucleic acid sequences relating to Bacteroides fragilis for diagnostics and therapeutics
JP2008525033A (en) Group B Streptococcus
WO2004018646A2 (en) Conserved and specific streptococcal genomes
US6537773B1 (en) Nucleotide sequence of the mycoplasma genitalium genome, fragments thereof, and uses thereof
US20070009900A1 (en) Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeutics
EP2146743A2 (en) New virulence factors of streptococcus pneumoniae
US20020120116A1 (en) Enterococcus faecalis polynucleotides and polypeptides
US6348328B1 (en) Compounds
AU777190B2 (en) Streptococcus pneumoniae polynucleotides and sequences

Legal Events

Date Code Title Description
EEER Examination request
FZDE Discontinued