US 20090029383 A1
The subject invention pertains to a method for determining the sequence of a polynucleotide comprising the steps of (i) contacting a polynucleotide processive enzyme immobilised in a fixed position, with a target polynucleotide under conditions sufficient to induce enzyme activity; (ii) detecting an effect consequent on the interaction of the enzyme and polynucleotide, wherein the effect is detected by measurement of a non-linear optical signal or a linear signal coupled to a non-linear signal.
1. A method for determining the sequence of a polynucleotide, comprising the steps of:
(i) contacting a polynucleotide processive enzyme immobilised in a fixed position, with a target polynucleotide under conditions sufficient to induce enzyme activity;
(ii) detecting an effect consequent on the interaction of the enzyme and polynucleotide,
wherein the effect is detected by measurement of a non-linear optical signal or a linear signal coupled to a non-linear signal.
2. The method according to
3. The method according to
4. The method according to
5. The method according to
6. The method according to
7. The method according to
8. The method according to
9. The method according to
10. The method according to
11. The method according to
12. The method according to
13. The method according to
14. The method according to
15. The method according to
16. The method according to
17. The method according to
18. The method according to
19. The method according to
20. The method according to
21. The method according to
22. The method according to
23. A solid support material, comprising:
i) at least one immobilised polymerase and at least one dipolar molecule positioned on or proximal to the polymerase; or
ii) a cell immobilised thereon, wherein the cell comprises a polymerase enzyme maintained in a fixed position within the cell.
24. An imaging system set up to detect a non-linear optical signal, comprising a solid support having immobilised thereon an enzyme that interacts with a polynucleotide, and a dipolar molecule positioned on or proximal to the enzyme.
This invention relates to a method for determining the sequence of a polynucleotide.
The ability to determine the sequence of a polynucleotide is of great scientific importance, as shown by the Human Genome Project in mapping the three billion bases of DNA encoded in the human genome.
The principle method in general use for large-scale DNA sequencing is the chain termination method. This method was first developed by Sanger and Coulson (Sanger et al., Proc. Natl. Acad. Sci. USA, 1977; 74: 5463-5467), and relies on the use of dideoxy derivatives of the four nucleoside triphosphates which are incorporated into the nascent polynucleotide chain in a polymerase reaction. Upon incorporation, the dideoxy derivatives terminate the polymerase reaction and the products are then separated by gel electrophoresis and analysed to reveal the position at which the particular dideoxy derivative was incorporated into the chain.
Although this method is widely used and produces reliable results, it is recognised that it is slow, labour-intensive and expensive.
Fluorescent labels have been used to identify nucleotide incorporation onto a growing nascent DNA molecule, using the polymerase reaction (see WO91/06678). However, these techniques have the disadvantage of increasing background interference from the fluorophores. As the DNA molecule grows, the background “noise” increases and the time required to detect each nucleotide incorporation needs to be increased. This severely restricts the use of the method for sequencing large polynucleotides. The most serious limitation of polynucleotide sequencing systems built around fluorescent dyes, however, is the problem of photobleaching.
Photobleaching is a well documented phenomenon in fluorescent dye systems and results from exposure of the dye to excitation wavelengths. All dye systems have an ability to absorb a limited number of photons before photobleaching occurs. Once photobleaching has occurred the fluorescent dye is no longer visible to the observer and hence, if conjugated to a molecule, this will not be detectable.
There is therefore a need for an improved method for determining the sequence of a polynucleotide, which significantly increases the rate and fragment size of the polynucleotide being sequenced and which preferably does not depend on fluorescently labelled nucleotides for detection. Further, the method should be capable of being carried out by an automated process, reducing the complexity and cost associated with existing methods.
The present invention is based on the realisation that a conformational and/or mass and/or energy distribution change in a polynucleotide processive enzyme, which occurs when an enzyme associates with and moves along a target polynucleotide, can be detected using non-linear optical imaging, including that based on second or third harmonic generation.
According to the present invention, a method for sequencing a polynucleotide comprises the steps of:
Numerous advantages are achieved with the present invention. Sequencing can be carried out with small amounts of polynucleotide, with the capability of sequencing single polynucleotide molecules, thereby eliminating the need for amplification prior to initiation of sequencing. Long sequence read lengths can be obtained and secondary structure considerations minimised. Obtaining long read lengths eliminates the need for extensive fragment reassembly using computation. Further, as the invention is not dependent upon the need for fluorescently-labelled nucleotides or any measurement of fluorescence, the limitation of read length at the single molecule level as a function of photobleaching or other unpredictable fluorescence effects, is circumvented. The present invention also permits long polynucleotide fragments to be read sequentially by the same enzyme system. This has the benefit of allowing a single enzyme system to be used which can be regenerated and re-used allowing many different polynucleotide templates to be sequenced. Finally, the utilisation of Second or Third Harmonic Generation offers advantages due to the lack of photodamage and photobleaching. This is due to the fact that no photochemistry occurs, even in the focal plane because the signal, stimulated by non-resonant radiation, does not involve an excited state with a finite lifetime.
According to a second aspect of the invention, a solid support material comprises at least one polymerase and at least one dipolar molecule positioned on or proximal to the polymerase.
According to a third aspect of the invention, an imaging system set up to detect a non-linear optical signal, comprises a solid support having immobilised thereon an enzyme that interacts with a polynucleotide, and a dipolar molecule positioned on or proximal to the enzyme.
The invention is described with reference to the accompanying figure, wherein:
The present invention makes use of conventional non-linear optical measurements to identify a conformational and/or mass and/or energy distribution change occurring as a polynucleotide processive enzyme interacts with the individual bases on a target polynucleotide or incorporates nucleotides onto a nascent polynucleotide molecule.
The use of non-linear optical methods for imaging molecules is known. What has not been appreciated is that these methods can be applied to the sequencing of a polynucleotide, making use of an immobilised or fixed enzyme.
In a separate embodiment, a linear signal is generated in addition to a non-linear signal and the linear signal is detected. The two signals are said to be coupled, resulting in enhanced detection.
The term “polynucleotide” as used herein is to be interpreted broadly, and includes DNA and RNA, including modified DNA and RNA, DNA/RNA hybrids, as well as other hybridising nucleic acid-like molecules, e.g. peptide nucleic acid (PNA).
The term “polynucleotide processive enzyme” as used herein is to be interpreted broadly and relates to any enzyme that interacts with a polynucleotide and moves continuously along the polynucleotide. The enzyme is preferably a polymerase enzyme, and may be of any known type. For example, the polymerase may be any DNA-dependent DNA polymerase. If the target polynucleotide is a RNA molecule, then the polymerase may be a RNA-dependent DNA polymerase, i.e. reverse transcriptase, or a RNA-dependent RNA polymerase, i.e. RNA replicase. In a preferred embodiment of the invention, the polymerase is T4 polymerase. In further preferred embodiments of the invention, the polymerase is either E. coli polymerase III holoenzyme (McHenry, Ann. Rev. Biochem., 1988; 57:519); T7 polymerase (Schwager et al., Methods in Molecular and Cellular Biology, 1989/90; 1(4): 155-159) or bacteriophage T7 gene 5 polymerase complexed with E. coli Thioredoxin (Tabor et al., J. Biol. Chem., 1987; 262: 1612-1623). Each of these polymerase enzymes binds to a target polynucleotide with high processivity (and fidelity) and therefore maintains a polymerase-polynucleotide complex, even when polymerisation is not actively taking place.
Alternative enzymes that interact with a polynucleotide include helicase, primase, holoenzyme, topoisomerase or gyrase enzymes. Such enzymes offer further advantages. For example, using a helicase reduces the problem of secondary structures that exist within polynucleotide molecules, as helicases encounter and overcome these structures within their natural environment. Secondly, helicases allow the necessary reactions to be carried out on double-stranded DNA at room temperature.
As the enzyme interacts with successive bases on the polynucleotide, its conformation will change depending on which base (or nucleotide) on the target it is brought into contact with. Thus, the temporal order of base pair additions during the reaction is measured on a single molecule of nucleic acid, i.e. the activity of the enzyme system on the template polynucleotide to be sequenced can be followed in real time. The sequence is deduced by identifying which base (nucleotide) is being incorporated into the growing complementary strand of the target polynucleotide via the catalytic activity of the enzyme.
An important aspect of the present invention is the immobilization of the enzyme in a fixed position relative to the imaging system. This is preferably carried out by immobilising the enzyme to a solid support, with the enzyme retaining its biological activity. Methods for the immobilisation of suitable enzymes to a solid support are known. For example, WO-A-99/05315 describes the immobilisation of a polymerase enzyme to a solid support. General methods for immobilising proteins to supports are suitable.
The optical detection methods used in the present invention are intended to image at the single molecule level, i.e. to generate a distinct image/signal for one enzyme. A plurality of enzymes may be immobilised on a solid support at a density that permits single enzyme resolution. Therefore, in one embodiment, there are multiple enzymes immobilised on a solid support, and the method of the invention can be carried out on these simultaneously. This allows different polynucleotide molecules to be sequenced together.
It will be apparent to the skilled person to carry out the imaging method under conditions suitable to promote enzymic activity. For example, with regard to a polymerase enzyme, it will be apparent that the other components necessary for the polymerase reaction to proceed, are required. In this embodiment, a polynucleotide primer molecule and each of the nucleoside triphosphates dATP, dTTP, dCTP and dGTP, will be required. The nucleoside triphosphates may be added sequentially, with removal of non-bound nucleotides prior to the introduction of the next nucleoside triphosphate. Alternatively, all the triphosphates can be present at the same time. It may be preferable to utilise triphosphates that have one or more blocking groups which can be removed selectively by pulsed monochromatic light, thereby preventing non-controlled incorporation. Suitable blocked triphosphates are disclosed in WO-A-99/05315.
High-resolution non-linear optical imaging systems are known in the art In general, the non-linear polarisation for a material can be expressed as:
where P is the induced polarisation, X(n) is the nth-order non-linear susceptibility and E is the electric field vector. The first term describes normal absorption and reflection of light; the second describes second harmonic generation (SHG), sum and difference frequency generation; and the third describes light scattering, stimulated Raman processes, third harmonic generation (TGH), and both two- and three-photon absorption.
A preferred imaging system of the present invention relies on the detection of the signal arising from second or third harmonic generation.
Single-molecule resolution using second or third harmonic generation (hereinafter referred to as SHG) is known in the art (Peleg et al., Proc. Natl. Acad. Sci. USA, 1999; 95:6700-6704 and Peleg et al., Bioimaging, 1996; 4:215-224).
The general set-up of the imaging system can be as described in Peleg et al., 1996, supra, and as shown in
In order to generate the second or third harmonic, it is necessary to position an appropriate label on or in close proximity to the immobilised enzyme. Highly dipolar molecules are suitable for this purpose. (Lewis et al. Chem. Phys., 1999; 245: 133-144). An example of suitable molecules are dyes, particularly styryl dyes (such as membrane dye JPW 1259—supplied by Molecular Probes). Green Fluorescent Protein (GFP) is another example of a “dye” or “label” which can be used to image via SHG. As used herein, GFP refers to both the wild-type protein, and spectrally shifted mutants thereof (Tsien, Ann. Rev. Biochem., 1998; 67:509 and U.S. Pat. No. 5,777,079 and U.S. Pat. No. 5,625,048). Other suitable dyes include di-4-ANEPPS, di-8-ANEPPS and JPW2080 (Molecular Probes).
The dipolar molecules may be located on the individual bases of the polynucleotide (or its complement if the dipolar molecules are attached to the nucleoside triphosphates and used in a polymerase reaction).
In a preferred embodiment of the invention, the enzyme, e.g. a polymerase, is prepared as a recombinant fusion with GFP. The GFP can be located at the N- or C-terminus of the enzyme (the C-terminus may be desirable if a polymerase is to be used in conjunction with a ‘sliding clamp’). Alternatively, the GFP molecule can be located anywhere within the enzyme, provided that enzymic activity is retained.
In a separate embodiment of the present invention, the non-linear optical imaging system is Raman spectroscopy or surface enhanced Raman spectroscopy (SERS). An overview of Raman spectroscopy is contained in McGilp, Progress in Surface Science, 1995; 49(1): 1-106.
The optical radiation used to excite the Raman system is, preferably, Near Infrared Radiation (NIR). NIR excitation has the advantage of decreasing the fluorescence and Raman signal of the surrounding medium or solvent.
In a separate embodiment of the invention, the non-linear signal can be enhanced by the use of a metal nanoparticle and/or a roughened metal surface (Boyed et al., Phys Rev., 1984; B. 30: 519-526, Chen et al., Phys. Rev. Lett., 1981; 46: 1010-1012 and Peleg et al., 1996, supra). A signal enhancing metal nanoparticle can be conjugated to the enzyme (e.g. with a nanoparticle conjugated antibody, Lewis et al., Proc. Natl. Acad. Sci. USA, 1999; 96: 6700-6704), immobilised near the immobilised/localised enzyme or brought into close proximity to the SHG dye/enzyme.
A metal nanoparticle enhances the spectroscopic imaging associated with, in particular, SHG from nanometric regions, thereby permitting improved imaging at the single molecule level. Spectroscopic imaging based on Raman scattering can also be improved using a metal nanoparticle. Suitable metal nanoparticles are known, and include gold and silver nanoparticles. The nanoparticles are generally of a diameter of from 5 nm to 100 nm, preferably from 10 nm to 60 nm. The nanoparticles can be attached to the polynucleotide (or its complement if the nanoparticles are attached to nucleoside triphosphates and used in a polymerase reaction).
A roughened metal surface has also been shown to improve the sensitivity of the SHG process (Chen et al., 1981, supra and Peleg et al., 1996, supra) and is also a requirement for SERS. The metal surface is usually silver or another nobel metal. An initial selective modification of the metal surface at sub-wavelength spatial resolution can be carried out using various techniques, including the use of atomic force microscopy (AFM). A platinum-coated AFM tip can be used to catalyse hydrogenation of terminal azides to amino groups that are amenable to further derivatisation (Muller et al., Science, 1995; 268: 272-273). The enzymes can then be placed into “hot spots” where high local fields exist in regions where optical modes are localised (Shalaev et al. Phys. Rep., 1996; 272:61).
In a separate embodiment of the invention, a nanoparticle can be brought into close proximity with the enzyme using an AFM cantilever tip/probe, to thereby enhance the non-linear signal.
AFM has been shown recently to be capable of having a time resolution and sensitivity applicable to the dynamic imaging of protein conformational changes (Rousso et al., J. Struc. Biol., 1997; 119: 158-164). This is utilised in a preferred embodiment of the invention, where an AFM probe/tip is positioned over the enzyme and, in combination with non-linear optical information (e.g. SHG), used to detect conformational changes of a protein due to the interaction between the enzyme and the nucleotide sequence as the enzyme moves along the target polynucleotide. The information may be collected in the far-field using conventional confocal optics or in reflection mode if used in conjunction with total internal reflection.
In a further embodiment, the non-linear signal (e.g. SHG) is monitored in the near-field using Near-Field Scanning Optical Microscopy (NSOM). NSOM is a form of scanning probe microscopy, which makes use of the optical interaction between a nanoscopic tip (as used in AFM) and a sample to obtain spatially resolved optical information. Near-field microscopy in combination with SHG has been studied extensively and shown to be surface sensitive on an atomic scale (McGilp, 1995, supra). The main advantage of using NSOM as part of the imaging system is that it allows a large increase in resolution to sub-wave-length dimensions. As the present invention relates to the conformational monitoring of a single enzyme, e.g. a polymerase enzyme, as it interacts with a polynucleotide, sub-wave-length spatial resolution is highly desirable. In the context of this aspect of the invention, it is preferable if an AFM cantilever tip is used as an apertureless Near-field scanning microscope (Sangohdar et al, J. Opt. A: Pure Appl. Opt., 1999; 523-530). This is analogous to the use of metallic nanoparticles as a source of local field enhancement. It is preferred that the tip is made out of, or coated with, a nobel metal or any material which acts to increase the local electromagnetic field. Alternatively, a metallic nanoparticle may be connected directly to the cantilever tip. This has already been shown to be applicable to the monitoring of conformational changes at the single molecule level (Rousso, et al. supra).
In a further separate embodiment of the present invention, an independently generated surface plasmon (or polariton)/evanescent field can be used to enhance the signal-to-noise ratio of the non-linear signal. This evanescent wave enhanced imaging technique has greater signal-to-noise ratio than, for example, SHG imaging alone. In this embodiment, the evanescently enhanced SHG field signal from the labelled enzyme can be collected in the near field by an NSOM fibre whilst simultaneously obtaining AFM conformational data, and at the same time the amount of absorbed evanescent radiation can be monitored to obtain information on the amount of coupling between the evanescent field and the labelled polymerase/SHG field.
In this configuration (NSOM collection mode) the system acts as a photon scanning tunnelling microscope (PSTM) and the evanescent or surface plasmon field is coupled into the NSOM fibre probe tip. Any attenuation in the field strength of the signal reaching the tip by the polymerase will be monitored via a detector positioned at the end of the tip.
Surface plasmon resonance is known in the art, and relies on the generation of an evanescent wave by applying an incident light beam to a prism. A typical set-up for use in this embodiment consists of a prism which is coupled optically to a metal coated glass coverslip on which an enzyme is immobilised. The coverslip is part of a microfluidic flow cell system with an inlet for introducing ligands (nucleotides) over the immobilised enzyme. The enzyme is also labelled to allow non-linear effects to be generated. An incident light beam is applied to the prism to generate the surface plasmon field. At the same time, a non-linear signal (e.g. second harmonic field) is generated by directing a pulsed near infrared laser through a polarizer and half wave plate, into an optical scanner for beam control via a filter to eliminate optical second harmonic noise, and then into the sample. The non-linear optical signal is collected with lenses and a filter and directed into a monochromator, passed to a photomultiplier tube for detection and then amplified and recorded via a computer system.
When the non-linear optical is coupled to that generating the evanescent field, the signal that is detected can also be the linear (evanescent) signal. In this embodiment, the NSOM can be used in the collection made to detect the linear signal.
In a separate aspect of the present invention, the polynucleotide sequencing can be carried out within a cell.
It has been demonstrated that, in its native cellular environment, a DNA polymerase and its associated replisome complex is anchored in place (or localised in space) within the cell (Newport et al., Curr. Opin. Cell Biol., 1996; 8: 365; and Lemon et al., Science, 1998; 282: 1516-1519. This native anchored replication complex is analogous to the immobilisation of the enzyme to a solid support.
This allows the in vivo monitoring of conformational and template sequence-related changes of replisome-related molecules at the single molecule level to be carried out in real-time during DNA replication and/or cell division.
In order to carry out this aspect, it is necessary to modify the enzyme so that it can be imaged using nonlinear optical detection techniques. This can be achieved by genetic fusion of the enzyme with, for example, green fluorescent protein (GFP). The cell should also be immobilised to permit detection to occur.
The expressed fusion protein can be monitored/detected at its anchored cellular location via the application of non-linear optical detection (second harmonic generation).
The following Example illustrates the invention.
In this experiment, a fusion protein of Green Fluorescent Protein (GFP) and a polymerase was created via recombinant techniques well known in the art.
Quartz chips (14 mm in diameter, 0.3 mm thick) were spin-coated with a 50 nm thick layer of gold and then coated with a layer of planar dextran. These gold coated quartz chips were then placed into the fluid cell of a custom built Nearfield Scanning Optical Microscope (NSOM). The gold-coated quartz chips were coupled optically to a quartz prism via index matching oil. The fluid cell was then sealed and polymerase buffer was then allowed to flow over the chip.
Immobilisation of the polymerase to the chip surface was carried out according to Jonsson et al., Biotechniques, 1991; 11:620-627. The chip environment was equilibrated with running buffer (10 mM hepes, 1 mM MgCl2150 mM NaCl, 0.05% surfactant P20, pH 7.4). Equal volumes of N-hydroxysuccinimide (0.1 M in water) and N-ethyl-N′-(dimethylaminpropyl) carbodimide (EDC) (0.1 M in water) were mixed together and injected across the chip surface, to activate the carboxymethylated dextran. The polymerase-GFP fusion protein (150 μl) was mixed with 10 mM sodium acetate (100 μl, pH 5) and injected across the activated surface. Finally, residual N-hydroxysuccinimide esters on the chip surface were reacted with ethanolamine (35 μl, 1 M in water, pH 8.5), and non-bound polymerase was washed from the surface. The immobilization procedure was performed with a continuous flow of running buffer (5 μl/min) at a temperature of 25° C.
50 μl of antibody binding buffer (10 mM MES pH6.0, 50 mM NaCl, 3 mM EDTA) was flowed over the immobilized polymerase/GFP on the chip surface at a flow rate of 5 μl/min at 25° C. A primary antibody (GFP (B-2)B biotin conjugated 200 μl ml-1, Santa Cruz Biotechnology) was diluted 1:3000 in antibody binding buffer and allowed to flow over the chip surface at a flow rate of 5 μl/min for 30 minutes. Excess antibody was then washed off the surface by flowing antibody binding buffer over the chip at a flow rate of 5 μl/min for 30 minutes.
A secondary antibody (Immunogold conjugate EM Goat antimouse IgG (H+L) 40 nm, British Biocell International) was diluted 1:1000 in antibody binding buffer and allowed to flow over the chip surface at a flow rate of 5 μl/min for 30 minutes. Excess antibody was then washed off the surface by flowing antibody binding buffer over the chip at a flow rate of 5 μl/min for 30 minutes. The buffer was then returned to running buffer which was then allowed to flow over the chip at a rate of 5 μl/min for 30 minutes before initiation of the next stage.
Two oligonucleotides were synthesized using standard phosphoramidite chemistry. The oligonucleotide defined as SEQ ID NO. 1 was used as the target polynucleotide, and the oligonucleotide defined as SEQ ID NO. 2 was used as the primer.
In order to detect the conformational changes in the polymerase, a modified NSOM was used in tapping mode, with pulled quartz multimode 100 μm long fibre cantilevers. The cantilever was driven close to its resonant frequency and an initial area scan was carried out over the surface of the chip containing immobilized antibodies. The second harmonic signal was generated from the immobilized polymerase in the flow cell via initial illumination from a pulsed Near infra-red laser source. The NSOM tip was then scanned over the chip surface in the flow cell in order to obtain an image of a 40 nm gold particles in the flow cell which is associated with the polymerase. The tip is then held in stationary mode over the polymerase.
The pre-initiated pre-primed complex was then injected into the flow cell at a flow rate of 5 μl/min so that the “clamp” around the primer-template molecule forms a complex with the immobilized polymerase. The flow cell was maintained at 25° C. by a cooling device built into the flow cell.
The running buffer was then flushed continuously through the flowcell at 500 μl/min. After 10 minutes the sequencing reaction was initiated by injection of 0.4 mM dATP (8 μl) into the buffer at a flow rate of 500 μl/min. After 4 minutes 0.4 mM dTTP (8 μl) was injected into the flowcell. Then after another 4 minutes 0.4 mM dGTP (8 μl) was injected and after another 4 minutes 0.4 mM dCTP (8 μl) was injected. This cycle was then repeated 10 times. Over the entire time period the second harmonic signal transmitted via the multimode fibre was passed into a monochromator and then into a photomultiplier. The signal from the photomultipler was then amplified and fed into a computer for processing and storage.
The intensity change of second harmonic signal arising from the polymerase complex for a period of 10 seconds from the start of each injection was then calculated and plotted against nucleotide injected into the flow cell. The results of the sequencing reaction are shown in