US 20070020653 A1
The present invention relates to DNA polymerases. In particular the invention relates to a method for the generation of DNA polymerases exhibiting a relaxed substrate specificity. Uses of mutant polymerases produced using the methods of the invention are also described.
1. A method for the generation of an engineered DNA polymerase with an expanded substrate range which method comprises the step of preparing and expressing nucleic acid encoding an engineered DNA polymerase, wherein said preparing comprises the use of template DNA polymerase nucleic acid and primers which bear one or more distorting 3′ termini.
2. A method for the generation of a engineered DNA polymerase with an expanded substrate range which comprises the steps of:
(a) preparing nucleic acid encoding an engineered DNA polymerase, wherein the polymerase is generated using a repertoire of nucleic acid molecules encoding one or more DNA polymerases and primers which bear distorting 3 termini.
(b) compartmentalising the nucleic acid of step (a) into microcapsules;
(c) expressing the nucleic acid to produce their respective DNA polymerase within the microcapsules;
(d) sorting the nucleic acid encoding the engineered DNA polymerase which exhibits an expanded substrate range; and
(e) expressing the engineered DNA polymerase which exhibits an expanded substrate range.
3. The method of
4. The method of
5. The method of
6. The method of
7. The method of
8. The method of
9. The method of
10. The method of
11. The method of
12. The method of
13. The method of
14. The method of
15. The method of
16. The method of
17. The method of
18. The method of
19. An engineered DNA polymerase which exhibits an expanded substrate range relative to the wild-type form of said DNA polymerase.
20. An engineered DNA polymerase produced by the method of
21. A engineered DNA polymerase which exhibits an expanded substrate range produced by the method of
22. The engineered DNA polymerase of
23. The engineered DNA polymerase of
24. An isolated DNA polymerase as shown in
25. An isolated nucleic acid encoding a DNA polymerase with expanded substrate range, said nucleic acid comprising the sequence shown in
26. A pol A DNA polymerase with an expanded substrate range, wherein the polymerase exhibits at least 95% identity to one or more of the amino acid sequences designated M1 and M4 as shown in
27. A pol A DNA polymerase which is capable of mismatch extension, wherein the DNA polymerase comprises the amino acid sequence of any one or more of the clones designated herein as 3B5, 3B8, 3C12 and 3D1.
28. The DNA polymerase of
29. A pol A DNA polymerase which is capable of abasic site bypass, wherein the DNA polymerase comprises the amino acid sequence of any one of the clones designated herein as 3A10, 3B6 or 3B11.
30. The DNA polymerase of
31. A pol A DNA polymerase which is capable of DNA replication involving the incorporation of un-natural base analogues into the newly replicated DNA, wherein the pol A DNA polymerase comprises the amino acid sequence of a clone designated herein as 4D11 or 5D4.
32. The pol A DNA polymerase of
33. A pol A DNA polymerase with an expanded substrate range, wherein the polymerase exhibits at least 95% identity to one or more of the amino acid sequences herein designated 3B5, 3B8, 3C12, 3D1, 3A10, 3B6, 3B11, 4D11 and 5D4 which comprises a mutation, relative to one of parent genes Taq, Tth, or Tfl, found in clones 3B5, 3B8, 3C12, 3D1, 3A10, 3B6, 3B11, 4D11 and 5D4 disclosed herein.
34. A nucleic acid construct encoding an engineered polymerase of
35. A nucleic acid construct encoding an engineered pol A DNA polymerase which exhibits an expanded substrate range, wherein said pol A DNA polymerase is the polymerase shown in
36. A vector comprising a nucleic acid construct of
37. A vector comprising a nucleic acid construct of
38. A method of producing a polynucleotide, the method comprising contacting a template nucleic acid with a DNA polymerase of
39. The method of
40. A method of producing a polynucleotide, the method comprising contacting a template nucleic acid with a blend of DNA polymerases of
41. The method of
42. The method of
43. The method of
44. The method of
45. The method of
This application is a continuation of Application No. PCT/GB04/004643, which was filed on 3 Nov. 2004, which designated the United States and was published in English, and which claims the benefit of United Kingdom Applications GB041087.8, filed 14 May 2004, and GB0325650.0, filed 3 Nov. 2003. The entire teachings of the above applications are incorporated herein by reference.
The present invention relates to DNA polymerases. In particular the invention relates to a method for the generation of DNA polymerases which exhibit a relaxed substrate specificity. Uses of engineered polymerases produced using the methods of the invention are also described.
Accurate DNA replication is of fundamental importance to all life ensuring the maintenance and transmission of the genome and limiting tumorigenesis in higher organisms. High-fidelity DNA polymerases perform an astonishing feat of molecular recognition, incorporating the correct nucleotide triphosphate (dNTP) substrate molecules as specified by the template base with minimal error rates. For example, even without exonucleolytic proofreading, the replicative DNA polymerase III from E. coli on average only makes one error in ˜105 base pairs (Schaaper JBC 1993).
As energetic differences between correctly and mispaired nucleotides per se are much too small to give rise to a 105 fold discrimination, the structure of the polymerase active site in high-fidelity polymerases has evolved to enhance those differences. Recent structural studies of the A-family (Pol I-like) DNA polymerases from Thermus aquaticus (Taq) (Li 98), phage T7 (Ellenberger) and B. stearothermophilus (Bst) (Beese) in particular have revealed how conformational changes during the catalytic cycle may exclude non-cognate base-pairing geometries because of steric clashes within the closed active site. As a result of these tight steric constraints, not only are mismatched nucleotides excluded but catalysis becomes exquisitely sensitive to even slight distortions in the primer-template duplex. This precludes or greatly diminishes the replication of modified or damaged DNA templates, the incorporation of modified or unnatural deoxinucleotide triphosphates (dNTP) and the extension of mismatched or unnatural 3′ termini.
While desirable in nature, such stringent substrate discrimination is limiting for many applications in biotechnology. Specifically, it restricts the use of unnatural or modified nucleotide bases and the applications they enable. It also precludes the efficient PCR amplification of damaged DNA templates.
Some other naturally occurring polymerases are less stringent with regard to their substrate specificity. For example, viral reverse transcriptases like HIV-1 reverse transcriptase or AMV reverse transcriptase and polymerases capable of translesion synthesis such as polY-family polymerases, pol X (Vaisman et al, 2001, JBC) or pol X (Washington (2002), PNAS; or the unusual polB-family polymerase pol X (Johnson, Nature), all extend 3′ mismatches with elevated efficiency compared to high fidelity polymerases. The disadvantage of the use of translesion synthesis polymerases for biotechnological uses is that they depend on cellular processivity factors for their activity, such as PCNA. Moreover such polymerases are not stable at the temperatures at which certain biotechnological techniques are performed, such as PCR. Furthermore most Translesion synthesis polymerases have a much reduced fidelity, which would severely compromise their utility for cloning.
Using another approach, the availability of high-resolution structures has guided efforts to rationally alter the substrate specificity of high fidelity DNA polymerases by site-directed mutagenesis e.g. to increase acceptance of dideoxi- (ddNTPs) (Li 99) or ribonucleotides (rNTPs) (Astatke 98). In vivo complementation followed by screening has also yielded polymerase variants with increased rNTP incorporation and limited bypass of template lesions (Patel 01). Recently, two different in vitro strategies for selection of polymerase activity have been described (Jestin 00, Ghadessy 01, Xia 02). One is based on the proximal attachent of polymerase and template-primer duplex on the same phage particle and has allowed the isolation mutants of Taq polymerase, which incorporate rNTPs and dNTPs with comparable efficiency (Xia 02). However, such methods are complex, prone to error and are laborious.
Recently, the technique of compartmentalized self-replication (CSR) (Ghadessy 01), which is based on the self-replication of polymerase genes by the encoded polymerases within discrete, non-communicating compartments has allowed the selection of mutants of Taq polymerase with increased thermostability and/or resistance to the potent inhibitor heparin (Ghadessy et al 01).
However, there still remains a need in the art for an efficient and simple method for relaxing the substrate specificity of high fidelity DNA polymerases whilst maintaining high catalytic turnover and processivity of DNA fragments up to several tens of kb. Such polymerases will be of particular use in applications such as PCR amplification and sequencing of damaged DNA templates, for the incorporation of unnatural base analogues into DNA (such as is required for sequencing or array labelling) and as a starting point for the creation of novel polymerase activities using compartmentalised self replication or other methods.
The present inventors modified the principles of directed evolution, (in particular compartmentalised self replication) described in GB97143002, 986063936 and GB 01275643 in the name of the present inventors, to relax the steric control of high fidelity DNA polymerases and consequently to expand the substrate range of such polymerases. All of the documents listed above are herein incorporated by reference.
They surprisingly found that by performing the technique of compartmentalised self replication referenced above, using repertoires of randomly mutated Taq genes, and flanking primers bearing the mismatches A*G and C*C at their 3′ terminus/end, then mutants were generated which not only exhibited the ability to extend the A*G and C*C tranversion mismatches used in the CSR selection, but also surprisingly exhibited a generic ability to extend mispaired 3′ termini. This finding is especially significant since Taq polymerase is not able to extend 3′ mismatches (Kwok wt al, (1990), Huang (1992).
The mutant polymerases generated also exhibit high catalytic turnover, concomitant with other high fidelity polymerases and are capable of efficient amplification of DNA fragments up to 26 kb.
Thus in a first aspect the present invention provides a method for the generation of an engineered DNA polymerase with an expanded substrate range which comprises the step of preparing and expressing nucleic acid encoding an engineered DNA polymerase utilising template nucleic acid and flanking primers which bear one or more distorting 3′ termini/ends.
As herein defined ‘flanking primers which bear a 3′ distorting terminus/end’ refer to those primers which possess at their 3′ ends one or more group/s, preferably nucleotide group/s which deviate from cognate base-pairing geometry. Such deviations from cognate base-pairing geometry includes but is not limited to: nucleotide mismatches, base lesions (i.e. modified or damaged bases) or entirely unnatural, synthetic base substitutes. According to the above aspects of the invention, advantageously, the flanking primer/s bear one or more nucleotide mismatches at their 3′ end/terminus.
Advantageously, according to the above aspects of the invention the flanking primers may have one, two, three, four, or five or more nucleotide mismatches at the 3′ primer end. More advantageously, the one or more nucleotide mismatches are consecutive mismatches. More advantageously, according to the above aspects of the invention, the flanking primers have one or two nucleotide mismatches at the 3′ primer end. Most preferably according to the above aspects of the invention, the flanking primers have one nucleotide mismatch at their 3′ primer end.
More specifically the term ‘distorting 3′ termini/ends’ includes within its scope the phenomenon whereby, for example, either the 3′ terminal base (1-mismatch) or the 3′ terminal and upstream base (2-mismatch, 3-mismatch, 4-mismatch and so on) are not complementary to the template base. Preferably mismatches are transversion mismatches i.e. apposing purines with purines and pyrimidines with pyrimidines. Preferably transversion mismatches are G.A and C.C. This type of primer terminus distortion is referred to herein as ‘primer mismatch distortion’.
In addition, and as eluded to above, the term ‘flanking primers bearing distorting 3′ termini/ends’ includes within its scope flanking primers bearing one or more unatural base analogues at the 3′ termini/end of the one or more flanking primers so that distortion of the cognate DNA duplex geometry is created.
The method of the invention may be used to expand the substrate range of any DNA polymerase which lacks an intrinsic 3-5′ exonuclease proofreading activity or where a 3-5′ exonuclease proofreading activity has been disabled, e.g. through mutation. Suitable DNA polymerases include polA, polB (see e.g. Patrel & Loeb, Nature Struc Biol 2001) polC, polD, polY, polX and reverse transcriptases (RT) but preferably are processive, high-fidelity polymerases.
Advantageously, an engineered DNA polymerase with an expanded substrate range according to the invention is generated from a pol A-family DNA polymerase.
Advantageously, the DNA polymerase is generated from a repertoire of pol A DNA polymerase nucleic acid as template nucleic acid. Preferably the pol A polymerase is Taq polymerase and the flanking primers used in the generation of the polymerase are one or more of those primers selected from the group consisting of the following: 5′-CAG GAA ACA GCT ATG ACA AAA ATC TAG ATA ACG AGG GA-3′;A•G mismatch; 5′GTA AAA CGA CGG CCA GTA CCA CCG AAC TGC GGG TGA CGC CAA GCC-3° C.*C mismatch
More advantageously, according to the above aspect of the invention, the nucleic acid encoding the engineered polymerase according to the invention is generated using PCR using one or more flanking primers listed herein.
Advantageously, the method of the present invention involves the use of compartmentalised self replication, and consists of the steps listed below:
Most advantageously, the method of the invention comprises the use of one or more DNA polymerases and flanking primers which bears one or more nucleotide mismatches at their 3′primer ends.
According to the above aspects of the invention, the term ‘engineered DNA polymerase’ refers to a DNA polymerase which has a nucleic acid sequence which is not 100% identical at the nucleic acid level to the one or more DNA polymerase/s or fragments thereof, from which it is derived, and which is synthetic. According to the invention, an engineered DNA polymerase may belong to any family of DNA polymerase.
Advantageously, an engineered DNA polymerase according to the invention is a pol A DNA polymerase. As referred to above the term ‘engineered DNA polymerase’ also includes within its scope fragments, derivatives and homologues of an ‘engineered DNA polymerase’ as herein defined so long as it exhibits the requisite property of possessing an expanded substrate range as defined herein. In addition, it is an essential feature of the present invention that an engineered DNA polymerase according to the invention does not include a polymerase with a 3-5′ exonuclease activity under the conditions used for the polymerisation reaction. (This definition includes polymerases in which the 3-5′ exonuclease is not part of the polymerase polypeptide chain but is associated non-covalently with the active polymerase). Such a proofreading activity would remove any 3′ mismatches incorporated according to the method of the invention, and thus would prevent a polymerase according to the invention possessing an expanded substrate range as defined herein.
As defined herein the term ‘expanded substrate range’ (of an engineered DNA polymerase) means that substrate range of an engineered DNA polymerase according to the present invention is broader than that of the one or more DNA polymerases, or fragments thereof from which it is derived. The term ‘a broader substrate range’ refers to the ability of an engineered polymerase according to the present invention to extend one or more 3′distorting ends, advantageously transversion mismatches (purine*purine, pyrimidine*pyrimidine) for example A*A, C*C, G*G, T*T and G*A, which the one or more polymerase/s from which it is derived cannot extend. That is, essentially, a DNA polymerase which exhibits a relaxed substrate range as herein defined has the ability not only to extend the 3′ distorting endsused in its generation, IE those of the flanking primers) but also exhibits a generic ability to extend 3′ distorting ends (for example A*G, A*A, G*G mismatches). Preferably, ‘expanded substrate range’ (of an engineered DNA polymerase) includes a wider spectrum of unnatural nucleotide substrates including αS dNTPs, dye-labelled nucleotides, damaged DNA templates and so on. More details are given in the Examples.
According to the above aspect of the invention advantageously the DNA polymerase generated using CSR technology is a pol A polymerase and it is generated using flanking primers selected from the group consisting of the following: 5′-CAG GAA ACA GCT ATG ACA AAA ATC TAG ATA ACG AGG GA-3′;A•G mismatch 5′GTA AAA CGA CGG CCA GTA CCA CCG AAC TGC GGG TGA CGC CAA GCC-3′C*C mismatch.
One skilled in the art will appreciate that in essence, any DNA polymerase flanking primer which incorporates a 3′ mismatch will work with any suitable repertoire. The process of mismatch extension will vary in characteristics from polymerase to polymerase, and will also vary according to the experimental conditions. For example, G*A and C*C are the most disfavoured mismatches for extension by Taq polymerase (Huang et al, 92). Other mismatches are favoured for extension by other polymerases and this can be routinely determined by the skilled person.
One skilled in the art will also appreciate that it is an essential feature of the present invention that the methods described herein will only work for polymerases which are devoid of 3-5′ exonuclease activity proofreading under the conditions used for the polymerisation reaction, as such activity would result in the removal of the incorporated mismatches.
Using the method of the invention, the present inventors generated a number of pol A polymerase mutants. Two of the mutants named M1 and M4 not only exhibit the ability to extend the G*A and C*C transversion mismatches used in the CSR selection, but also surprisingly exhibit a generically enhanced ability to extend 3′ mismatched termini.
Thus in a further aspect the present invention provides an engineered DNA polymerase which exhibits an expanded substrate range. Preferably such an engineered polymerase is obtainable using one or more method/s of the present invention.
According to the above aspect of the invention, preferably the DNA polymerase is a pol A polymerase.
According to the above aspect of the invention, preferably the engineered DNA polymerase is obtained using the method of the invention.
In a further aspect still, the present invention provides a pol A DNA polymerase with an expanded substrate range, or the nucleic acid encoding it, wherein the DNA polymerase is designated M1 or M4 as shown in
According to the above aspect of the invention, preferably the engineered DNA polymerase as herein defined is that polymerase designated M1 in
In yet a further aspect the invention provides a pol A DNA polymerase with an expanded substrate range, wherein the polymerase exhibits at least 95% identity to one or more of the amino acid sequences designated M1 and M4 as shown in
Advantageously, the invention provides a pol A DNA polymerase with an expanded substrate range, or the nucleic acid encoding it, wherein the polymerase exhibits at least 95% identity to one or more of the amino acid sequences designated M1 and M4 as shown in
Most advantageously, the invention provides a pol A DNA polymerase with an expanded substrate range, or the nucleic acid encoding it, wherein the polymerase exhibits at least 95% identity to one or more of the amino acid sequences designated M1 and M4 as shown in
According to the above aspect of the invention the mutation ‘E520G’ describes a DNA polymerase according to the invention in which glycine is present at position 520 of the amino acid sequence. The present inventors were surprised to find that E520, which is located at the tip of the thumb domain at a distance 20A from the 3′OH of the mismatched primer terminus, would be involved in mismatch recognition or extension. The mutation of E520 to G520 is clearly important in such roles however as the present inventors have demonstrated. This aspect of the invention is described further in the detailed description of the invention.
The present inventors consider that the method of the invention is applicable to the generation of ‘blends’ of engineered DNA polymerases with an expanded substrate range. According to the present invention the term a ‘blend’ of more than one polymerase refers to a mixture of 2 or more, 3 or more 4 or more, 5 or more engineered polymerases. Preferably the term ‘blends’ refers to a mixture of 6, 7, 8, 9 or 10 or more ‘engineered polymerases’.
It is important to note that the extension of mismatched 3′ primer termini is a feature of naturally occurring polymerases. Viral reverse transcriptases (RT) like HIV-1 RT or AMV RT and polymerases capable of translesion synthesis (TLS) such as the polY-family polymerases pol ι (Vaisman 2001JBC) or pol κ (Washington 2002 PNAS) or the unusual polB-family polymerase pol ζ (Johnson Nature), all extend 3′ mismatches with elevated efficiency compared to high-fidelity polymerases. Thus, the mutant polA polymerases according to the present invention share significant functional similarities with other polymerases found in nature but so far represent, the only known member of the polA-family polymerases that are proficient in mismatch extension (ME) and translesion synthesis (TLS).
In contrast to TLS polymerases, which are distributive and depend on cellular processivity factors such as PCNA, M1 and M4 combine mismatch extension (ME) and translesion synthesis (TLS) with high processivity and in the case of M1 are capable of efficient amplification of DNA fragments of up to 26 kb.
In a further aspect still the present invention provides a nucleic acid construct which is capable of encoding a pol A DNA polymerase which exhibits an expanded substrate range, wherein said pol A DNA polymerase is depicted in
According to the above aspect of the invention, preferably the nucleic acid construct encodes the M1 pol A polymerase as described herein.
In a further aspects the invention provides a pol A DNA polymerase with an expanded substrate range, in particular which is capable of mismatch extension, wherein the DNA polymerase comprises, preferably consists of the amino acid sequence of any one or more of the clones designated herein as 3B5, 3B8, 3C12 and 3D1.
In yet a further aspect the invention provides a pol A DNA polymerase with an expanded substrate range, in particular which is capable of abasic site bypass, wherein the DNA polymerase comprises, preferably consists of the amino acid sequence of any one or more of the clones designated herein as 3A10, 3B6 and 3B11.
In a further aspect still the invention provides a pol A DNA polymerase with an expanded substrate range, in particular which is capable of DNA replication involving the incorporation of unatural base analogues into the newly replicated DNA, wherein the pol A DNA polymerase comprises, preferably consists of the amino acid sequence of any one or more of the clones designated herein as 4D11 and 5D4.
In a further aspect the present invention provides a pol A DNA polymerase with an expanded substrate range, wherein the polymerase exhibits at least 95% identity to one or more of the amino acid sequences designated 3B5, 3B8, 3C12, 3D1, 3A10, 3B6, 3B11, 4D11 and 5D4. which comprises any one or more of the mutations (with respect to either of the three parent genes Taq, Tth, Tfl) or gene segments found in clones 3B5, 3B8, 3C12, 3D1, 3A10, 3B6, 3B11, 4D11 and 5D4.
In a further aspect still, the present invention provides a vector comprising a nucleic acid construct according to the present invention.
In a further aspect still the present invention provides the use of a DNA polymerase according to the present invention in any one or more of the following applications selected from the group consisting of the following: PCR amplification, sequencing of damaged DNA templates, the incorporation of unnatural base analogues into DNA and the creation of novel polymerase activities.
According to the above aspect of the invention, preferably the use is of a ‘blend’ of DNA polymerases according to the invention or selected according to the method of the invention.
The use of blends of polymerases will be familiar to those skilled in the art and is described in Barnes, W. M. (1994) Proc. Natl. Acad. Sci. USA 91, 2216-2220 which is herein incorporated by reference.
According to the above aspect of the invention, preferably the DNA polymerase is a pol A DNA polymerase. Advantageously, it is generated using CSR technology using flanking primers bearing one or more 3′ mismatch pairs of interest as described herein. Other suitable methods include screening after activity preselection (see Patel & Loeb 01) and phage display with proximity coupled template-primer duplex substrate (Jestin 01, Xue, 02. CST is also ideally suited as the present inventors have demonstrated.
According to the above aspect of the invention, preferably the use of a polymerase according to the invention is in PCR amplification and the polymerase is M1 as herein described.
According to the above aspect of the invention, advantageously, the creation of novel polymerase activities is produced using the technique of compartmentalised self replication as described herein.
The term ‘engineered DNA polymerase’ refers to a DNA polymerase which has a nucleic acid sequence which is not 100% identical at the nucleic acid level to the one or more DNA polymerase/s or fragments thereof, from which it is derived, and which has been generated using one or more biotechnological methods. Advantageously, an engineered DNA polymerase according to the invention is a pol-A family DNA polymerase or a pol-B family DNA polymerase. More advantageously, an engineered DNA polymerase according to the invention is a pol-A family DNA polymerase. As referred to above the term ‘engineered DNA polymerase’ also includes within its scope fragments, derivatives and homologues of an ‘engineered DNA polymerase’ as herein defined so long as it exhibits the requisite property of possessing an expanded substrate range as defined herein. In addition, it is an essential feature of the present invention that an engineered DNA polymerase according to the invention does not include a polymerase with a 3-5′ exonuclease activity under the conditions used for the polymerisation reaction. Such a proofreading activity would remove any 3′ mismatches incorporated according to the method of the invention, and thus would prevent a polymerase according to the invention possessing an expanded substrate range as defined herein.
As herein defined ‘flanking primers which bear a 3′ distorting terminus’ refer to those DNA polymerase primers which possess at their 3′ ends one or more group/s, preferably nucleotide group/s which deviate from cognate base-pairing geometry. Such deviations from cognate base-pairing geometry includes but is not limited to: nucleotide mismatches, base lesions (i.e. modified or damaged bases) or entirely unnatural, synthetic base substitutes at the 3 end of a flanking primer used according to the methods of the invention. According to the above aspects of the invention, advantageously, the flanking primer/s bear one or more nucleotide mismatches at their 3′ end. Advantageously, according to the above aspects of the invention the flanking primers may have one, two, three, four, or five or more nucleotide mismatches at the 3′ primer end. Preferably according to the above aspects of the invention, the flanking primers have one or two nucleotide mismatches at the 3′ primer end. Most preferably according to the above aspects of the invention, the flanking primers have one nucleotide mismatch at their 3′ primer end.
As defined herein the term ‘expanded substrate range’ (of an engineered DNA polymerase) means that substrate range of an engineered DNA polymerase according to the present invention is broader than that of the one or more DNA polymerases, or fragments thereof from which it is derived. The term ‘a broader substrate range’ refers to the ability of an engineered polymerase according to the present invention to extend one or more 3′distorting ends, advantageously transversion mismatches (purine*purine, pyrimidine*pyrimidine) for example A*A, C*C, G*G, T*T and G*A, which the one or more polymerase/s from which it is derived cannot extend. That is, essentially, a DNA polymerase which exhibits a relaxed substrate range as herein defined has the ability not only to extend the 3′ distorting ends used in its generation, IE those of the flanking primers) but also exhibits a generic ability to extend 3′ distorting ends (for example A*G, A*A, G*G mismatches).
(A) On the template containing an abasic site, wtTaq efficiently inserted a base opposite the lesion, but further extension was negligible. In contrast, M1 is capable of both insertion opposite the abasic site and lesion bypass. Of the four mismatch extension polymerases, polymerases A10 and D1 clearly display better abasic site bypass than either wtTaq or M1, with a number of other polymerases displaying improved abasic site activity (notably C12).
(B) The Polymerase A10 was chosen for further investigation and displays superior elongation and bypass when compared to wild type for both the abasic site and the CPD.
(A) Schematic representation of two step nested PCR. In the first round a pair of outer primers (represented in green) are used; in the second step a pair of nested inner primers (red) are used.
(B) Target sequences in the cave bear mitochondrial D loop. Outer primer sequences are underlined, Inner primer sequences are in red.
Sample GS 3-7 is from the Gamsulzen cave (Austria) and is between 25 000 and 45 000 years old.
In eight out of a total of nine uncontaminated experiments, the blend of mismatch polymerases produced more successful (positive) amplifications than SuperTaq. The odds of this occuring by chance are (9!/(8!1!))*(0.5)8(0.5)1=1.76%, as determined by binomial distribution analysis. Given the heterogenity of aDNA samples, it is not surprising that in one case SuperTaq performed better than the blend. Experiment 5 is depicted in
The experiments are listed in chronological order and it is noteworthy that the difference in performance between SuperTaq and the blend became less pronounced as time passed. This may be due to freeze/thawing further damaging the aDNA as well as to loss of activity in the blend which less pure than SuperTaq.
(A) Principles Underlying CST Technology According to the Invention.
In a preferred embodiment the present invention provides a method for the generation of an engineered DNA polymerase with an expanded substrate range which comprises the steps of:
The techniques of directed evolution and compartmentalised self replication are detailed in GB 97143002 and GB 98063936 and GB 01275643, in the name of the present inventors. These documents are herein incorporated by reference.
The inventors modified the methods of compartmentalised self replication and surprisingly generated DNA polymerases which exhibited an expanded substrate range as herein defined.
In particular, the inventors realised that for self-replication of Taq polymerase, compartments must remain stable at the high temperatures of PCR thermocycling. Encapsulation of PCRS has been described previously for lipid vesicles (Oberholzer, T., Albrizio, M. & Luisi, P. L. (1995) Chem. Biol. 2, 677-82 and fixed cells and tissues (Haase, A. T., Retzel, E. F. & Staskus, K. A. (1990) Proc. Natl. Acad. Sci. USA 87, 4971-5; Embleton, M. J., Gorochov, G., Jones, P. T. & Winter, G. (1992) Nucleic Acids) but with low efficiencies.
The present inventors used recently developed oil in water emulsions but modified the composition of the surfactant as well as the oil to water ratio. Details are given in Example 1. These modifications greatly increased the heat stability of the compartments and allowed PCR yields in the emulsion to approach those of PCR in solution. Further details of the method of compartmentalised self replication are given below.
The microcapsules used according to the method of the invention require appropriate physical properties to allow the working of the invention.
First, to ensure that the nucleic acids and gene products may not diffuse between microcapsules, the contents of each microcapsule must be isolated from the contents of the surrounding microcapsules, so that there is no or little exchange of the nucleic acids and gene products between the microcapsules over the timescale of the experiment.
Second, the method of the present invention requires that there are only a limited number of nucleic acids per microcapsule. This ensures that the gene product of an individual nucleic acid will be isolated from other nucleic acids. Thus, coupling between nucleic acid and gene product will be highly specific. The enrichment factor is greatest with on average one or fewer nucleic acids per microcapsule, the linkage between nucleic acid and the activity of the encoded gene product being as tight as is possible, since the gene product of an individual nucleic acid will be isolated from the products of all other nucleic acids. However, even if the theoretically optimal situation of, on average, a single nucleic acid or less per microcapsule is not used, a ratio of 5, 10, 50, 100 or 1000 or more nucleic acids per microcapsule may prove beneficial in sorting a large library. Subsequent rounds of sorting, including renewed encapsulation with differing nucleic acid distribution, will permit more stringent sorting of the nucleic acids. Preferably, there is a single nucleic acid, or fewer, per microcapsule.
Third, the formation and the composition of the microcapsules must not abolish the function of the machinery the expression of the nucleic acids and the activity of the gene products.
Consequently, any microencapsulation system used must fulfil these three requirements. The appropriate system(s) may vary depending on the precise nature of the requirements in each application of the invention, as will be apparent to the skilled person.
A wide variety of microencapsulation procedures are available (see Benita, 1996) and may be used to create the microcapsules used in accordance with the present invention. Indeed, more than 200 microencapsulation methods have been identified in the literature (Finch, 1993).
These include membrane enveloped aqueous vesicles such as lipid vesicles (liposomes) (New, 1990) and non-ionic surfactant vesicles (van Hal et al., 1996). These are closed-membranous capsules of single or multiple bilayers of non-covalently assembled molecules, with each bilayer separated from its neighbour by an aqueous compartment. In the case of liposomes the membrane is composed of lipid molecules; these are usually phospholipids but sterols such as cholesterol may also be incorporated into the membranes (New, 1990). A variety of enzyme-catalysed biochemical reactions, including RNA and DNA polymerisation, can be performed within liposomes (Chakrabarti et al., 1994; Oberholzer et al., 1995a; Oberholzer et al., 1995b; Walde et al., 1994; Wick & Luisi, 1996).
With a membrane-enveloped vesicle system much of the aqueous phase is outside the vesicles and is therefore non-compartmentalised. This continuous, aqueous phase should be removed or the biological systems in it inhibited or destroyed (for example, by digestion of nucleic acids with DNase or RNase) in order that the reactions are limited to the microcapsules (Luisi et al., 1987).
Enzyme-catalysed biochemical reactions have also been demonstrated in microcapsules generated by a variety of other methods. Many enzymes are active in reverse micellar solutions (Bru & Walde, 1991; Bru & Walde, 1993; Creagh et al., 1993; Haber et al., 1993; Kumar et al., 1989; Luisi & B., 1987; Mao & Walde, 1991; Mao et al., 1992; Perez et al., 1992; Walde et al., 1994; Walde et al., 1993; Walde et al., 1988) such as the AOT-isooctane-water system (Menger & Yamada, 1979).
Microcapsules can also be generated by interfacial polymerisation and interfacial complexation (Whateley, 1996). Microcapsules of this sort can have rigid, nonpermeable membranes, or semipermeable membranes. Semipermeable microcapsules bordered by cellulose nitrate membranes, polyamide membranes and lipid-polyamide membranes can all support biochemical reactions, including multienzyme systems (Chang, 1987; Chang, 1992; Lim, 1984). Alginate/polylysine microcapsules (Lim & Sun, 1980), which can be formed under very mild conditions, have also proven to be very biocompatible, providing, for example, an effective method of encapsulating living cells and tissues (Chang, 1992; Sun et al., 1992).
Non-membranous microencapsulation systems based on phase partitioning of an aqueous environment in a colloidal system, such as an emulsion, may also be used.
Preferably, the microcapsules of the present invention are formed from emulsions; heterogeneous systems of two immiscible liquid phases with one of the phases dispersed in the other as droplets of microscopic or colloidal size (Becher, 1957; Sherman, 1968; Lissant, 1974; Lissant, 1984).
Emulsions may be produced from any suitable combination of immiscible liquids. Preferably the emulsion of the present invention has water (containing the biochemical components) as the phase present in the form of finely divided droplets (the disperse, internal or discontinuous phase) and a hydrophobic, immiscible liquid (an ‘oil’) as the matrix in which these droplets are suspended (the nondisperse, continuous or external phase). Such emulsions are termed ‘water-in-oil’ (W/O). This has the advantage that the entire aqueous phase containing the biochemical components is compartmentalised in discreet droplets (the internal phase). The external phase, being a hydrophobic oil, generally contains none of the biochemical components and hence is inert.
The emulsion may be stabilised by addition of one or more surface-active agents (surfactants). These surfactants are termed emulsifying agents and act at the water/oil interface to prevent (or at least delay) separation of the phases. Many oils and many emulsifiers can be used for the generation of water-in-oil emulsions; a recent compilation listed over 16,000 surfactants, many of which are used as emulsifying agents (Ash and Ash, 1993). Suitable oils include light white mineral oil and non-ionic surfactants (Schick, 1966) such as sorbitan monooleate (Span™80; ICI) and polyoxyethylenesorbitan monooleate (Tween™ 80; ICI) and Triton-X-100.
The use of anionic surfactants may also be beneficial. Suitable surfactants include sodium cholate and sodium taurocholate. Particularly preferred is sodium deoxycholate, preferably at a concentration of 0.5% w/v, or below. Inclusion of such surfactants can in some cases increase the expression of the nucleic acids and/or the activity of the gene products. Addition of some anionic surfactants to a non-emulsified reaction mixture completely abolishes translation. During emulsification, however, the surfactant is transferred from the aqueous phase into the interface and activity is restored. Addition of an anionic surfactant to the mixtures to be emulsified ensures that reactions proceed only after compartmentalisation.
Creation of an emulsion generally requires the application of mechanical energy to force the phases together. There are a variety of ways of doing this which utilise a variety of mechanical devices, including stirrers (such as magnetic stir-bars, propeller and turbine stirrers, paddle devices and whisks), homogenisers (including rotor-stator homogenisers, high-pressure valve homogenisers and jet homogenisers), colloid mills, ultrasound and ‘membrane emulsification’ devices (Becher, 1957; Dickinson, 1994).
Aqueous microcapsules formed in water-in-oil emulsions are generally stable with little if any exchange of nucleic acids or gene products between microcapsules. Additionally, we have demonstrated that several biochemical reactions proceed in emulsion microcapsules. Moreover, complicated biochemical processes, notably gene transcription and translation are also active in emulsion microcapsules. The technology exists to create emulsions with volumes all the way up to industrial scales of thousands of litres (Becher, 1957; Sherman, 1968; Lissant, 1974; Lissant, 1984).
The preferred microcapsule size will vary depending upon the precise requirements of any individual selection process that is to be performed according to the present invention. In all cases, there will be an optimal balance between gene library size, the required enrichment and the required concentration of components in the individual microcapsules to achieve efficient expression and reactivity of the gene products.
Details of one example of an emulsion used when performing the method of the present invention are given in Example 1.
Expression within Microcapsules
The processes of expression must occur within each individual microcapsule provided by the present invention. Both in vitro transcription and coupled transcription-translation become less efficient at sub-nanomolar DNA concentrations. Because of the requirement for only a limited number of DNA molecules to be present in each microcapsule, this therefore sets a practical upper limit on the possible microcapsule size. Preferably, the mean volume of the microcapsules is less that 5.2×10−16 m3, (corresponding to a spherical microcapsule of diameter less than 10 μm, more preferably less than 6.5×10−17 m3 (5 μm), more preferably about 4.2×10−18 m3 (2 μm) and ideally about 9×10−18 m3 (2.6 μm).
The effective DNA or RNA concentration in the microcapsules may be artificially increased by various methods that will be well-known to those versed in the art. These include, for example, the addition of volume excluding chemicals such as polyethylene glycols (PEG) and a variety of gene amplification techniques, including transcription using RNA polymerases including those from bacteria such as E. coli (Roberts, 1969; Blattner and Dahlberg, 1972; Roberts et al., 1975; Rosenberg et al., 1975), eukaryotes e.g. (Weil et al., 1979; Manley et al., 1983) and bacteriophage such as T7, T3 and SP6 (Melton et al., 1984); the polymerase chain reaction (PCR) (Saiki et al., 1988); Qβ replicase amplification (Miele et al., 1983; Cahill et al., 1991; Chetverin and Spirin, 1995; Katanaev et al., 1995); the ligase chain reaction (LCR) (Landegren et al., 1988; Barany, 1991); and self-sustained sequence replication system (Fahy et al., 1991) and strand displacement amplification (Walker et al., 1992). Even gene amplification techniques requiring thermal cycling such as PCR and LCR could be used if the emulsions and the in vitro transcription or coupled transcription-translation systems are thermostable (for example, the coupled transcription-translation systems could be made from a thermostable organism such as Thermus aquaticus).
Increasing the effective local nucleic acid concentration enables larger microcapsules to be used effectively. This allows a preferred practical upper limit to the microcapsule volume of about 5.2×10−16 m3 (corresponding to a sphere of diameter 10 um).
The microcapsule size must be sufficiently large to accommodate all of the required components of the biochemical reactions that are needed to occur within the microcapsule.
For example, in vitro, both transcription reactions and coupled transcription-translation reactions require a total nucleoside triphosphate concentration of about 2 mM.
For example, in order to transcribe a gene to a single short RNA molecule of 500 bases in length, this would require a minimum of 500 molecules of nucleoside triphosphate per microcapsule (8.33×10−22 moles). In order to constitute a 2 mM solution, this number of molecules must be contained within a microcapsule of volume 4.17×10−19 litres (4.17×10−22 m3 which if spherical would have a diameter of 93 nm.
Furthermore, particularly in the case of reactions involving translation, it is to be noted that the ribosomes necessary for the translation to occur are themselves approximately 20 nm in diameter. Hence, the preferred lower limit for microcapsules is a diameter of approximately 100 nm.
Therefore, the microcapsule volume is preferably of the order of between 5.2×10−22 m3 and 5.2×10−16 m3 corresponding to a sphere of diameter between 0.1 um and 10 um, more preferably of between about 5.2×10−19 m3 and 6.5×10−17 m3 (1 um and 5 um). Sphere diameters of about 2.6 um are most advantageous.
It is no coincidence that the preferred dimensions of the compartments (droplets of 2.6 um mean diameter) closely resemble those of bacteria, for example, Escherichia are 1.1-1.5×2.0-6.0 um rods and Azotobacter are 1.5-2.0 um diameter ovoid cells. In its simplest form, Darwinian evolution is based on a ‘one genotype one phenotype’ mechanism. The concentration of a single compartmentalised gene, or genome, drops from 0.4 nM in a compartment of 2 um diameter, to 25 pM in a compartment of 5 um diameter. The prokaryotic transcription/translation machinery has evolved to operate in compartments of ˜1-2 um diameter, where single genes are at approximately nanomolar concentrations. A single gene, in a compartment of 2.6 um diameter is at a concentration of 0.2 nM. This gene concentration is high enough for efficient translation. Compartmentalisation in such a volume also ensures that even if only a single molecule of the gene product is formed it is present at about 0.2 nM, which is important if the gene product is to have a modifying activity of the nucleic acid itself. The volume of the microcapsule should thus be selected bearing in mind not only the requirements for transcription and translation of the nucleic acid/nucleic acid, but also the modifying activity required of the gene product in the method of the invention.
The size of emulsion microcapsules may be varied simply by tailoring the emulsion conditions used to form the emulsion according to requirements of the selection system. The larger the microcapsule size, the larger is the volume that will be required to encapsulate a given nucleic acid/nucleic acid library, since the ultimately limiting factor will be the size of the microcapsule and thus the number of microcapsules possible per unit volume.
The size of the microcapsules is selected not only having regard to the requirements of the transcription/translation system, but also those of the selection system employed for the nucleic acid/nucleic acid construct. Thus, the components of the selection system, such as a chemical modification system, may require reaction volumes and/or reagent concentrations which are not optimal for transcription/translation. As set forth herein, such requirements may be accommodated by a secondary re-encapsulation step; moreover, they may be accommodated by selecting the microcapsule size in order to maximise transcription/translation and selection as a whole. Empirical determination of optimal microcapsule volume and reagent concentration, for example as set forth herein, is preferred.
A “nucleic acid/nucleic acid” in accordance with the present invention is as described above. Preferably, a nucleic acid is a molecule or construct selected from the group consisting of a DNA molecule, an RNA molecule, a partially or wholly artificial nucleic acid molecule consisting of exclusively synthetic or a mixture of naturally-occurring and synthetic bases, any one of the foregoing linked to a polypeptide, and any one of the foregoing linked to any other molecular group or construct. Advantageously, the other molecular group or construct may be selected from the group consisting of nucleic acids, polymeric substances, particularly beads, for example polystyrene beads, magnetic substances such as magnetic beads, labels, such as fluorophores or isotopic labels, chemical reagents, binding agents such as macrocycles and the like.
The nucleic acid portion of the nucleic acid may comprise suitable regulatory sequences, such as those required for efficient expression of the gene product, for example promoters, enhancers, translational initiation sequences, polyadenylation sequences, splice sites and the like.
Details of a preferred method of performing the method of the invention are given in Example 1. However, those skilled in the art will appreciate that the examples given are non-limiting and methods for product selection are discussed in more general terms below.
A ligand or substrate can be connected to the nucleic acid by a variety of means that will be apparent to those skilled in the art (see, for example, Hermanson, 1996). Any tag will suffice that allows for the subsequent selection of the nucleic acid. Sorting can be by any method which allows the preferential separation, amplification or survival of the tagged nucleic acid. Examples include selection by binding (including techniques based on magnetic separation, for example using Dynabeads™), and by resistance to degradation (for example by nucleases, including restriction endonucleases).
One way in which the nucleic acid molecule may be linked to a ligand or substrate is through biotinylation. This can be done by PCR amplification with a 5′-biotinylation primer such that the biotin and nucleic acid are covalently linked.
The ligand or substrate to be selected can be attached to the modified nucleic acid by a variety of means that will be apparent to those of skill in the art. A biotinylated nucleic acid may be coupled to a polystyrene microbead (0.035 to 0.2 um in diameter) that is coated with avidin or streptavidin, that will therefore bind the nucleic acid with very high affinity. This bead can be derivatised with substrate or ligand by any suitable method such as by adding biotinylated substrate or by covalent coupling.
Alternatively, a biotinylated nucleic acid may be coupled to avidin or streptavidin complexed to a large protein molecule such as thyroglobulin (669 Kd) or ferritin (440 Kd). This complex can be derivatised with substrate or ligand, for example by covalent coupling to the alpha-amino group of lysines or through a non-covalent interaction such as biotin-avidin. The substrate may be present in a form unlinked to the nucleic acid but containing an inactive “tag” that requires a further step to activate it such as photoactivation (e.g. of a “caged” biotin analogue, (Sundberg et al., 1995; Pirrung and Huang, 1996)). The catalyst to be selected then converts the substrate to product. The “tag” could then be activated and the “tagged” substrate and/or product bound by a tag-binding molecule (e.g. avidin or streptavidin) complexed with the nucleic acid. The ratio of substrate to product attached to the nucleic acid via the “tag” will therefore reflect the ratio of the substrate and product in solution.
When all reactions are stopped and the microcapsules are combined, the nucleic acids encoding active enzymes can be enriched using an antibody or other molecule which binds, or reacts specifically with the “tag”. Although both substrates and product have the molecular tag, only the nucleic acids encoding active gene product will co-purify.
The terms “isolating”, “sorting” and “selecting”, as well as variations thereof, are used herein. Isolation, according to the present invention, refers to the process of separating an entity from a heterogeneous population, for example a mixture, such that it is free of at least one substance with which it was associated before the isolation process. In a preferred embodiment, isolation refers to purification of an entity essentially to homogeneity. Sorting of an entity refers to the process of preferentially isolating desired entities over undesired entities. In as far as this relates to isolation of the desired entities, the terms “isolating” and “sorting” are equivalent. The method of the present invention permits the sorting of desired nucleic acids from pools (libraries or repertoires) of nucleic acids which contain the desired nucleic acid. Selecting is used to refer to the process (including the sorting process) of isolating an entity according to a particular property thereof.
Initial selection of a nucleic acid/nucleic acid from a nucleic acid library (for example a mutant taq library) using the present invention will in most cases require the screening of a large number of variant nucleic acids. Libraries of nucleic acids can be created in a variety of different ways, including the following.
Pools of naturally occurring nucleic acids can be cloned from genomic DNA or cDNA (Sambrook et al., 1989); for example, mutant Taq libraries or other DNA polymerase libraries, made by PCR amplification repertoires of taq or other DNA polymerase genes have proved very effective sources of DNA polymerase fragments. Further details are given in the examples.
Libraries of genes can also be made by encoding all (see for example Smith, 1985; Parmley and Smith, 1988) or part of genes (see for example Lowman et al., 1991) or pools of genes (see for example Nissim et al., 1994) by a randomised or doped synthetic oligonucleotide.
Libraries can also be made by introducing mutations into a nucleic acid or pool of nucleic acids ‘randomly’ by a variety of techniques in vivo, including; using ‘mutator strains’, of bacteria such as E. coli mutD5 (Liao et al., 1986; Yamagishi et al., 1990; Low et al., 1996). Random mutations can also be introduced both in vivo and in vitro by chemical mutagens, and ionising or UV irradiation (see Friedberg et al., 1995), or incorporation of mutagenic base analogues (Freese, 1959; Zaccolo et al., 1996). ‘Random’ mutations can also be introduced into genes in vitro during polymerisation for example by using error-prone polymerases (Leung et al., 1989). In a preferred embodiment of the method of the invention, the repertoire of nucleic fragments used is a mutant Taq repertoire which has been mutated using error prone PCR. Details are given in Examples 1. According to the method of the invention, the term ‘random’ may be in terms of random positions with random repertoire of amino acids at those positions or it may be selected (predetermined) positions with random repertoire of amino acids at those selected positions.
Further diversification can be introduced by using homologous recombination either in vivo (see Kowalczykowski et al., 1994 or in vitro (Stemmer, 1994a; Stemmer, 1994b)).
In addition to the nucleic acids described above, the microcapsules according to the invention will comprise further components required for the sorting process to take place. Other components of the system will for example comprise those necessary for transcription and/or translation of the nucleic acid. These are selected for the requirements of a specific system from the following; a suitable buffer, an in vitro transcription/replication system and/or an in vitro translation system containing all the necessary ingredients, enzymes and cofactors, RNA polymerase, nucleotides, nucleic acids (natural or synthetic), transfer RNAs, ribosomes and amino acids, and the substrates of the reaction of interest in order to allow selection of the modified gene product.
A suitable buffer will be one in which all of the desired components of the biological system are active and will therefore depend upon the requirements of each specific reaction system. Buffers suitable for biological and/or chemical reactions are known in the art and recipes provided in various laboratory texts, such as Sambrook et al., 1989.
The in vitro translation system will usually comprise a cell extract, typically from bacteria (Zubay, 1973; Zubay, 1980; Lesley et al., 1991; Lesley, 1995), rabbit reticulocytes (Pelham and Jackson, 1976), or wheat germ (Anderson et al., 1983). Many suitable systems are commercially available (for example from Promega) including some which will allow coupled transcription/translation (all the bacterial systems and the reticulocyte and wheat germ TNT™ extract systems from Promega). The mixture of amino acids used may include synthetic amino acids if desired, to increase the possible number or variety of proteins produced in the library. This can be accomplished by charging tRNAs with artificial amino acids and using these tRNAs for the in vitro translation of the proteins to be selected (Ellman et al., 1991; Benner, 1994; Mendel et al., 1995).
After each round of selection the enrichment of the pool of nucleic acids for those encoding the molecules of interest can be assayed by non-compartmentalised in vitro transcription/replication or coupled transcription-translation reactions. The selected pool is cloned into a suitable plasmid vector and RNA or recombinant protein is produced from the individual clones for further purification and assay.
Microcapsules may be identified by virtue of a change induced by the desired gene product which either occurs or manifests itself at the surface of the microcapsule or is detectable from the outside as described in section iii (Microcapsule Sorting). This change, when identified, is used to trigger the modification of the gene within the compartment. In a preferred aspect of the invention, microcapsule identification relies on a change in the optical properties of the microcapsule resulting from a reaction leading to luminescence, phosphorescence or fluorescence within the microcapsule. Modification of the gene within the microcapsules would be triggered by identification of luminescence, phosphorescence or fluorescence. For example, identification of luminescence, phosphorescence or fluorescence can trigger bombardment of the compartment with photons (or other particles or waves) which leads to modification of the nucleic acid. A similar procedure has been described previously for the rapid sorting of cells (Keij et al., 1994). Modification of the nucleic acid may result, for example, from coupling a molecular “tag”, caged by a photolabile protecting group to the nucleic acids: bombardment with photons of an appropriate wavelength leads to the removal of the cage. Afterwards, all microcapsules are combined and the nucleic acids pooled together in one environment. Nucleic acids encoding gene products exhibiting the desired activity can be selected by affinity purification using a molecule that specifically binds to, or reacts specifically with, the “tag”.
Multi Step Procedure
It will be also be appreciated that according to the present invention, it is not necessary for all the processes of transcription/replication and/or translation, and selection to proceed in one single step, with all reactions taking place in one microcapsule. The selection procedure may comprise two or more steps. First, transcription/replication and/or translation of each nucleic acid of a nucleic acid library may take place in a first microcapsule. Each gene product is then linked to the nucleic acid which encoded it (which resides in the same microcapsule). The microcapsules are then broken, and the nucleic acids attached to their respective gene products optionally purified. Alternatively, nucleic acids can be attached to their respective gene products using methods which do not rely on encapsulation. For example phage display (Smith, G. P., 1985), polysome display (Mattheakkis et al., 1994), RNA-peptide fusion (Roberts and Szostak, 1997) or lac repressor peptide fusion (Cull, et al., 1992).
In the second step of the procedure, each purified nucleic acid attached to its gene product is put into a second microcapsule containing components of the reaction to be selected. This reaction is then initiated. After completion of the reactions, the microcapsules are again broken and the modified nucleic acids are selected. In the case of complicated multistep reactions in which many individual components and reaction steps are involved, one or more intervening steps may be performed between the initial step of creation and linking of gene product to nucleic acid, and the final step of generating the selectable change in the nucleic acid.
In all the above configurations, genetic material comprised in the nucleic acids may be amplified and the process repeated in iterative steps. Amplification may be by the polymerase chain reaction (Saiki et al., 1988) or by using one of a variety of other gene amplification techniques including; Qβ replicase amplification (Cahill, Foster and Mahan, 1991; Chetverin and Spirin, 1995; Katanaev, Kumasov and Spirin, 1995); the ligase chain reaction (LCR) (Landegren et al., 1988; Barany, 1991); the self-sustained sequence replication system (Fahy, Kwoh and Gingeras, 1991) and strand displacement amplification (Walker et al., 1992).
(B) DNA Polymerases According to the Invention.
High fidelity DNA polymerases such as Pol A (like Taq polymerase) and Pol-B family polymerases which lack a 3′-5′ exonuclease proofreading capability show a strict blockage to the extension of distorted or mismatched 3′ primer termini to avoid propagation of misincorporations. While the degree of blockage varies considerably depending on the nature of the mismatch, some transversion (purine•purine/pyrimidine•pyrimidine) mismatches are extended up to 106-fold less efficiently than matched termini (Huang 92). Likewise, many unnatural base analogues, while incorporated efficiently, act as strong terminators (Kool, Loakes).
The present inventors have modified the principles described in Ghadessy, F. G et al (2001) Proc. Nat. Acad. Sci, USA, 93, 4552-4557 (compartmentalised self replication) and Ghadessy 2003, and outlined above. Both these documents are herein incorporated by reference. The present inventors have used these modified techniques to develop a method by which the substrates specificity of high fidelity DNA polymerases may be expanded in a generic way.
The inventors have exemplified the technique by expanding the substrate specificity of the high-fidelity pol-A family polymerases. In particular, the present inventors created two repertoires of randomly mutated Taq genes, as described in Ghadessy, F. G et al (2001) referred to above. Three cycles of mismatch extension CSR was performed using flanking primers bearing the mismatches A*G and C*C at their 3′ ends. Selected clones were ranked using a PCR extension assay described herein.
Selected mutants exhibited the ability to extend the G*A and C*C tranversion mismatches used in the CSR selection, but also exhibited a generic ability to extend mispaired 3′ termini. These results are surprising, especially since Taq polymerase is unable to extend such mismatches (Kwok et al, (1990); Huang (1992).
Thus, using this approach, the inventors have generated DNA polymerases which exhibit a relaxed substrate specificity/expanded substrate range.
According to the present invention, the term ‘expanded substrate range’ (of an engineered DNA polymerase) means that substrate range of an engineered DNA polymerase according to the present invention is broader than that of the one or more DNA polymerases, or fragments thereof from which it is derived. The term ‘a broader substrate range’ refers to the ability of an engineered polymerase according to the present invention to extend one or more 3′ mismatches, for example A*A, G*A, G*G, T*T, C*C, which the one or more polymerase/s from which it is derived cannot extend. That is, essentially, a DNA polymerase which exhibits a relaxed substrate range as herein defined has the ability not only to extend the 3′ mismatches used in its generation, (IE those of the flanking primers), but also exhibits a generic ability to extend 3′ mismatches (for example A*G, A*A, G*G).
The two best mutants M1 (G84A, D144G, K314R, E520G, F598L, A608V, E742G) and M4 (D58G, R74P, A109T, L245R, R343G, G370D, E520G, N583S, E694K, A743P) were chosen for further investigation.
M1 and M4 not only had greatly increased ability to extend the G•A and C•C transversion mismatches used in the CSR selection, but appeared to have acquired a more generic ability to extend 3′ mispaired termini, including other strongly disfavoured transversion mismatches (such as A•G, A•A, G•G) (
(ii) M1 and M4 Mutants According to the Invention.
Nucleic acid sequences encoding M1 and M4 pol A DNA polymerase mutants are depicted SEQ No 1 and SEQ No 2 respectively and are shown in
Despite very similar properties, M1 and M4 (and indeed other selected clones) have few mutations in common, suggesting there are multiple molecular solutions to the mismatch extension phenotype. One exception was E520G, a mutation that is shared by all but one of the four best clones of the final selection. Curiously, E520 is located at the very tip of the thumb domain at a distance of 20 Å from the 3′ OH of the mismatched primer terminus and its involvement in mismatch recognition or extension is unclear. However, E520G is clearly important for mismatch extension as backmutation reduces mismatch extension in both M1 and M4 to near wt levels (
The only other feature clearly shared by both M1 and M4 are mutations targeting residues, which may be involved in flipping out the +1 template base. Residue E742 mutated in M1 (E742G) forms a direct contact with the flipped out +1 base on the template strand (L1 et al), while in M4 the adjacent residue A743 is mutated to proline (A743P), which may disrupt interactions by distorting local backbone conformation. Back mutation of E742G in M1 reduced mismatch extension, but only by ca. 20% indicating that it does not contribute decisively to mismatch extension.
Surprisingly, mutations in the N-terminal 5′-3′ exonuclease domain (53exoD) also appear to be contributing to mismatch extension as suggested by the 2-4 fold increased mismatch extension ability of chimeras of the 53exoD of M1, M4 and polD of wtTaq (
The relationship of M1 and M4 with other naturally occurring DNA polymerases Extension of mismatched 3′ primer termini is a feature of naturally occurring polymerases.
Viral reverse transcriptases (RT) like HIV-1 RT or AMV RT and polymerases capable of translesion synthesis (TLS) such as the polY-family polymerases pol ι (Vaisman 2001JBC) or pol κ (Washington 2002 PNAS) or the unusual polB-family polymerase polζ (Johnson Nature), all extend 3′ mismatches with elevated efficiency compared to high-fidelity polymerases. Thus, the selected polymerases share significant functional similarities with preexisting polymerases but represent, to our knowledge, the only known polA-family polymerases that are proficient in mismatch extension (ME) and translesion synthesis (TLS). In contrast to TLS polymerases, which are distributive and depend on cellular processivity factors such as PCNA (Prakash refs for eta/kappa and iota), M1 and M4 combine ME and TLS with high processivity and in the case of M1 are capable of efficient amplification of DNA fragments of up to 26 kb.
In the case of viral RTs, ME may play a crucial role in allowing error-prone yet processive replication of a multi-kb viral genome. For TLS polymerases, proficient mismatch extension is also a necessary prerequisite for their biological function as unpaired and distorted primer termini necessarily occur opposite lesions in the DNA template strand. The ability of TLS polymerases to traverse replication blocking lesions in DNA is thought to arise from a relaxed geometric selection in the active site (Goodman 02). The ability of M1 and M4 to process both bulky mispairs and a distorting CPD (cys-syn thymidine-thymidine dimer) dimer makes it plausible that, in analogy to TLS polymerases, they also have acquired a more open active site. Indeed, modelling showed that a CPD dimer can not be accommodated in the wtTaq polymerase active site without mayor steric clashes (Trincao01).
M1 (and to a lesser degree M4) also display a much increased ability to incorporate extend and replicate different types of unnatural nucleotide substrates that deviate to varying degrees from the canonical nucleobase structure. Of these the αS substitution is the most conservative. However, the sulfur anion is significantly larger than oxygen anion and coordinates cations poorly, which may be among the reasons why the wt enzyme will not tolerate full αS substitution. Fluorescently-labelled nucleotides like αS nucleotides retain base-pairing potential but include a bulky and hydrophobic substituent that must be accomodated by the polymerase active site. Steric clashes in the active site are allievated by the presence of a long, flexible linker. Indeed, we find biotin-16-dUTP a much better substrate for M1 than biotin-11-dUTP, while wtTaq cannot utilize either. The hydrophobic analogue 5NI represents the most drastic departure from standard nucleotide chemistry we investigated. Comparable in size to a purine base, 5NI competely lacks any hydrogen bonding potential but like the natural bases, favours the anti-position with respect to the ribose sugar as judged by NMR (J. Gallego, D. L. and P. H., unpublished results). Therefore, a 5NI•A or 5NI•G basepair would closely resemble a purine-purine transversion mismatch and may cause similar distortions to the canonical DNA duplex geometry. Elegant experiments using isosteric non-hydrogen bonding base analogues have shown that Watson-Crick hydrogen bonding per se is not required for efficient insertion or replication (reviewed by Kool 02). However, while many non-hydrogen-bonding hydrophobic base analogues are efficiently incorporated, they subsequently lead to termination, both at the 3′ end and as a template base (Kool, Romesberg).
Structural and biochemical studies have previously identified regions of the polymerase structure that are important for mismatch discrimination such as motif A (involved in binding the incoming dNTP), the O-helix (motif B) and residues involved in minor groove hydrogen bonding (24, 25). Inspection of the sequence of M1 and M4 reveals a conspicuous absence of mutations in these regions. Rather mutations in M1 and M4 implicate regions of the polymerase not previously associated with substrate recognition such as the tip of the thumb subdomain (E520), the +1 template base-flipping function (E742, A743) in the finger subdomain and the 5-3′ exonuclease domain (53exoD).
The 53exoD is too distant from the active site to have direct effects on mismatch extension. It is, however, thought to be crucial for polymerase processivity and may thus influence mismatch extension (24). Indeed, the Stoffel fragment of Taq polymerase (26), which lacks the 53exoD, displays both reduced processivity and more stringent mismatch discrimination (27). Mutations in the 53exoD of M1 and M4 may therefore contribute to mismatch extension by enhancing polymerase processivity. Together with the ability to bypass abasic sites (generated in large DNA fragments during thermocycling) this may also contribute to the proficiency of M1 at long PCR (
Extension and Incorporation Kinetics of Polymerases According to the Invention.
Examination of the extension and incorporation kinetics of the mutant polymerases suggests that they have a significantly increased propensity to not only extend but also incorporate transversion mispairs and consequently should have a significantly increased mutation rate compared to the wt enzyme. More relaxed geometric selection in the active site might also be expected to come at the price of significantly reduced fidelity as indeed is the case for TLS polymerases (23). However, measurement of the overall mutation rate using the MutS assay (not shown) and sequencing of PCR products generated by M1 indicated only a modest (<2-fold) increase in the mutation rate (Table 1) mostly due to an increased propensity for transversions. As discussed previously (10), CSR should select for optimal self-mutation rates within the error threshold (31). A change in the mutation spectrum towards a more even distribution of transition and transversion mutations may be an effective solution to accelerate adaptation, while maintaining a healthy distance from the error threshold. This may also make M1 a useful tool for protein engineering as the bias of Taq (and other DNA polymerases) for transition mutations limits the regions of sequence space that can be accessed effectively using PCR mutagenesis
In summary DNA polymerases according to the present invention, in particular M1 and M4 respectively as depicted in SEQ No 1 and SEQ No 2 possess the following properties:
(1) DNA Translesion synthesis
(2) A generic ability to incorporate unnatural base analogues into DNA.
(3) M1 has the ability to efficiently amplify DNA targets up to 26 kb.
Uses of DNA Polymerases According to the Invention.
Directed evolution towards extension of distorting transversion mismatches like G·A or C·C by CSR yields novel, “unfussy” polymerases with an ability to perform not only efficient mismatch extension and TLS but also accept a range of unnatural nucleotide substrates. The present inventors have shown that the evolution of TLS from a high-fidelity, polA-family, pol B family or other polymerases requires but few mutations, suggesting that TLS and relaxed substrate recognition are functionally connected and may represent a default state of polymerase function rather than a specialization.
The unusual properties of the DNA polymerases according to the present invention, in particular M1 and M4 may have immediate uses for example for the improved incorporation of dye-modified nucleotides in sequencing and array labelling and/or the amplification of ultra-long DNA targets. They may prove useful in the amplification of damaged DNA templates in forensics or paelobiology, may permit an expansion of the chemical repertoire of aptamers or deoxi-ribozymes (Benner, Barbas, ribozyme review) and may aid efforts to expand the genetic alphabet (Benner, Schultz). The altered mutation spectrum of M1 may make a useful tool in random mutagenesis experiments as the strong bias of Taq and other polymerases towards (A->G, T->C) transitions limits the combinatorial diversity accessible through PCR mutagenesis. Furthermore, the ability of M1 & M4 to extend 3′ ends in which the last base is mismatched with the template strand and the ability of H10 (see example 6) to extend 3′ ends in which the last two bases are mismatched with the template strand may extend the scope of DNA shuffling methods (Stemmer) by allowing to recombine more distantly related sequences.
In addition, DNA polymerases according to the invention, in particular pol A polymerases, for example M1 and M4 pol A polymerases as herein described may serve as a useful framework for mutagenesis and evolution towards polymerases capable of utilizing an ever wider array of modified nucleotide substrates. The inventors anticipate that directed evolution may ultimately permit modification of polymerase chemistry itself, allowing the creation of amplifiable DNA-like polymers of defined sequence thus extending molecular evolution to material science.
The invention will now be described by the following examples which are in no way limiting of the invention claimed herein.
DNA manipulation and protein expression. Expression of Taq clones for screening and CSR selection was as described (10). For kinetic measurements and gel extension assays, polymerases were purified as described (32) using a Biorex70 ion exchange resin (BioRad). All PCR and primer extensions were performed in 1×Taq buffer (50 mM KCl/10 mM Tris•HCl (pH 9.0)/0.1% Triton X-100/1.5 mM MgCl2), with dNTPs (0.25 mM (Amersham Pharmacia Biotech, NJ)) and appropriate primers unless specified otherwise. Primer sequences are provided in Supplementary information. Primer extension reactions were terminated by addition of 95% formamide/10 mM EDTA and analysed on 20% polyacrylamide/7 M Urea gels.
CSR selection. Activity preselected libraries L1* and L2* (10) were combined and 3 rounds of CSR selection carried out as described (10) except using primers 1: (A•G mismatch) and 2: (C•C mismatch) and 15 cycles of (94° C. 1 min, 55° C. 1 min, 72° C. 8 min). Round 2 clones were recombined by staggered extension process (StEP) PCR shuffling (33) as described. For round 3, CSR cycles were reduced to 10 and annealing times to 30 sec.
PCR. A PCR assay was used to screen and rank clones. Briefly, clones were normalized for activity in PCR with matched primers 3, 4 and activity with mismatched primers 1 and 2 (1 μM each) determined at minimal cycle number (15-25 cycles). Extension capability for different mismatches was determined by the same assay using mismatch primers 2 (C•C mismatch), 5 (A•A mismatch), 6 (G•G mismatch), 7 (G•A mismatch) with matched primer 3 or primer 1 (A•G mismatch) with matched primer 4. Incorporation of unnatural substrates in 50 cycle PCR was carried out using standard conditions and 50 μM αS dNTPs (Promega) or 50 μM FITC-12-dATP (Perkin-Elmer), Rhodamine-5-dUTP (Perkin-Elmer) or Biotin-16-dUTP (Roche) with equivalent amounts of the other 3 dNTPs (all 50 μM). Long PCR was carried out using a two-step cycling protocol as described (22) 94° C. for 2 minutes, followed by 20 cycles of (94° C. 15 sec, 68° C. 30 min) using 5 ng of phage λ DNA (New England Biolabs) template and either primers 9, 10, 11 with primer 12 or primer 13 with primers 10, 14.
Single nucleotide incorporation/extension kinetics. Kinetic parameters were determined using a gel-based assay essentially as described (16). Primers 15, 16, 17 (3′ base=G, C, A respectively) were 32P-labeled and annealed to one of template strands 18, 19, 20 (template base=C, G, A respectively) or 21 (template base C different context). Duplex substrates were used at 50 nM final concentration in 1×Taq buffer with various concentrations of enzyme and dNTP. Reactions were carried out at 60° C. for times whereby <20% of primer-template was utilized at the highest concentration of dNTP.
Template affinity assays. An equilibrium binding assay (12) was used to determine relative affinity of polymerases for the mismatched primer-templates used in the kinetics assays. Polymerases were preincubated at 60° C. in 1×Taq buffer with 50 nM 32P-labeled matched primer-template and 50 nM unlabeled mismatched competitor primer-templates. Reactions were initiated by simultaneous addition of dCTP (200 μM) and trap DNA (XbaI/SalI-restricted sheared salmon sperm DNA, 4.5 mg/ml). Prior experiments demonstrated trap-effectiveness over the time period used (15 seconds).
Translesion Replication Assay. Template primers 22 (undamaged) or 23 (containing a synthetic abasic site) were synthesized by Lofstrand Laboratories (Gaithersburg, Md.). Template primer 24 (containing a single cis-syn thymine dimer), was synthesized as described (34). Primer 25 was 32P-labeled and annealed to one of the three templates 22, 23, 24 (at a primer template ratio of molar 1:1.5) and extended in 40 mM Tris•HCl at pH 8.0, 5 mM MgCl2, 100 μM of each dNTP, 10 mM DTT, 250 μg/ml BSA, 2.5% glycerol, 10 nM primer-template DNA and 0.1 Unit of polymerase at 60° C. for various times.
5NI replication assay. Primer 26 was 32P-labeled and annealed to template primer 27 (containing a single 5-nitroindole) in 1×Taq buffer, 0.1 or 0.5 U of the polymerase was added and reactions incubated at 60° C. for 15 mins, after which 40 μM of each dNTP were added and incubation at 60° C. continued for various times.
Fidelity assays. Mutation rates were determined using the mutS ELISA assay (Genecheck, Ft. Collins, Colo.) or by performing 2×50 cycles of PCR on three different templates and sequencing the cloned products.
Kinetic analysis. Extension and incorporation kinetics of M1 and M4 for a selection of mismatches were measured using a gel-based steady-state kinetic assay (Goodman) (Tables 1 & 2). M1 and M4 respectively extend a C•C mispair 390 and 75-fold more efficiently than wtTaq. Examination of the other most disfavored mismatches (G•A, A•G, A•A, G•G) reveals generic, although less pronounced, increases of extension efficiencies, as suggested by the PCR assay (
Translesion synthesis. Transversion mispairs represent distorting deviations from the cognate duplex structure. We therefore investigated if M1 and M4 were capable of processing other deviations of the DNA structure such as lesions in the template strand. Using a gel-extension assay we investigated their ability to traverse an abasic site and a cis-syn thymine pyrmidine dimer (CPD) template strand lesion. In control assays using an undamaged template, wtTaq, M1 and M4 efficiently and rapidly extended primers to the end of the template (
Unnatural substrates. We reasoned that relaxed geometric selection might also aid the incorporation of unnatural base analogues, some of which inhibit or arrest polymerase activity due to poor geometric fit or lack of interaction with either polymerase or template strand. A first, conservative example are phosphothioate nucleotide triphosphates (αS dNTPs), in which one of the oxygen atoms in the α phosphate group is replaced by sulfur. As part of a dNTP mixture, αS dNTPs are generally well accepted as substrates by DNA polymerases but when we replaced all four dNTPs with their αS counterparts in PCR wtTaq failed to generate any amplification products, while M1 (and to lesser extent M4) were able to generate PCR products of up to 2 kbp, indicating that they could utilize αS dNTPs with much increased efficiency compared to the wt enzyme (
Long PCR. Amplification product size with wtTaq is generally limited to fragments a few kb long but can be extended to much longer targets by inclusion of a proofreading polymerase (Barnes 92). We found that the selected polymerases, in particular M1 was able to efficiently amplify of targets up to 26 kb (
Libraries of chimeric polymerase gene variants were constructed using a gene shuffling technique called Staggered extension protocol (StEP, (Zhao, Giver et al. 1998)).
This technique allows two or more genes of interest from different species to be randomly recombined to produce chimeras, the sequence of which contains parts of the original input parent genes.
Thermus aquaticus (Taq) wild type and T8 (a previously selected 11 fold more thermostable Taq variant (Ghadessy, Ong et al. 2001)), Thermus thermophilus (Tth) and Thermus flavus (Tfl) polymerases had previously been amplified from genomic DNA and cloned into pASK75 (Skerra 1994) and tested for activity. These genes were then shuffled using the staggered extension protocol (StEP) as described (Zhao, Giver et al. 1998) with (CAG GAA ACA GCT ATG ACA AAA ATC TAG ATA ACG AGG GCA A and GTA AAA CGA CG G CCA GTA CCA CCG AAC TGC GGG TGA CGC CAA GCG), recloned into pASK75 and transformed into E. coli TG1.
The library size was scored by dilution assays and determining the ratio of clones containing insert using PCR screening and was approximately 108. A diagnostic restriction digest of 20 clones produced 20 unique restriction patterns, indicating that the library was diverse. Subsequent sequencing of selected chimeras showed an average of 4 to 6 crossovers per gene.
CSR emulsification and selection was performed on the StEP Taq, Tth and Tfl library essentially as described (Ghadessy, Ong et al. 2001). Mismatch primers with two mismatches at their 3′ end (5′-GTA AAA CGA CGG CCA GTT TAT TAA CCA CCG AAC TGC-3′, 5′-CAG GAA ACA GCT ATG ACT CGA CAA AAA TCT AGA TAA CGA CC-3′) were in the emulsion as the source of selective pressure. The aqueous phase was ether extracted, PCR purified (Qiagen, Chatsworth, Calif.) with an additional 35% GnHCl, digested with DpnI to remove methylated plasmid DNA, treated with ExoSAP (USB) to remove residual primers, reamplified with outnested primers and recloned and transformed into E. coli as above.
THE RESULTANT CLONES WERE SCREENED AND RANKED BY PCR ASSAY. BRIEFLY, 2 μL OF INDUCED CELLS WERE ADDED TO 20 μL OF PCR MIX WITH THE RELEVANT MISMATCH PRIMERS. CLONES THAT PRODUCED A BAND WERE THEN SUBJECTED TO FURTHER ANALYSIS AND THE MOST ACTIVE CLONES WERE SEQUENCED.
In particular, clone H10 has significant activity on the primers with two mismatches.
H10 is a chimera of T. aquaticus wild type (residues 4 to 20 and 221 to 640), T8 (residues 1 to 3 and 641 to 834) and T. thermophilus (residues 21 to 220). H10 has five detectable crossover sites and 13 point mutations, of which 4 are silent (F74ΠI, F280ΠL, P300ΠS, T387ΠA, A441ΠV, A519ΠV Q536ΠR, R679ΠG, F699ΠL).
CSR emulsification and selection was performed on the StEP Taq, Tth and Tfl library essentially as described (Ghadessy, Ong et al. 2001). The library had previously been cloned into pASK75 (see example 6). The aqueous phase was ether extracted and replication products were purified using a PCR purification kit (Qiagen, Chatsworth, Calif.) including a wash with an 35% GnHCl. 7 μl of purified replication products (from 48) were digested with 1 μl DpnI (20 Units) to remove plasmid DNA and treated with 2 μl ExoSAP (USB) to remove residual primers for 1 h at 37° C. and reamplified with outnested primers (GTAAAACGACGGCCAGT and CAGGAAACAGCTATGAC, 94° C. 2 minutes, and then 30 cycles of 94° C. 30 seconds, 50° C. for 30 seconds and 72° C. for 5 minutes with a final 65° C. for 10 minutes). Reamplification products were digested with XbaI and SalI, recloned into pASK75 and transformed into E. coli as above.
In parallel an alternative selection approach was used: the induced library was emulsified as above with the additional presence of biotinylated dUTP and incubated at 94° C. 5 minutes, 50° C. 1 minute and 72° C. 1 minute. The aqueous phase was ether extracted, the DNA in the aqueous phase was precipitated by addition of 1/10 volume of 3M NaAc, 1 μl glycogen and 2.5 volumes of 100% ethanol. This was then incubated for 1 hour at −20° C., spun for at 13000 rpm for 30 minutes in a benchtop microcentrifuge, washed with 70% ethanol and resuspended in 50 μl buffer EB (Qiagen). 20ul of Dynabeads (DynaL Biotech) were washed twice and resuspended in 20 μl of bead buffer (10 mM Tris pH 7.5, 1 mM EDTA, 0.2M NaCl) The washed beads were then mixed with the selection in a total volume of 0.5 ml bead buffer and then incubated overnight under constant agitation at room temperature to capture biotinylated products. Beads were washed twice in bead buffer, twice in buffer EB and finally resuspended in 50 μl bead buffer. The resuspended beads were reamplified with outnested primers (sequences and programme as above) and recloned and transformed into E. coli as above.
Two sets of mismatch primers with four mismatches at their 3′ end (underlined) (5′-CAG GAA ACA GCT ATG ACA AAA GTG AAA TGA ATA GTT CGA CTTTT-3′ and 5′-GTA AAA CGA CGG CCA GTC TTC ACA GGT CAA GCT TAT TAA GGTG-3′ as the first set and 5′-CAG GAA ACA GCT ATG ACC ATT GAT AGA GTT ATT TTA CCA CAGGG-3′ and 5′-GTA AAA CGA CGG CCA GTC TTC ACA GGT CAA GCT TAT TAA GGTG-3′ as the second set) were used in the emulsion as two separate sources source of selective pressure.
The resultant clones from both CSR and CST were screened and ranked by PCR assay. Briefly, 2 μl of induced cells were added to 20 μl of PCR mix with the relevant 4 mismatch primers. Clones that produced a band were then subjected to further analysis and their activity on single, double and quadruple mismatch primers (single mismatch primers: 5′-CAG GAA ACA GCT ATG ACA AAA ATC TAG ATA ACG AGG GA-3′ and 5′-GTA AAA CGA CGG CCA GTA CCA CCG AAC TGC GGG TGA CGC CAA GCC 3′ (SEQ ID NO: 49); double mismatch primers: CAG GAA ACA GCT ATG ACT CGA CAA AAA TCT AGA TAA CGA CC and GTA AAA CGA CGG CCA GTT TAT TAA CCA CCG AAC TGC; four mismatch primers above.) was investigated. Polymerases that could extend all of these mismatches were found, though many polymerases could do only one of the mismatches and none could do all.
The plasmid DNA of the ten best clones was then purified and shuffled as described above (StEP, (Zhao, Giver et al. 1998)). This was then purified, cut and cloned and the resultant library was subjected to another round of CSR as described (Ghadessy, Ong et al. 2001). The same two sets of mismatch primers with four mismatches at their 3′ end were used in the emulsion as two separate sources source of selective pressure. This was then dealt with as above and the resultant clones were screened and ranked by PCR assay (as above). Once again, polymerases that could extend all of these mismatches were found (see Table), though many polymerases could do only one of the mismatches and none could do all. There was a notable increase in clones displaying mismatch activity over the first round.
The best clones from the second round were combined with the best clones from the first round on a 96 well plate and were subjected to further screening.
The following table is a summary of the results.
A1 is Tth polymerase; A2 Tfl; A3 Taq; A4 M1; A5 M4; A6H10 (see previous example. 1A7 to 1D12 are first round clones (where 1 indicates that these are first round clones), 2E1 to 2H12 are second round clones (where 2 indicates that these are second round clones)
The best first and second round clones were shuffled as described above and subjected to another round of CSR. The same two sets of mismatch primers with four mismatches at their 3′ end were used in the emulsion as two separate sources of selective pressure. This was then dealt with as above and the resultant clones were screened and ranked by PCR assay (as above). Once again, polymerases that could extend all of these mismatches were found. In particular, clones 3B5. 3B8, 3C12 and 3D1 (where 3 indicates that these are third round clones) were able to extend primers containing four mismatches. See
Some promising clones were sequenced. All of the polymerases displayed a similar composition: the first part of the protein, roughly corresponding to the 5-3 exonuclease domain of the polymerase, was derived from Tth, whilst the remaining part of the protein was derived from Taq. Four point mutations (L33ΠP, E78ΠK, D145ΠG and E822ΠK) re-occurred in the majority of sequenced mutants and one (B10) had acquired an extra 16 amino acids at its C terminus through a frame shift at position 2499. Tfl was highly underrepresented, although some of its sequence was present.
The below protocol is a sensitive method to measure polymerase activity both for the incorporation of unnatural nucleotide substrates (added to the reaction mixture) or the extension or replication of unnatural nucleotide substrates (incorporated as part of the hairpin oligo).
The assay comprises a hairpin oligonucleotide which constitutes both primer and template in one. In contains as part of the hairpin a biotinylated dU residue, which allows capture of the hairpin oligonucleotide on streptavidin-coated surfaces.
The oligonucleotide folds up into a hairpin with a 5′ overhang, which serves as the template strand for the polymerase (typical sequence: 5′-AGC TAC CAT GCC TGC ACG CAG TCG GCA TCC GTC GCG ACC ACG TT5 TTC GTG GTC GCG ACG GAT GCC G-3′, bases involved in hairpin formation are underlined, 3′ base is in bold, 5=dU-biotin).
Extension reactions are carried out in the presence of small amounts of a labelled nucleotide typically DIG-16-dUTP. Product is captured (for example on a streptavidin coated ELISA plate) and incorporation of labelled nucleotide into the product strand is measured (using for example an anti-DIG antibody) and taken as a measure of polymerase activity.
Extension reactions are carried out in 1×Taq buffer including 1-100 nM of hairpin primer and 100 μM dNTP mixture (comprising 0.3-30% dUTP-DIG), typically incubated at 94° C. for 1-5 min, followed by incubation at 50° C. for 1-5 min, followed by incubation at 72° C. for 1-5 min. (1-10 μl) Reaction products are added to Streptavidin coated ELISA plates (Streptawell, Roche) in 200 μl PBS, 0.2% Tween20 (PBST) and incubated at room temperature for 10 min to 1 h. ELISA plates are washed 3× in PBST and 200 μl of anti-DIG-POD Fab2 fragment (Roche) diluted 1/2000 in PBST is added and the plate is incubated at room temperature for 10 min to 1 h. The plate is washed 3-4× in PBST and developed with an appropriate POD substrate.
Clones previously selected for their ability to extend from a 4 basepair mismatch were assayed for their ability to incorporate a variety of nucleotide analogues.
Clones were grown at 30° C. overnight in 200 μl 2×TY+ampicillin (100 μg/ml). A 150 μl (2×TY+ampicillin 100 μg/ml) overday culture was started from the overnight and grown for 3 hours at 37° C. After 3 hours protein expression was induced by the addition of 50 μl of 2×TY+anhydrous tetracycline (8 ng/ml) to the culture which was then allowed to grow for a further 3 h at 37° C. The cells were pelleted at 2254×g for 5 minutes and the growth medium removed by aspiration after which the cell pellet was resuspended in 100 μl 1×Taq buffer (10 mM Tris-HCl, pH 9.0, 1.5 mM MgCl2, 50 mM KCl, 0.1% Triton X-100, 0.01% (w/v) stabiliser; HT Biotechnology Ltd). Resuspended cells were lysed by incubation at 85° C. for 10 minutes and the cell debris was pelleted at 2254×g for 5 minutes.
Reactions were performed in a final volume of 12.5 μl comprising:
The reaction conditions were:
5 μl of the extension reaction was added to 200 μl of PBS-Tween (1×PBS; 0.2% Tween 20) in StreptaWell high bind plates (Roche) and allowed to bind for 30 minutes at room temperature. The plate was washed 3× in PBS-Tween after which was added 200 μl PBS-Tween+anti-digioxigenin-POD Fab fragments (antibody diluted 1/2000; Roche). The antibody was allowed to bind for 30 minutes at room temperature.
The plate was washed 3× in PBS-Tween and 200 μl of the substrate added (per ml 100 μl of 1M NaAc pH 6.0, 10 μL of DAB, 1 μl of H2O2, the reaction was allowed to develop after which it was stopped by adding 100 μl of 1M H2SO4.
Experiment I. ELISA with Fluorescein 12-dATP:
The ability of clones selected for 4-mismatch extension to incorporate Fluorescein 12-dATP (Perkin Elmer) was assayed using the primer FITC4. The lysates used were concentrated 4-fold.
Experiment II. ELISA with Biotin 11-dATP:
The ability of clones selected for 4-mismatch extension to incorporate Biotin 11-dATP (Perkin Elmer) was assayed using the primer FITC 10. The lysates used were concentrated 4-fold.
Experiment III. ELISA with CyDye 5-dCTP:
The ability of clones selected for 4-mismatch extension to incorporate Cy5-dCTP (Amersham Biosciences) was assayed using the primer ELISAC4P. The lysates used were concentrated 4-fold.
Experiment IV. ELISA with CyDye 3-dUTP:
The ability of clones selected for 4-mismatch extension to incorporate CyDye 3-dUTP (Amersham Biosciences) was assayed using the primer ELISAT3P. The lysates used were concentrated 4-fold. The DIG labelled dUTP in the extension reaction was replaced with Fluorescein 12-dATP and the incorporation of Fluorescein 12-dATP was detected by anti-Fluorescein-POD Fab fragments (Roche).
Experiment V. Abasic Site ELISA
The ability of clones selected for 4-mismatch extension to bypass abasic sites was assayed using the primer Pscreen1Abas (AGC TAC CAT GCC TGC ACG CAG 1CG GCA TCC GTC GCG ACC ACG TT5 TTC GTG GTC GCG ACG GAT GCC G, 1=abasic site
Clones selected for 4-mismatch extension were assayed for activity with different substrates using an ELISA assay.
A4=Taq mutant M1
A5=Taq mutant M4
A6=Taq mutant H10
Rows A-D Clones isolated after 1 round of 4-mismatch selection
Rows E-H Clones Isolated after 2 rounds of 4-mismatch Selection
The results are shown in
Experiment V. Abasic site and 5-hydroxyhydantoin bypass
Polymerases 3A10 and 3D1 were investigated further for their ability to bypass abasic sites and 5-hydroxy hydantoins, which are both known to exist in damaged DNA such as found in ancient samples, using the ELISA based activity screen as described above. Both polymerases were more proficient at lesion bypass than wild type Taq by up to two orders of magnitude.
The hydantion phosphoramidite was synthesised by standard procedures starting from the hydantoin free base. Glycosylation of the silylated hydantoin base in the presence of tin (IV) chloride with the ditoluoyl (alpha) chlorosugar gave rise to two N-glycosylated products which were separated and characterised by 2D-NMR experiments. The tolyl groups were removed with ammonia to yield the free nucleoside which was dimethoxytritylated and phosphytylated in the usual manner. The hairpin primer to assay hydantoin bypass was: 5′-AGC TAC CAT GCC TGC ACG CAG XCG GCA TCC GTC GCG ACC ACG TTY TTC GTG GTC GCG ACG GAT GCC G-3′, X=hydantoin, Y=Biotin-dU.
The sequences of the clones referred to in Examples are shown below: For the avoidance of any doubt, the first sequence provided in each section is the nucleic acid sequence. The second sequence provided is the corresponding amino acid sequence of the clone.
A list of polymerases selected to extend four mismatches were assayed for their ability to extend abasic sites in PCR (
A list of polymerases selected to extend four mismatches were assayed for their ability to extend abasic sites in PCR (
Seven polymerases were assayed for their ability to bypass abasic sites in a primer extension assay (
Primer extension assays were essentially as described in (Ghadessy et al., 2004). Briefly, undamaged oligonucleotides and a 51mer containing a synthetic abasic site were synthesized by Lofstrand Laboratories (Gaithersburg, Md.) using standard techniques and were gel purified prior to use. A 20mer primer (LES—20P) with the sequence 5′-CGTGGTCGCGACGGATGCCG-3′ was 5′-labeled with [32P]ATP (5000 Ci/mmole; 1 Ci=37 GBq) (Pharmacia) using T4 polynucleotide kinase (Invitrogen, Carlsbad Calif.). Radiolabeled primer-template DNAs were prepared by annealing the 5′[32P] labeled 20mer primer to one of the two following 51mer templates (at a primer template ratio of molar 1:1.5). 1) undamaged DNA (UNDT51T); 5′-AGC TAC CAT GCC TGC ACG AAT TCG GCA TCC GTC GCG ACC ACG GTC GCA GCG-3′; 2) an oligo (LABA51T) containing a synthetic abasic site (indicated as an X in bold font); 5′-AGC TAC CAT GCC TGC ACG ACA XCG GCA TCC GTC GCG ACC ACG GTC GCA GCG-3′. Standard replication reactions of 10 μl contained 40 mM Tris•HCl at pH 8.0, 5 mM MgCl2, 100 μM of each ultrapure dNTP (Amersham Pharmacia Biotech, NJ), 10 mM DTT, 250 μg/ml BSA, 2.5% glycerol, 10 nM 5′[32P] primer-template DNA and 0.1 Unit of polymerase. After incubation at 60° C. for various times reactions were terminated by the addition of 10 μl of 95% formamide/10 mM EDTA and the samples heated to 100° C. for 5 min. Reaction mixtures (5 μl) were subjected to 20% polyacrylamide/7 M Urea gel electrophoresis and replication products visualized by PhosphorImager analysis.
Polymerases A10 was the most active and was chosen for further analysis (
Relaxed specificity might be expected to be achieved at the cost of lower fidelity. We used a MutS ELISa to investigate this possibility.
MutS is an E. coli derived mismatch binding protein that binds single base pair mismatches or small (1-4 base) additions or deletions. It can be used to monitor PCR fidelity in an ELISA based assay (Debbie et al., 1997).
Immobilised Mismatch Binding protein plates (Genecheck, Ft Collins, USA) were used for fidelity measurements as per manufacturer's instructions, essentially as described in (Debbie et al., 1997).
The mutation rate of D1 was compared that of wtTaq and M1 M1 was already known to have a modestly increased mutation rate (approximately 2 fold) (Ghadessy et al., 2004). The data presented here suggests that D1 has a 2 fold increased error rate compared to M1 and a four fold increased error rate compared to wtTaq. This corresponds approximately to a 1 in 2500 error ratio and is sufficiently low to not be problematic for many applications.
DNA recovered from ancient samples is invariably damaged, limiting the information it can yield. Polymerases that can bypass damage (such as abasic site or hydantoins) might therefore be useful in increasing the information that can be recovered from ancient samples of DNA.
Experiment 1: A Mismatch Extending Polymerase can Amplify Previously Un-Amplifiable Cave Hyena DNA
Several samples of cave hyena (Crocuta spelaea) were extracted and analysed. Of those, seven samples (see
These samples were chosen to test the efficacy of the expanded substrate spectrum polymerases.
M1 has a slightly reduced kcat/Km, 14% of Taq wild type, and is hence slightly less efficient in PCR. Therefore, M1 was blended with a commercial preparation of Taq (SuperTaq (HT biotechnology Ltd)) in a ratio of 1 unit to 10 and compared to Taq in the absence of M1. It was hoped that if M1 could bypass the blocking lesions, then the wild type Taq would amplify the resulting translesion synthesis product. On two separate occasions, the M1/SuperTaq mix was able to produce an amplification product whereas SuperTaq alone did not (see
The DNA was cloned and sequence and found to differ in two positions (A71→G, 77A→G) from the expected sequence. This could either be a miscoding lesion resulting from a deamination of C or a population variant sequence not seen previously in aDNA. Indeed, both mutations exist in modern spotted hyena (Crocuta crocuta), arguing for the second interpretation. Of the 10 sequences obtained from the same successful PCR, two each had a further unique single mutation, an A to G in different places. These are most likely errors incurred during amplification. Such errors are frequently seen in aDNA PCR and are one reason why multiple sequences need to be obtained from the same PCR product.
Contamination problems prevented an exhaustive analysis of the benefits of M1 polymerase. However, this result strongly suggested that a suitable altered polymerase could be usefully applied to aDNA.
Experiment 2: A Blend of Mismatch Extending Polymerase Needs Less Ancient DNA for a Successful PCR.
Polymerases that displayed interesting properties: B5, B8, C12 and D1, which can extend mismatches as well as A10, B6 and B10 which are proficient at abasic site bypass were purified. In order to keep the number of experiments manageable, they were blended in equal volumes with M1, SuperTaq and heparin purified wild-type Taq. This mix of polymerases was used in almost all subsequent experiments and is referred to as the blend.
To ensure that no polymerase would negatively affect the PCR through its mutant activity, each one was individually blended with SuperTaq and used to perform an aDNA PCR with an ancient sample known to contain amplifiable DNA. All PCRs were successful (data not shown), indicating that it was unlikely that any of the mutant enzymes would be a liability in the blend.
The activity of the blend was checked against the activity of SuperTaq by a PCR activity dilution series. By this measure, the blend was less active than SuperTaq, by a factor of two.
The conditions that are usually used in aDNA PCR did not transfer readily to the blend or to SuperTaq as they had been optimised for AmpliTaqGold (Applied Biosystems), a chemically modified version of Taq that allows a hot start and slow enzyme release through heat activation. Manual hot starts are not advisable in aDNA analysis because opening the PCR tube outside the clean room prior to thermocycling carries a high risk of contamination. Furthermore, alternative hot start techniques could not be utilised either: antibodies used to inactivate wtTaq at low temperatures might not bind to the chimerical proteins selected from the Molecular Breeding library and hot start buffers proved ineffective (data not shown). A new two step nested PCR strategy was used. In the first step, the aDNA is amplified over 28 cycles with either SuperTaq or the blend. In the second step, the first PCR is diluted 20 fold in a secondary clean room and amplified with SuperTaq using in-nested primers. This is the approach subsequently used to compare SuperTaq and the blend
Briefly, 2 μl of ancient sample were added to a 20 μl PCR in SuperTaq buffer (HT Biotech) with 1 μM of the appropriate primers (see
A two fold dilution series of aDNA with equal volumes of SuperTaq and the blend (and therefore approximately equal activities, with the blend slightly less active) was performed and repeated this four times
This experiment showed that the blend was more likely to produce a band at a lower concentration of aDNA than SuperTaq. This therefore represented the second experiment that indicated that the mismatch extension polymerases were more proficient at amplifying aDNA than wild-type Taq.
Experiment 3: The Mismatch Extension Polymerases Perform Consistently Better in Ancient DNA PCR.
Sample heterogeneity and the inherent stochasticity of aDNA analysis make the interpretation of a single positive or negative PCR problematic. To address this, multiple PCRs of a same sample and count the number of successful PCR amplifications at a limiting sample dilution were performed. Comparison of SuperTaq with the blend would allowed a statistical analysis. As the amount of aDNA required for this type of approach is large, samples previously shown to be of high quality were chosen and tested at limiting dilutions to increase the amount of material available for analysis. A short target sequence was chosen to allow maximal dilutions.
This has the additional advantage that at a sufficiently high dilution, the undamaged DNA will have been diluted out, leaving only damaged template. In such conditions, the difference between a polymerase that can bypass blocking lesions and one that cannot should become clearly apparent.
A total of nine experiments at limiting amounts of aDNA, where the PCR would only be stochastically successful (
We can therefore state that this effect is not due to chance and that the blend is repeatedly performing better than SuperTaq in the conditions of the experiment. This proves beyond reasonable doubt that the mismatch extension polymerases are a more sensitive tool for the recovery of ancient DNA sequences.
We selected for extension and bypass of 5NI directly from the polymerase chimera library described in example 8 using an analogous strategy to the mismatch selection using flanking primers (5′-CAG GAA ACA GCT ATG ACA AAA ATC TAG ATA ACG AGG GCA 5NI-3′, 5′-GTA AAA CGA CGG CCA GTA CCA CCG AAC TGC GGG TGA CGC CAA GC5NI-3′ comprising 5NI (or a derivative) at their 3′ ends. After round 3, we used flanking primers (5′-CAG GAA ACA GCT ATG ACA AAA ATC TAG ATA 5NICG AGG GCA 5NI-3′, 5′-GTA AAA CGA CGG CCA GTA CCA C5NIG AAC TGC GGG TGA CGC CAA GC5NI-3′) comprising internal 5NI (or a derivative) as well as 3′ terminal 5NI (or a derivative) to increase selection pressure for 5NI replication.
Five rounds of selection yielded a number of clones with greatly increased ability to replicate 5NI. Among the best clones were round 4 clone 4D11 and round 5 clone 5D4:
Round 5 polymerases selected for replication of 5NI were tested for activity with a range of substrates using the hairpin ELISA assay described in example 8. tUTP and ceATP were kind gifts from the laboratory of P. Herdewijin, Rega Institute, Katholieke Universiteit Leuven, Belgium. Results are shown in
1. ELISA with tUTP:
The ability of round 5 clones selected for 5NI replication extension to sequentially incorporate 2 or 3 of the TNA UTP derivative (3′, 2′)-beta-L-threonyl-UTP was assayed using the hairpin primers (ELISAT2p: 5′-TAG CTC GGT AA CGC CGG CTT CCG TCG CGA CCA CGT TX TTC GTG GTC GCG ACG GAA GCC G-3′, ELISAT3p: 5′-TAG CTC GGT AAA CGC CGG CTT CCG TCG CGA CCA CGT TX TTC GTG GTC GCG ACG GAA GCC G-3′ (X=dU-biotin (Glen research)). The lysates used were concentrated 4-fold. ELISA protocol was a described except that The DIG labelled dUTP in the extension reaction was replaced with Fluorescein 12-dATP (Perkin-Elmer) (at 3% of dATP) and the incorporation of Fluorescein 12-dATP was detected by anti-Fluorescein-POD Fab fragments (Roche).
2. ELISA with ceATP:
The ability of round 5 clones selected for 5NI replication extension to sequentially incorporate the cyclohexenyl ATP derivative ceATP was assayed using the hairpin primers (ELISA2p: 5′-TAG CTC GGA TTTT CGC CGG CTT CCG TCG CGA CCA CGT TX TTC GTG GTC GCG ACG GAA GCC G-3′, (X=dU-biotin (Glen research)). The lysates used were concentrated 4-fold.
3. ELISA with CyDye 5-dCTP and CyDye 3-dCTP:
The ability of round 5 clones selected for 5NI replication extension to sequentially incorporate the fluorescent dye-labelled nucleotides Cy5-dCTP and Cy3-dCTP (Amersham Biosciences) was assayed using the hairpin primers (ELISA2p: 5′-TAG CTA CCA GGG CTC CGG CTT CCG TCG CGA CCA CGT TXT TCG TGG TCG CGA CGG AAG CCG-3′, (X=dU-biotin (Glen research)). The lysates used were concentrated 4-fold.
4. Basic Site Bypass ELISA
The ability of round 5 clones selected for 5NI replication extension to bypass an abasic site was assayed using the hairpin primer (PScreenlabas: 5′-AGC TAC CAT GCC TGC ACG CAG YCG GCA TCC GTC GCG ACC ACG TTX TTC GTG GTC GCG ACG GAT GCC G-3′, (X=dU-biotin, Y=abasic site (Glen research)). The lysates used were concentrated 4-fold.
1: Extension opposite 5-nitroindole.
Primer extension reactions were carried out as follows:
50 pmol of 32P-labelled primer and 100 pmol of template in a volume of 44 μl were annealed in 1×Taq buffer. 4D11 or 5D4 polymerase as cell lysate (611) was added and reactions were incubated at 50° C. for 15 minutes followed by addition of one dNTP (1 μl in total volume of 50 μl, final dNTP concentration 40 μM). 8 μl samples were taken at various time points and added to 8 μl stop solution (7M urea, 100 mM EDTA containing xylene cyanol F). At the end of the time course the remaining 3 dNTPs were added (final concentration each dNTP 40 μM) and reactions incubated at 50° C. for a further 30 minutes. Reaction samples were electrophoretically separated using 20% polyacrylamide gels at 25W for 4 hours. The resultant gels were dried and scanned using a phosphorimager (Molecular Dynamics). Data was processed using the program InageQuant (Molecular Dynamics). Results are shown in FIGS. 35, 36:
Similar reactions using Taq, Tth and Tfl wild-type polymerases under identical conditions leads to almost undetectable extension reactions (data not shown).
2. Incorporation and Extension of 5-nitroindole-5′-triphosphate (5NITP).
Primer extension reactions were carried out as follows:
50 pmol of 32P-labelled primer and 100 pmol of template in a volume of 44 μl were annealed in 1×Taq buffer. 4D11 or 5D4 polymerase as cell lysate (6 μl) was added and reactions were incubated at 50° C. for 15 minutes followed by addition of d5NITP (1 μl in total volume of 50 μl, final dNTP concentration 40 μM). 8 μl samples were taken at various time points and added to 8 μl stop solution (7M urea, 10 mM EDTA containing xylene cyanol F). At the end of the time course the 4 native dNTPs were added (final concentration each dNTP 40 μM) and reactions incubated at 50° C. for a further 30 minutes. Reaction samples were electrophoretically separated using 20% polyacrylamide gels at 25W for 4 hours. The resultant gels were dried and scanned using a phosphorimager (Molecular Dynamics). Data was processed using the program ImageQuant (Molecular Dynamics). Results are shown in FIGS. 17, 18):
The NI-NI self-pair is also formed exceptionally well, though further extension is reduced (data not shown). Similar reactions using Taq, Tth and Tfl wild-type polymerases under identical conditions leads to almost undetectable extension reactions (data not shown).
Targets were prepared by PCR amplification of 2.5 kb Taq gene using primers 29, 28 or 2 kb of the HIV pol gene using primers 30, 31. Salmon sperm DNA (Invitrogen) was prepared at 100 ng/ul in 50% DMSO. FITC and Cy5 probes were prepared by PCR amplification of 0.4 kb fragment of Taq using primers 8, 28 with either 100% (FITC100M1) or 10% of dATP (FITC10M1, FITC10Taq) replaced by FITC-12-dATP or 10% of dCTP replaced by Cy5-dCTP (Cy5Taq). Cy5 and Cy3 random 20mers (MWG) were used at 250 nM. Targets were purified using PCR purification kit (Qiagen) and prepared in 50% DMSO and spotted onto GAPSII aminosilane-coated glass slides (Corning) using a MicroGrid (BioRobotics). Array hybridizations were performed according to standard protocols:
Printed slides were baked for 2 hr at 80° C., incubated with agitation for 30 min at 42° C. in 5×SSC/0.1% BSA Fraction V (Roche)/0.1% SDS, boiled for 2 min in ultrapure water, washed 20× in ultrapure water at room temperature (RT), rinsed in propan-2-ol and dried in a clean airstream. 50 ng of FITC- and Cy5-labelled probes were prepared in 20 μl of hybridization buffer (1 mM Tris-HCl pH7.4, 50 mM tetrasodium pyrophosphate, 1× Denhardts solution, 40% deionised formamide, 0.1% SDS, 100 μg/ml sheared salmon sperm DNA). Each sample was heated to 95° C. for 5 min, centrifuged for 2 min, applied to the surface of an array and covered with a 22×22 mm HybriSlip (Sigma). Hybridizations were performed at 48° C. for 16 hr in a hybridization chamber (Corning). Arrays were washed once with 2×SSC/0.1% SDS at 65° C. for 5 min once with 0.2×SSC at RT for 5 min and twice with 0.05×SSC at RT for 5 min. Slides were dried in a clean airstream, scanned with an ArrayWoRx autoloader (Applied Precision Instruments) and the array images analysed using SoftWoRx tracker (Molecularware).
Complete substitution of natural nucleotides with their unnatural counterparts altered the properties of the resulting amplification products. For example, fully alphaS substituted DNA was completely resistant to nuclease digestion (not shown).
The 0.4 kb fragment, in which all adenines (dA) on both strands had been replaced with FITC-12-dAMP (FITC100M1), displayed extremely bright fluorescence. The frequency of fluorophore incorporation per 1000 nucleotides (FOI) is commonly used to specify the fluorescence intensity of a probe. FOIs of microarray probes commonly range from 10-50, while FITC100M1 has an FOI of 295. To investigate if such a high level of fluorophore substitution would affect hybridisation characteristics we performed a series of microarray experiments. We compared the fluorescent signal generated by FITC100M1 with equivalent probes generated using either wtTaq or M1 and replacing only 10% of dAMP with FITC-12-dAMP (FITC10Taq, FITC10M1 (FOI=30)). In competitive co-hybridisation with a standard Cy5-labelled probe (Cy5Taq), FITC100M1 hybridised specifically only with its cognate Taq polymerase target sequence and not with any non-cognate control DNA. Hybridisation of FITC100M1 generated an up to 20-fold higher specific signal than equimolar amounts of the FITC10 probes (
Mutation rates were determined using the mutS ELISA assay26 (Genecheck, Ft. Collins, Colo.) according to manufacturers instructions. Alternatively, amplification products derived from 2×50 cycles of PCR of 2 targets with different GC content (HIV pol (38% GC), Taq (68% GC)) were cloned, 40 clones (800 bp each) were sequenced and mutations (wtTaq (51), M1 (75)) analyzed.
Promiscuous mismatch extension might be expected to come at the price of reduced fidelity, as misincorporation no longer leads to termination. Measurement of the overall mutation rate using both the MutS assay (
Naturally occurring translesion polymerases are mostly poorly processive. We therefore investigated, if processivity of M1 and M4 was similarly reduced but found that, even at the lowest enzyme concentrations, primer extension and termination probabilities by M1 and M4 closely matched those of wtTaq (
Processivity was measured using a primer extension assay the presence and absence of trap DNA. Termination probabilities were calculated according to the method of Kokoska et al.
Oligonucleotide primer 32 (5′-GCG GTG TAG AGA CGA GTG CGG AG-3′) was 32P-labelled and annealed to the template 33 (5′-CTC TCA CAA GCA GCC AGG CAA GCT CCG CAC TCG TCT CTA CAC CGC TCC GC-3′) (at a primer/template ratio of molar 1/1.5). wtTaq (0.0025 nM; 0.025 nM; 0.25 nM), M1(0.05 nM; 0.5 nM; 5 nM), and M4 (0.05 nM; 0.5 nM; 5 nM) were preincubated with the primer-template DNA substrates (10 nM) in 10 mM Tris-HCl at pH 9.0, 5 mM MgCl2, 50 mM KCl, 0.1% Triton X 100 at 25° C. for 15 min. Reactions were initiated by addition of 100 μM dNTPs with or without trap DNA (1000-fold excess of unlabeled primer-templates). Reactions were performed at 60° C. for 2 min. Preincubation of polymerases with the trap DNA substrate and labelled primer-template before the addition of dNTPs completely abolished primer extension (not shown) demonstrating trap effectiveness. Thus, in the presence of trap DNA, all DNA synthesis resulted from a single DNA binding event. Gel band intensities were calculated using a Phosphoimager and ImageQuant (both Molecular Dynamics) software. Percentage of polymerase molecules, which extended primers to the end of the template was calculated using the formula: In x 100%/(I1+I2+ . . . +In), where In is the intensity of the band at position 22 or 23; I1, I2 . . . is the intensity of the band at position 1, 2 . . . Termination probabilities (τ) were calculated according to the method of Kokoska et all, whereby T at a particular template position was calculated as the intensity of the band at this position divided by the sum of the intensity of this band and the band intensities of all longer products.
All publications mentioned in the above specification are herein incorporated by reference. Various modifications and variations of the described methods and system of the present invention will be apparent to those skilled in the art without departing from the scope and spirit of the present invention. Although the present invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention which are obvious to those skilled in biochemistry, molecular biology and biotechnology or related fields are intended to be within the scope of the following claims.