US 20060051789 A1
Methods of preparing gene-specific oligonucleotide libraries are disclosed. In one embodiment a double-stranded RNA corresponding to both sense and antisense strands of mRNA is digested by ribonuclease to produce short RNA fragments. In subsequent ligation steps, flanking oligoribonucleotides of defined sequences may be attached to the 3- and 5-ends of each fragment by RNA ligase (such as T4 RNA ligase). The products of ligation can be reverse transcribed and PCR amplified (RT-PCR) using the oligonucleotides attached to the gene-derived sequences as primer-binding sites. Various methods for incorporating libraries into expression vectors allowing expression of either siRNAs or shRNAs are also disclosed.
1. A method of producing a target-specific library that comprises substantially all sequences of a pre-determined length or range of lengths that are comprised within a target polynucleotide sequence, the method comprising:
digesting a double-stranded RNA copy of said target polynucleotide with a nuclease to generate fragments of from about 10 nucleotides to about 40 nucleotides in length;
dephosphorylating said RNA fragments;
ligating said RNA fragment to a first flanking oligonucleotide comprising a 3′ terminator nucleotide to generate a first ligation product;
phosphorylating said first ligation product;
ligating to said first ligation product a second flanking oligonucleotide lacking a 5′ phosphate group to generate a second ligation product; and
reverse transcribing said second amplification product to generate a cDNA;
amplifying said cDNA with primers complementary to said first and said second flanking oligonucleotide;
wherein said resulting library of polynucleotides comprises substantially all sequences of a pre-determined length within said target polynucleotide sequence.
2. A method of producing a target-specific library that comprises substantially all sequences of a pre-determined length or range of lengths that are comprised within a target polynucleotide sequence, the method comprising:
digesting a double-stranded RNA copy of said target polynucleotide with a nuclease to generate fragments of from about 10 nucleotides to about 40 nucleotides in length;
dephosphorylating said RNA fragments;
ligating 2′-deoxyadenosine 3′-monophosphate (pdAp) to each end of said product of dephosphorylation;
dephosphorylating the product of said ligation reaction;
ligating product of said dephosphorylation reaction into a linearized vector having 3′-deoxythymidine overhangs;
filling in gaps by using a DNA polymerase such as E. coli Pol l;
amplifying the resulting vector in bacteria to replace RNA with DNA;
wherein said resulting library of polynucleotides comprises substantially all sequences of a pre-determined length within said target polynucleotide sequence.
3. The method according to
4. The method of
5. The method of
6. The method according to
7. The method according to
8. The method of
9. The method of
10. The method of
11. The method of
12. The method of
13. The method of
14. The method according to
digesting said library of polynucleotides with a restriction endonuclease that cleaves in the ligated flanking sequences.
15. The method of
16. A method of producing a target-specific library that comprises substantially all sequences of a pre-determined range of lengths that are comprised within a target polynucleotide sequence, the method comprising:
partially digesting a double-stranded DNA copy of said target polynucleotide with DNase I, and digestion is performed in the presence of Mn+2 to generate blunt-ended fragments of from about 10 nucleotides to about 40 nucleotides in length or a wider range that comprises the range 10 to 40 nucleotides; and
ligating said DNA fragment to a first adapter;
ligating the above product to a second DNA adapter.
amplifying the product of the above reaction using primers complementary to said first and said second adapters.
inserting said fragments into a vector or between fixed sequence segments of DNA.
17. The method of
18. The method according to
19. The method according to claims 17, further comprising the steps of:
digesting the product of ligation or amplification with one or two restriction endonucleases targeted to a sequence in one or both adapters.
20. A method of producing a target-specific library that comprises substantially all sequences of a pre-determined range of lengths that are comprised within a target polynucleotide sequence, the method comprising:
hybridizing hemi-random probes to a ssDNA target, wherein said hemi-random probes comprise a fixed region comprising primer-binding sequences with encoded restriction enzyme recognition sites and a 10-nt randomized sequence located at the 5′ end in the case of one probe and at the 3′-end in the case of the other;
ligating hybridized probes that hybridize to adjacent target sequences;
amplifying the product of said ligating step;
inserting the product of said amplification into a vector or between DNA sequences allowing expression of the inserted sequences.
21. The method according to
22. The method according to
23. The method according to
24. The method according to
The invention provides methods and reagents for producing gene-specific (directed) oligonucleotide libraries comprising sequences of defined length corresponding to portions of a polynucleotide target of interest, and their uses in wide range of nucleic acid applications, as gene inhibitors and analytical/diagnostics probes.
Important requirements for gene inhibitors and diagnostic methods based on hucleic acids are sequence specificity and high efficacy. Such applications include si/shRNA (small interfering/small hairpin RNA) (Rossi et al. (2002) Nucleic Acids Res. 30:1757-1766; Shi (2003) TRENDS Genetics 19: 9-12; Bohula et al. (2003) J. Biol. Chem. 278: 15991-15997), ribozyme (Scarabino & Tocchini-Valentini (1996) FEBS Lett. 383:185-190; Amarzguioui et al. (2000) Nucleic Acids Res. 28:4113-4124), and antisense (Bruice & Lima (1997) Biochemistry 36:5004-5019; Sohail & Southern (2000) Adv. Drug Deliv. Rev. 44:23-34) approaches to gene inhibition, as well as microarrays (Southern et al. (1999) Nat. Genet. 21:5-9), competitive RT-PCR (Ishibashi (1997) J. Biochem. Biophys. Methods 35:203-207), blots and in situ hybridization.
The specificity and efficacy of probe hybridization depends on parameters such as target accessibility, hybridization rate, and the stability of the formed duplex (Sczakiel and Far (2002) Curr. Opin. Mol. Ther. 4:149-153). Because of the complexity of these interactions, the rational design methods, both experimental and theoretical, that have been developed for predicting optimal probe sequences and target site accessibility have had only limited success (Sczakiel & Far (2002) Curr. Opin. Mol. Ther. 4:149-153; Sohail & Southern (2000) Adv. Drug Deliv. Rev. 44: 23-34). Also, the common notion that sequences that are less involved in internal hydrogen bonding interactions represent more favorable target sequences is an oversimplification (Sczakiel & Far (2002) Curr. Opin. Mol. Ther. 4:149-153; Fakler et al. (1994) J. Biol. Chem. 269:16187-16194; Laptev et al. (1994) Biochemistry 33:11033-11039). Target RNAs are often folded differently in the cell than in vitro (Lindell et al. (2002) RNA 8:534-541), and may be complexed with proteins that further reduce target site accessibility (Lieber & Strauss (1995) Mol. Cell Biol. 15:540-551). Conversely, some cellular factors may promote probe hybridization with target sites that are not accessible in vitro (Laptev et al. (1994) Biochemistry 33:11033-11039; Bertrand & Rossi (1994) EMBO J. 13:2904-2912).
As a consequence of this complexity, optimal sequences of nucleic acid hybridization probes as well as antisense and ribozyme gene-inhibitors (drugs) cannot reliably be selected based on sequence data analysis or using experimentally-determined in vitro target accessibility. To address this problem, several in vitro and in vivo methods for selecting optimal target sequences from sequence libraries have been developed, using 5-30 nucleotide long variable sequences (Lieber & Strauss (1995) Mol. Cell. Biol. 15:540-551; Allawi et al. (2001) RNA 7:314-327; Lloyd et al. (2001) Nucleic Acids Res. 29:3664-3673; Ho et al. (1998) Nat. Biotechnol. 16:59-63; Birikh et al. (1997) RNA 3:429-437; Lima et al. (1997) J. Biol. Chem. 272:626-638; Wrzesinski et al. (2000) Nucleic Acids Res. 28:1785-1793; Scherr et al. (2001) Mol. Ther. 4:454-460; Milner et al. (1997) Nat. Biotechnol. 15: 37-541; Patzel & Sczakiel (2000) Nucleic Acids Res. 28: 2462-2466; Yu et al. (1998) J. Biol. Chem. 273:23524-23533; WO 00/43538; WO 02/24950). An additional advantage of such libraries is that they can be used in a “reverse” genomics approach, which can identify genes responsible for a specific phenotype without prior knowledge of any sequence information (Li et al. (2000) Nucleic Acids Res. 28:2605-2612; Kawasaki & Taira (2002) Nucleic Acids Res. 30:3609-3614) Akashi et al. (2005) Nature Rev. 6:413-22. In case of small interfering RNAs (including siRNA, shRNA and miRNA) the situation is even more complicated.
In the case of siRNAs and shRNAs, the situation is even more complicated. Not all siRNA and shRNA sequences are equally potent or specific. Although it has long been thought that siRNAs shorter than about 30 bp avoided induction of interferon and PKR, recent reports indicate that in fact siRNAs longer than about 19 bp (Fish & Kruithof (2004) BMC Mol. Biol. 5:9) or having a 5′-triphosphate group (Kim et al. (2004) Nat. Biotechnol. 22: 321-325) can trigger an interferon response. In addition, siRNAs can produce off-target effects, whereby unintended mRNAs are silenced due to having partial homology to the siRNA. Off-target effects may be less problematic with highly potent siRNAs because they can be used at lower concentrations, where discrimination between matched and mismatched targets is greater. Identifying highly potent siRNAs is also crucial to efforts to develop siRNA therapeutics. High potency has been associated with specific sequence features as well as the internal stability profile of the siRNA and the accessibility of the mRNA target site (Elbashir et al. (2001) Nature 411: 494-498; 2001; Lee et al. (2002) Nat. Biotechnol. 20: 500-505; Paul et al. (2002) Nat. Biotechnol. 20: 505-508; Paul et al. (2002) Nat. Biotechnol. 20: 505-508; Hohjoh (2002) FEBS Lett. 521: 195-199; Holen et al. (2002) Nucleic Acids Res. 30: 1757-1766 Khvorova et al. (2003) Cell 115: 209-216; Kretschmer-Kazemi et al. (2003) Nucleic Acids Res. 31: 4417-4424; Reynolds et al. (2004) Nat. Biotechnol. 22: 326-330; Ui-Tei et al. (2004) Nucleic Acids Res. 32: 936-948). These correlations have been incorporated into algorithms that are commonly used to predict functional siRNAs. Despite their success at finding good siRNAs, many effective siRNA sequences are not predicted by current algorithms. Ideally, all possible target-specific siRNA sequences of appropriate lengths would be tested in cells to assure finding the best inhibitors for a given mRNA (Singer et al. (2004) Proc. Natl. Acad. Sci. USA. 101: 5313-5314). However, such a “brute force” approach is expensive and time-consuming. An attractive alternative is to screen cell-based libraries of sequences for the most potent siRNAs, without any bias for or against sequence features except for their presence within the target.
In principle, screening for gene inhibitors may be performed by using completely random (degenerate) libraries. However, this approach has several major problems. The high complexity of random libraries (e.g., 420 or ˜1012 molecules for 20-nt antisense sequences represented only about once in the human genome) (Saha et al.) may make this approach time-consuming and expensive for cell-based assays (Kruger et al., 2000; Kawasaki & Taira, 2002; Miyagashi & Taira, 2002; Tran et al. 2003). Also, experiments have shown that degenerate libraries are highly toxic to cells: antisense ribozymes with degenerate substrate recognition sites can efficiently block the functioning of both mRNAs of interest (host or foreign) and unintended cellular RNAs (Pierce & Ruffner, 1998; Kruger et al., 2000). Several groups have made gene-specific siRNA pools by digestion of long RNA duplexes with E. coli RNase III (Calegari et al. (2002) Proc. Natl. Acad. Sci. USA 99: 14236-14240; Yang et al. (2002) Proc. Natl. Acad. Sci. USA 99: 9942-9947; Yang et al. (2004) Methods Mol. Biol. 252: 471-482; Kittler et al. (2004) Nature 432: 1036-1040) or recombinant human Dicer (Kawasaki et al. (2003) Nucleic Acids Res. 31: 981-987). Such siRNA pools are able to efficiently silence target mRNAs, and can be directly used in cell-based loss-of-function studies. However, no selection of the most potent siRNA species is possible unless RNAs are converted into DNA sequences and incorporated into appropriate expression vectors (as described in the present invention). Such expression vectors may contain opposing (convergent) promoters, allowing transcription of both RNA strands, which can then anneal to form functional siRNA molecules. Similar vectors to express siRNA libraries comprising both defined and randomized sequences have been recently described (Tran et al. (2003) BMC Biotechnol. 3: 1-9; Zheng et al. (2004) Proc. Natl. Acad. Sci. USA. 101: 135-140; Seyhan et al. (2005) RNA 11: 837-846)
A number of previous studies have suggested that for a given target site, shRNAs expressed as single molecules from vectors with pol IlIl promoters are generally more effective than siRNAs expressed as separate strands from opposing promoters. Any effective siRNA sequences identified by screening of gene-specific siRNA libraries can be subsequently converted to the shRNA format and tested for improvements in gene silencing. However, in certain cases pol III-expressed siRNA libraries may have an advantage over shRNA libraries. Since short siRNAs may bypass the Dicer processing pathway (Lee et al. (2002) Nat. Biotechnol. 20: 500-505; Paul et al. (2002) Nat. Biotechnol. 20: 505-508; Miyagishi & Taira (2002) Nat. Biotechnol. 20: 497-500), siRNAs could potentially be used in differentiated cells containing little or no Dicer (Brummelkamp et al. (2002) Science 296: 550-553; Sui et al. (2002) Proc. Natl. Acad. Sci. USA 99: 5515-5520; Parrish et al. (2000) Mol. Cell. 6: 1077-1087; Zheng et al. (2004) Proc. Natl. Acad. Sci. USA. 101: 135-140). Besides, shRNAs can be difficult to amplify and transcribe, and are unstable during cloning in E. coli, which can lead to a reduction in library coverage and potential loss of the best target sites.
To take full advantage of the expressed siRNA libraries, an appropriate screen for the most potent siRNA species should be devised. The screening can be done by cloning all species and testing them individually in cell culture, a very laborious process (Zheng et al. (2004) Proc. Natl. Acad. Sci. USA. 101: 135-140; Aza-Blanc et al. (2003) Mol. Cell. 12: 627-637) or by a screen for the phenotype conferred by inhibition of the target. For fluorescent-tagged targets such as GFP fusions, a fluorescence-activated cell sorter can be used. For targets whose silencing confers a growth or survival advantage, such as a virus or a pro-apoptotic gene, the desired species will outgrow the others. For other targets, fusion with a “suicide gene” such as the thymidine kinase of Herpes simplex virus (HSV-TK) can also allow selection for cells in which the target is silenced (Shirane et al. (2004) Nat. Genet. 36: 190-196).
Directed (gene-specific) libraries comprised of all 15-25-nt long sequences represented within the target gene(s) of interest offer a superior alternative to screening completely random libraries. The use of directed libraries prepared in vitro significantly simplifies the screening process since comparatively small libraries need to be assayed. For example, a 20-nt directed library targeting a 2000-nt long mRNA consists of only 1981 different molecules. Moreover, unintended knockdown of non-targeted genes is reduced, allowing more efficient cell-based assays with the directed libraries cloned into appropriate vectors. Currently, there are several reported methods of preparation of directed libraries that can be cloned, amplified and inserted into appropriate antisense, ribozyme, or siRNA expression cassettes (Pierce & Ruffner, 1998; Ruffner et al., 1999; Paquin et al., 2000; Sohail & Southern, 2000; Kazakov et al., Vlassov et al. 2004).
One method that has been used for preparation of a directed sequence library is a multi-stage process for making a directed antisense library against a target transcript specifically for hammerhead ribozyme constructs (Pierce and Ruffner (1998) Nucleic Acids Res. 26:5093-101; WO 99/50457). This method involves multiple enzymatic manipulations to produce a directed library of antisense sequences with a uniform length (10 or 14 nt, determined by the type IIS restriction endonuclease used in the procedure). In addition to the technical complexity of the procedure, this method has the additional disadvantage that the terminal ˜500 nucleotides at each end of the target sequences are missing, and the size of the antisense sequences is restricted to a 14-nt or less (which is less that than required for siRNAs).
Another method for producing a directed library, described in WO 00/43538 and Bruckner et al. (2002) Biotechniques 33: 874-882, includes hybridization of an immobilized DNA target with a randomized sequence of uniform length (20 nucleotides), flanked on each end by a defined primer sequence masked by complementary blocking oligonucleotides. This method suffers from several serious drawbacks: the complexity of the initial random library (420 or 1012) is higher than any target gene complexity (and even the entire human genome). The screening of such libraries is very time- and labor-intensive, and it requires immobilization of the target polynucleotides. The method is restricted to the use of long, immobilized DNA targets, which hybridize to oligonucleotide probes less efficiently than shorter, non-immobilized oligonucleotide fragments in solution (see, e.g., Armour et al. (2000) Nucleic Acids Res. 28: 605-09; Southern et al. (1999) Nature Genet. Suppl. 21:5-9). Hybridization with an immobilized target requires large volumes for hybridization solutions. Solid-phase hybridization methods produce high background due to nonspecific surface effects. Extra steps are required to separate bound from unbound probes and to elute bound probe from the target prior to amplification of the bound sequences. In addition, hybridization patterns obtained with a completely random 20-nucleotide library are expected to be far less intense than those obtained with shorter libraries, due to formation of complementary complexes among members of the library (see, e.g., Ho et al. (1996) Nucleic Acids Res. 24:1901-07). Even when a high initial concentration of the 20-nucleotide random library is used, the concentration of individual sequences in the random pool is not high enough to provide efficient hybridization with a DNA target (see, e.g., Wertmur (1991) Critical Rev. Biochem. Mol. Biol. 26:227-59). Finally, the method has low specificity; WO 00/43538 suggests that the majority of the 20-mer sequences captured on an immobilized DNA target from the random oligonucleotide pool at 52° C. will contain 4-8 mismatches.
Yet another method that has been used is described in Boiziau et al. (1999) J. Biol. Chem. 274: 12730-12737, using a “template-assisted combinatorial strategy”. Boiziau et al. selected DNA aptamers targeting an accessible binding site in an RNA hairpin, using both completely random libraries and libraries “enriched” in target-specific sequences. The “enriched sequences” were produced by ligation of “half-candidates” in the presence of an RNA hairpin using RNA ligase. The half-candidates were designed as hemi-random probes containing defined primer and comparatively long 15-nt terminal random sequences, and were used without masking oligonucleotides in the ligation reaction. Both ligation methods showed low efficiency and target-specificity, which is a consequence of the preference of RNA ligase to ligate sequence motifs that are not aligned in complementary complexes (Harada and Orgel (1993) Proc. Natl. Acad. Sci. USA 90: 1576-1579. Also, due to the lack of masking oligonucleotides, most ligation products were unrelated to the RNA target. Consequently, the authors found no benefit to using libraries prepared from hemi-random probes versus using probes with completely random 30-mer libraries without a ligation step.
Recently, Shirane et al. (Shirane et al. (2004) Nat. Genet. 36: 190-196) developed another method of preparation of a directed library of 19-21 bp DNA fragments that allows expression of shRNA from the library. This method includes quasi-random fragmentation of a double-stranded DNA corresponding to the gene of interest by DNase I (Matveeva et al. 1997). The ends of these fragments were blunted by DNA polymerase and ligated by DNA ligase to a hairpin-shaped adaptor containing the recognition sequence of Mme I restriction endonuclease. Subsequent cleavage by Mme I produced DNA fragments of uniform length of 19-21 bp. This preparation scheme is rather complex, and the obtained library is restricted to species ˜20 nt in length.
Alternatively, the same enzyme Mmel was used to adjust the length of double-stranded DNA fragments of a gene of interest produced by action of mixture of restriction endonucleases including HinpI, BsaHI, Acil, HpaII, HpyCHIV and Taqαl (Sen et al. (2004) Nat. Genet. 36: 183-189). These restrictases are frequent cutters and leave identical CG-overhangs to facilitate cloning. In the next step of this scheme, the obtained DNA fragments were ligated to the loop sequence containing the Mmel restriction site, which was used to generate ˜20 bp long fragments of the directed library. Using a multi-step procedure, the resulting fragments were cloned into expression vectors to produce the shRNA library. The main drawback of this scheme is that the cocktail of restriction enzymes does not produce sufficiently random cuts, and as a result the obtained library contained only 34 unique target-specific sequences out of theoretically possible 981 for the 1000-nt long target. This too is a rather complex scheme and the obtained library is also restricted in length to ˜20 nt.
In view of the foregoing, there is a need for an improved procedure for generating a directed sequence library that is highly specific for the target sequence from which the library is generated, and that does not suffer from the limitations of the methods described above. Also, there is a high demand for improved cassettes to express directed libraries and subsequent selection schemes allowing to choose the best candidates, including antisense RNA, ribozymes, si/shRNA.
Methods are provided for producing target-specific (directed) libraries that comprise substantially all sequences of a pre-determined length that are comprised within a target polynucleotide sequence, which polynucleotide may be a gene, plurality of genes, genome, etc. Such libraries are useful in the expression and selection of gene expression inhibitors and molecular tools, analytical assays and diagnostics specific for the target polynucleotide.
In one embodiment of the invention, a double-stranded RNA comprising complementary strands of a target polynucleotide is digested by ribonuclease to produce double stranded RNAs of a predetermined size. In some embodiments, the RNAse is a length-directed RNAse, e.g. Dicer, which may be utilized in combination with an enzyme providing 3′ phosphatase activity, e.g. ExoIII. The dsRNA fragments of pre-determined size are ligated to oligoribonucleotides of defined sequence at both the 3′- and 5′-ends. The products of ligation are reverse transcribed and amplified using the ligated oligonucleotides as primer-binding sites.
In another embodiment of the invention, a directed library is produced by ligation of hemi-random probes hybridized to adjacent sites on a polynucleotide target. After ligation of the probes with a DNA ligase (such as T4 DNA ligase), pairs of ligated probes are PCR amplified.
In yet another embodiment, a deoxyribonuclease (e.g. DNase I) is used to digest the target polynucleotide. Flanking oligonucleotides are ligated to the obtained fragments, allowing subsequent PCR amplification using the oligonucleotide sequences as primer-binding sites.
The amplified double-stranded DNA fragment encoding the directed libraries, obtained by any of the above described methods, can be inserted in an expression cassette, where such cassettes include PCR templates, vectors, etc. Various methods can be used for this purpose, including annealing to flanking oligonucleotides and extension with Klenow polymerase (in case of PCR cloning); enzymatic ligation using blunt ends or specific restriction sites; and the like. In the latter case, treatment of the amplified polynucleotides with restriction endonucleases (acting at sites encoded in primer-binding flanking constant regions) releases directed sequence inserts.
The directed libraries are useful in various screening methods. The expressed RNA may be selected for functional characteristics, including efficacy as antisense, ribozyme, siRNA, shRNA, miRNA; etc. can be expressed, according to suggested protocols. Selection schemes of interest include, without limitation, selection of RNA Lassos capable of fast and efficient hybridization with target RNA; selection of potent inhibitors from siRNA libraries in vivo; selection of optimal viral target sites in virus-infected mammalian cells; and the like.
These and other objects, advantages, and features of the invention will become apparent to those persons skilled in the art upon reading the details of the methods of producing libraries and uses thereof as more fully described below.
The invention is best understood from the following detailed description when read in conjunction with the accompanying drawings. It is emphasized that, according to common practice, the various features of the drawings are not to-scale. On the contrary, the dimensions of the various features are arbitrarily expanded or reduced for clarity. Included in the drawings are the following figures:
Before the present methods, libraries, and uses thereof are described, it is to be understood that this invention is not limited to particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only by the appended claims.
Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limits of that range is also specifically disclosed. Each smaller range between any stated value or intervening value in a stated range and any other stated or intervening value in that stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included or excluded in the range, and each range where either, neither or both limits are included in the smaller ranges is also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, the preferred methods and materials are now described. All publications mentioned herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited.
It must be noted that as used herein and in the appended claims, the singular forms “a”, “an”, and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a sequence” includes a plurality of such sequences and reference to “the ligation” includes reference to one or more ligations and equivalents thereof known to those skilled in the art, and so forth.
The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided may be different from the actual publication dates which may need to be independently confirmed.
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, and biochemistry, which are within the skill of the art. Such techniques are explained fully in the literature, such as: “Molecular Cloning: A Laboratory Manual,” vol. 1-3, third edition (Sambrook et al., 2001); “Oligonucleotide Synthesis” (M. J. Gait, ed., 1984); “Methods in Enzymology” (Academic Press, Inc.); “Current Protocols in Molecular Biology” (F. M. Ausubel et al., eds., 1987); “PCR Cloning Protocols,” (Yuan and Janes, eds., 2002, Humana Press).
The invention provides a method that produces essentially perfect directed libraries, comprising substantially all sequences of a pre-determined length that are comprised within a target polynucleotide sequence. By producing a substantially complete library of defined length fragments, the target polynucleotide is efficiently analyzed for fragments corresponding to optimal sequences for various purposes, such as RNA Lasso; siRNA; ribozymes; and the like. By “substantially all”, it is intended that the library comprises at least about 90% of the possible sequences, and may comprise at least about 95%, at least about 99%, or more.
Target polynucleotides of interest include RNA species, e.g. mRNA, groups of mRNAs, etc., and DNA species, e.g. genes, introns, exons, regulatory sequences, genomes of mitochondria, viruses, bacterial, eukaryotes, etc.
In some embodiments of the invention, enzymatic reactions are performed on dsRNA species as schematically shown in
The resulting dsRNA is nuclease digested. In some embodiments, the nuclease is a length-directed RNAse, where for the purposes of the present invention, a length-directed ribonuclease cleaves an RNA, usually a dsRNA, into fragments of defined length greater than about 10 nucleotides in length, usually in a processive manner. The length is usually at least about 10 nucleotides, more usually at least about 12 nucleotides, and may be at least about 20 nucleotides; and not more than about 40 nucleotides, more usually not more than about 30 nucleotides, and may be not more than about 25 nucleotides. In other embodiments, the nuclease is not length-directed and the resulting digestion product is size fractioned prior to use, e.g. by gel electrophoresis, etc. Preferred nucleases cleave in a non-site specific manner.
Length-directed nucleases of particular interest for this purpose are Dicer and RNAse III. Both recombinant human Dicer and Escherichia coli RNase III can be used in vitro to cleave long dsRNA. Dicer is an endoribonuclease that contains RNase III domains and is the enzyme responsible for cleavage of long dsRNAs to siRNA in the endogenous RNAi pathway. The siRNAs produced by Dicer are about 19-21 bp in length and contain 3′ dinucleotide overhangs with 5′-phosphate and 3′-hydroxyl termini (Myers et al. 2003; Kawasaki et al. 2003, supra). E. coli RNase III is involved in the maturation and degradation of diverse cellular, phage, and plasmid RNAs. Also applicable for digesting long dsRNA, its cleavage products range from ˜11-25 bp in length with termini identical to those produced by Dicer (Yang et al. 2002; Yang et al. 2004). Both ribonucleases are commercially available from multiple sources.
When provided short targets (<65 bp), Dicer appears to measure from an end in determining its cut sites (Zhang et al. (2002) EMBO J. 21: 5875-5885; Zhang et al. (2004) Cell 118: 57-68; Siolas et al. (2004) Nat. Biotech. 23:227-231), raising the question of whether sequential cut sites in longer RNAs are in register and might skip over some target sequences. The fact that digestion from either end can occur in most cases provides a second register of cutting which reduces the likelihood of skipping some sequences. Moreover, since each cut site is actually a distribution of several adjacent cleavages (see Zhang et al. (2004), supra), each successive cleavage makes the distribution wider and wider, so that essentially all sites are cleaved except those within about 60-100 bp of the ends. By starting with a dsRNA target flanked by extra 100 bp of nontarget sequences at either end, this concern can be eliminated, and the resulting addition of a few nontarget siRNAs to the library will have no effect on the effectiveness of library screening. In some embodiments of the invention, the target nucleic acid is flanked by at least about 60 nucleotides, and may be flanked by 100 nt. or more of nontarget sequence.
The fact that Dicer cleaves longer dsRNAs more efficiently than shorter ones (Bernstein et al. (2001) Nature 409: 363-366; Elbashir et al. 2001, supra; Ketting et al. (2001) Genes & Dev. 15: 2654-2659) suggests that this enzyme may have “endonuclease” activity, independent of ends and therefore not in any fixed register, that is not evident with short fragments where end effects may dominate. Alternatively, fragmentation of a DNA target by DNase I avoids end effects since that enzyme is a true endonuclease. Some sequence preferences can be seen with light digestion (Herrera and Chaires (1994) J. Mol. Biol. 236:405-411), so adjusting the level of digestion to provide fragments mostly shorter than 30 bp would further reduce the likelihood of missing any sequences in the final library.
The digestion product of the RNAse digestion comprises small dsRNA fragments, which may be of a defined size. The fragments are strand-separated, and may be purified by length, e.g. gel electrophoresis, capillary electrophoresis, HPLC, etc. The fragments are dephosphorylated, e.g. by alkaline phosphatase.
In ligation steps, flanking oligoribonucleotides of defined sequences are attached to the 3′-and 5′-ends of each fragment by T4 RNA ligase. Similar ligation-amplification methods have been previously used for cloning of small RNA fragments extracted from cells (Elbashir et al. 2001; Lau et al. 2001; Pfeffer et al. 2003). The flanking oligonucleotides provide primer-binding sites for the PCR amplification that will take place on the last stage of the protocol. These oligonucleotides also may provide restriction sites.
The reaction may be optimized to prevent circularization via intramolecular ligation of the oligonucleotides during the ligation reaction by the following steps. In a first ligation reaction, a first flanking oligoribonucleotide is used, in which the oligoribonucleotide, comprises a 5′-phosphate and 3′ “terminator nucleotide”. A terminator nucleotide refers to a nucleotide containing a chemical modification at the 3′ end that prevents normal polymerization or ligation of the nucleotide into a polymer. Such terminator nucleotides may retain the ability to form base pairs, and may be recognized by enzymes that act on polynucleotides.
Such terminator modifications are known in the art, and include, without limitation: 2′,3′ dideoxythymidine; 2′,3′ dideoxycytidine; 2′,3′ dideoxyuridine; 2′,3′ dideoxyguanosine; 2′,3′ dideoxyadenosine. Any of the bases may be modified by addition of an alkyl spacer at the 3′ end, which inactivates the 3′ OH towards enzymatic processing. One of skill in the art will recognize that such spacers may be variable in the length of the carbon chain, e.g. 1, 2, 3, 4, 5 carbons, etc. Inverted bases, such as inverted dT, when incorporated at the 3′-end of an oligo lead to a 3′-3′ linkage which inhibits degradation by 3′ exonucleases and extension by DNA polymerases and ligases. 3′-O-methyl-dNTPs are described by Metzker et al. (1994) Nucleic Acids Res. 22(20):4259-4267. A large number of other modified or capped nucleotides have been described in the art, and may be used in the methods of the invention.
Following ligation to the first flanking ribooligonucleotide, the ligation product may be purified by any convenient method, e.g. gel electrophoresis, dialysis, capillary electrophoresis, HPLV, etc. The purified ligation product is then phosphorylated and ligated to a second flanking oligoribonucleotide lacking a terminal phosphate. In this second ligation reaction, the circularization of the product is prevented due to the absence of 5′-phosphate.
The ligation product of the second reaction is reverse transcribed and PCR amplified (RT-PCR) using methods known in the art, using the first and second flanking oligonucleotides as primer-binding sites. The resulting PCR-amplified DNA fragments may be used for various purposes, e.g. inserting into vectors for library generation, expression, sequencing, etc.
The directed libraries produced by this method contain both sense and antisense gene-specific sequences. If it is desirable to obtain sequences that correspond only to the antisense strand, this double-stranded RNA library can be denatured, the sense sequences annealed with an excess of the gene-specific antisense cDNA, and the unhybridized single-stranded antisense RNA fragments separated by a gel-electrophoresis or affinity chromatography and purified.
Alternative Method #1 for Directed Library Preparation Based on Ligation of Hemi-Random Probes on a ssDNA Target
An alternative method to prepare a gene-specific (directed) library, based on the hybridization of hemi-random probes to a ssDNA target with subsequent enzymatic ligation of the probes that happen to hybridize to adjacent target sequences (see
Alternative Method #2 for Directed Library Preparation Based on DNase Fragmentation of a dsDNA Target
In this method, the directed libraries can be directly derived from gene-specific double-stranded DNA as shown in
The fragmentation of DNA targets by DNase I and isolation of fragments of about 20 bp for preparation of shRNA libraries has been recently described by others (Sen et al (2004) Nat. Genet. 36: 183-189; Shirane et al. (2004) Nat. Genet. 36: 190-196) or suggested (Taira & Miyagishi (2004) U.S. patent application US2004/0002077 A1.) In the present invention, we use a wider range of DNase I fragment sizes for the expression of siRNA. We also suggest an additional purification and amplification of the PCR-amplified product obtained from the original DNase digest. This additional step provides a higher yield and allows easy purification of DNA fragments of the desired length.
The Dicer and DNase I methods of target fragmentation can be considered complementary, with each having certain advantages and disadvantages. The Dicer/RNase III-generated fragments are of course the same length as in vivo products of Dicer processing and can be directly incorporated into the RISC complex. The DNase-generated gene fragments may be more useful for the preparation of shRNA libraries, since the stem length of potent shRNAs can vary from 21 to 29 bp, depending on the sequence (Paddison et al. (2004) Nature 428: 427-431). Formation of long RNA duplexes from the transcribed antisense and sense strands may sometimes be a challenge for the Dicer/RNase III approach when dealing with highly structured RNAs such as viral internal ribosome entry sites (IRES) elements. On the other hand, the DNase I approach requires at least two gel fractionation steps, and may use three or more (the third after ligation of adapters and PCR).
To provide additional sequence and size diversity, libraries made by each method may be mixed prior to insertion in an expression vector.
Directed sequence libraries and methods of the present invention may be used as starting materials for a multitude of applications, including development of diagnostic reagents, therapeutic reagents (e.g., polynucleotide therapeutics), genomics tools, affinity reagents, and the like.
In one aspect, libraries of the invention are used (as alternative to fully random libraries) for development and optimization of sequences for antisense- and ribozyme-based polynucleotide genomics tools (e.g., gene knockdown, gene-target discovery and validation, etc.) and therapeutics by methods known in the art reviewed in references cited in the introduction. For example, a directed sequence library may be prepared from a gene sequence that provides a particular cellular function. Antisense sequences that block that function may be determined by screening the library for sequences that inhibit gene function. The screening can be performed in cells as described, for example, in paragraph , Examples 13 and 14, and
“Rationally-designed” nucleic acid therapeutics utilize various in silico algorithms known in the art to select a target site, and often are directed to a single site on the target RNA. Such therapeutics include antisense, ribozymes, deoxyribozymes, siRNA, shRNA and miRNA. In cases where the target mutates rapidly (e.g. HIV or influenza virus) the rationally-selected target sequences mutate over time, and the therapeutic becomes ineffective. The same is true for nucleic acid therapeutics directed at cancer targets, where mutations in a target sequence can lead to resistance to the nucleic acid therapeutic.
Nucleic acid therapeutics selected de novo from a pool of directed sequence libraries have advantages over those selected by in silico selection methods. Therapeutics selected from a directed sequence library of the invention complement multiple sites on a target simultaneously, allowing effective down-regulation of a rapidly mutating virus or cancer cell. Knowledge of the genetic sequence or molecular and structural biology of the virus or cancer cell are unnecessary, in contrast to rational drug design methods.
In another aspect, libraries of the invention are used for selection and optimization of sequences useful for RNA interference, such as siRNA (small interfering RNA) molecules capable of inhibiting known or unknown genes. “siRNA” refers to a double-stranded RNA molecule that inhibits expression of a complementary known or unknown gene(s) (see, e.g., Tuschl (2002) Nature Biotechnology 20:446-48).
In another embodiment, libraries of the invention are immobilized on a solid support to generate an array, which may be used to detect or quantify complementary polynucleotide sequences. The complete library may be used, or selection may be performed to optimize the array probes. Such arrays are useful in microarray-based diagnostics and gene expression analysis, including detection of the presence of bacterial and viral infectious agents, genetic traits and diseases, SNPs, etc. (see, e.g., Rampal, ed. (2001) DNA Arrays, Methods and Protocols (Humana Press).
As used herein, “microarray” refers to a surface with an array of putative binding (e.g., by hybridization) sites for a biochemical sample. Typically, a microarray refers to an assembly of distinct polynucleotides immobilized at defined positions on a substrate. Microarrays are formed on substrates fabricated with materials such as paper, glass, plastic (e.g., polypropylene, nylon), polyacrylamide, nitrocellulose, silicon, optical fiber, or any other suitable solid or semi-solid support, and configured in a planar (e.g., glass plates, silicon chips) or three-dimensional (e.g., pins, fibers, beads, particles, microtiter wells, capillaries) configuration. Polynucleotides may be attached to the substrate by a number of means, including (i) in situ synthesis (e.g., high-density polynucleotide arrays) using photolithographic techniques (see Fodor et al., Science (1991) 251:767-73; Pease et al., Proc. Natl. Acad. Sci. USA (1994) 91:5022-5026; Lockhart et al., Nature Biotechnology (1996) 14:1654; U.S. Pat. Nos. 5,578,832; 5,556,752; and 5,510,270); (ii) spotting/printing at medium to low density on glass, nylon, or nitrocellulose (see Schena et al., Science (1995) 270:467-70; DeRisi et al., Nature Genetics (1996) 14:457-60; Shalon et al., Genome Res. (1996) 6:639045; and Schena et al., Proc. Natl. Acad. Sci. USA (1992) 20:1679-84; and (iv) by dot-blotting on a nylon or nitrocellulose hybridization membrane (see, e.g., Sambrook et al., Eds. (2001) Molecular Cloning: A Laboratory Manual, 3rd ed., Vol. 1-3, Cold Spring Harbor Laboratory (Cold Spring Harbor, N.Y.)). Polynucleotides may also be noncovalently immobilized on the substrate by hybridization to anchors, by means of beads, or in a fluid phase such as in microtiter wells or capillaries. Arrays may include polynucleotide sequences prepared by the methods of invention.
For example, target-dependent ligation products may be prepared by the methods of the invention to include overlapping sequences of a viral genome, and such sequences immobilized on a solid support to generate an array. Such an array may be used to distinguish between viral strains by hybridization to specific subsets of sequences on the array.
In another aspect, libraries of the invention are used for development of diagnostic or forensic reagents for detection of the presence of bacterial and viral infectious agents, genetic traits and diseases, SNPs, etc. For example, libraries of the invention are used to select and optimize adjacent pairs of oligonucleotide probe sequences that are useful in ligase-mediated detection methods. In another example, libraries of the invention may be used to select and optimize polynucleotide sequences useful for hybridization-mediated DNA detection (i.e., affinity complementation). In a further example, libraries of the invention may be used to select and optimize polynucleotide primer sequences for PCR-based detection methods.
In another aspect, libraries of the invention may be used for development of affinity reagents. For example, a directed sequence library or a portion thereof, prepared by methods of the invention, may be coupled to a solid support and used for enrichment or purification of a polynucleotide sequence or nucleoprotein complex of interest from a mixture. Means for attachment of polynucleotides to a solid support are well known in the art. For example, amino-modified polynucleotides can be attached to an aldehyde-functionalized surface via reaction with free aldehyde groups using Schiff's base chemistry. In another example, amino-terminal polynucleotides can be coupled to isothiocyanate-activated glass, to aldehyde-activated glass, or to a glass surface modified with epoxide.
In other aspects, libraries of the invention may be used for preparative extraction of specific genes (including mRNA, genomic DNA, or fragments thereof), and as probes for specific sequences in Northern blots, in situ hybridization, and genomics mapping and annotation procedures.
In another aspect, libraries of the invention may be prepared from more than one target simultaneously (i.e., in a single reaction vessel). After cloning of directed sequence inserts obtained from multiple targets into vectors, the individual inserts may be sequenced and aligned to the appropriate target by, e.g., computer-assisted sequence alignment, to select desirable probe sequences for each target used in the mixture. These methods may be used to significantly enhance and accelerate genomics-related studies. Further, they can be used to generate cocktails of inhibitors of the expression of one or more genes, according to the targets used to generate the directed libraries. These cocktails can generated by expressing the libraries in cells of interest, selecting for a desired phenotype, and recovering the sequences of the library that conferred the phenotype by PCR and sequencing (see Li et al. (2000) supra; Kawasaki & Taira (2002), supra).
The scheme shown in
Another use for the library of
Yet another potential application is selection of successful miRNA candidates from the obtained pool of mismatched sequences.
The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the present invention, and are not intended to limit the scope of what the inventors regard as their invention nor are they intended to represent that the experiments below are all or the only experiments performed. Efforts have been made to ensure accuracy with respect to numbers used (e.g. amounts, temperature, etc.) but some experimental errors and deviations should be accounted for. Unless indicated otherwise, parts are parts by weight, molecular weight is weight average molecular weight, temperature is in degrees Centigrade, and pressure is at or near atmospheric.
Transcription of the target. Sense and antisense strands of the RNA target were transcribed from a PCR-amplified DNA template either in one-tube reaction using opposing T7 promoters or separate-tube reactions, one using SP6, another T7 promoter (with Ambion's MEGAshortscript or MEGAscript kits).
Annealing and Dicer digest. RNA strands were annealed to form perfect duplex and digested by recombinant Dicer enzyme:
Dicer 6 μl (0.5 U/μl, Stratagene #240100-51)
5× buffer 6 μl
dsRNA+water 18 μl (˜3 μg)
Resulting 20-22 bp siRNAs were purified and strands-separated by 15% PAG-7M urea, eluted by crash/soak method and ethanol precipitated, then dissolved in 5 mM Tris-HCl pH 7.5.
The directed libraries produced by this method contain both sense and antisense gene-specific sequences. If it is desirable to obtain sequences that only correspond to the antisense strand, this library is mixed and annealed with an excess of antisense cDNA and the unhybridized antisense RNA fraction is separated by a gel-shift assay or affinity chromatography. However, this extra step is unnecessary for many purposes.
One potential problem of this approach is possible circularization via intramolecular ligation of the oligonucleotides during the ligation reaction. Therefore, the Dicer-produced RNA fragments are dephosphorylated, and in the first ligation reaction (see below) the flanking oligoribonucleotide 1 with a 5′-phosphate (required for ligation) has 3′-idT (inverted deoxythymidine) that prevents circularization.
fragmented RNA+water 85 μl
10× buffer 10 μl
CIAP 5 μl (Calf Intestine Alkaline Phosphatase, 1 U/μl, MBI Fermentas #EF0341)
The reaction proceeded for 1 h at 37° C., then followed phenol extraction, and RNA was precipitated with ethanol.
Next, in two subsequent ligation steps, flanking oligoribonucleotides of defined sequences were attached to the 3′- and 5′-ends of each fragment by T4 RNA ligase:
T4 RNA ligase 1 μl (20 U/μl, NE BioLabs #M0204S)
RNase OUT 1 μl (40 U/μl, Invitrogen #10777-019)
0× buffer 4 μl
Flanking 1 oligo (5′-p; 3′-idT) 2 μl (150 pmol)
(SEQ. ID. NO. 1) (Sequence: 5′-GAGAAUMCAACAACAACAA-3′: Dharmacon, Lafayette, Colo.)
Fragmented RNA 1-10 μl (˜1 μg)
Water 31-22 μl
The reaction proceeded for 1 h at 37° C., the products were purified by 15% PAG-7M urea, and ethanol precipitated.
The gel-purified product of the 1st ligation was phosphorylated to be further ligated to another flanking oligoribonucleotide 2:
RNA+water 41 μl
10× buffer 5 μl
T4 PNK 2 μl (Polynucletide kinase, 10 U/μl, NE BioLabs #M0201S)
RNase OUT 1 μl
ATP 0.7 μl (75 mM)
The reaction proceeded for 1 h at 37° C., followed by phenol extraction and ethanol precipitation.
The phosphorylated product was ligated to flanking oligoribonucleotide 2, which does not have a terminal phosphate. In this second ligation reaction, the circularization of the product of the first ligation was also prevented due to the presence of 5′-blocking group.
T4 RNA ligase 1 μl
RNase OUT 1 μl
10× buffer 4 μl
Flanking 2 oligo 4 μl (300 pmol)
(SEQ. ID. NO. 2) (Sequence: 5′-UGGUACAUUACCUGGUAAC-3′)
RNA+water 30 μl
The reaction proceeded for 1 h at 37° C., followed by phenol extraction and ethanol precipitation.
The products of 2nd ligation were reverse transcribed and further PCR amplified (RT-PCR) using the oligonucleotides attached to the gene-derived sequences as primer-binding sites.
5× buffer 10 μl
dNTPs 10 μl
RNA+water 26.5 μl
RT primer 0.5 μl (50 pmol)
AMV-RT 2 μl (10 U/μl, Promega #M510F)
RNase OUT 1 μl
The primers were annealed to RNA (65 C 5 min-ice), then other components were added and reaction incubated for 1 h at 42° C.
10× buffer 10 μl
RT-DNA 10 μl (out of 50)
MgCl2 6 μl (25 mM)
dNTPs 8 μl (10 μl each/100 mM/+360 μl water)
RT primer 0.5-1 μl (50-100 pmol)
F primer 0.5-1 μl (50-100 pmol)
(Sequences: (SEQ. ID. NO. 3) 5′-TTGTTGTTGTTGTTATTCTC-3′ and (SEQ. ID. NO. 4) 5′-TGGTACATTACCTGGTAAC-3′: synthesized by IDT (Integrated DNA Technologies, Coralville, Iowa)
Taq 0.5 μl (Promega)
Typical cycles (94° C. 30 sec—50° C. 30 sec—72° C. 30 sec) 10-20 cycles
Gel analysis. After PCR, 10 μl of the reaction mixture was mixed with 3 μl of 6× loading buffer (0.25% bromphenol blue, 0.25% xylene cyanol, 30% glycerol in water) and loaded onto a 10% native polyacrylamide gel in 1×TBE. The gel was run at room temperature at 25V/cm field. After electrophoresis, the gel was stained with ethidium bromide and visualized under UV light.
Cloning and Sequencing
The ˜60 bp products were PCR amplified on a large scale, gel purified, and cloned into the pT7Blue-3 vector (Novagen). E. coli were transformed with the recombinant vector and colonies were used for mini-preps. DNA was isolated using the QIAprep Spin Miniprep Kit (Qiagen), and sent to Retrogen, Inc. for unidirectional sequencing with T7 promoter primer.
Sequencing Results for Directed Library Against TNF Target
The sequencing results are shown in
The DNA target was a single-stranded murine TNFα cDNA. The target was prepared by amplification from a pGEM-4/TNF plasmid which included sequences for the murine TNFα gene with the full-length 5′-UTR and part of the 3′-UTR, totaling 1 kb. Amplification was by asymmetric PCR, using only a single primer, allowing production of single-stranded DNA. The single-stranded DNA was purified away from primers using a GeneClean III kit, ethanol precipitated, and used in experiments as a target for preparation of a directed library.
Hemi-Random Probes, Masking Oligonucleotides, and PCR Primers
Hemi-random probes, masking oligonucleotides, and PCR primers were synthesized by IDT (Integrated DNA Technologies, Coralville, Iowa).
Hemi-random probes contained 10-mer random regions and 26-mer defined sequences that contained a primer binding site and a restriction site, as follows:
Masking oligonucleotides contained sequences complementary to and masking the 26-nt long defined sequences of the probes. Masking oligonucleotides were used to prevent hybridization of the defined sequences of the probes to target sequences and to prevent parasitic ligation of probe sequences to each other. The sequences of the masking oligonucleotides were as follows:
Primers used for PCR amplification of ligation products were as follows:
The hemi-random probes were pre-hybridized with their corresponding masking oligonucleotides in T4 DNA ligase reaction buffer for 5 min at room temperature. The target was added and the mixture was then incubated for 30 min at varying temperatures (25-42° C.) to allow the probes to hybridize to the target. T4 DNA ligase was then added and the mixture was incubated at room temperature for 1 hour. The ligation reaction mixture contained the following:
Hemi-Random Probes A and B 0.1-1 μM (2-20 pmol, 2-4 μl)
Masking Oligonucleotides for Hemi-Random Probes A and B 0.1-1 μM (2-20 pmol, 2-4μl)
DNA target 0.01-1 μM (0.2-20 pmol, 2 μl)
T4 DNA ligase buffer (30 mM Tris-HCl, pH 7.8, 5-10 mM MgCl12, 10 mM DTT, 1 mM ATP)
(2μl of 10×), 50-200 mM NaCl
T4 DNA ligase 0.1 U/μl (2 units, 1 μl)
H2O up to 20 μl
The effect of random oligodeoxyribonucleotides and oligoribonucleotides (4-5-6-7 nt long) and spermidine was also studied.
Amplification by PCR. After the ligation reaction was complete, 1 μl of the 20 μl ligation mixture was used for PCR amplification of the 72 bp ligation product. Typical cycles were: 94° C. 30 sec—54° C. 30 sec—72° C. 15 sec (20 cycles).
After PCR, 10 μl of the reaction mixture was mixed with 3 μl of 6× loading buffer (0.25% bromphenol blue, 0.25% xylene cyanol, 30% glycerol in water) and loaded onto a 10% native polyacrylamide gel in 1×TBE. The gel was run at room temperature at 25V/cm field. After electrophoresis, the gel was stained with ethidium bromide and visualized under UV light.
Cloning and Sequencing
The 72 bp ligation products were PCR amplified on a large scale, gel purified, and cloned into the pT7Blue-3 vector (Novagen). E. coli were transformed with the recombinant vector and colonies were used for mini-preps. DNA was isolated using the Wizard Plus Minipreps Purification System (Promega) or QIAprep Spin Miniprep Kit (Qiagen), and sent to Marshall University DNA Core Facility for dye-primer sequencing.
Sequencing Results for Directed Library Against TNF Target
The results of the target-dependent ligation experiments described above are shown in
Preparation of gene-specific libraries by DNase I fragmentation of a dsDNA target (
PCR-amplified cDNA encoding DsRed was subjected to partial digestion with DNase I in a buffer containing 1 mM MnCl2, 50 mM Tris-HCl (pH 7.5), 0.5 μg/μl BSA, and 0.1-0.3 U/μg DNase I (Ambion) at 20° C. for 1-10 min to generate small, blunt-ended DNA fragments (
The resulting DNA fragments (which contain 5′-phosphates) can be directly “blunt-end” cloned into the siRNA vector. However, attachment of adapters (fixed flanking double-stranded DNA sequences) is beneficial since it allows PCR amplification and higher ligation efficiency due to the presence of restriction sites in the adapters. The dsDNA adapters were essentially complementary to the 3′-termini of modified U6 and H1 promoters
Ligation reactions were performed with T4 DNA ligase, using one adapter at a time, each in ˜200-fold excess over the DNA fragments. The ligation products were PCR-amplified using primers complementary to the adapter sequences (94° C., 30 sec/52° C., 30 sec/72° C., 60 sec, for 20-30 cycles). The resulting ˜70 bp PCR products were purified by native 10% polyacrylamide gel, digested with Hind III and Bgl II, and after a second gel-purification, were cloned into the siRNA expression vector (see below). Plasmid DNAs isolated (QIAprep Spin Miniprep, Qiagen) from randomly selected bacterial clones were sequenced and used for transfection studies (
Sequencing Results for the Directed Library Against DsRed
Sequencing of several clones obtained from this approach showed that all the isolated clones contained inserts that had perfect homology to the DsRed gene. DsRed insert sequences varied from ˜17 to 34 bp (
In vitro Selection Protocol
A TNF-directed Lasso library generated as described in Example 1 was transcribed in vitro with T7 RNA polymerase (Ambion) to generate the initial pool of Lassos for in vitro selection (
Results of the in vitro Selection
After the third round of selection, the gel-purified RT-PCR fragment was cloned using a TA-cloning kit (Invitrogen). The resulting colonies were screened for inserts by blue/white color selection. 23 individual clones were isolated and sequenced to identify the selected antisense sequences (
Analysis of Individual Selected Lassos
To identify which of the selected Lassos are superior binders, one representative clone of each unique selected sequence Was transcribed in vitro and tested in binding affinity and kinetics assays. Lassos were internally 32P-labeled during in vitro transcription and incubated with an excess of non-radioactive target TNF-1000 RNA at 37° C. in SB. Products of these reactions were analyzed by denaturing 5% PAGE (
Lassos were synthesized and internally radiolabeled by T7 polymerase transcription in the presence of [α32P]rCTP. Time course binding assays were performed to monitor the efficiency of Lasso binding to target RNA (
In conclusion, by starting with a pool of Lassos that contain a gene-specific library against mTNFα, we were able to select the most efficiently hybridizing and circularizing Lassos. We confirmed that the Lassos selected were capable of fast binding to target RNA by testing the selected sequences individually in binding assays.
Selection for optimal DsRed target sequences was performed essentially as described for TNFα. After three rounds of selection, the resulting Lassos were cloned and sequenced to determine which antisense sequences were selected.
Results are shown in
The directed or randomized oligonucleotide libraries within desirable length range, obtained as shown in
This shRNA PCR transcription cassette can be used either directly for transfections of mammalian cells or after cloning into appropriate expression vectors. A direct transfection system can be used for rapid screening of siRNA libraries and allows easy identification of optimal siRNA-target sequence combinations and multiplexing of siRNA library expression in mammalian cells. This strategy also avoids a bacterial amplification stage, which can introduce major mutations or deletions at inverted repeats. Note that 5′-phosphorylation of the primers results in enhanced expression of PCR cassettes, probably stabilizing them in cells. Alternatively, this cassette can be capped with hairpin forming oligodeoxynucleotides. This approach was shown to stabilize by protecting the termini of the DNA duplex from exonucleolytic degradation resulting in improved expression in cells (Horie & Simada, 1994, Biochem. Mol. Biol. Int.)
Alternatively, dsDNA templates for the directed siRNA library can be generated by using DNase I, dicer or ligation methods. The DNA duplex is then digested with restriction enzymes Hind III and Bgl II generating overhangs immediately next to the randomized sequence. A hairpin-shaped oligonucleotide containing H1 or any other pol III promoter sequence and having a Bgl II restriction site at the end of the stem is ligated to the 3′-end of the duplex DNA, converting the duplex into a hairpin. A second set of synthetic dsDNA (PR1 and PR2) with Hind III restriction site at its 3′-end is ligated to the above siRNA-H1 hairpin product. The resulting DNA hairpins with a 3′-end single stranded overhang having homology to the U6 promoter are gel-purified under denaturing conditions, and then used as reverse primers in the PCR reaction on a hU6 promoter plasmid as template as described above and as shown in
Double-stranded RNA corresponding to the target of interest is prepared and cleaved with recombinant dicer enzyme as described above. The diced ds RNA fragments (approximately 21 bp with 2 nt 3′ overhangs) are treated with calf intestinal phosphatase and the 5′ dephosphorylated dsRNA is purified by phenol/chloroform extraction and ethanol precipitation (
Two dsDNA directed libraries, generated by one of the methods shown in
The directed library (obtained by any method described above), is digested with Hind III and Bgl II and ligated to two linkers, one in the form of a hairpin (CAP) and the other a partial duplex DNA containing a 3′-tail that is complementary to the 3′-end of the h-U6 promoter (
The goal: to convert the fused product between pol III (U6 or H1) promoter and restriction fragment, encoding a directed siRNA library, into a dumbbell-shaped DNA follwed by its RCA amplification. To generate multimeric pol III promoter-shRNA cassettes by RCA reaction using Ø29 (Blau, 04) or with Bst I DNA pol. (Shirane et al., 04) pol (
Improved method for expression of directed libraries of shRNAs: In this method (
The experimental design of the constructs and experimental scheme is shown in
The experimental design of the constructs is shown in
Here we describe a rapid, automatic, in vivo method for identifying the best target genes in a virus and the most accessible target sequences within those genes. The scheme for this approach is summarized in
A unique feature of this approach is that the selection takes place within the cell, and directed libraries containing only target-specific molecules are employed. The complexity of the viral or cDNA directed library is relatively small, on the order of 104 for the most viral RNA targets and 10-20×106 for cDNA. This allows establishment of the antisense library in host cells with little or no loss of complexity.
The initial experiments are carried out with a non-replicative form of SFV (SQL), which cannot propagate unless it has been treated with protease.
Once putative inhibitors are identified, they are tested individually for efficacy, specificity and potency with chymotrypsin-treated SQL SFV virus and finally with the fully virulent replication proficient A7 strain. An eventual goal is to develop a panel of cell-based libraries that will allow infection with a wide variety of viral pathogens to screen for inhibitors.
To deliver the RNA inhibitors, lentiviral vectors are used. These vectors deliver transgenes very efficiently to many primary cell types. The use of strong pol III promoters (U6, tRNA or H1) in these vectors assures high levels of intracellular expression of RNA inhibitors. If even higher expression levels are needed, an enhanced U6 promoter recently reported can be used.
In this example (
Targeting Host Cellular Factors:
The ability of siRNAs to inhibit viral replication has been shown for several pathogenic viruses; however, considering the high sequence specificity of siRNAs and high mutation rates of RNA viruses including SFV, HCV, HIV and poliovirus, the antiviral efficacy of siRNAs directed to the viral genome may be limited due to the potential emergence of escape mutants. However, cellular factors involved in the viral life cycle have been successfully targeted providing a more sustained siRNA effect since these factors do not normally mutate and are present at much lower copy number than the viral RNA targets. For example, targeting of HIV's main receptor CD4, its coreceptor, CCR5, or both CCR5 and CXCR4, can suppress the entry and replication of HIV-1. Since viral entry and replication require various host factors, an siRNA library generated using a host cDNA library alongside an HIV-directed siRNA library can be used to identify several host and viral targets essential for viral infection.
The preceding merely illustrates the principles of the invention. It will be appreciated that those skilled in the art will be able to devise various arrangements which, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples and conditional language recited herein are principally intended to aid the reader in understanding the principles of the invention and the concepts contributed by the inventors to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the invention as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents and equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure. The scope of the present invention, therefore, is not intended to be limited to the exemplary embodiments shown and described herein. Rather, the scope and spirit of present invention is embodied by the appended claims.