US11118194B2 - Modified site-directed modifying polypeptides and methods of use thereof - Google Patents
Modified site-directed modifying polypeptides and methods of use thereof Download PDFInfo
- Publication number
- US11118194B2 US11118194B2 US16/061,291 US201616061291A US11118194B2 US 11118194 B2 US11118194 B2 US 11118194B2 US 201616061291 A US201616061291 A US 201616061291A US 11118194 B2 US11118194 B2 US 11118194B2
- Authority
- US
- United States
- Prior art keywords
- amino acid
- cas9
- polypeptide
- acid sequence
- seq
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
- C12N15/902—Stable introduction of foreign DNA into chromosome using homologous recombination
- C12N15/907—Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/26—Preparation of nitrogen-containing carbohydrates
- C12P19/28—N-glycosides
- C12P19/30—Nucleotides
- C12P19/34—Polynucleotides, e.g. nucleic acids, oligoribonucleotides
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P21/00—Preparation of peptides or proteins
- C12P21/02—Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/01—Fusion polypeptide containing a localisation/targetting motif
- C07K2319/09—Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
Definitions
- RNA-mediated adaptive immune systems in bacteria and archaea rely on Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR) genomic loci and CRISPR-associated (Cas) proteins that function together to provide protection from invading viruses and plasmids.
- CRISPR Clustered Regularly Interspaced Short Palindromic Repeat
- Cas CRISPR-associated proteins
- Type II CRISPR-Cas systems the Cas9 protein functions as an RNA-guided endonuclease that uses a dual-guide RNA consisting of crRNA and trans-activating crRNA (tracrRNA) for target recognition and cleavage by a mechanism involving two nuclease active sites that together generate double-stranded DNA breaks (DSBs).
- tracrRNA trans-activating crRNA
- RNA-programmed Cas9 has proven to be a versatile tool for genome engineering in multiple cell types and organisms. Guided by a dual-RNA complex or a chimeric single-guide RNA, Cas9 (or variants of Cas9 such as nickase variants) can generate site-specific DSBs or single-stranded breaks (SSBs) within target nucleic acids.
- Target nucleic acids can include double-stranded DNA (dsDNA) and single-stranded DNA (ssDNA) as well as RNA.
- NHEJ non-homologous end joining
- HDR homology directed repair
- the Cas9 system provides a facile means of modifying genomic information.
- catalytically inactive Cas9 alone or fused to transcriptional activator or repressor domains can be used to alter transcription levels at sites within target nucleic acids by binding to the target site without cleavage.
- the present disclosure provides modified site-directed modifying polypeptides, and ribonucleoproteins comprising the modified polypeptides.
- the modified site-directed modifying polypeptides are modified for passive entry into target cells.
- the modified site-directed modifying polypeptides are useful in a variety of methods for target nucleic acid modification, which methods are also provided.
- FIG. 1A-1G depict amino acid sequences of polypeptides that facilitate crossing a eukaryotic cell membrane. Top to bottom, SEQ ID NOs: 1090-1249.
- FIG. 2A-2D depict design ( FIG. 2A ) and characterization ( FIG. 2B-2D ) of a fusion site-directed modifying polypeptide according to embodiments of the present disclosure.
- FIG. 3 depicts genome editing following injection of Cas9 RNP into multiple brain regions.
- FIG. 4A-4D depict the effect of 4 ⁇ NLS-Cas9 on in vivo Cas9/RNP-mediated genome editing.
- FIG. 5A-5D depict the effect of 4 ⁇ NLS-Cas9 on in vivo hippocampal Cas9/RNP-mediated genome editing.
- FIG. 6A-6C depict the effect of subretinal Cas9/RNP injections on genome editing in the retina.
- FIG. 7A-7B provide the amino acid sequence of a 4 ⁇ NLS-Cas-2 ⁇ NLS-sfGFP polypeptide (SEQ ID NO: 1250)( FIG. 7A ) and the amino acid sequence of a 4 ⁇ NLS-Cas-2 ⁇ NLS polypeptide (SEQ ID NO: 1251)( FIG. 7B ).
- FIG. 8 provides example amino acid sequences of typeV and typeVI CRISPR/Cas polypeptides.
- FIG. 9 depicts antibody staining of brain sections from 4 ⁇ -NLS-Cas9 RNP genome-edited animals.
- FIG. 10A-10G depict Cas9 RNP-mediated editing of neural progenitor/stem cells (NPCs).
- FIG. 11A-11D provide various information on sgRNAs and NPCs. Sequences in FIG. 11C from top to bottom-SEQ ID NOs:1319-1331.
- FIG. 12 depicts Cas9 RNP-mediated editing of Ai9 mouse tdTomato STOP cassette.
- FIG. 13A-13D depict that direct delivery of cell penetrating Cas9 RNPs increases editing efficiency in vitro.
- FIG. 14A-14C depict that injection of Cas9 RNP into multiple brain regions in adult mice results in precise and programmable genome-editing.
- FIG. 15A-15C depict that bilateral intrastriatal injection measurements of tdTomato+ cell volume and density indicates RNP dose dependent increase in edited tissue volume.
- FIG. 16A-16B depict the analysis of innate immune response in treated and untreated brains.
- FIG. 17A-17F depict that increasing dose of 4 ⁇ NLS-Cas9 RNP complexes significantly increases the number of tdTomato+ genome-edited cells in the striatum.
- FIG. 18 depicts GUIDE-Seq analysis for off-target editing and 0 ⁇ NLS-Cas9 compared to 4 ⁇ NLS-Cas9 fidelity.
- AAGTAAAACCTCTACAAATGNGG is SEQ ID NO:1332.
- FIG. 19 provides primary sequences for N-terminal NLS-Cas9 fusions. Sequences from top to bottom-SEQ ID NOs:1490-1493.
- site-directed modifying polypeptide or “site-directed DNA modifying polypeptide” or “site-directed target nucleic acid modifying polypeptide” or “RNA-binding site-directed polypeptide” or “RNA-binding site-directed modifying polypeptide” or “site-directed polypeptide” it is meant a polypeptide that binds a guide RNA and is targeted to a specific DNA sequence by the guide RNA.
- a site-directed modifying polypeptide can be class 2 CRISPR/Cas protein (e.g., a type II CRISPR/Cas protein, a type V CRISPR/Cas protein, a type VI CRISPR/Cas protein).
- Type II CRISPR/Cas protein is a Cas9 protein (“Cas9 polypeptide”).
- Cas9 polypeptide examples of type V CRISPR/Cas proteins are Cpf1, C2c1, and C2c3.
- An example of a type II CRISPR/Cas protein is a C2c2 protein.
- Class 2 CRISPR/Cas proteins e.g., Cas9, Cpf1, C2c1, C2c2, and C2c3 as described herein are targeted to a specific DNA sequence by the RNA (a guide RNA) to which it is bound.
- the guide RNA comprises a sequence that is complementary to a target sequence within the target DNA, thus targeting the bound CRISPR/Cas protein to a specific location within the target DNA (the target sequence).
- a Cpf1 polypeptide as described herein is targeted to a specific DNA sequence by the RNA (a guide RNA) to which it is bound.
- the guide RNA comprises a sequence that is complementary to a target sequence within the target DNA, thus targeting the bound Cpf1 protein to a specific location within the target DNA (the target sequence).
- Heterologous means a nucleotide or polypeptide sequence that is not found in the native nucleic acid or protein, respectively.
- a fusion Cas9 polypeptide of the present disclosure comprises: a) a Cas9 polypeptide; and b) a heterologous polypeptide comprising an amino acid sequence from a protein other than Cas9 polypeptide.
- polynucleotide and “nucleic acid,” used interchangeably herein, refer to a polymeric form of nucleotides of any length, either ribonucleotides or deoxynucleotides. Thus, this term includes, but is not limited to, single-, double-, or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or a polymer comprising purine and pyrimidine bases or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases.
- polynucleotide and “nucleic acid” should be understood to include, as applicable to the embodiment being described, single-stranded (such as sense or antisense) and double-stranded polynucleotides.
- naturally-occurring refers to a nucleic acid, cell, or organism that is found in nature.
- a polypeptide or polynucleotide sequence that is present in an organism (including viruses) that can be isolated from a source in nature and which has not been intentionally modified by a human in the laboratory is naturally occurring.
- isolated is meant to describe a polynucleotide, a polypeptide, or a cell that is in an environment different from that in which the polynucleotide, the polypeptide, or the cell naturally occurs.
- An isolated genetically modified host cell may be present in a mixed population of genetically modified host cells.
- exogenous nucleic acid refers to a nucleic acid that is not normally or naturally found in and/or produced by a given bacterium, organism, or cell in nature.
- endogenous nucleic acid refers to a nucleic acid that is normally found in and/or produced by a given bacterium, organism, or cell in nature.
- An “endogenous nucleic acid” is also referred to as a “native nucleic acid” or a nucleic acid that is “native” to a given bacterium, organism, or cell.
- Recombinant means that a particular nucleic acid (DNA or RNA) is the product of various combinations of cloning, restriction, and/or ligation steps resulting in a construct having a structural coding or non-coding sequence distinguishable from endogenous nucleic acids found in natural systems.
- DNA sequences encoding the structural coding sequence can be assembled from cDNA fragments and short oligonucleotide linkers, or from a series of synthetic oligonucleotides, to provide a synthetic nucleic acid which is capable of being expressed from a recombinant transcriptional unit contained in a cell or in a cell-free transcription and translation system.
- sequences can be provided in the form of an open reading frame uninterrupted by internal non-translated sequences, or introns, which are typically present in eukaryotic genes.
- Genomic DNA comprising the relevant sequences can also be used in the formation of a recombinant gene or transcriptional unit. Sequences of non-translated DNA may be present 5′ or 3′ from the open reading frame, where such sequences do not interfere with manipulation or expression of the coding regions, and may indeed act to modulate production of a desired product by various mechanisms (see “DNA regulatory sequences”, below).
- the term “recombinant” polynucleotide or “recombinant” nucleic acid refers to one which is not naturally occurring, e.g., is made by the artificial combination of two otherwise separated segments of sequence through human intervention.
- This artificial combination is often accomplished by either chemical synthesis means, or by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques. Such can be done to replace a codon with a redundant codon encoding the same or a conservative amino acid, while typically introducing or removing a sequence recognition site. It can also be performed to join together nucleic acid segments of desired functions to generate a desired combination of functions.
- This artificial combination is often accomplished by either chemical synthesis means, or by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques.
- polypeptide refers to a polypeptide which is not naturally occurring, e.g., is made by the artificial combination of two otherwise separated segments of amino sequence through human intervention.
- a polypeptide that comprises a heterologous amino acid sequence is recombinant.
- construct or “vector” is meant a recombinant nucleic acid, generally recombinant DNA, which has been generated for the purpose of the expression and/or propagation of a specific nucleotide sequence(s), or is to be used in the construction of other recombinant nucleotide sequences.
- DNA regulatory sequences refer to transcriptional and translational control sequences, such as promoters, enhancers, polyadenylation signals, terminators, protein degradation signals, and the like, that provide for and/or regulate expression of a coding sequence and/or production of an encoded polypeptide in a host cell.
- transformation is used interchangeably herein with “genetic modification” and refers to a permanent or transient genetic change induced in a cell following introduction of new nucleic acid (i.e., DNA exogenous to the cell).
- Genetic change (“modification”) can be accomplished either by incorporation of the new DNA into the genome of the host cell, or by transient or stable maintenance of the new DNA as an episomal element.
- a permanent genetic change is generally achieved by introduction of the DNA into the genome of the cell.
- permanent changes can be introduced into the chromosome or via extrachromosomal elements such as plasmids and expression vectors, which may contain one or more selectable markers to aid in their maintenance in the recombinant host cell.
- “Operably linked” refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner.
- a promoter is operably linked to a coding sequence if the promoter affects its transcription or expression.
- heterologous promoter and “heterologous control regions” refer to promoters and other control regions that are not normally associated with a particular nucleic acid in nature.
- a “transcriptional control region heterologous to a coding region” is a transcriptional control region that is not normally associated with the coding region in nature.
- a “host cell,” as used herein, denotes an in vivo or in vitro eukaryotic cell, a prokaryotic cell, or a cell from a multicellular organism (e.g., a cell line) cultured as a unicellular entity, which eukaryotic or prokaryotic cells can be, or have been, used as recipients for a nucleic acid (e.g., an expression vector that comprises a nucleotide sequence encoding one or more biosynthetic pathway gene products such as mevalonate pathway gene products), and include the progeny of the original cell which has been genetically modified by the nucleic acid.
- a nucleic acid e.g., an expression vector that comprises a nucleotide sequence encoding one or more biosynthetic pathway gene products such as mevalonate pathway gene products
- a “recombinant host cell” (also referred to as a “genetically modified host cell”) is a host cell into which has been introduced a heterologous nucleic acid, e.g., an expression vector.
- a subject prokaryotic host cell is a genetically modified prokaryotic host cell (e.g., a bacterium), by virtue of introduction into a suitable prokaryotic host cell of a heterologous nucleic acid, e.g., an exogenous nucleic acid that is foreign to (not normally found in nature in) the prokaryotic host cell, or a recombinant nucleic acid that is not normally found in the prokaryotic host cell; and a subject eukaryotic host cell is a genetically modified eukaryotic host cell, by virtue of introduction into a suitable eukaryotic host cell of a heterologous nucleic acid, e.g., an exogenous nucleic acid that is foreign to the eukaryotic host cell, or a recombinant nucleic acid that is not normally found in the eukaryotic host cell.
- a suitable prokaryotic host cell e.g., a bacterium
- a group of amino acids having aliphatic side chains consists of glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains consists of serine and threonine; a group of amino acids having amide-containing side chains consists of asparagine and glutamine; a group of amino acids having aromatic side chains consists of phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains consists of lysine, arginine, and histidine; and a group of amino acids having sulfur-containing side chains consists of cysteine and methionine.
- Exemplary conservative amino acid substitution groups are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-
- a polynucleotide or polypeptide has a certain percent “sequence identity” to another polynucleotide or polypeptide, meaning that, when aligned, that percentage of bases or amino acids are the same, and in the same relative position, when comparing the two sequences. Sequence similarity can be determined in a number of different manners. To determine sequence identity, sequences can be aligned using the methods and computer programs, including BLAST, available over the world wide web at ncbi.nlm.nih.gov/BLAST. See, e.g., Altschul et al. (1990), J. Mol. Biol. 215:403-10.
- FASTA Another alignment algorithm is FASTA, available in the Genetics Computing Group (GCG) package, from Madison, Wis., USA, a wholly owned subsidiary of Oxford Molecular Group, Inc.
- GCG Genetics Computing Group
- Other techniques for alignment are described in Methods in Enzymology, vol. 266: Computer Methods for Macromolecular Sequence Analysis (1996), ed. Doolittle, Academic Press, Inc., a division of Harcourt Brace & Co., San Diego, Calif., USA.
- alignment programs that permit gaps in the sequence.
- the Smith-Waterman is one type of algorithm that permits gaps in sequence alignments. See Meth. Mol. Biol. 70: 173-187 (1997).
- the GAP program using the Needleman and Wunsch alignment method can be utilized to align sequences. See J. Mol. Biol. 48: 443-453 (1970).
- the present disclosure provides modified site-directed modifying polypeptides, and ribonucleoproteins comprising the modified polypeptides.
- the modified site-directed modifying polypeptides are modified for passive entry into target cells.
- the modified site-directed modifying polypeptides, and ribonucleoproteins comprising same, are useful in a variety of methods for target nucleic acid modification, which methods are also provided.
- the present disclosure provides modified site-directed modifying polypeptides, and ribonucleoproteins (RNP) comprising the modified polypeptides.
- RNP ribonucleoproteins
- the terms “RNP” and “RNP complex” are used herein interchangeably.
- the modification of the site-directed modifying polypeptides provides for passive entry into a target eukaryotic cell, i.e., an RNP that comprises a modified site-directed modifying polypeptide of the present disclosure crosses the plasma membrane of a eukaryotic cell without the need for any additional agent (e.g., small molecule agents, lipids, etc.) to facilitate crossing the plasma membrane.
- a site-directed modifying polypeptide of the present disclosure is a fusion polypeptide that comprises: a) a class 2 CRISPR/Cas protein; and b) a fusion partner, where the fusion partner is a heterologous polypeptide that facilitates uptake of the RNP into a eukaryotic cell, i.e., the heterologous polypeptide facilitates crossing the plasma membrane of a eukaryotic cell such that the RNP crosses the plasma membrane and enters the cytoplasm of the eukaryotic cell.
- a site-directed modifying polypeptide of the present disclosure is also referred to herein as a “fusion site-directed modifying polypeptide.”
- a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure comprises a fusion partner that is a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell.
- an RNP comprises a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure
- the RNP crosses the plasma membrane without the need for any mechanical, electrical, or chemical means to facilitate crossing of the plasma membrane by the RNP, and entry of the RNP into the cytoplasm.
- an RNP comprises a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure
- the RNP crosses the plasma membrane without the need for any other agent that facilitates a macromolecule to cross the plasma membrane, e.g., without the use of a transfection reagent(s), viral infection, conjugation, protoplast fusion, an agent that modifies electrical conductivity of the plasma membrane, or an agent that modifies membrane stability.
- an RNP comprises a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure
- the RNP crosses the plasma membrane without the need for any other agent that facilitates a macromolecule to cross the plasma membrane, e.g., without use of a cationic liposome (e.g., lipofectamine); without the use of diethylaminoethyl (DEAE)-dextran; without the use of calcium phosphate; without the use of a dendrimer; etc.
- a cationic liposome e.g., lipofectamine
- DEAE diethylaminoethyl
- an RNP comprises a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure
- the RNP crosses the plasma membrane without the need for modulation of the electrical conductivity of the plasma membrane, e.g., without the need for electroporation.
- the RNP crosses the plasma membrane without the need for mechanical means of facilitating crossing the plasma membrane, e.g., without the use of microinjection, pressure, or particle bombardment.
- a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure comprises two or more heterologous polypeptides that facilitate uptake of an RNP into a eukaryotic cell.
- a fusion site-directed modifying polypeptide e.g., a fusion class 2 CRISPR/Cas polypeptide
- a fusion site-directed modifying polypeptide of the present disclosure comprises six heterologous polypeptides that facilitate uptake of an RNP into a eukaryotic cell.
- a fusion site-directed modifying polypeptide e.g., a fusion class 2 CRISPR/Cas polypeptide
- a fusion site-directed modifying polypeptide of the present disclosure comprises two or more heterologous polypeptides that facilitate uptake of an RNP into a eukaryotic cell.
- the two or more heterologous polypeptides are separated by a linker of from 2 amino acids to 25 amino acids (e.g., 2 amino acids (aa), 3 aa, 4 aa, 5 aa, 6 aa, 7 aa, 8 aa, 9 aa, 10 aa, 11 aa, 12 aa, 13 aa, 14 aa, 15 aa, 16 aa, 17 aa, 18 aa, 19 aa, 20 aa, 21 aa, 22 aa, 23 aa, 24 aa, or 25 aa).
- Suitable linkers are described below.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure can have a length of from about 5 amino acids to about 70 amino acids, e.g., from 5 amino acids (aa) to 10 aa, from 10 aa to 15 aa, from 15 aa to 20 aa, from 20 aa to 25 aa, from 25 aa to 30 aa, from 30 aa to 35 aa, from 35 aa to 40 aa, from 40 aa to 45 aa, from 45 aa to 50 aa, from 50 aa to 55 aa, from 55 aa to 60 aa, from 60 aa to 65 aa, or from 65 aa to 70 aa.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell has a length of from 5 amino acids to 10 amino acids (e.g., 5, 6, 7, 8, 9, or 10 amino acids). In some cases, a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell has a length of 7 amino acids. In some cases, a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell has a length of from 10 amino acids to 15 amino acids (e.g., 10, 11, 12, 13, 14, or 15 amino acids).
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell has a length of from 15 amino acids to 20 amino acids (e.g., 15, 16, 17, 18, 19, or 20 amino acids). In some cases, a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell has a length of from 20 amino acids to 25 amino acids (e.g., 20, 21, 22, 23, 24, or 25 amino acids).
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure can have a high percentage of arginine and/or lysine residues.
- a suitable heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell can have an amino acid sequence comprising from 20% to 80% arginine and/or lysine residues.
- a suitable heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell can have an amino acid sequence comprising from 20% to 80% lysine residues.
- a suitable heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell can have an amino acid sequence comprising from 20% to 80% arginine+lysine residues.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence K-K/R-X-K/R, where X is any amino acid.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence K-K/R-X-K/R, where X is any amino acid; and has a length of from 7 to 17 amino acids.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence K-K/R-X-K/R, where X is any amino acid; and has a length of from 5 to 15 amino acids.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence K-K/R-X-K/R, where X is any amino acid; and has a length of from 15 to 20 amino acids.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence PKKKRKV (SEQ ID NO: 1090).
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence PKKKRKV (SEQ ID NO: 1090), and has a length of 7 amino acids.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence RPAATKKAGQAKKKKLD (SEQ ID NO: 1096).
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence RPAATKKAGQAKKKKLD (SEQ ID NO: 1096), and has a length of 17 amino acids.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence AVKRPAATKKAGQAKKKKLD (SEQ ID NO: 1097).
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence AVKRPAATKKAGQAKKKKLD (SEQ ID NO: 1097), and has a length of 20 amino acids.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence AVKRPAATKKAGQAKKK (SEQ ID NO: 1098).
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence AVKRPAATKKAGQAKKK (SEQ ID NO: 1098), and has a length of 17 amino acids.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence KRPAATKKAGQAKKKKLD (SEQ ID NO: 1099), and has a length of 18 amino acids.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence PKKKRKVED (SEQ ID NO: 1248); and has a length of 9 amino acids.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure comprises the amino acid sequence PKKKRKVDT (SEQ ID NO: 1249); and has a length of 9 amino acids.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure is at or near the N-terminus of a class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure is at or near the C-terminus of a class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide.
- a heterologous polypeptide that facilitates uptake of an RNP into a eukaryotic cell and that is suitable for inclusion in a fusion polypeptide of the present disclosure is located internally within a class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide.
- a fusion site-directed modifying polypeptide e.g., a fusion class 2 CRISPR/Cas polypeptide
- a fusion site-directed modifying polypeptide of the present disclosure comprises two or more heterologous polypeptides that facilitate uptake of an RNP into a eukaryotic cell
- the two or more heterologous polypeptides are at or near the N-terminus of a class 2 CRISPR/Cas polypeptide.
- a fusion site-directed modifying polypeptide e.g., a fusion class 2 CRISPR/Cas polypeptide
- a fusion site-directed modifying polypeptide of the present disclosure comprises two or more heterologous polypeptides that facilitate uptake of an RNP into a eukaryotic cell
- the two or more heterologous polypeptides are at or near the C-terminus of a class 2 CRISPR/Cas polypeptide.
- a fusion site-directed modifying polypeptide e.g., a fusion class 2 CRISPR/Cas polypeptide
- a fusion site-directed modifying polypeptide of the present disclosure comprises two or more heterologous polypeptides that facilitate uptake of an RNP into a eukaryotic cell
- the two or more heterologous polypeptides are at or near the N-terminus and at or near the C-terminus of a class 2 CRISPR/Cas polypeptide.
- a fusion site-directed modifying polypeptide e.g., a fusion class 2 CRISPR/Cas polypeptide
- a fusion site-directed modifying polypeptide of the present disclosure comprises two or more heterologous polypeptides that facilitate uptake of an RNP into a eukaryotic cell
- the two or more heterologous polypeptides are at or near the N-terminus and located internally within a class 2 CRISPR/Cas polypeptide.
- a fusion site-directed modifying polypeptide e.g., a fusion class 2 CRISPR/Cas polypeptide
- a fusion site-directed modifying polypeptide of the present disclosure comprises two or more heterologous polypeptides that facilitate uptake of an RNP into a eukaryotic cell
- the two or more heterologous polypeptides are at or near the C-terminus and located internally within a class 2 CRISPR/Cas polypeptide.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a heterologous polypeptide that facilitates entry into a eukaryotic cell; and b) a class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a heterologous polypeptide that facilitates entry into a eukaryotic cell
- a class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) two or more heterologous polypeptides that facilitates entry into a eukaryotic cell; and b) a class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide); and b) a heterologous polypeptide that facilitates entry into a eukaryotic cell.
- a class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide
- heterologous polypeptide that facilitates entry into a eukaryotic cell.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide); and b) two or more heterologous polypeptides that facilitates entry into a eukaryotic cell.
- a class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide
- heterologous polypeptides that facilitates entry into a eukaryotic cell.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; and c) a class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a linker polypeptide of from about 3 amino acids to about 25 amino acids in length; c) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; and d) a class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a first linker polypeptide of from about 3 amino acids to about 25 amino acids in length; c) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a second linker polypeptide of from about 3 amino acids to about 25 amino acids in length; and e) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- the first and the second heterologous polypeptides are identical.
- the first and the second heterologous polypeptides are different, e.g., differ from one another in amino acid sequence by at least one amino acid.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; c) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide); and d) a third heterologous polypeptide that facilitates entry into a eukaryotic cell.
- a Class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a linker polypeptide of from about 3 amino acids to about 25 amino acids in length; c) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide); and e) a third heterologous polypeptide that facilitates entry into a eukaryotic cell.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a first linker polypeptide of from about 3 amino acids to about 25 amino acids in length; c) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a second linker polypeptide of from about 3 amino acids to about 25 amino acids in length; e) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide); and f) a third heterologous polypeptide that facilitates entry into a eukaryotic cell.
- a first heterologous polypeptide that facilitates entry into a eukaryotic cell
- first, the second, and the third heterologous polypeptides are identical. In other instances, the first, the second, and the third second heterologous polypeptides are different, e.g., differ from one another in amino acid sequence by at least one amino acid. In some cases, the first and the second heterologous polypeptides are identical; and the third heterologous polypeptide differs from the first and the second heterologous polypeptides.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; c) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; and d) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a Class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a linker polypeptide of from about 3 amino acids to about 25 amino acids in length; c) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a second linker polypeptide of from about 3 amino acids to about 25 amino acids in length; e) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; and f) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a Class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a first linker polypeptide of from about 3 amino acids to about 25 amino acids in length; c) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a second linker polypeptide of from about 3 amino acids to about 25 amino acids in length; e) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; f) a third linker polypeptide of from about 3 amino acids to about 25 amino acids in length; and g) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a Class 2 CRISPR/Cas polypeptide e.g., a
- first, the second, and the third heterologous polypeptides are identical. In other instances, the first, the second, and the third second heterologous polypeptides are different, e.g., differ from one another in amino acid sequence by at least one amino acid.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; c) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide); and e) a fourth heterologous polypeptide that facilitates entry into a eukaryotic cell.
- a Class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a linker polypeptide of from about 3 amino acids to about 25 amino acids in length; c) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a second linker polypeptide of from about 3 amino acids to about 25 amino acids in length; e) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; f) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide); g) a fourth heterologous polypeptide that facilitates entry into a eukaryotic cell.
- a Class 2 CRISPR/Cas polypeptide e.g., a type
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a first linker polypeptide of from about 3 amino acids to about 25 amino acids in length; c) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a second linker polypeptide of from about 3 amino acids to about 25 amino acids in length; e) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; f) a third linker polypeptide of from about 3 amino acids to about 25 amino acids in length; g) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide); and h) a fourth heterologous polypeptide that facilitates entry into a e
- the first, the second, the third, the fourth heterologous polypeptides are identical. In other instances, the first, the second, the third, and the fourth heterologous polypeptides are different, e.g., differ from one another in amino acid sequence by at least one amino acid. In some cases, the first, the second, and the third heterologous polypeptides are identical; and the fourth heterologous polypeptide differs from the first, the second, and the third heterologous polypeptides.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; c) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a fourth heterologous polypeptide that facilitates entry into a eukaryotic cell; and e) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a Class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a linker polypeptide of from about 3 amino acids to about 25 amino acids in length; c) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a second linker polypeptide of from about 3 amino acids to about 25 amino acids in length; e) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; f) a third linker polypeptide of from about 3 amino acids to about 25 amino acids in length g) a fourth heterologous polypeptide that facilitates entry into a eukaryotic cell; and h) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying poly
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a first linker polypeptide of from about 3 amino acids to about 25 amino acids in length; c) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a second linker polypeptide of from about 3 amino acids to about 25 amino acids in length; e) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; f) a third linker polypeptide of from about 3 amino acids to about 25 amino acids in length g) a fourth heterologous polypeptide that facilitates entry into a eukaryotic cell; h) a fourth linker polypeptide of from about 3 amino acids to about 25 amino acids in length; and i) a Class 2 CRISPR/Cas polypeptide (e.g) a
- first, second, third, and fourth heterologous polypeptides are identical. In other instances, the first, second, third, and fourth heterologous polypeptides are different, e.g., differ from one another in amino acid sequence by at least one amino acid.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; c) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a fourth heterologous polypeptide that facilitates entry into a eukaryotic cell; e) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide); and f) a fifth heterologous polypeptide that facilitates entry into a eukaryotic cell.
- a Class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptid
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a linker polypeptide of from about 3 amino acids to about 25 amino acids in length; c) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a second linker polypeptide of from about 3 amino acids to about 25 amino acids in length; e) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; f) a third linker polypeptide of from about 3 amino acids to about 25 amino acids in length; g) a fourth heterologous polypeptide that facilitates entry into a eukaryotic cell; h) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying poly
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a first linker polypeptide of from about 3 amino acids to about 25 amino acids in length; c) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a second linker polypeptide of from about 3 amino acids to about 25 amino acids in length; e) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; f) a third linker polypeptide of from about 3 amino acids to about 25 amino acids in length g) a fourth heterologous polypeptide that facilitates entry into a eukaryotic cell; h) a fourth linker polypeptide of from about 3 amino acids to about 25 amino acids in length; i) a Class 2 CRISPR/Cas polypeptide (e.g.
- the first, second, third, and fourth heterologous polypeptides are identical. In other instances, the first, second, third, and fourth heterologous polypeptides are different, e.g., differ from one another in amino acid sequence by at least one amino acid. In some cases, the first, second, third, fourth, and fifth heterologous polypeptides are identical. In some cases, the first, second, third, fourth, and fifth heterologous polypeptides are different, e.g., differ from one another in amino acid sequence by at least one amino acid. In some cases, the first, second, third, and fourth heterologous polypeptides are identical; and the fifth heterologous differs from the first, second, third, and fourth heterologous polypeptides.
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; c) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a fourth heterologous polypeptide that facilitates entry into a eukaryotic cell; e) a fifth heterologous polypeptide that facilitates entry into a eukaryotic cell; and f) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a Class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptid
- a fusion polypeptide of the present disclosure comprises, in order from N-terminus to C-terminus: a) a first heterologous polypeptide that facilitates entry into a eukaryotic cell; b) a second heterologous polypeptide that facilitates entry into a eukaryotic cell; c) a third heterologous polypeptide that facilitates entry into a eukaryotic cell; d) a fourth heterologous polypeptide that facilitates entry into a eukaryotic cell; e) a fifth heterologous polypeptide that facilitates entry into a eukaryotic cell; f) a sixth heterologous polypeptide that facilitates entry into a eukaryotic cell; and g) a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a Class 2 CRISPR/Cas polypeptide e.g.
- a linker can be interposed between any two heterologous polypeptides and/or between a heterologous polypeptide and a Class 2 CRISPR/Cas polypeptide (e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide).
- a Class 2 CRISPR/Cas polypeptide e.g., a type II, type V, or type VI CRISPR/Cas site-directed modifying polypeptide.
- a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure includes one or more internally inserted linker polypeptides, where a linker polypeptide can be between two heterologous polypeptides that facilitate crossing a eukaryotic plasma membrane; between a heterologous polypeptide that facilitate crossing a eukaryotic plasma membrane and a site-directed modifying polypeptide; and the like.
- the linker polypeptide may have any of a variety of amino acid sequences. Proteins can be joined by a spacer peptide, generally of a flexible nature, although other chemical linkages are not excluded.
- Suitable linkers include polypeptides of from 3 amino acids to 40 amino acids in length, e.g., from 3 amino acids to 25 amino acids in length, from 3 amino acids to 10 amino acids in length, from 3 amino acids to 5 amino acids in length, etc. These linkers are generally produced by using synthetic, linker-encoding oligonucleotides to couple the proteins. Peptide linkers with a degree of flexibility can be used.
- the linking peptides may have virtually any amino acid sequence, bearing in mind that the preferred linkers will have a sequence that results in a generally flexible peptide.
- the use of small amino acids, such as glycine and alanine are of use in creating a flexible peptide. The creation of such sequences is routine to those of skill in the art.
- a variety of different linkers are commercially available and are considered suitable for use.
- Exemplary linker polypeptides include glycine polymers (G) n , glycine-serine polymers (including, for example, GGS, (GS) n , GSGGS n (SEQ ID NO: 517), GGSGGS n (SEQ ID NO: 518), and GGGS n (SEQ ID NO: 519), where n is an integer of at least one, and can range from 1 to about 10), glycine-alanine polymers, alanine-serine polymers.
- Exemplary linkers can comprise amino acid sequences including, but not limited to, GS, GGS, GGSG (SEQ ID NO: 520), GGSGG (SEQ ID NO: 521), GSGSG (SEQ ID NO: 522), GSGGG (SEQ ID NO: 523), GGGSG (SEQ ID NO: 524), GSSSG (SEQ ID NO: 525), and the like.
- the ordinarily skilled artisan will recognize that design of a peptide conjugated to any elements described above can include linkers that are all or partially flexible, such that the linker can include a flexible linker as well as one or more portions that confer less flexible structure.
- a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure is a fusion class 2 CRISPR/Cas site-directed modifying polypeptide and therefore comprises a class 2 CRISPR/Cas polypeptide.
- the functions of the effector complex e.g., cleaving target DNA
- a single protein e.g., see Zetsche et al, Cell. 2015 Oct. 22; 163(3):759-71; Makarova et al, Nat Rev Microbiol. 2015 November; 13(11):722-36; and Shmakov et al., Mol Cell.
- class 2 CRISPR/Cas protein is used herein to encompass the effector protein (the target nucleic acid cleaving protein) from class 2 CRISPR systems.
- class 2 CRISPR/Cas protein encompasses type II CRISPR/Cas proteins (e.g., Cas9), type V CRISPR/Cas proteins (e.g., Cpf1, C2c1, C2C3), and type VI CRISPR/Cas proteins (e.g., C2c2).
- class 2 CRISPR/Cas proteins encompass type II, type V, and type VI CRISPR/Cas proteins, but the term is also meant to encompass any class 2 CRISPR/Cas protein suitable for forming a subject RNP complex.
- a fusion site-directed modifying polypeptide of the present disclosure comprises a type II CRISPR site-directed modifying polypeptide (e.g., a “Cas9 polypeptide”).
- a fusion site-directed modifying polypeptide of the present disclosure comprises: a) Cas9 protein; and b) a heterologous polypeptide that facilitates entry of an RNP (where the RNP comprises the fusion site-directed modifying polypeptide and a guide RNA) into a eukaryotic cell.
- a fusion site-directed modifying polypeptide of the present disclosure comprises: a) chimeric Cas9 protein; and b) a heterologous polypeptide that facilitates entry of an RNP (where the RNP comprises the fusion site-directed modifying polypeptide and a guide RNA) into a eukaryotic cell.
- a fusion site-directed modifying polypeptide of the present disclosure comprises: a) variant Cas9 protein that is a nickase; and b) a heterologous polypeptide that facilitates entry of an RNP (where the RNP comprises the fusion site-directed modifying polypeptide and a guide RNA) into a eukaryotic cell.
- a fusion site-directed modifying polypeptide of the present disclosure comprises: a) variant Cas9 protein that exhibits reduced enzymatic activity (e.g., a “dead Cas9” or “dCas9”); and b) a heterologous polypeptide that facilitates entry of an RNP (where the RNP comprises the fusion site-directed modifying polypeptide and a guide RNA) into a eukaryotic cell.
- a Cas9 protein forms a complex with a Cas9 guide RNA.
- the guide RNA provides target specificity to a Cas9-guide RNA complex by having a nucleotide sequence (a guide sequence) that is complementary to a sequence (the target site) of a target nucleic acid (as described elsewhere herein).
- the Cas9 protein of the complex provides the site-specific activity.
- the Cas9 protein is guided to a target site (e.g., stabilized at a target site) within a target nucleic acid sequence (e.g.
- a chromosomal sequence or an extrachromosomal sequence e.g., an episomal sequence, a minicircle sequence, a mitochondrial sequence, a chloroplast sequence, etc.
- a Cas9 protein can bind and/or modify (e.g., cleave, nick, methylate, demethylate, etc.) a target nucleic acid and/or a polypeptide associated with target nucleic acid (e.g., methylation or acetylation of a histone tail)(e.g., when the Cas9 protein includes a fusion partner with an activity).
- the Cas9 protein is a naturally-occurring protein (e.g., naturally occurs in bacterial and/or archaeal cells).
- the Cas9 protein is not a naturally-occurring polypeptide (e.g., the Cas9 protein is a variant Cas9 protein, a chimeric protein, and the like).
- Cas9 proteins include, but are not limited to, those set forth in SEQ ID NOs: 5-816.
- Naturally occurring Cas9 proteins bind a Cas9 guide RNA, are thereby directed to a specific sequence within a target nucleic acid (a target site), and cleave the target nucleic acid (e.g., cleave dsDNA to generate a double strand break, cleave ssDNA, cleave ssRNA, etc.).
- a chimeric Cas9 protein is a fusion protein comprising a Cas9 polypeptide that is fused to a heterologous protein, where the heterologous protein provides an activity not provided by the Cas9 protein, where the activity is other than facilitating entry of an RNP into a eukaryotic cell.
- the fusion partner can provide an activity, e.g., enzymatic activity (e.g., nuclease activity, activity for DNA and/or RNA methylation, activity for DNA and/or RNA cleavage, activity for histone acetylation, activity for histone methylation, activity for RNA modification, activity for RNA-binding, activity for RNA splicing etc.).
- a portion of the Cas9 protein exhibits reduced nuclease activity relative to the corresponding portion of a wild type Cas9 protein (e.g., in some cases the Cas9 protein is a nickase).
- the Cas9 protein is enzymatically inactive, or has reduced enzymatic activity relative to a wild-type Cas9 protein (e.g., relative to Streptococcus pyogenes Cas9).
- Assays to determine whether given protein interacts with a Cas9 guide RNA can be any convenient binding assay that tests for binding between a protein and a nucleic acid. Suitable binding assays (e.g., gel shift assays) will be known to one of ordinary skill in the art (e.g., assays that include adding a Cas9 guide RNA and a protein to a target nucleic acid).
- Assays to determine whether a protein has an activity can be any convenient assay (e.g., any convenient nucleic acid cleavage assay that tests for nucleic acid cleavage).
- Suitable assays e.g., cleavage assays will be known to one of ordinary skill in the art and can include adding a Cas9 guide RNA and a protein to a target nucleic acid.
- a chimeric Cas9 protein includes a heterologous polypeptide that has enzymatic activity that modifies target nucleic acid (e.g., nuclease activity, methyltransferase activity, demethylase activity, DNA repair activity, DNA damage activity, deamination activity, dismutase activity, alkylation activity, depurination activity, oxidation activity, pyrimidine dimer forming activity, integrase activity, transposase activity, recombinase activity, polymerase activity, ligase activity, helicase activity, photolyase activity or glycosylase activity).
- target nucleic acid e.g., nuclease activity, methyltransferase activity, demethylase activity, DNA repair activity, DNA damage activity, deamination activity, dismutase activity, alkylation activity, depurination activity, oxidation activity, pyrimidine dimer forming activity, integrase activity, transposase
- a chimeric Cas9 protein includes a heterologous polypeptide that has enzymatic activity that modifies a polypeptide (e.g., a histone) associated with target nucleic acid (e.g., methyltransferase activity, demethylase activity, acetyltransferase activity, deacetylase activity, kinase activity, phosphatase activity, ubiquitin ligase activity, deubiquitinating activity, adenylation activity, deadenylation activity, SUMOylating activity, deSUMOylating activity, ribosylation activity, deribosylation activity, myristoylation activity or demyristoylation activity).
- a polypeptide e.g., a histone
- target nucleic acid e.g., methyltransferase activity, demethylase activity, acetyltransferase activity, deacetylase activity, kinase activity,
- Cas9 orthologs from a wide variety of species have been identified and in some cases the proteins share only a few identical amino acids.
- Identified Cas9 orthologs have similar domain architecture with a central HNH endonuclease domain and a split RuvC/RNaseH domain (e.g., RuvCI, RuvCII, and RuvCIII) (e.g., see Table 1).
- a Cas9 protein can have 3 different regions (sometimes referred to as RuvC-I, RuvC-II, and RucC-III), that are not contiguous with respect to the primary amino acid sequence of the Cas9 protein, but fold together to form a RuvC domain once the protein is produced and folds.
- Cas9 proteins can be said to share at least 4 key motifs with a conserved architecture.
- Motifs 1, 2, and 4 are RuvC like motifs while motif 3 is an HNH-motif.
- the motifs set forth in Table 1 may not represent the entire RuvC-like and/or HNH domains as accepted in the art, but Table 1 presents motifs that can be used to help determine whether a given protein is a Cas9 protein.
- Table 1 lists 4 motifs that are present in Cas9 sequences from various species.
- the amino acids listed in Table 1 are from the Cas9 from S. pyogenes (SEQ ID NO: 5) Motif Highly # Motif Amino acids (residue #s) conserved 1 RuvC-like IGLDIGTNSVGWAVI (7-21) D10, G12, G17 I (SEQ ID NO: 1) 2 RuvC-like IVIEMARE (759-766) E762 II (SEQ ID NO: 2) 3 HNH- DVDHIVPQSFLKDDSIDNKVLTRSDKN H840, N854, motif (837-863)(SEQ ID NO: 3) N863 4 RuvC-like HHAHDAYL (982-989) H982, H983, II (SEQ ID NO: 4) A984, D986, A987
- a suitable Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 60% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 99% or more or 100% amino acid sequence identity to motifs 1-4 as set forth in SEQ ID NOs: 1-4, respectively (e.g., see Table 1), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 5-816.
- a suitable Cas9 polypeptide comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 60% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 99% or more or 100% amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5 (e.g., the sequences set forth in SEQ ID NOs: 1-4, e.g., see Table 1), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 60% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 70% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 75% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 80% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 85% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 90% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 95% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 99% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 100% amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- Any Cas9 protein as defined above can be used as a Cas9 polypeptide, as part of a chimeric Cas9 polypeptide (e.g., a Cas9 fusion protein), any of which can be used in an RNP of the present disclosure.
- a suitable Cas9 protein comprises an amino acid sequence having 60% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 99% or more or 100% amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 60% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9 protein comprises an amino acid sequence having 70% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 75% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9 protein comprises an amino acid sequence having 80% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 85% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9 protein comprises an amino acid sequence having 90% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 95% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9 protein comprises an amino acid sequence having 99% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 100% amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- Any Cas9 protein as defined above can be used as a Cas9 polypeptide, as part of a chimeric Cas9 polypeptide (e.g., a Cas9 fusion protein), any of which can be used in an RNP of the present disclosure.
- a suitable Cas9 protein comprises an amino acid sequence having 60% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 99% or more or 100% amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 60% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9 protein comprises an amino acid sequence having 70% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9 protein comprises an amino acid sequence having 75% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 80% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9 protein comprises an amino acid sequence having 85% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9 protein comprises an amino acid sequence having 90% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable Cas9 protein comprises an amino acid sequence having 95% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9 protein comprises an amino acid sequence having 99% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable Cas9 protein comprises an amino acid sequence having 100% amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- Any Cas9 protein as defined above can be used as a Cas9 polypeptide, as part of a chimeric Cas9 polypeptide (e.g., a Cas9 fusion protein), any of which can be used in an RNP of the present disclosure.
- a Cas9 protein comprises 4 motifs (as listed in Table 1), at least one with (or each with) amino acid sequences having 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 99% or more or 100% amino acid sequence identity to each of the 4 motifs listed in Table 1 (SEQ ID NOs:1-4), or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- Cas9 protein encompasses a “chimeric Cas9 protein.” As used herein, the term “Cas9 protein” encompasses a variant Cas9 that is a nickase. As used herein, the term “Cas9 protein” encompasses a variant Cas9 that exhibits reduced enzymatic activity (e.g., a “dead Cas9” or “dCas9”).
- a fusion site-directed modifying polypeptide of the present disclosure comprises: a) a variant Cas9 protein; and b) a heterologous polypeptide that facilitates entry of an RNP (where the RNP comprises the fusion site-directed modifying polypeptide and a guide RNA) into a eukaryotic cell.
- a variant Cas9 protein has an amino acid sequence that is different by one amino acid (e.g., has a deletion, insertion, substitution, fusion) (i.e., different by at least one amino acid) when compared to the amino acid sequence of a wild type Cas9 protein.
- the variant Cas9 protein has an amino acid change (e.g., deletion, insertion, or substitution) that reduces the nuclease activity of the Cas9 protein.
- the variant Cas9 protein has 50% or less, 40% or less, 30% or less, 20% or less, 10% or less, 5% or less, or 1% or less of the nuclease activity of the corresponding wild-type Cas9 protein. In some cases, the variant Cas9 protein has no substantial nuclease activity.
- Cas9 protein When a Cas9 protein is a variant Cas9 protein that has no substantial nuclease activity, it can be referred to as “dCas9.”
- a protein e.g., a class 2 CRISPR/Cas protein, e.g., a Cas9 protein
- nickase e.g., a “nickase Cas9”.
- a variant Cas9 protein can cleave the complementary strand (sometimes referred to in the art as the target strand) of a target nucleic acid but has reduced ability to cleave the non-complementary strand (sometimes referred to in the art as the non-target strand) of a target nucleic acid.
- the variant Cas9 protein can have a mutation (amino acid substitution) that reduces the function of the RuvC domain.
- the Cas9 protein can be a nickase that cleaves the complementary strand, but does not cleave the non-complementary strand.
- a variant Cas9 protein has a mutation at an amino acid position corresponding to residue D10 (e.g., D10A, aspartate to alanine) of SEQ ID NO: 5 (or the corresponding position of any of the proteins set forth in SEQ ID NOs: 6-261 and 264-816) and can therefore cleave the complementary strand of a double stranded target nucleic acid but has reduced ability to cleave the non-complementary strand of a double stranded target nucleic acid (thus resulting in a single strand break (SSB) instead of a double strand break (DSB) when the variant Cas9 protein cleaves a double stranded target nucleic acid) (see, for example, Jinek et al., Science. 2012 Aug. 17; 337(6096):816-21). See, e.g., SEQ ID NO: 262.
- a variant Cas9 protein can cleave the non-complementary strand of a target nucleic acid but has reduced ability to cleave the complementary strand of the target nucleic acid.
- the variant Cas9 protein can have a mutation (amino acid substitution) that reduces the function of the HNH domain.
- the Cas9 protein can be a nickase that cleaves the non-complementary strand, but does not cleave the complementary strand.
- the variant Cas9 protein has a mutation at an amino acid position corresponding to residue H840 (e.g., an H840A mutation, histidine to alanine) of SEQ ID NO: 5 (or the corresponding position of any of the proteins set forth as SEQ ID NOs: 6-261 and 264-816) and can therefore cleave the non-complementary strand of the target nucleic acid but has reduced ability to cleave (e.g., does not cleave) the complementary strand of the target nucleic acid.
- residue H840 e.g., an H840A mutation, histidine to alanine
- Such a Cas9 protein has a reduced ability to cleave a target nucleic acid (e.g., a single stranded target nucleic acid) but retains the ability to bind a target nucleic acid (e.g., a single stranded target nucleic acid). See, e.g., SEQ ID NO: 263.
- a variant Cas9 protein has a reduced ability to cleave both the complementary and the non-complementary strands of a double stranded target nucleic acid.
- the variant Cas9 protein harbors mutations at amino acid positions corresponding to residues D10 and H840 (e.g., D10A and H840A) of SEQ ID NO: 5 (or the corresponding residues of any of the proteins set forth as SEQ ID NOs: 6-261 and 264-816) such that the polypeptide has a reduced ability to cleave (e.g., does not cleave) both the complementary and the non-complementary strands of a target nucleic acid.
- Such a Cas9 protein has a reduced ability to cleave a target nucleic acid (e.g., a single stranded or double stranded target nucleic acid) but retains the ability to bind a target nucleic acid.
- a Cas9 protein that cannot cleave target nucleic acid e.g., due to one or more mutations, e.g., in the catalytic domains of the RuvC and HNH domains
- d Cas9 or simply “dCas9.” See, e.g., SEQ ID NO: 264.
- residues D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, and/or A987 of SEQ ID NO: 5 can be altered (i.e., substituted). Also, mutations other than alanine substitutions are suitable.
- a variant Cas9 protein that has reduced catalytic activity e.g., when a Cas9 protein has a D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, and/or a A987 mutation of SEQ ID NO: 5 or the corresponding mutations of any of the proteins set forth as SEQ ID NOs: 6-816, e.g., D10A, G12A, G17A, E762A, H840A, N854A, N863A, H982A, H983A, A984A, and/or D986A)
- the variant Cas9 protein can still bind to target nucleic acid in a site-specific manner (because it is still guided to a target nucleic acid sequence by a Cas9 guide RNA) as long as it retains the ability to interact with the Cas9 guide RNA.
- a variant Cas9 protein can have the same parameters for sequence identity as described above for Cas9 proteins.
- a suitable variant Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 60% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 99% or more or 100% amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 60% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 70% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 75% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 80% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 85% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 90% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 95% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 99% or more amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 4 motifs, each of motifs 1-4 having 100% amino acid sequence identity to motifs 1-4 of the Cas9 amino acid sequence set forth as SEQ ID NO: 5 (the motifs are in Table 1, below, and are set forth as SEQ ID NOs: 1-4, respectively), or to the corresponding portions in any of the amino acid sequences set forth in SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 60% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 99% or more, or 100% amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 60% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable variant Cas9 protein comprises an amino acid sequence having 70% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 75% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable variant Cas9 protein comprises an amino acid sequence having 80% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 85% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable variant Cas9 protein comprises an amino acid sequence having 90% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 95% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable variant Cas9 protein comprises an amino acid sequence having 99% or more amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 100% amino acid sequence identity to amino acids 7-166 or 731-1003 of the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to the corresponding portions in any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 60% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 99% or more, or 100% amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 60% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 70% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable variant Cas9 protein comprises an amino acid sequence having 75% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 80% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable variant Cas9 protein comprises an amino acid sequence having 85% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 90% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable variant Cas9 protein comprises an amino acid sequence having 95% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a suitable variant Cas9 protein comprises an amino acid sequence having 99% or more amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816. In some cases, a suitable variant Cas9 protein comprises an amino acid sequence having 100% amino acid sequence identity to the Cas9 amino acid sequence set forth in SEQ ID NO: 5, or to any of the amino acid sequences set forth as SEQ ID NOs: 6-816.
- a fusion site-directed modifying polypeptide of the present disclosure comprises a type V CRISPR/Cas protein (polypeptide) (e.g., Cpf1, C2c1, C2c3).
- a fusion site-directed modifying polypeptide of the present disclosure comprises a type VI CRISPR/Cas protein (polypeptide) (e.g., C2c2).
- a fusion site-directed modifying polypeptide of the present disclosure comprises a type V or type VI CRISPR/Cas protein (polypeptide) (e.g., Cpf1, C2c1, C2c2, C2c3).
- a Type V CRISPR/Cas polypeptide is a Cpf1 protein.
- type V and type VI CRISPR/Cas proteins e.g., cpf1, C2c1, C2c2, and C2c3 guide RNAs
- cpf1, C2c1, C2c2, and C2c3 guide RNAs can be found in the art, for example, see Zetsche et al, Cell. 2015 Oct. 22; 163(3):759-71; Makarova et al, Nat Rev Microbiol. 2015 November; 13(11):722-36; and Shmakov et al., Mol Cell. 2015 Nov. 5; 60(3):385-97.
- the a Type V or type VI CRISPR/Cas polypeptide (e.g., Cpf1, C2c1, C2c2, C2c3) is enzymatically active, e.g., the Type V or type VI CRISPR/Cas polypeptide, when bound to a guide RNA, cleaves a target nucleic acid.
- the a Type V or type VI CRISPR/Cas polypeptide exhibits reduced enzymatic activity relative to a corresponding wild-type a Type V or type VI CRISPR/Cas polypeptide (e.g., Cpf1, C2c1, C2c2, C2c3), and retains DNA binding activity.
- a type V CRISPR/Cas protein is Cpf1.
- a Cpf1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the amino acid sequence depicted in FIG. 8 (SEQ ID NOs: 1252-1256).
- a Cpf1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to a contiguous stretch of from 100 amino acids to 200 amino acids (aa), from 200 aa to 400 aa, from 400 aa to 600 aa, from 600 aa to 800 aa, from 800 aa to 1000 aa, from 1000 aa to 1100 aa, from 1100 aa to 1200 aa, or from 1200 aa to 1300 aa, of the amino acid sequence depicted in FIG. 8 (SEQ ID NOs: 1252-1256).
- a Cpf1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the RuvCI domain of a Cpf1 polypeptide of the amino acid sequence depicted in FIG. 8 (SEQ ID NOs: 1252-1256).
- a Cpf1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the RuvCII domain of a Cpf1 polypeptide of the amino acid sequence depicted in FIG. 8 (SEQ ID NOs: 1252-1256).
- a Cpf1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the RuvCIII domain of a Cpf1 polypeptide of the amino acid sequence depicted in FIG. 8 (SEQ ID NOs: 1252-1256).
- the Cpf1 polypeptide exhibits reduced enzymatic activity relative to a wild-type Cpf1 polypeptide (e.g., relative to a Cpf1 polypeptide comprising the amino acid sequence depicted in FIG. 8 (SEQ ID NOS: 1252-1256), and retains DNA binding activity.
- a Cpf1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the amino acid sequence depicted in FIG.
- a Cpf1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the amino acid sequence depicted in FIG.
- a Cpf1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the amino acid sequence depicted in FIG.
- amino acid substitution 8 (SEQ ID NOS: 1252-1256); and comprises an amino acid substitution (e.g., a D-A substitution) at an amino acid residue corresponding to amino acid 1255 of the amino acid sequence depicted in FIG. 8 (SEQ ID NOS: 1252-1256).
- a suitable Cpf1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to an amino acid sequence depicted in FIG. 8 (SEQ ID NOs: 1252-1256).
- a type V CRISPR/Cas protein is C2c1 (examples include those depicted in FIG. 8 as SEQ ID NOs: 1280-1287).
- a C2c1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the amino acid sequence set forth in SEQ ID NOs: 1280-1287).
- a C2c1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to a contiguous stretch of from 100 amino acids to 200 amino acids (aa), from 200 aa to 400 aa, from 400 aa to 600 aa, from 600 aa to 800 aa, from 800 aa to 1000 aa, from 1000 aa to 1100 aa, from 1100 aa to 1200 aa, or from 1200 aa to 1300 aa, of the amino acid sequence set forth in SEQ ID NOs: 1280-1287).
- a C2c1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the RuvCI domain of a C2c1 polypeptide of the amino acid sequence set forth in SEQ ID NOs: 1280-1287).
- a C2c1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the RuvCII domain of a C2c1 polypeptide of the amino acid sequence set forth in SEQ ID NOs: 1280-1287).
- a C2c1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the RuvCIII domain of a C2c1 polypeptide of the amino acid sequence set forth in SEQ ID NOs: 1280-1287).
- the C2c1 polypeptide exhibits reduced enzymatic activity relative to a wild-type C2c1 polypeptide (e.g., relative to a C2c1 polypeptide comprising the amino acid sequence set forth in SEQ ID NOs: 1280-1287), and retains DNA binding activity.
- a suitable C2c1 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to an amino acid sequence depicted in FIG. 8 (SEQ ID NOs: 1280-1287).
- a type V CRISPR/Cas protein is C2c3 (examples include those depicted in FIG. 8 as SEQ ID NOs: 1290-1293).
- a C2c3 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the amino acid sequence set forth in SEQ ID NOs: 1290-1293).
- a C2c3 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to a contiguous stretch of from 100 amino acids to 200 amino acids (aa), from 200 aa to 400 aa, from 400 aa to 600 aa, from 600 aa to 800 aa, from 800 aa to 1000 aa, from 1000 aa to 1100 aa, from 1100 aa to 1200 aa, or from 1200 aa to 1300 aa, of the amino acid sequence set forth in SEQ ID NOs: 1290-1293).
- a C2c3 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the RuvCI domain of a C2c3 polypeptide of the amino acid sequence set forth in SEQ ID NOs: 1290-1293).
- a C2c3 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the RuvCII domain of a C2c3 polypeptide of the amino acid sequence set forth in SEQ ID NOs: 1290-1293).
- a C2c3 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the RuvCIII domain of a C2c3 polypeptide of the amino acid sequence set forth in SEQ ID NOs: 1290-1293).
- the C2c3 polypeptide exhibits reduced enzymatic activity relative to a wild-type C2c3 polypeptide (e.g., relative to a C2c3 polypeptide comprising the amino acid sequence set forth in SEQ ID NOs: 1290-1293), and retains DNA binding activity.
- a suitable C2c3 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to an amino acid sequence depicted in FIG. 8 (SEQ ID NOs: 1290-1293).
- a type VI CRISPR/Cas protein is C2c2 (examples include those depicted in FIG. 8 as SEQ ID NOs: 1300-1311).
- a C2c2 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the amino acid sequence set forth in SEQ ID NOs: 1300-1311).
- a C2c2 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to a contiguous stretch of from 100 amino acids to 200 amino acids (aa), from 200 aa to 400 aa, from 400 aa to 600 aa, from 600 aa to 800 aa, from 800 aa to 1000 aa, from 1000 aa to 1100 aa, from 1100 aa to 1200 aa, or from 1200 aa to 1300 aa, of the amino acid sequence set forth in SEQ ID NOs: 1300-1311).
- a C2c2 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the RuvCI domain of a C2c2 polypeptide of the amino acid sequence set forth in SEQ ID NOs: 1300-1311).
- a C2c2 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the RuvCII domain of a C2c2 polypeptide of the amino acid sequence set forth in SEQ ID NOs: 1300-1311).
- a C2c2 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to the RuvCIII domain of a C2c2 polypeptide of the amino acid sequence set forth in SEQ ID NOs: 1300-1311).
- the C2c2 polypeptide exhibits reduced enzymatic activity relative to a wild-type C2c2 polypeptide (e.g., relative to a C2c2 polypeptide comprising the amino acid sequence set forth in SEQ ID NOs: 1300-1311), and retains DNA binding activity.
- a suitable C2c2 polypeptide comprises an amino acid sequence having at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 90%, or 100%, amino acid sequence identity to an amino acid sequence depicted in FIG. 8 (SEQ ID NOs: 1300-1311).
- a nucleic acid molecule that binds to a class 2 CRISPR/Cas protein e.g., a Cas9 protein; a type V or type VI CRISPR/Cas protein; a Cpf1 protein; etc.
- a class 2 CRISPR/Cas protein e.g., a Cas9 protein; a type V or type VI CRISPR/Cas protein; a Cpf1 protein; etc.
- targets the complex to a specific location within a target nucleic acid is referred to herein as a “guide RNA” or “CRISPR/Cas guide nucleic acid” or “CRISPR/Cas guide RNA.”
- a guide RNA provides target specificity to the complex (the RNP complex) by including a targeting segment, which includes a guide sequence (also referred to herein as a targeting sequence), which is a nucleotide sequence that is complementary to a sequence of a target nucleic acid.
- a targeting segment which includes a guide sequence (also referred to herein as a targeting sequence), which is a nucleotide sequence that is complementary to a sequence of a target nucleic acid.
- a guide RNA can be referred to by the protein to which it corresponds.
- the corresponding guide RNA can be referred to as a “Cas9 guide RNA.”
- the class 2 CRISPR/Cas protein is a Cas9 protein (e.g., a fusion Cas9 polypeptide)
- the corresponding guide RNA can be referred to as a “Cpf1 guide RNA.”
- a guide RNA includes two separate nucleic acid molecules: an “activator” and a “targeter” and is referred to herein as a “dual guide RNA”, a “double-molecule guide RNA”, a “two-molecule guide RNA”, or a “dgRNA.”
- the guide RNA is one molecule (e.g., for some class 2 CRISPR/Cas proteins, the corresponding guide RNA is a single molecule; and in some cases, an activator and targeter are covalently linked to one another, e.g., via intervening nucleotides), and the guide RNA is referred to as a “single guide RNA”, a “single-molecule guide RNA,” a “one-molecule guide RNA”, or simply “sgRNA.”
- a nucleic acid molecule that binds to a CRISPR/Cas9 protein and targets the complex to a specific location within a target nucleic acid is referred to herein as a “Cas9 guide RNA.”
- a Cas9 guide RNA can be said to include two segments, a first segment (referred to herein as a “targeting segment”); and a second segment (referred to herein as a “protein-binding segment”).
- target segment a segment/section/region of a molecule, e.g., a contiguous stretch of nucleotides in a nucleic acid molecule.
- a segment can also mean a region/section of a complex such that a segment may comprise regions of more than one molecule.
- the first segment (targeting segment) of a Cas9 guide RNA includes a nucleotide sequence (a guide sequence) that is complementary to (and therefore hybridizes with) a specific sequence (a target site) within a target nucleic acid (e.g., a target ssRNA, a target ssDNA, the complementary strand of a double stranded target DNA, etc.).
- the protein-binding segment (or “protein-binding sequence”) interacts with (binds to) a Cas9 polypeptide.
- the protein-binding segment of a subject Cas9 guide RNA includes two complementary stretches of nucleotides that hybridize to one another to form a double stranded RNA duplex (dsRNA duplex).
- Site-specific binding and/or cleavage of a target nucleic acid can occur at locations (e.g., target sequence of a target locus) determined by base-pairing complementarity between the Cas9 guide RNA (the guide sequence of the Cas9 guide RNA) and the target nucleic acid.
- the Cas9 guide RNA provides target specificity to the complex by including a targeting segment, which includes a guide sequence (a nucleotide sequence that is complementary to a sequence of a target nucleic acid).
- the Cas9 protein of the complex provides the site-specific activity (e.g., cleavage activity or an activity provided by the Cas9 protein when the Cas9 protein is a Cas9 fusion polypeptide, i.e., has a fusion partner).
- the Cas9 protein is guided to a target nucleic acid sequence (e.g. a target sequence in a chromosomal nucleic acid, e.g., a chromosome; a target sequence in an extrachromosomal nucleic acid, e.g. an episomal nucleic acid, a minicircle, an ssRNA, an ssDNA, etc.; a target sequence in a mitochondrial nucleic acid; a target sequence in a chloroplast nucleic acid; a target sequence in a plasmid; a target sequence in a viral nucleic acid; etc.) by virtue of its association with the Cas9 guide RNA.
- a target nucleic acid sequence e.g. a target sequence in a chromosomal nucleic acid, e.g., a chromosome
- a target sequence in an extrachromosomal nucleic acid e.g. an episomal nucleic acid, a minicircle,
- the “guide sequence” also referred to as the “targeting sequence” of a Cas9 guide RNA can be modified so that the Cas9 guide RNA can target a Cas9 protein, e.g., a fusion Cas9 polypeptide, to any desired sequence of any desired target nucleic acid, with the exception (e.g., as described herein) that the PAM sequence can be taken into account.
- a Cas9 protein e.g., a fusion Cas9 polypeptide
- a Cas9 guide RNA can have a targeting segment with a sequence that has complementarity with (e.g., can hybridize to) a sequence in a nucleic acid in a eukaryotic cell, e.g., a viral nucleic acid, a eukaryotic nucleic acid (e.g., a eukaryotic chromosome, chromosomal sequence, a eukaryotic RNA, etc.), and the like.
- a targeting segment with a sequence that has complementarity with (e.g., can hybridize to) a sequence in a nucleic acid in a eukaryotic cell, e.g., a viral nucleic acid, a eukaryotic nucleic acid (e.g., a eukaryotic chromosome, chromosomal sequence, a eukaryotic RNA, etc.), and the like.
- a Cas9 guide RNA includes two separate nucleic acid molecules: an “activator” and a “targeter” and is referred to herein as a “dual Cas9 guide RNA”, a “double-molecule Cas9 guide RNA”, or a “two-molecule Cas9 guide RNA” a “dual guide RNA”, or a “dgRNA.”
- the activator and targeter are covalently linked to one another (e.g., via intervening nucleotides) and the guide RNA is referred to as a “single guide RNA”, a “Cas9 single guide RNA”, a “single-molecule Cas9 guide RNA,” or a “one-molecule Cas9 guide RNA”, or simply “sgRNA.”
- a Cas9 guide RNA comprises a crRNA-like (“CRISPR RNA”/“targeter”/“crRNA”/“crRNA repeat”) molecule and a corresponding tracrRNA-like (“trans-acting CRISPR RNA”/“activator”/“tracrRNA”) molecule.
- a crRNA-like molecule comprises both the targeting segment (single stranded) of the Cas9 guide RNA and a stretch (“duplex-forming segment”) of nucleotides that forms one half of the dsRNA duplex of the protein-binding segment of the Cas9 guide RNA.
- a corresponding tracrRNA-like molecule comprises a stretch of nucleotides (duplex-forming segment) that forms the other half of the dsRNA duplex of the protein-binding segment of the guide nucleic acid.
- a stretch of nucleotides of a crRNA-like molecule are complementary to and hybridize with a stretch of nucleotides of a tracrRNA-like molecule to form the dsRNA duplex of the protein-binding domain of the Cas9 guide RNA.
- each targeter molecule can be said to have a corresponding activator molecule (which has a region that hybridizes with the targeter).
- the targeter molecule additionally provides the targeting segment.
- a targeter and an activator molecule hybridize to form a Cas9 guide RNA.
- the exact sequence of a given crRNA or tracrRNA molecule is characteristic of the species in which the RNA molecules are found.
- a subject dual Cas9 guide RNA can include any corresponding activator and targeter pair.
- activator or “activator RNA” is used herein to mean a tracrRNA-like molecule (tracrRNA: “trans-acting CRISPR RNA”) of a Cas9 dual guide RNA (and therefore of a Cas9 single guide RNA when the “activator” and the “targeter” are linked together by, e.g., intervening nucleotides).
- a Cas9 guide RNA (dgRNA or sgRNA) comprises an activator sequence (e.g., a tracrRNA sequence).
- a tracr molecule is a naturally existing molecule that hybridizes with a CRISPR RNA molecule (a crRNA) to form a Cas9 dual guide RNA.
- activator is used herein to encompass naturally existing tracrRNAs, but also to encompass tracrRNAs with modifications (e.g., truncations, sequence variations, base modifications, backbone modifications, linkage modifications, etc.) where the activator retains at least one function of a tracrRNA (e.g., contributes to the dsRNA duplex to which Cas9 protein binds). In some cases the activator provides one or more stem loops that can interact with Cas9 protein.
- An activator can be referred to as having a tracr sequence (tracrRNA sequence) and in some cases is a tracrRNA, but the term “activator” is not limited to naturally existing tracrRNAs.
- targeter or “targeter RNA” is used herein to refer to a crRNA-like molecule (crRNA: “CRISPR RNA”) of a Cas9 dual guide RNA (and therefore of a Cas9 single guide RNA when the “activator” and the “targeter” are linked together, e.g., by intervening nucleotides).
- a Cas9 guide RNA (dgRNA or sgRNA) comprises a targeting segment (which includes nucleotides that hybridize with (are complementary to) a target nucleic acid, and a duplex-forming segment (e.g., a duplex forming segment of a crRNA, which can also be referred to as a crRNA repeat).
- the sequence of a targeting segment (the segment that hybridizes with a target sequence of a target nucleic acid) of a targeter is modified by a user to hybridize with a desired target nucleic acid
- the sequence of a targeter will often be a non-naturally occurring sequence.
- the duplex-forming segment of a targeter (described in more detail below), which hybridizes with the duplex-forming segment of an activator, can include a naturally existing sequence (e.g., can include the sequence of a duplex-forming segment of a naturally existing crRNA, which can also be referred to as a crRNA repeat).
- targeter is used herein to distinguish from naturally occurring crRNAs, despite the fact that part of a targeter (e.g., the duplex-forming segment) often includes a naturally occurring sequence from a crRNA. However, the term “targeter” encompasses naturally occurring crRNAs.
- a Cas9 guide RNA can also be said to include 3 parts: (i) a targeting sequence (a nucleotide sequence that hybridizes with a sequence of the target nucleic acid); (ii) an activator sequence (as described above)(in some cases, referred to as a tracr sequence); and (iii) a sequence that hybridizes to at least a portion of the activator sequence to form a double stranded duplex.
- a targeter has (i) and (iii); while an activator has (ii).
- a Cas9 guide RNA (e.g. a dual guide RNA or a single guide RNA) can be comprised of any corresponding activator and targeter pair.
- the duplex forming segments can be swapped between the activator and the targeter.
- the targeter includes a sequence of nucleotides from a duplex forming segment of a tracrRNA (which sequence would normally be part of an activator) while the activator includes a sequence of nucleotides from a duplex forming segment of a crRNA (which sequence would normally be part of a targeter).
- a targeter comprises both the targeting segment (single stranded) of the Cas9 guide RNA and a stretch (“duplex-forming segment”) of nucleotides that forms one half of the dsRNA duplex of the protein-binding segment of the Cas9 guide RNA.
- a corresponding tracrRNA-like molecule comprises a stretch of nucleotides (a duplex-forming segment) that forms the other half of the dsRNA duplex of the protein-binding segment of the Cas9 guide RNA.
- a stretch of nucleotides of the targeter is complementary to and hybridizes with a stretch of nucleotides of the activator to form the dsRNA duplex of the protein-binding segment of a Cas9 guide RNA.
- each targeter can be said to have a corresponding activator (which has a region that hybridizes with the targeter).
- the targeter molecule additionally provides the targeting segment.
- a targeter and an activator hybridize to form a Cas9 guide RNA.
- the particular sequence of a given naturally existing crRNA or tracrRNA molecule is characteristic of the species in which the RNA molecules are found. Examples of suitable activator and targeter are well known in the art.
- a Cas9 guide RNA (e.g. a dual guide RNA or a single guide RNA) can be comprised of any corresponding activator and targeter pair.
- Non-limiting examples of nucleotide sequences that can be included in a Cas9 guide RNA include sequences set forth in SEQ ID NOs: 827-1075, or complements thereof.
- sequences from SEQ ID NOs: 827-957 (which are from tracrRNAs) or complements thereof can pair with sequences from SEQ ID NOs: 964-1075 (which are from crRNAs), or complements thereof, to form a dsRNA duplex of a protein binding segment.
- the first segment of a subject guide nucleic acid includes a guide sequence (i.e., a targeting sequence)(a nucleotide sequence that is complementary to a sequence (a target site) in a target nucleic acid).
- a targeting sequence a nucleotide sequence that is complementary to a sequence (a target site) in a target nucleic acid.
- the targeting segment of a subject guide nucleic acid can interact with a target nucleic acid (e.g., double stranded DNA (dsDNA)) in a sequence-specific manner via hybridization (i.e., base pairing).
- dsDNA double stranded DNA
- the nucleotide sequence of the targeting segment may vary (depending on the target) and can determine the location within the target nucleic acid that the Cas9 guide RNA and the target nucleic acid will interact.
- the targeting segment of a Cas9 guide RNA can be modified (e.g., by genetic engineering)/designed to hybridize to any desired sequence (target site) within a target nucleic acid (e.g., a eukaryotic target nucleic acid such as genomic DNA).
- a target nucleic acid e.g., a eukaryotic target nucleic acid such as genomic DNA.
- the targeting segment can have a length of 7 or more nucleotides (nt) (e.g., 8 or more, 9 or more, 10 or more, 12 or more, 15 or more, 20 or more, 25 or more, 30 or more, or 40 or more nucleotides).
- nt nucleotides
- the targeting segment can have a length of from 7 to 100 nucleotides (nt) (e.g., from 7 to 80 nt, from 7 to 60 nt, from 7 to 40 nt, from 7 to 30 nt, from 7 to 25 nt, from 7 to 22 nt, from 7 to 20 nt, from 7 to 18 nt, from 8 to 80 nt, from 8 to 60 nt, from 8 to 40 nt, from 8 to 30 nt, from 8 to 25 nt, from 8 to 22 nt, from 8 to 20 nt, from 8 to 18 nt, from 10 to 100 nt, from 10 to 80 nt, from 10 to 60 nt, from 10 to 40 nt, from 10 to 30 nt, from 10 to 25 nt, from 10 to 22 nt, from 10 to 20 nt, from 10 to 18 nt, from 12 to 100 nt, from 12 to 80 nt, from 12 to 60 nt
- the nucleotide sequence (the targeting sequence) of the targeting segment that is complementary to a nucleotide sequence (target site) of the target nucleic acid can have a length of 10 nt or more.
- the targeting sequence of the targeting segment that is complementary to a target site of the target nucleic acid can have a length of 12 nt or more, 15 nt or more, 18 nt or more, 19 nt or more, or 20 nt or more.
- the nucleotide sequence (the targeting sequence) of the targeting segment that is complementary to a nucleotide sequence (target site) of the target nucleic acid has a length of 12 nt or more.
- the nucleotide sequence (the targeting sequence) of the targeting segment that is complementary to a nucleotide sequence (target site) of the target nucleic acid has a length of 18 nt or more.
- the targeting sequence of the targeting segment that is complementary to a target sequence of the target nucleic acid can have a length of from 10 to 100 nucleotides (nt) (e.g., from 10 to 90 nt, from 10 to 75 nt, from 10 to 60 nt, from 10 to 50 nt, from 10 to 35 nt, from 10 to 30 nt, from 10 to 25 nt, from 10 to 22 nt, from 10 to 20 nt, from 12 to 100 nt, from 12 to 90 nt, from 12 to 75 nt, from 12 to 60 nt, from 12 to 50 nt, from 12 to 35 nt, from 12 to 30 nt, from 12 to 25 nt, from 12 to 22 nt, from 12 to 20 nt, from 15 to 100 nt, from 15 to 90 nt, from 15 to 75 nt, from 15 to 60 nt, from 15 to 50 nt, from 15 to 35 nt, from 15 to 30 nt
- the targeting sequence of the targeting segment that is complementary to a target sequence of the target nucleic acid has a length of from 15 nt to 30 nt. In some cases, the targeting sequence of the targeting segment that is complementary to a target sequence of the target nucleic acid has a length of from 15 nt to 25 nt. In some cases, the targeting sequence of the targeting segment that is complementary to a target sequence of the target nucleic acid has a length of from 18 nt to 30 nt. In some cases, the targeting sequence of the targeting segment that is complementary to a target sequence of the target nucleic acid has a length of from 18 nt to 25 nt.
- the targeting sequence of the targeting segment that is complementary to a target sequence of the target nucleic acid has a length of from 18 nt to 22 nt. In some cases, the targeting sequence of the targeting segment that is complementary to a target site of the target nucleic acid is 20 nucleotides in length. In some cases, the targeting sequence of the targeting segment that is complementary to a target site of the target nucleic acid is 19 nucleotides in length.
- the percent complementarity between the targeting sequence (guide sequence) of the targeting segment and the target site of the target nucleic acid can be 60% or more (e.g., 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 97% or more, 98% or more, 99% or more, or 100%). In some cases, the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the seven contiguous 5′-most nucleotides of the target site of the target nucleic acid. In some cases, the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 60% or more over about 20 contiguous nucleotides.
- the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the fourteen contiguous 5′-most nucleotides of the target site of the target nucleic acid and as low as 0% or more over the remainder. In such a case, the targeting sequence can be considered to be 14 nucleotides in length. In some cases, the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the seven contiguous 5′-most nucleotides of the target site of the target nucleic acid and as low as 0% or more over the remainder. In such a case, the targeting sequence can be considered to be 20 nucleotides in length.
- the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 7 contiguous 5′-most nucleotides of the target site of the target nucleic acid (which can be complementary to the 3′-most nucleotides of the targeting sequence of the Cas9 guide RNA). In some cases, the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 8 contiguous 5′-most nucleotides of the target site of the target nucleic acid (which can be complementary to the 3′-most nucleotides of the targeting sequence of the Cas9 guide RNA).
- the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 9 contiguous 5′-most nucleotides of the target site of the target nucleic acid (which can be complementary to the 3′-most nucleotides of the targeting sequence of the Cas9 guide RNA). In some cases, the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 10 contiguous 5′-most nucleotides of the target site of the target nucleic acid (which can be complementary to the 3′-most nucleotides of the targeting sequence of the Cas9 guide RNA).
- the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 17 contiguous 5′-most nucleotides of the target site of the target nucleic acid (which can be complementary to the 3′-most nucleotides of the targeting sequence of the Cas9 guide RNA). In some cases, the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 18 contiguous 5′-most nucleotides of the target site of the target nucleic acid (which can be complementary to the 3′-most nucleotides of the targeting sequence of the Cas9 guide RNA).
- the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 60% or more (e.g., e.g., 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 97% or more, 98% or more, 99% or more, or 100%) over about 20 contiguous nucleotides.
- the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 7 contiguous 5′-most nucleotides of the target site of the target nucleic acid and as low as 0% or more over the remainder. In such a case, the targeting sequence can be considered to be 7 nucleotides in length. In some cases, the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 8 contiguous 5′-most nucleotides of the target site of the target nucleic acid and as low as 0% or more over the remainder. In such a case, the targeting sequence can be considered to be 8 nucleotides in length.
- the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 9 contiguous 5′-most nucleotides of the target site of the target nucleic acid and as low as 0% or more over the remainder. In such a case, the targeting sequence can be considered to be 9 nucleotides in length. In some cases, the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 10 contiguous 5′-most nucleotides of the target site of the target nucleic acid and as low as 0% or more over the remainder. In such a case, the targeting sequence can be considered to be 10 nucleotides in length.
- the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 11 contiguous 5′-most nucleotides of the target site of the target nucleic acid and as low as 0% or more over the remainder. In such a case, the targeting sequence can be considered to be 11 nucleotides in length. In some cases, the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 12 contiguous 5′-most nucleotides of the target site of the target nucleic acid and as low as 0% or more over the remainder. In such a case, the targeting sequence can be considered to be 12 nucleotides in length.
- the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 13 contiguous 5′-most nucleotides of the target site of the target nucleic acid and as low as 0% or more over the remainder. In such a case, the targeting sequence can be considered to be 13 nucleotides in length. In some cases, the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 14 contiguous 5′-most nucleotides of the target site of the target nucleic acid and as low as 0% or more over the remainder. In such a case, the targeting sequence can be considered to be 14 nucleotides in length.
- the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 17 contiguous 5′-most nucleotides of the target site of the target nucleic acid and as low as 0% or more over the remainder. In such a case, the targeting sequence can be considered to be 17 nucleotides in length. In some cases, the percent complementarity between the targeting sequence of the targeting segment and the target site of the target nucleic acid is 100% over the 18 contiguous 5′-most nucleotides of the target site of the target nucleic acid and as low as 0% or more over the remainder. In such a case, the targeting sequence can be considered to be 18 nucleotides in length.
- Examples of various Cas9 guide RNAs can be found in the art, for example, see Jinek et al., Science. 2012 Aug. 17; 337(6096):816-21; Chylinski et al., RNA Biol. 2013 May; 10(5):726-37; Ma et al., Biomed Res Int. 2013; 2013:270805; Hou et al., Proc Natl Acad Sci USA. 2013 Sep. 24; 110(39):15644-9; Jinek et al., Elife. 2013; 2:e00471; Pattanayak et al., Nat Biotechnol. 2013 September; 31(9):839-43; Qi et al, Cell. 2013 Feb.
- a nucleic acid molecule that binds to a type V or type VI CRISPR/Cas protein (e.g., Cpf1, C2c1, C2c2, C2c3), and targets the complex to a specific location within a target nucleic acid is referred to herein generally as a “type V or type VI CRISPR/Cas guide RNA”.
- a type V or type VI CRISPR/Cas guide RNA An example of a more specific term is a “Cpf1 guide RNA.”
- a type V or type VI CRISPR/Cas guide RNA can have a total length of from 30 nucleotides (nt) to 200 nt, e.g., from 30 nt to 180 nt, from 30 nt to 160 nt, from 30 nt to 150 nt, from 30 nt to 125 nt, from 30 nt to 100 nt, from 30 nt to 90 nt, from 30 nt to 80 nt, from 30 nt to 70 nt, from 30 nt to 60 nt, from 30 nt to 50 nt, from 50 nt to 200 nt, from 50 nt to 180 nt, from 50 nt to 160 nt, from 50 nt to 150 nt, from 50 nt to 125 nt, from 50 nt to 100 nt, from 50 nt to 90 nt, from 50 nt
- a type V or type VI CRISPR/Cas guide RNA (e.g., cpf1 guide RNA) has a total length of at least 30 nt (e.g., at least 40 nt, at least 50 nt, at least 60 nt, at least 70 nt, at least 80 nt, at least 90 nt, at least 100 nt, or at least 120 nt,).
- a Cpf1 guide RNA has a total length of 35 nt, 36 nt, 37 nt, 38 nt, 39 nt, 40 nt, 41 nt, 42 nt, 43 nt, 44 nt, 45 nt, 46 nt, 47 nt, 48 nt, 49 nt, or 50 nt.
- a type V or type VI CRISPR/Cas guide RNA can include a target nucleic acid-binding segment and a duplex-forming region (e.g., in some cases formed from two duplex-forming segments, i.e., two stretches of nucleotides that hybridize to one another to form a duplex).
- the target nucleic acid-binding segment of a type V or type VI CRISPR/Cas guide RNA can have a length of from 15 nt to 30 nt, e.g., 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, 25 nt, 26 nt, 27 nt, 28 nt, 29 nt, or 30 nt.
- the target nucleic acid-binding segment has a length of 23 nt.
- the target nucleic acid-binding segment has a length of 24 nt.
- the target nucleic acid-binding segment has a length of 25 nt.
- the guide sequence of a type V or type VI CRISPR/Cas guide RNA can have a length of from 15 nt to 30 nt (e.g., 15 to 25 nt, 15 to 24 nt, 15 to 23 nt, 15 to 22 nt, 15 to 21 nt, 15 to 20 nt, 15 to 19 nt, 15 to 18 nt, 17 to 30 nt, 17 to 25 nt, 17 to 24 nt, 17 to 23 nt, 17 to 22 nt, 17 to 21 nt, 17 to 20 nt, 17 to 19 nt, 17 to 18 nt, 18 to 30 nt, 18 to 25 nt, 18 to 24 nt, 18 to 23 nt, 18 to 22 nt, 18 to 21 nt, 18 to 20 nt, 18 to 19 nt, 19 to 30 nt, 19 to 25 nt, 19 to 24 nt, 19
- the guide sequence has a length of 17 nt. In some cases, the guide sequence has a length of 18 nt. In some cases, the guide sequence has a length of 19 nt. In some cases, the guide sequence has a length of 20 nt. In some cases, the guide sequence has a length of 21 nt. In some cases, the guide sequence has a length of 22 nt. In some cases, the guide sequence has a length of 23 nt. In some cases, the guide sequence has a length of 24 nt.
- the guide sequence of a type V or type VI CRISPR/Cas guide RNA can have 100% complementarity with a corresponding length of target nucleic acid sequence.
- the guide sequence can have less than 100% complementarity with a corresponding length of target nucleic acid sequence.
- the guide sequence of a type V or type VI CRISPR/Cas guide RNA e.g., cpf1 guide RNA
- the target nucleic acid-binding segment has 100% complementarity to the target nucleic acid sequence.
- the target nucleic acid-binding segment has 1 non-complementary nucleotide and 24 complementary nucleotides with the target nucleic acid sequence.
- the target nucleic acid-binding segment has 2 non-complementary nucleotide and 23 complementary nucleotides with the target nucleic acid sequence.
- the duplex-forming segment of a type V or type VI CRISPR/Cas guide RNA (e.g., cpf1 guide RNA) (e.g., of a targeter RNA or an activator RNA) can have a length of from 15 nt to 25 nt (e.g., 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, or 25 nt).
- a type V or type VI CRISPR/Cas guide RNA e.g., cpf1 guide RNA
- a targeter RNA or an activator RNA can have a length of from 15 nt to 25 nt (e.g., 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 n
- the RNA duplex of a type V or type VI CRISPR/Cas guide RNA can have a length of from 5 base pairs (bp) to 40 bp (e.g., from 5 to 35 bp, 5 to 30 bp, 5 to 25 bp, 5 to 20 bp, 5 to 15 bp, 5-12 bp, 5-10 bp, 5-8 bp, 6 to 40 bp, 6 to 35 bp, 6 to 30 bp, 6 to 25 bp, 6 to 20 bp, 6 to 15 bp, 6 to 12 bp, 6 to 10 bp, 6 to 8 bp, 7 to 40 bp, 7 to 35 bp, 7 to 30 bp, 7 to 25 bp, 7 to 20 bp, 7 to 15 bp, 7 to 12 bp, 7 to 10 bp, 8 to 40 bp, 8 to 35 bp, 8 to 30 bp, 7 to 25 bp, 7 to 20 b
- a duplex-forming segment of a Cpf1 guide RNA can comprise a nucleotide sequence selected from (5′ to 3′): AAUUUCUACUGUUGUAGAU (SEQ ID NO: 1257), AAUUUCUGCUGUUGCAGAU (SEQ ID NO: 1258), AAUUUCCACUGUUGUGGAU (SEQ ID NO: 1259), AAUUCCUACUGUUGUAGGU (SEQ ID NO: 1260), AAUUUCUACUAUUGUAGAU (SEQ ID NO: 1261), AAUUUCUACUGCUGUAGAU (SEQ ID NO: 1262), AAUUUCUACUUUGUAGAU (SEQ ID NO: 1263), and AAUUUCUACUUGUAGAU (SEQ ID NO: 1264).
- the guide sequence can then follow (5′ to 3′) the duplex forming segment.
- an activator RNA e.g. tracrRNA
- a C2c1 guide RNA dual guide or single guide
- a C2c1 guide RNA dual guide or single guide
- RNA that includes the nucleotide sequence GAAUUUUUCAACGGGUGUGCCAAUGGCCACUUUCCAGGUGGCAAAGCCCG UUGAGCUUCUCAAAAAG (SEQ ID NO: 1265).
- a C2c1 guide RNA is an RNA that includes the nucleotide sequence
- a C2c1 guide RNA is an RNA that includes the nucleotide sequence GUCUAGAGGACAGAAUUUUUCAACGGGUGUGCCAAUGGCCACUUUCCAGG UGGCAAAGCCCGUUGAGCUUCUCAAAAAG (SEQ ID NO: 1266).
- a C2c1 guide RNA is an RNA that includes the nucleotide sequence UCUAGAGGACAGAAUUUUUCAACGGGUGUGCCAAUGGCCACUUUCCAGGU GGCAAAGCCCGUUGAGCUUCUCAAAAAG (SEQ ID NO: 1267).
- a non-limiting example of an activator RNA (e.g. tracrRNA) of a C2c1 guide RNA is an RNA that includes the nucleotide sequence ACUUUCCAGGCAAAGCCCGUUGAGCUUCUCAAAAAG (SEQ ID NO: 1268).
- a duplex forming segment of a C2c1 guide RNA (dual guide or single guide) of an activator RNA includes the nucleotide sequence AGCUUCUCA (SEQ ID NO: 1269) or the nucleotide sequence GCUUCUCA (SEQ ID NO: 1270) (the duplex forming segment from a naturally existing tracrRNA.
- a non-limiting example of a targeter RNA (e.g. crRNA) of a C2c1 guide RNA (dual guide or single guide) is an RNA with the nucleotide sequence CUGAGAAGUGGCACNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN (SEQ ID NO: 1271), where the Ns represent the guide sequence, which will vary depending on the target sequence, and although 20 Ns are depicted a range of different lengths are acceptable.
- a duplex forming segment of a C2c1 guide RNA (dual guide or single guide) of a targeter RNA e.g.
- crRNA includes the nucleotide sequence CUGAGAAGUGGCAC (SEQ ID NO: 1272) or includes the nucleotide sequence CUGAGAAGU (SEQ ID NO: 1273) or includes the nucleotide sequence UGAGAAGUGGCAC (SEQ ID NO: 1274) or includes the nucleotide sequence UGAGAAGU (SEQ ID NO: 1275).
- type V or type VI CRISPR/Cas guide RNAs e.g., cpf1, C2c1, C2c2, and C2c3 guide RNAs
- cpf1, C2c1, C2c2, and C2c3 guide RNAs can be found in the art, for example, see Zetsche et al, Cell. 2015 Oct. 22; 163(3):759-71; Makarova et al, Nat Rev Microbiol. 2015 November; 13(11):722-36; and Shmakov et al., Mol Cell. 2015 Nov. 5; 60(3):385-97.
- an RNP of the present disclosure comprises: a) a fusion site-directed modifying polypeptide (e.g., a class 2 CRISPR/Cas polypeptide) of the present disclosure; b) a guide RNA; and c) a donor DNA polynucleotide.
- a fusion site-directed modifying polypeptide e.g., a class 2 CRISPR/Cas polypeptide
- a method of the present disclosure for modifying a target nucleic acid comprises contacting a eukaryotic cell comprising a target nucleic acid with an RNP of the present disclosure, where the RNP comprises: a) a fusion site-directed modifying polypeptide of the present disclosure (e.g., a class 2 CRISPR/Cas polypeptide); b) a guide RNA; and c) a donor DNA polynucleotide.
- the contacting occurs under conditions that are permissive for nonhomologous end joining (NHEJ) or homology-directed repair (HDR).
- NHEJ nonhomologous end joining
- HDR homology-directed repair
- the target DNA is contacted with the donor polynucleotide (donor DNA template), wherein the donor polynucleotide, a portion of the donor polynucleotide, a copy of the donor polynucleotide, or a portion of a copy of the donor polynucleotide integrates into the target DNA.
- donor polynucleotide donor DNA template
- the donor polynucleotide comprises a nucleotide sequence that includes at least a segment with homology to the target DNA sequence
- the subject methods may be used to add, i.e. insert or replace, nucleic acid material to a target DNA sequence (e.g.
- a tag e.g., 6 ⁇ His, a fluorescent protein (e.g., a green fluorescent protein; a yellow fluorescent protein, etc.), hemagglutinin (HA), FLAG, etc.
- a regulatory sequence e.g., a promoter, a polyadenylation signal, an internal ribosome entry sequence (IRES), a 2A peptide, a start codon, a stop codon, a splice signal, a localization signal, etc.
- a nucleic acid sequence e.g., introduce a mutation
- a complex comprising a guide RNA and a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure is useful in any in vitro or in vivo application in which it is desirable to modify DNA in a site-specific, i.e. “targeted”, way, for example gene knock-out, gene knock-in, gene editing, gene tagging, etc., as used in, for example, gene therapy, e.g.
- a disease or as an antiviral, anti-pathogenic, or anticancer therapeutic the production of genetically modified organisms in agriculture, the large scale production of proteins by cells for therapeutic, diagnostic, or research purposes, the induction of iPS cells, biological research, the targeting of genes of pathogens for deletion or replacement, etc.
- a polynucleotide comprising a donor sequence to be inserted is also provided to the cell, e.g., a donor polynucleotide is included in an RNP of the present disclosure.
- a donor sequence or “donor polynucleotide” it is meant a nucleic acid sequence to be inserted at the cleavage site induced by a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure.
- the donor polynucleotide will contain sufficient homology to a genomic sequence at the cleavage site, e.g. 70%, 80%, 85%, 90%, 95%, or 100% homology with the nucleotide sequences flanking the cleavage site, e.g. within about 50 bases or less of the cleavage site, e.g. within about 30 bases, within about 15 bases, within about 10 bases, within about 5 bases, or immediately flanking the cleavage site, to support homology-directed repair between it and the genomic sequence to which it bears homology.
- sufficient homology to a genomic sequence at the cleavage site e.g. 70%, 80%, 85%, 90%, 95%, or 100% homology with the nucleotide sequences flanking the cleavage site, e.g. within about 50 bases or less of the cleavage site, e.g. within about 30 bases, within about 15 bases, within about 10 bases, within about 5 bases, or immediately flanking the cleavage site
- Donor sequences can be of any length, e.g. 10 nucleotides or more, 50 nucleotides or more, 100 nucleotides or more, 250 nucleotides or more, 500 nucleotides or more, 1000 nucleotides or more, 5000 nucleotides or more, etc.
- the donor sequence is typically not identical to the genomic sequence that it replaces. Rather, the donor sequence may contain one or more single base changes, insertions, deletions, inversions or rearrangements with respect to the genomic sequence, so long as sufficient homology is present to support homology-directed repair.
- the donor sequence comprises a non-homologous sequence flanked by two regions of homology, such that homology-directed repair between the target DNA region and the two flanking sequences results in insertion of the non-homologous sequence at the target region.
- Donor sequences may also comprise a vector backbone containing sequences that are not homologous to the DNA region of interest and that are not intended for insertion into the DNA region of interest.
- the homologous region(s) of a donor sequence will have at least 50% sequence identity to a genomic sequence with which recombination is desired. In certain embodiments, 60%, 70%, 80%, 90%, 95%, 98%, 99%, or 99.9% sequence identity is present. Any value between 1% and 100% sequence identity can be present, depending upon the length of the donor polynucleotide.
- the donor nucleic acid may comprise certain sequence differences as compared to the genomic sequence, e.g. restriction sites, nucleotide polymorphisms, selectable markers (e.g., drug resistance genes, fluorescent proteins, enzymes etc.), etc., which may be used to assess for successful insertion of the donor nucleic acid at the cleavage site or in some cases may be used for other purposes (e.g., to signify expression at the targeted genomic locus).
- sequence differences may include flanking recombination sequences such as FLPs, loxP sequences, or the like, that can be activated at a later time for removal of the marker sequence.
- the donor nucleic acid may be provided to the cell as single-stranded DNA, single-stranded RNA, double-stranded DNA, or double-stranded RNA. It may be introduced into a cell in linear or circular form. If introduced in linear form, the ends of the donor sequence may be protected (e.g., from exonucleolytic degradation) by methods known to those of skill in the art. For example, one or more dideoxynucleotide residues are added to the 3′ terminus of a linear molecule and/or self-complementary oligonucleotides are ligated to one or both ends. See, for example, Chang et al. (1987) Proc. Natl. Acad. Sci.
- Additional methods for protecting exogenous polynucleotides from degradation include, but are not limited to, addition of terminal amino group(s) and the use of modified internucleotide linkages such as, for example, phosphorothioates, phosphoramidates, and O-methyl ribose or deoxyribose residues.
- additional lengths of sequence may be included outside of the regions of homology that can be degraded without adversely affecting recombination.
- a donor nucleic acid can be introduced into a cell as part of a vector molecule having additional sequences such as, for example, replication origins, promoters and genes encoding antibiotic resistance.
- a donor nucleic acid can be introduced as naked nucleic acid, as nucleic acid complexed with an agent such as a liposome or poloxamer, or can be delivered by a virus (e.g., adenovirus, adeno-associated virus, etc.).
- a fusion site-directed modifying polypeptide e.g., a fusion class 2 CRISPR/Cas polypeptide, or an RNP comprising same, is useful in a variety of methods for target nucleic acid modification, which methods are also provided.
- a fusion site-directed polypeptide of the present disclosure, or an RNP of the present disclosure can be used in any method in which a Cas9 protein or a Cpf1 protein can be used.
- a fusion site-directed polypeptide of the present disclosure, or an RNP of the present disclosure can be used to (i) modify (e.g., cleave, e.g., nick; methylate; etc.) a target nucleic acid (DNA or RNA; single stranded or double stranded); (ii) modulate transcription of a target nucleic acid; (iii) label a target nucleic acid; (iv) bind a target nucleic acid (e.g., for purposes of isolation, labeling, imaging, tracking, etc.); (v) modify a polypeptide (e.g., a histone) associated with a target nucleic acid; and the like.
- modify e.g., cleave, e.g., nick;
- a method that uses a fusion sited-directed polypeptide includes binding of the fusion site-directed polypeptide to a particular region in a target nucleic acid (by virtue of being targeted there by an associated guide RNA (e.g., a Cas9 guide RNA or a Cpf1 guide RNA)), the methods are generally referred to herein as methods of binding (e.g., a method of binding a target nucleic acid).
- an associated guide RNA e.g., a Cas9 guide RNA or a Cpf1 guide RNA
- a method of binding may result in nothing more than binding of the target nucleic acid
- the method can have different final results (e.g., the method can result in modification of the target nucleic acid, e.g., cleavage/methylation/etc., modulation of transcription from the target nucleic acid, modulation of translation of the target nucleic acid, genome editing, modulation of a protein associated with the target nucleic acid, isolation of the target nucleic acid, etc.).
- suitable methods Cas9 variants, guide RNAs, etc., see, for example, Jinek et al., Science. 2012 Aug.
- the present disclosure provides methods of cleaving a target nucleic acid; methods of editing a target nucleic acid; methods of modulating transcription from a target nucleic acid; methods of isolating a target nucleic acid, methods of binding a target nucleic acid, methods of imaging a target nucleic acid, methods of modifying a target nucleic acid, and the like.
- the methods generally involve contacting a eukaryotic cell, which eukaryotic cell comprises a target nucleic acid, with a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure in complex with a guide RNA.
- a fusion site-directed modifying polypeptide e.g., a fusion class 2 CRISPR/Cas polypeptide
- the methods comprising contacting a eukaryotic cell, which eukaryotic cell comprises a target nucleic acid, with an RNP comprising a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure and a guide RNA.
- the methods comprising contacting a eukaryotic cell, which eukaryotic cell comprises a target nucleic acid, with: a) an RNP comprising a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure and a guide RNA; and b) a donor nucleic acid.
- the terms/phrases “contact a target nucleic acid” and “contacting a target nucleic acid”, for example, with a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure, etc. encompass: 1) introducing a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure into a cell; and introducing a guide RNA (a Cas9 guide RNA or a Cpf1 guide RNA) into a cell by introducing the guide RNA itself into the cell; 2) introducing a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure into a cell; and introducing a guide RNA (Cas9 guide RNA or Cpf1 guide RNA) into a cell by introducing into the cell a nucleic acid.
- a method that includes contacting the target nucleic acid encompasses the introduction into the cell of the guide RNA in its active/final state (e.g., in the form of an RNA in some cases for the guide RNA), and also encompasses the introduction into the cell of one or more nucleic acids encoding one or more of the components (e.g., nucleic acid(s) having nucleotide sequence(s), nucleic acid(s) having nucleotide sequence(s) encoding guide RNA(s), and the like).
- Contacting a target nucleic acid encompasses contacting the target nucleic acid inside of a cell in vitro, inside of a cell in vivo, inside of a cell ex vivo, etc.
- a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure, when bound to a guide RNA, can bind to a target nucleic acid, and in some cases, can bind to and modify a target nucleic acid.
- a target nucleic acid can be any nucleic acid (e.g., DNA, RNA), can be double stranded or single stranded, can be any of a number of types of nucleic acid (e.g., a chromosome, derived from a chromosome, chromosomal, plasmid, viral, mitochondrial, chloroplast, linear, circular, etc.) and can be from any organism (e.g., as long as the guide RNA (e.g., Cas9 guide RNA, Cpf1 guide RNA) can hybridize to a target sequence in a target nucleic acid, such that target nucleic acid can be targeted).
- a guide RNA e.g., Cas9 guide RNA, Cpf1 guide RNA
- a target nucleic acid can be DNA or RNA.
- a target nucleic acid can be double stranded (e.g., dsDNA, dsRNA) or single stranded (e.g., ssRNA, ssDNA).
- a target nucleic acid is single stranded.
- a target nucleic acid is a single stranded RNA (ssRNA).
- a target ssRNA e.g., a target cell ssRNA, a viral ssRNA, etc.
- mRNA miRNA
- a target nucleic acid is a single stranded DNA (ssDNA) (e.g., a viral DNA). As noted above, in some cases, a target nucleic acid is single stranded. In some cases, a target nucleic acid is a double-stranded DNA.
- ssDNA single stranded DNA
- a target nucleic acid is a double-stranded DNA.
- a target nucleic acid can be located within a eukaryotic cell, for example, inside of a eukaryotic cell in vitro, inside of a eukaryotic cell in vivo, inside of a eukaryotic cell ex vivo.
- Suitable target cells include, but are not limited to: a single-celled eukaryotic organism; a cell of a single-cell eukaryotic organism; a plant cell; an algal cell, e.g., Botryococcus braunii, Chlamydomonas reinhardtii, Nannochloropsis gaditana, Chlorella pyrenoidosa, Sargassum patens C.
- a fungal cell e.g., a yeast cell
- an animal cell e.g., a cell from an invertebrate animal (e.g. fruit fly, cnidarian, echinoderm, nematode, etc.); a cell from a vertebrate animal (e.g., fish, amphibian, reptile, bird, mammal); a cell from a mammal (e.g., a cell from a rodent, a cell from a human, etc.); and the like.
- Any type of cell may be of interest (e.g. a stem cell, e.g.
- an embryonic stem (ES) cell an induced pluripotent stem (iPS) cell
- a germ cell e.g., an oocyte, a sperm, an oogonia, a spermatogonia, etc.
- a somatic cell e.g. a fibroblast, an oligodendrocyte, a glial cell, a hematopoietic cell, a neuron, a muscle cell, a bone cell, a hepatocyte, a pancreatic cell
- an in vitro or in vivo embryonic cell of an embryo at any stage e.g., a 1-cell, 2-cell, 4-cell, 8-cell, etc. stage zebrafish embryo; etc.
- Cells may be from established cell lines or they may be primary cells, where “primary cells”, “primary cell lines”, and “primary cultures” are used interchangeably herein to refer to cells and cells cultures that have been derived from a subject and allowed to grow in vitro for a limited number of passages, i.e. splittings, of the culture.
- primary cultures are cultures that may have been passaged 0 times, 1 time, 2 times, 4 times, 5 times, 10 times, or 15 times, but not enough times go through the crisis stage.
- the primary cell lines are maintained for fewer than 10 passages in vitro.
- Target cells can be unicellular organisms and/or can be grown in culture. If the cells are primary cells, they may be harvest from an individual by any convenient method.
- leukocytes may be conveniently harvested by apheresis, leukocytapheresis, density gradient separation, etc., while cells from tissues such as skin, muscle, bone marrow, spleen, liver, pancreas, lung, intestine, stomach, etc. can be conveniently harvested by biopsy.
- Target cells include in vivo target cells.
- Target cells include retinal cells (e.g., Müller cells, ganglion cells, amacrine cells, horizontal cells, bipolar cells, and photoreceptor cells including rods and cones, Müller glial cells, and retinal pigmented epithelium); neural cells (e.g., cells of the thalamus, sensory cortex, zona incerta (ZI), ventral tegmental area (VTA), prefontal cortex (PFC), nucleus accumbens (NAc), amygdala (BLA), substantia nigra, ventral pallidum , globus pallidus, dorsal striatum, ventral striatum, subthalamic nucleus, hippocampus, dentate gyrus, cingulate gyms, entorhinal cortex, olfactory cortex, primary motor cortex, or cerebellum); liver cells; kidney cells; immune cells; cardiac cells; skeletal muscle cells; smooth muscle
- an RNP of the present disclosure e.g., an RNP comprising: i) a fusion site-directed modifying polypeptide (e.g., a fusion class 2 CRISPR/Cas polypeptide) of the present disclosure; and ii) a guide RNA
- a fusion site-directed modifying polypeptide e.g., a fusion class 2 CRISPR/Cas polypeptide
- a guide RNA can be administered via intraocular injection, by intravitreal injection, by intravitreal implant, subretinal injection, suprachoroidal administration, intravenous administration, or by any other convenient mode or route of administration.
- the subject methods may be employed to induce target nucleic acid cleavage, target nucleic acid modification, and/or to bind target nucleic acids (e.g., for visualization, for collecting and/or analyzing, etc.) in mitotic or post-mitotic cells in vivo and/or ex vivo and/or in vitro (e.g., to disrupt production of a protein encoded by a targeted mRNA).
- target nucleic acids e.g., for visualization, for collecting and/or analyzing, etc.
- mitotic or post-mitotic cells in vivo and/or ex vivo and/or in vitro (e.g., to disrupt production of a protein encoded by a targeted mRNA).
- a mitotic and/or post-mitotic cell of interest in the disclosed methods may include a cell from any organism (e.g.
- a single-celled eukaryotic organism a cell of a single-cell eukaryotic organism, a plant cell, an algal cell, e.g., Botryococcus braunii, Chlamydomonas reinhardtii, Nannochloropsis gaditana, Chlorella pyrenoidosa, Sargassum patens C. Agardh , and the like, a fungal cell (e.g., a yeast cell), an animal cell, a cell from an invertebrate animal (e.g.
- fruit fly cnidarian, echinoderm, nematode, etc.
- a cell from a vertebrate animal e.g., fish, amphibian, reptile, bird, mammal
- a cell from a mammal e.g., a cell from a rodent (e.g., mouse; rat), a cell from a human, etc.).
- Standard abbreviations may be used, e.g., bp, base pair(s); kb, kilobase(s); pl, picoliter(s); s or sec, second(s); min, minute(s); h or hr, hour(s); aa, amino acid(s); kb, kilobase(s); bp, base pair(s); nt, nucleotide(s); i.m., intramuscular(ly); i.p., intraperitoneal(ly); s.c., subcutaneous(ly); and the like.
- the NLS-modified Cas9 protein produced included, from N-terminus to C-terminus: a) 0, 1, 2, 3, or 4 NLSs; b) Cas9; c) 2 NLSs; and a superfolder green fluorescent protein (sfGFP).
- the amino acid sequences of the N-terminal and C-terminal NLS-containing regions are depicted in FIG. 2A .
- the NLS-modified Cas9 proteins were combined with guide RNA to generate RNPs. The RNPs were tested for passive uptake into neural stem cells in vitro. The data are depicted in FIG. 2A-2D .
- FIG. 2A-2D 4 ⁇ N-Terminal NLS on Cas9 Significantly Improves Passive Cas9 RNP Uptake and Genome-Editing in Neural Stem Cells in Culture.
- FIG. 2A shows design of Cas9 protein with various numbers of N-terminal NLS.
- FIG. 2B tdTomato reporter system activates expression of tdTomato Red Fluorescent Protein (RFP) in genome-edited cells. 4 ⁇ NLS-Cas9 design is significantly more efficient at genome-editing cells than other designs.
- FIG. 2C Representative pictures of cells, scale bar 1000 ⁇ m.
- FIG. 2D Representative pictures of cells, scale bar 200 ⁇ m.
- a construct encoding a Streptococcus pyogenes Cas9 (SpyCas9) with 4 NLSs at the N-terminus and 2 NLSs at the C-terminus was generated.
- the 4 ⁇ NLS-SpyCas9-2 ⁇ NLS included a superfolder green fluorescent protein (sfGFP) at the C-terminus.
- the encoded protein is referred to as “4 ⁇ NLS-SpyCas9-2 ⁇ NLS-sfGFP.”
- the amino acid sequence of the 4 ⁇ NLS-SpyCas9-2 ⁇ NLS-sfGFP polypeptide is depicted in FIG. 7 .
- the 4 ⁇ NLS-SpyCas9-2 ⁇ NLS-sfGFP polypeptide was incorporated into ribonucleoprotein (RNP) with guide RNA. The data are shown in FIG. 3-6 .
- FIG. 3 is a diagrammatic representation of FIG. 3 .
- the Cas9 constructs and single-guide RNAs were as follows: a) 2 Cas9 constructs: i) Cas9-2 ⁇ NLS-sfGFP; and ii) 4 ⁇ NLS-Cas9-2 ⁇ NLS-sfGFP; b) 2 sgRNAs: i) tdTomato(298), 5′-AAGTAAAACCTCTACAAATG-3′ (SEQ ID NO:1312); and ii) non-targeting Gal4 (339), 5′-AACGACTAGTTAGGCGTGTA-3′ (SEQ ID NO:1313).
- the RNP was delivered in a per-dose concentration of 4 pmol/0.5 ⁇ l.
- HC Hippocampus
- S Striatum
- CTX S1 Primary Somatosensory Complex (S1)
- CTX V1 Primary Visual Cortex V1.
- FIG. 4A-4D 4 ⁇ NLS-Cas9 Significantly Improves In Vivo Cortical Cas9 RNP Genome-Editing.
- FIG. 4A Schematic of Cas9 proteins used for in vivo injections.
- FIG. 4C Representative pictures of brain sections with genome-edited cells expressing tdTomato RFP.
- FIG. 4D Quantification of genome-edited tdTomato positive cells per pmol Cas9 RNP. Cas9 construct; sgRNA. Sham is injection buffer only. tdTom sgRNA targets tdTomato reporter locus. Gal4 is a non-targeting sgRNA control and neither targets nor edits the tdTomato locus. The data indicate that 4 ⁇ NLS-Cas9 significantly improves in vivo cortical Cas9 RNP genome-editing.
- FIG. 5A-5D 4 ⁇ NLS-Cas9 Significantly Improves In Vivo Hippocampal Cas9 RNP Genome-Editing.
- FIG. 5A Schematic of Cas9 proteins used for in vivo injections.
- FIG. 5B 4 pmol Cas9 RNP in 0.5 ⁇ l volume was injected into the Hippocampus. Red dot is injection location.
- FIG. 5C Representative pictures of brain sections with genome-edited cells expressing tdTomato RFP.
- FIG. 5D Quantification of genome-edited tdTomato positive cells per pmol Cas9 RNP. Cas9 construct; sgRNA. Sham is injection buffer only. tdTom sgRNA targets tdTomato reporter locus. The data indicate that 4 ⁇ NLS-Cas9 significantly improves in vivo hippocampal Cas9 RNP genome-editing.
- FIG. 6A-6C Subretinal Cas9 RNP Injections: Genome-Editing in Retina Includes Cells Surrounding Injection Site and Distal Müller Glia.
- FIG. 6A presents a schematic of Subretinal injection site.
- FIG. 6B 15 pmol Cas9 RNP in 1 ⁇ l volume was injected subretinal.
- tdTomato+ cells report precise genome-editing by Cas9 RNP.
- Non-targeting Gal4 sgRNA does not activate tdTomato reporter gene.
- FIG. 6C Volume display of mse 4.2, 4 ⁇ NLS-Cas9 RNP, showing Cas9-edited Müller glia cells expressing tdTomato.
- FIG. 9 Antibody Staining of Brain Sections from 4 ⁇ -NLS-Cas9 RNP Genome-Edited Animals.
- the images are confocal microscopy images. Colocalization with the marker proteins identifies the edited tdTomato+neurons. Scale bar is 50 ⁇ m.
- CTIP2 aka BCL 11 a, a Transcription factor present in CA 1 Hippocampus and Striatum neurons.
- GFAP Glial fibrillary acidic protein is an intermediate filament (IF) protein that is expressed by numerous cell types of the central nervous system (CNS) including astrocytes, and ependymal cells.
- DARP-32 cAMP-regulated neuronal phosphoprotein, a well-documented marker of Striatum Medium Spiny Neurons.
- NeuN a neuronal specific nuclear protein in vertebrates.
- NPCs Neural Progenitor Cell
- NPCs were isolated from cortices from Embryonic Day 13.5 Ai9-tdTomato homozygous mouse embryos (Madisen et al. 2009 . Nat. Neurosci. 13:133-140). Cells were cultured as neurospheres in NPC Medium: DMEM/F12 with glutamine, Na-Pyruvate, 10 mM HEPES, Non-essential amino acid. Pen/Strep (100 ⁇ ), 2-mercaptoethanol (1000 ⁇ ), B-27 without vitamin A, N2 supplement, bFGF and EGF, both 20 ng/ml as final concentration. NPCs were passaged using MACS Neural Dissociation Kit (Papain) cat #130-092-628 following manufacturer's protocol. bFGF and EGF were refreshed every other day and passaged every six days. The NPC line was authenticated by immunocytochemistry marker staining; the cell line was tested for mycoplasma using Hoechst stain with visual analysis and was negative.
- the d2EGFP reporter construct was created in a modified lentivirus backbone with EF1-a promoter driving the gene of interest and a second PGK promoter driving production of a gene which confers resistance to hygromycin.
- the EGFP is destabilized by fusion to residues 422-461 of mouse ornithine decarboxylase, giving an in vivo half-life of ⁇ 2 hours.
- Transduced 293T cells were selected with hygromycin (250 ⁇ g/ml).
- d2EGFP clones were isolated by sorting single cells into 96 well plates and characterized by intensity of d2EGFP.
- Lentivirus was produced by PEI (Polysciences Inc., 24765) transfection of 293T cells with gene delivery vector co-transfected with packaging vectors pspax2 and pMD2.G essentially as described by (Tiscornia et al. 2006 , Nature Protocols. 1:241-245).
- the parental HEK293T cell line was obtained from UC Berkeley Scientific Facilities and authenticated using STR analysis by DDCM; the cell line wasted for mycoplasma using Hoechst stain with visual analysis and was negative.
- GFP disruption assays were based on those previously described (Gilbert et al. 2013 , Cell. 154:442-451).
- HEK-293T-d2EGFP cells were used in this experiment because they are efficiently transfected with Cas9-RNP mixed with lipofectamine2000 and therefore useful for this experiment which is analyzing the activity of the 0 ⁇ NLS- and 4 ⁇ NLS-Cas9 RNPs post cell penetration.
- HEK293T-d2EGFP cells were cultured in 10 cm dishes using Dulbecco's Modification of Eagle's Medium (DMEM) with 4.5 g/L glucose L-glutamine & sodium pyruvate (Corning cellgro) plus 10% fetal bovine serum, 1 ⁇ MEM Non-Essential Amino Acids Solution (Gibco) and Pen-Strep (gibco).
- DMEM Dulbecco's Modification of Eagle's Medium
- Gibco Non-Essential Amino Acids Solution
- Pen-Strep gibco
- Cas9 RNP was complexed with Lipofectamine 2000 (Life Technologies) at 0.005-50 pmol RNP+1 ⁇ l Lipofectamine in 20 ⁇ l OMEM media and added to the cells.
- Cells were analyzed for EGFP expression at 48 hours post transfection using a BD LSR Fortessa High-throughput sequencer.
- the recombinant S. pyogenes Cas9 used in this study carries two C-terminal SV40 nuclear localization sequences.
- the protein was expressed with a N-terminal hexahistidine tag and maltose binding protein in E. coli Rosetta 2 cells (EMD Millipore, Billerica, Mass.) from plasmids based on pMJ915 (Addgene plasmid #69090) (Lin et al. 2014 , eLife. 3:e04766).
- N-terminal nuclear localization sequence peptide arrays and sfGFP modifications were cloned into the plasmid using Gibson DNA assembly technique ( FIG. 19 ).
- Cas9 was purified by the protocols described in (Jinek et al. 2012 , Science. 337:816-821). Cas9 was stored in “Buffer 5”: 20 mM 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid (HEPES) at pH 7.5, 150 mM KCL, 10% glycerol, 1 mM tris(2-chloroethyl) phosphate (TCEP) and stored at ⁇ 80° C.
- Buffer 5 20 mM 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid (HEPES) at pH 7.5, 150 mM KCL, 10% glycerol, 1 mM tris(2-chloroethyl) phosphate (TCEP) and stored at ⁇ 80° C.
- Cas9 was buffer exchanged into “Buffer #1”: 25 mM Na phosphate pH 7.25, 300 mM NaCl, 200 mM 300 trehalose before size exclusion column and stored at ⁇ 80° C.
- Cas9 protein endotoxin levels were measured using Pierce LAL Chromogenic Endotoxin Quantification Kit Cat. #88282.
- FIG. 19 provides the primary sequence data for N-terminal NLS-Cas9 fusions.
- sgRNA target sequences were selected using the website accessed by entering “http:” followed by “//crispr.mit” followed by“.edu” into an internet browser (Hsu et al. 2013 . Nat. Biotechnol. 31:827-832).
- the DNA template encoding for a T7 promoter, a 20 nt target sequence and an optimized sgRNA scaffold was assembled from synthetic oligonucleotides (Integrated DNA Technologies, San Diego, Calif.) by overlapping polymerase chain reaction (PCR).
- Target sequences are: sgRNA298/tdTom (targets STOP cassette in tdTomato locus), 5′-AAGTAAAACCTCTACAAATG-3′ (SEQ ID NO:1312), sgRNA non-targeting (aka sgRNA339 targets Gal4 sequence that is not present in mouse genome), 5′-AACGACTAGTTAGGCGTGTA-3′ (SEQ ID NO:1313), sgRNA-NT3 (targets EGFP gene) 5′-GGTGGTGCAGATGAACTTCA-3′ (SEQ ID NO:1314).
- the PCR reaction contains 20 nM premix of BS298 (5′-TAA TAC GAC TCA CTA TAG AAG TAA AAC CTC TAC AAA TGG TTT AAG AGC TAT GCT GGA AAC AGC ATA GCA AGT TTA AAT AAG G-3′) (SEQ ID NO:1315) and BS6 (5′-AAA AAA AGC ACC GAC TCG GTG CCA CTI TT CAA GTT GAT AAC GGA CTA GCC TTA TTT AAA CTT GCT ATG CTG TTT CCA GC-3′) (SEQ ID NO:1316), 1 ⁇ M premix of T25_long (5′-GAA ATT AAT ACG ACT CAC TAT AG-3′) (SEQ ID NO:1317) and BS7 (5′-AAA AAA AGC ACC GAC TCG GTG C-3′) (SEQ ID NO:1318), 200 ⁇ M dNTP and
- thermocycler setting consisted of 40 cycles of 95° C. for 10 s, 59° C. for 10 s and 72° C. for 10 s.
- the PCR product was extracted once with phenol:chloroform:isoamylalcohol and thenonce with chloroform, before isopropanol precipitation overnight at ⁇ 20° C.
- the DNA pellet was washed three times with 70% ethanol, air-dried and resuspended in Elution Buffer.
- RNA was purified by electrophoresis in 10% polyacrylamide gel containing 6 M urea.
- the RNA band was excised from the gel, grinded up in a 15-ml tube, and eluted with 5 vol of 300 mM sodium acetate (pH 5) overnight at 4° C. The supernatant was filtered through a 0.2 um filter to remove acrylamide fragments. 2.5 equivalents of ethanol was added to precipitate the RNA overnight at ⁇ 20° C.
- the RNA pellet was collected by centrifugation, washed three times with 70% ethanol, and air-dried or vacuum-dried. To refold the sgRNA, the RNA pellet was re-dissolved in dPBS-Ca,-Mg.
- the sgRNA was heated to 70° C. for 5 min and cooled to room temperature for 5 min. MgCl 2 was added to a final concentration of 1 mM. The sgRNA was again heated to 50° C. for 5 min, cooled to room temperature for 5 min and kept on ice. The sgRNA concentration was determined by OD260 nm using Nanodrop (Thermo Fisher Scientic, Waltham, Mass.). The sgRNA was stored at ⁇ 80° C.
- Cas9 RNP either was prepared immediately before experiments or prepared and snap-frozen in liquid nitrogen and frozen at ⁇ 80 C for later use. Loss in activity upon freeze-thawing Cas9 RNP complexes was not measured.
- Cas9 protein was incubated with sgRNA at 1:1.2 molar ratio. Briefly sgRNA was added to Buffer #1 (25 mM NaPi, 150 mM NaCl, 200 mM trehalose, 1 mM MgCl 2 ). Then the Cas9 was added to the sgRNA, slowly, swirling it in. The mixture was incubated at 37° C. for 10 min to form RNP complexes.
- NPCs were dissociated by the MACS Neural Dissociation Kit (Papain) cat #130-092-628, spun down by centrifugation at 80 ⁇ g for 3 min, and washed once with dPBS-Ca,-Mg.
- Nucleofection of NPCs with Cas9 RNP was performed using Lonza (Allendale, N.J.) P3 cell kits and program EH-100 in an Amaxa 96-well Shuttle system. Each nucleofection reaction consisted of approximately 2.5 ⁇ 10 5 cells in 20 ⁇ l of nucleofection reagent and mixed with 10 ⁇ l of RNP. After nucleofection, 70 ⁇ l of growth media was added to the well to transfer the cells to tissue culture plates.
- plasmid nucleofection a modified pX330-U6-Chimeric_BB-CBh-hSpCas9 vector was used (was a gift from Feng Zhang (Addgene plasmid #42230)) (Cong et al. 2013 , Science. 339:819-823) that contained puromycin N-acetyltransferase PuroR gene and optimized sgRNA scaffold (Chen et al. 2013 , Cell. 155:1479-1491). Nucleofection was done with 700 ng plasmid with 4 ⁇ 10 5 NPCs and Lonza P3 cell kit and program DS-113 in an Amaxa 96-well Shuttle system. The cells were incubated at 37° C.
- genomic DNA analysis the media was removed by aspiration, and 100 ⁇ l of Quick Extraction solution (Epicentre, Madison, Wis.) was added to lyse the cells (65° C. for 20 min and then 95° C. for 20 min) and extract the genomic DNA. The cell lysate was stored at ⁇ 20° C. The concentration of genomic DNA was determined by NanoDrop. tdTomato activation in NPCs were analyzed by FACS. UC Berkeley FACS Core facilities were used.
- mice were maintained on a 12 h light dark cycle with ad libitum access to food and water. All animals were group housed and experiments were conducted in strict adherence to the Swiss federal ordinance on animal protection and welfare as well as according to the rules of the Association for Assessment and Accreditation of Laboratory Animal Care International (AAALAC), and with the explicit approval of the local veterinary authorities. Animals at University of California, Berkeley were maintained on a 12 h light dark cycle with ad libitum access to food and water. All animals were group housed and experiments were conducted in strict adherence to the University of California, Berkeley's Animal Care and Use Committee (ACUC) ethical regulations. No randomization was used to allocate animals to experimental groups.
- ACUC Animal Care and Use Committee
- Cas9, 4 ⁇ NLS-Cas9 and 4 ⁇ NLS-Cas9-GFP RNPs were prepared and shipped by Brett Staahl, UC Berkeley to Roche Pharmaceuticals Basel, Switzerland. 15-20 weeks old male Ai14 tdTomato mice (which lack the NeoR cassette present in Ai9 tdTomato mice but are otherwise identical) (Madisen et al. 2009 . Nat. Neurosci. 13:133-140) were anesthetized using injectable anesthesia (Fentanyl 0.05 mg/Kg+Medetomidine 0.5 mg/kg+Midazolam 5 mg/kg; s.c.).
- the anesthetized mouse was then aligned on an Angle two stereotactic frame (Leica, Germany) and craniotomies were performed with minimal damage to brain tissue. All stereotaxic coordinates were relative to bregma. Stereotaxic surgery targeted the mouse striatum (+0.74 mm anterioposterior, ⁇ 1.74 mm mediolateral, ⁇ 3.37 mm dorsoventral). Cas9 RNPs were infused (Striatum—0.5 ⁇ l/side) using a Neuros 75 5 ⁇ L syringe (Hamilton). After infusion, the injector was left at the injection site for 5 min and then slowly withdrawn.
- the operation field was cleaned with sterile 0.9% NaCl and closed with suture (Faden Monocryl Plus 5-0, Aichele Medico) and surgical glue (3MTM VetbondTM Tissue Adhesive).
- the mouse was kept warm at 37° C. during the surgical procedure and also post-surgery. To avoid drying of the eyes during surgery, an ointment was applied outside of the eyes of the mouse. The mice were left undisturbed for 12 days before cellular analysis. Sample size was chosen based on expected effect size. No randomization was applied while allocating animals to groups.
- mice were perfused with 4% paraformaldehyde and post-fixed overnight. Brains were sectioned (coronal plane sections) on a vibratome and 50 ⁇ m thick sections were used for 0 antibody labeling. Sections were first treated with blocking solution (0.3% Triton X-100, 10% goat serum in 1 ⁇ PBS) and incubated with the primary antibody (in blocking solution) overnight at 4° C. Sections were washed with 1 ⁇ PBS and incubated in the secondary antibody at room temperature for 3 hours.
- blocking solution 0.3% Triton X-100, 10% goat serum in 1 ⁇ PBS
- Confocal fluorescent images were acquired using a Leica TCS SP5 (Leica Microsystems) inverted microscope. Image analysis and maximum intensity projections of images acquired along the z-axis was done using LAS-AF software.
- Images for cell counting were acquired using a Leica TCS SP5 (Leica Microsystems) inverted microscope with 20 ⁇ dry objective. Image analysis and maximum intensity projections of images acquired along the z-axis was done using LAS-AF software. Every sixth section from the dorsal striatum was stained with DAPI and used for cell counting. Quantification of Td-tomato and DAPI double positive cells was done using ImageJ. The total number of edited cells per brain was quantified by multiplying the number of cells counted with the section periodicity (here it was 6). The experimenter was blinded to treatment condition while performing cell counting.
- GUIDE-Seq samples were prepared as described (Tsai et al. 2014 , Nat. Biotechnol. 33:187-197). Cortical neurons were isolated from post-natal day 0 (P0), Ai9 tdTomato mice.
- GUIDE-seq analysis The GUIDE-seq analysis package was used with default options (retrieved from the website accessed by entering https://followed by “github.” followed by “com/aryeelab/guideseq” into an internet browser on Aug. 10, 2016).
- a synthetic genome was created by inserting the tdTomato transgene (Ai9 was a gift from Hongkui Zeng (Addgene plasmid #22799)) (Madisen et al. 2009 . Nat. Neurosci. 13:133-140) into mm10 at chromosome 6 between coordinates 113075330 and 113076736 and was used for the alignment step.
- tdTomato target site FIG.
- the GUIDE-seq analysis package was modified to report all alignments regardless of MAPQ score.
- the GUIDE-seq reads were also mapped onto a synthetic genome containing only one repeat of the stop cassette and obtained similar results, although the version with three repeated stop cassettes was ultimately used as in some embodiments, it may be a more faithful representation of the genome of the Ai9 and Ai14 tdTomato mouse lines. UC Berkeley Genome Sequencing Core Facilities were used.
- tdTomato mouse Ai9/Ai14 tdTomato mouse model (hereafter referred to as tdTomato mouse) was first tested whether it could be adapted to “report” Cas9 editing in neural cells (Madisen et al. 2009 . Nat. Neurosci. 13:133-140) ( FIG. 10 ). These mice harbor a modification at the Rosa26 locus with a ubiquitously expressed CAGGS promoter and a loxP-flanked STOP cassette (three repeats of the SV40 polyA sequence) that prevents expression of the tdTomato fluorescent protein.
- Cre-mediated recombination at the loxP sites leads to STOP cassette deletion and tdTomato expression.
- This mouse model provides a robust, high-throughput, quantitative readout of site-specific genome modification at the loxP-flanked locus with a gain-of-function fluorescent signal in modified cells.
- One of the two loxP sites lacks a Protospacer Adjacent Motif (PAM) site necessary for Cas9-mediated DNA cleavage, and therefore, two unique single-guide RNAs (sgRNAs) would be needed for Cas9 to cut as Cre does at the loxP sites. Therefore it was set out to find a unique targetable sequence within the STOP cassette that is capable of activating the tdTomato reporter.
- PAM Protospacer Adjacent Motif
- NPCs Neural Progenitor/Stem cells
- NGS Next generation sequence analysis
- FIG. 10A-10G Cas9 RNP-Mediated Editing of Neural Progenitor/Stem Cells (NPCs).
- FIG. 10A Experimental scheme of Cas9:single-guide RNA ribonucleoprotein (Cas9 RNP) complexes delivery to NPCs OR Intracranial injection into adult mouse brains for genome editing, followed by genetic and phenotypic characterization. Scale bar 1 mm.
- FIG. 10F Sanger sequencing of 100 clones from the top band from 10 pmol RNP nucleofected, FACS sorted RFP ⁇ NPCs. 45% of the bands are edited with small 1-2 bp indels at 1, 2 or 3 of the sgRNA298 target site indicated by “X” in FIG. 10G .
- FIG. 10G Representative clones from Sanger sequencing of gel purified top, middle and bottom bands from 10 pmol RNP nucleofected NPCs.
- FIG. 11A-11D are views of FIG. 11A-11D .
- FIG. 11A Location of sgRNA's on tdTomato STOP cassette.
- FIG. 11B Cas9;sgRNA target sites differentially activate tdTomato reporter gene. sgRNA organized on histogram by 5′-3′ position on tdTomato STOP cassette.
- FIG. 11C Table of sgRNA sequences with Off-target hit scores (Hsu et al. 2013. Nat. Biotechnol. 31:827-832).
- FIG. 12 Cas9 RNP-Mediated Editing of Ai9 Mouse tdTomato STOP Cassette.
- WT wild-type
- edited alleles amplified from genomicDNA with primers 272Fd/273R.
- DNA bands were gel purified, cloned, Sanger sequenced and aligned to tdTomato STOP cassette.
- sgRNAtdTom has three target sites in the STOP cassette and therefore can generate six types of edits, including three small indels that do not generate a deletion, two types of single-repeat deletions and one double-repeat deletion ( FIG. 12 ).
- PCR analysis of the STOP cassette locus (272F and 273R primers) using genomic DNA from Cas9 RNP-treated cells revealed the expected three DNA band laddering pattern in edited cells ( FIG. 10E ).
- FACS fluorescence-activated cell sorting
- Cas9 RNP complexes Nucleofection of Cas9 RNP complexes is useful for treating cells in culture and isolated primary cells ex vivo. Recently electroporation has also been used to deliver Cas9 into muscle and retina with low efficiency. In some embodiments, editing CNS neurons in adult animals with high efficiency will require an alternative delivery strategy.
- Cas9 RNP has no innate cell penetrating activity, and its direct protein-based delivery into cells required chemical conjugation of poly-arginine peptides, a strategy prone to inefficiency and heterogeneity or mixing with lipid carrier molecules which are immunogenic, inflammatory and toxic.
- An alternative direct delivery approach was developed by engineering cell penetrating capabilities into Cas9 RNP complexes.
- Arrays of Simian vacuolating virus 40 (SV40) nuclear localization sequences (NLSs) enhanced the innate cell penetrating capabilities of zinc finger nucleases (Liu J, Gaj T, Wallen M C, Barbas C F. 2015. Improved Cell-Penetrating Zinc-Finger Nuclease Proteins for Precision Genome Engineering. 1-9).
- SV40 Simian vacuolating virus 40 nuclear localization sequences
- Cas9 protein that was used for cell-based experiments contains two SV40 NLS on the C-terminus, it was found that RNPs generated using this protein are not cell-penetrating. Therefore Cas9 proteins with increasing numbers of SV40 NLS arrays on the N-terminus were designed, expressed and purified.
- N-terminal NLS-Cas9 fusions are referred to as 0 ⁇ , 1 ⁇ , 2 ⁇ , 4 ⁇ and 7 ⁇ NLS-Cas9. These were made with and without superfolder(sf)GFP (Pédelacq et al. 2005 , Nat. Biotechnol. 24:79-88) fused on the C-terminus of Cas9 ( FIG. 13A ).
- RNP complexes were prepared and added directly to the media of tdTomato NPC cultures. Site-specific genome editing was monitored by observing activation of tdTomato reporter expression after three days of cell outgrowth.
- TdTomato expression was not detected in control experiments using a non-targeting sgRNA (target sequence is GALA) indicating that genome editing is sequence specific.
- PCR analysis of the tdTomato locus in these cells confirmed the expected deletion edits and, showed higher total deletion efficiency than reported by tdTomato fluorescent protein expression; at 100 pmol RNP, 0 ⁇ NLS-Cas9 was below detection levels vs. 4 ⁇ NLS-Cas9 which yielded 12% total deletion efficiency ( FIG. 13C ).
- FIG. 13A-13D Direct Delivery of Cell Penetrating Cas9 RNPs Increases Editing Efficiency In Vitro.
- FIG. 13A 1-7 ⁇ N-terminal NLS-Cas9 design; (SV40 nuclear localization sequence a.a. PKKKRKV—SEQ ID NO:1090)
- FIG. 13B Direct incubation of 1-7 ⁇ NLS-Cas9 in NPCs led to expression of tdTomato in genome-edited NPC cells.
- Cas9-RNP nucleofection efficiency compared with direct-delivery efficiency. 4 ⁇ NLS-Cas9 RNP complexes yield deletion edits (lower 2 bands) while 0 ⁇ NLS-Cas9 RNP complexes do not.
- FIG. 13D Cas9 RNP complexes targeting d2EGFP are mixed with Lipofectamine2000 and delivered to HEK293T-d2EGFP cells. Percent editing is measured by EGFP gene disruption by FACS analysis.
- the cell-penetration step was bypassed by mixing the RNPs with Lipofectamine2000 to trigger delivery into the cytoplasm by cell membrane fusion (Zuris J A, Thompson D B, Shu Y, Guilinger J P, Bessen J L, Hu J H, Maeder M L, Joung J K, Chen Z-Y, Liu D R. 2014. Cationic lipid-mediated delivery of proteins enables efficient protein-based genome editing in vitro and in vivo. Nat Biotechnol.).
- FIG. 14A-14C Injection of Cas9 RNP into Multiple Brain Regions in Adult Mice Results in Precise and Programmable Genome-Editing.
- FIG. 14A Dots indicate stereotaxic injection sites on coronal cartoons of mouse brain.
- Male mice are 14-15 weeks old. Brains are analyzed 12-14 days post injection. 50 um thick floating sections, 1 section every 300 um analyzed.
- Cas9 RNP components are: 0 ⁇ NLS-Cas9-2 ⁇ NLS-sfGFP or 4 ⁇ NLS-Cas9-2 ⁇ NLS-sfGFP.
- FIG. 14B Quantification of genome-edited tdTomat + cells/pmol RNP delivered (Cas9 construct; sgRNA). 4 ⁇ NLS-Cas9 RNP complexes are significantly more efficient compared to 0 ⁇ NLS-Cas9 RNP complexes for in vivo genome-editing in all brain regions tested. Sham (injection buffer only) and 4 ⁇ NLSCas9; non-targeting RNP complexes do not activate tdTomato indicating specificity of genome editing with Cas9 RNP complexes.
- FIG. 14C Representative pictures of brain sections with genome-edited cells expressing tdTomato and antibody staining against various marker proteins.
- Antibody staining of brain sections from 4 ⁇ NLS-Cas9 RNP treated animals shows editing in neurons and not astrocytes.
- Confocal microscopy images used for qualitative analysis of tdTomato + co-localization with marker proteins to identify edited cells.
- NeuN a neuronal specific nuclear protein in vertebrates.
- CTIP2 aka BCL11a, a transcription factor present in CA1 hippocampus and striatum neurons.
- DARPP-32 cAMP-regulated neuronal phosphoprotein, a marker of striatum medium spiny neurons.
- GFAP glial fibrillary acidic protein is an intermediate filament (IF) protein that is expressed by astrocytes and ependymal cells of the CNS.
- S100 ⁇ a gene highly expressed in striatal astrocytes. Scale bar is 50 ⁇ m.
- the 4 ⁇ NLS-Cas9 RNP complexes were significantly more efficient at editing cells in the adult brain than the 0 ⁇ NLS-Cas9 RNP complexes.
- FIG. 15A-15C Bilateral Intrastriatal Injection Measurements of tdTomato+ Cell Volume and Density Indicates RNP Dose Dependent Increase in Edited Tissue Volume.
- FIG. 15A Shaded oval indicates region of tdTomato+ cells on sagittal and coronal cartoons of mouse brain. Dashed lines on sagittal section represent approximate positions of 50 m coronal sections along the rostral-caudal axis.
- FIG. 15A Shaded oval indicates region of tdTomato+ cells on sagittal and coronal cartoons of mouse brain. Dashed lines on sagittal section represent approximate positions of 50 m coronal sections along the rostral-caudal axis.
- Dual color immunofluorescence was performed to identify the specific cell-types that are edited upon Cas9 RNP injection.
- tdTomato + cells were also positive for the post-mitotic neuronal marker, NeuN ( FIG. 14C ).
- CTIP2 also known as BCL11a
- CTIP2 also known as BCL11a
- IBA-1 ionized calcium-binding adapter molecule 1, also known as Allograft inflammatory factor 1 (AIF-1)
- IBA-1 is useful as an indicator of activated microglia because 1) its levels increase and 2) the cytoplasmic staining pattern can be used to assess microglia morphological changes that occur upon activation, namely close association with neuron cell bodies (Chen Z. et al. 1AD. Microglial displacement of inhibitory synapses provides neuroprotection in the adult brain. Nature Communications 5: 1-12). Significant IBA-1 intensity differences or morphological differences were not observed between control and RNP injected brains ( FIG.
- FIG. 16A-16B Analysis of Innate Immune Response in Treated and Untreated Brains.
- FIG. 16A Morphological appearance of microglia in the mouse brain in Csa9 RNP complex treated and untreated mice (visualized in green by immunostaining with anti-IBA-1 antibody).
- Cas9 RNP treated mice microglia have small cell bodies and long and slender processes indicating they are not activated. When activated, microglia enlarge their cell bodies and thicken their processes, which closely enwrap neuronal cell bodies.
- FIG. 17A To investigate whether an increased dose of 4 ⁇ NLS-Cas9 RNP complexes in the dorsal striatum would improve genome editing efficiency an intrastriatal injection dose course of 4, 15 and 30 pmol/0.5 ⁇ l injections was performed ( FIG. 17A ).
- RNP was prepared at 8 ⁇ M, 30 ⁇ M and 60 ⁇ M concentrations and injected into the dorsal striatum. Brains were harvested 14 days later, sectioned, DAPI counterstained, and tdTomato + cells counted. An RNP dose dependency on total number of edited cells/injection was observed, 4 pmol (588 ⁇ 90), 15 pmol (1339 ⁇ 331), 30 pmol (2675 ⁇ 613) ( FIG. 17B . 17 C).
- FIG. 17A-17F Increasing Dose of 4 ⁇ NLS-Cas9 RNP Complexes Significantly Increases the Number of tdTomato + Genome-Edited Cells in the Striatum.
- FIG. 17B tdTomato + cells in single injection dose response: 4, 15, 30 pmol/0.5 ⁇ l injection. Confocal images of region of edited striatum tissue. Scale bar 100 um.
- FIG. 17C Quantification of total #tdTomato + edited striatal cells per injection site.
- FIG. 17D Quantification of tdTomato + cells per pmol RNP delivered (1 edited cell/10 fmol RNP).
- FIG. 17E PCR analysis of gDNA isolated by LASER microdissection of 30 pmol RNP treated dorsal striatum. Tissue from three 50 ⁇ m sections, marked with an asterisk in FIG. 15 , taken at 300 ⁇ m intervals spanning ⁇ 1 mm on the rostral-caudal axis of tdTomato+ dorsal striatum were used. ⁇ 1 mm ⁇ 1.5 mm rectangles of tissue containing tdTomato+ cells were microdissected and pooled.
- FIG. 17D PCR analysis confirmed the expected genomic deletions in 7.5% of the alleles.
- n 6 for each group comprising 3 biological replicates with 2 technical replicates each.
- FIG. 17F Sanger sequencing of 88 clones isolated from top DNA band in FIG. 17E reveals 8.8% of alleles have small 1-2 bp indels at 1, 2, or 3 target sites.
- Cas9 RNP based systems have been shown to have significantly decreased off-target editing potential compared to genetically encoded systems.
- GUIDE-Seq technique was used to search for bona fide 0 ⁇ and 4 ⁇ NLS-Cas9 RNP on and off-target sites. Because chromatin structure influences Cas9 targeting, GUIDE-Seq was performed in primary cortical neuron cultures isolated from post-natal day 0 (P0) tdTomato mice and nucleofected with 0 ⁇ and 4 ⁇ NLS-Cas9 RNPs. The target site was identified and no off-target editing was observed. Also no difference in fidelity of targeting was observed between the 0 ⁇ and 4 ⁇ NLS-Cas9 RNPs ( FIG. 18 ).
- FIG. 18 GUIDE-Seq Analysis for Off-Target Editing and 0 ⁇ NLS-Cas9 vs. 4 ⁇ NLS-Cas9 Fidelity.
- GUIDE-Seq (Tsai et al. 2014, Nat. Biotechnol. 33:187-197) analysis in Ai9 tdTomato primary cortical neuron cultures isolated at postnatal day 0 brains and nucleofected with 0 ⁇ NLS or 4 ⁇ NLS-Cas9 RNPs complex with sgRNAtdTom or sgRNA-nontargeting. Target site edits were identified but no off-target editing was observed. No difference in fidelity of targeting was observed between the 0 ⁇ and 4 ⁇ NLS-Cas9 RNPs.
- GUIDE-seq analysis was carried out for the eight different samples described in FIG. 18 . Sites that were present in both the sgRNA_tdTomato targeted cells and the sgRNA_Gal4 targeted cells were ignored, leaving only the tdTomato site.
- Gal4 target sequence SEQ ID NO:1333 and tdTomato target sequence—SEQ ID NO:1332.
Abstract
Description
TABLE 1 |
Table 1 |
from various species. The amino acids listed in Table 1 |
are from the Cas9 from S. pyogenes (SEQ ID NO: 5) |
Motif | Highly | ||
# | Motif | Amino acids (residue #s) | conserved |
1 | RuvC-like | IGLDIGTNSVGWAVI (7-21) | D10, G12, G17 |
I | (SEQ ID NO: 1) | ||
2 | RuvC-like | IVIEMARE (759-766) | E762 |
II | (SEQ ID NO: 2) | ||
3 | HNH- | DVDHIVPQSFLKDDSIDNKVLTRSDKN | H840, N854, |
motif | (837-863)(SEQ ID NO: 3) | |
|
4 | RuvC-like | HHAHDAYL (982-989) | H982, H983, |
II | (SEQ ID NO: 4) | A984, D986, | |
A987 | |||
Claims (13)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/061,291 US11118194B2 (en) | 2015-12-18 | 2016-12-15 | Modified site-directed modifying polypeptides and methods of use thereof |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201562269755P | 2015-12-18 | 2015-12-18 | |
US16/061,291 US11118194B2 (en) | 2015-12-18 | 2016-12-15 | Modified site-directed modifying polypeptides and methods of use thereof |
PCT/US2016/067040 WO2017106569A1 (en) | 2015-12-18 | 2016-12-15 | Modified site-directed modifying polypeptides and methods of use thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
US20180363009A1 US20180363009A1 (en) | 2018-12-20 |
US11118194B2 true US11118194B2 (en) | 2021-09-14 |
Family
ID=59057888
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/061,291 Active 2038-02-02 US11118194B2 (en) | 2015-12-18 | 2016-12-15 | Modified site-directed modifying polypeptides and methods of use thereof |
Country Status (3)
Country | Link |
---|---|
US (1) | US11118194B2 (en) |
EP (1) | EP3390624A4 (en) |
WO (1) | WO2017106569A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022256448A2 (en) | 2021-06-01 | 2022-12-08 | Artisan Development Labs, Inc. | Compositions and methods for targeting, editing, or modifying genes |
WO2022266538A2 (en) | 2021-06-18 | 2022-12-22 | Artisan Development Labs, Inc. | Compositions and methods for targeting, editing or modifying human genes |
WO2023167882A1 (en) | 2022-03-01 | 2023-09-07 | Artisan Development Labs, Inc. | Composition and methods for transgene insertion |
WO2023225410A2 (en) | 2022-05-20 | 2023-11-23 | Artisan Development Labs, Inc. | Systems and methods for assessing risk of genome editing events |
Families Citing this family (42)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6261500B2 (en) | 2011-07-22 | 2018-01-17 | プレジデント アンド フェローズ オブ ハーバード カレッジ | Evaluation and improvement of nuclease cleavage specificity |
US20150044192A1 (en) | 2013-08-09 | 2015-02-12 | President And Fellows Of Harvard College | Methods for identifying a target site of a cas9 nuclease |
US9359599B2 (en) | 2013-08-22 | 2016-06-07 | President And Fellows Of Harvard College | Engineered transcription activator-like effector (TALE) domains and uses thereof |
US9340800B2 (en) | 2013-09-06 | 2016-05-17 | President And Fellows Of Harvard College | Extended DNA-sensing GRNAS |
US9526784B2 (en) | 2013-09-06 | 2016-12-27 | President And Fellows Of Harvard College | Delivery system for functional nucleases |
US9322037B2 (en) | 2013-09-06 | 2016-04-26 | President And Fellows Of Harvard College | Cas9-FokI fusion proteins and uses thereof |
US9840699B2 (en) | 2013-12-12 | 2017-12-12 | President And Fellows Of Harvard College | Methods for nucleic acid editing |
CA2956224A1 (en) | 2014-07-30 | 2016-02-11 | President And Fellows Of Harvard College | Cas9 proteins including ligand-dependent inteins |
EP3227447A4 (en) | 2014-12-03 | 2018-07-11 | Agilent Technologies, Inc. | Guide rna with chemical modifications |
KR20240038141A (en) | 2015-04-06 | 2024-03-22 | 더 보드 어브 트러스티스 어브 더 리랜드 스탠포드 주니어 유니버시티 | Chemically modified guide rnas for crispr/cas-mediated gene regulation |
CA2986310A1 (en) | 2015-05-11 | 2016-11-17 | Editas Medicine, Inc. | Optimized crispr/cas9 systems and methods for gene editing in stem cells |
US11911415B2 (en) | 2015-06-09 | 2024-02-27 | Editas Medicine, Inc. | CRISPR/Cas-related methods and compositions for improving transplantation |
WO2017062855A1 (en) | 2015-10-09 | 2017-04-13 | Monsanto Technology Llc | Novel rna-guided nucleases and uses thereof |
IL310721A (en) | 2015-10-23 | 2024-04-01 | President And Fellows Of Harvard College | Nucleobase editors and uses thereof |
US10767175B2 (en) | 2016-06-08 | 2020-09-08 | Agilent Technologies, Inc. | High specificity genome editing using chemically modified guide RNAs |
CN109642231A (en) * | 2016-06-17 | 2019-04-16 | 博德研究所 | VI type CRISPR ortholog and system |
KR20230095129A (en) | 2016-08-03 | 2023-06-28 | 프레지던트 앤드 펠로우즈 오브 하바드 칼리지 | Adenosine nucleobase editors and uses thereof |
CN109804066A (en) | 2016-08-09 | 2019-05-24 | 哈佛大学的校长及成员们 | Programmable CAS9- recombination enzyme fusion proteins and application thereof |
US11542509B2 (en) | 2016-08-24 | 2023-01-03 | President And Fellows Of Harvard College | Incorporation of unnatural amino acids into proteins using base editing |
KR20240007715A (en) | 2016-10-14 | 2024-01-16 | 프레지던트 앤드 펠로우즈 오브 하바드 칼리지 | Aav delivery of nucleobase editors |
WO2018119359A1 (en) | 2016-12-23 | 2018-06-28 | President And Fellows Of Harvard College | Editing of ccr5 receptor gene to protect against hiv infection |
US11898179B2 (en) | 2017-03-09 | 2024-02-13 | President And Fellows Of Harvard College | Suppression of pain by gene editing |
WO2018165629A1 (en) | 2017-03-10 | 2018-09-13 | President And Fellows Of Harvard College | Cytosine to guanine base editor |
IL269458B2 (en) | 2017-03-23 | 2024-02-01 | Harvard College | Nucleobase editors comprising nucleic acid programmable dna binding proteins |
US11560566B2 (en) | 2017-05-12 | 2023-01-24 | President And Fellows Of Harvard College | Aptazyme-embedded guide RNAs for use with CRISPR-Cas9 in genome editing and transcriptional activation |
EP3652312A1 (en) | 2017-07-14 | 2020-05-20 | Editas Medicine, Inc. | Systems and methods for targeted integration and genome editing and detection thereof using integrated priming sites |
US11732274B2 (en) | 2017-07-28 | 2023-08-22 | President And Fellows Of Harvard College | Methods and compositions for evolving base editors using phage-assisted continuous evolution (PACE) |
EP3676376A2 (en) | 2017-08-30 | 2020-07-08 | President and Fellows of Harvard College | High efficiency base editors comprising gam |
US11795443B2 (en) | 2017-10-16 | 2023-10-24 | The Broad Institute, Inc. | Uses of adenosine base editors |
WO2019222545A1 (en) | 2018-05-16 | 2019-11-21 | Synthego Corporation | Methods and systems for guide rna design and use |
WO2020191248A1 (en) | 2019-03-19 | 2020-09-24 | The Broad Institute, Inc. | Method and compositions for editing nucleotide sequences |
CN114375334A (en) | 2019-06-07 | 2022-04-19 | 斯克里贝治疗公司 | Engineered CasX system |
WO2021067788A1 (en) | 2019-10-03 | 2021-04-08 | Artisan Development Labs, Inc. | Crispr systems with engineered dual guide nucleic acids |
JP2023525304A (en) | 2020-05-08 | 2023-06-15 | ザ ブロード インスティテュート,インコーポレーテッド | Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence |
EP4256054A1 (en) | 2020-12-03 | 2023-10-11 | Scribe Therapeutics Inc. | Engineered class 2 type v crispr systems |
WO2022204543A1 (en) * | 2021-03-25 | 2022-09-29 | The Regents Of The University Of California | Methods and materials for treating huntington's disease |
WO2022236147A1 (en) * | 2021-05-06 | 2022-11-10 | Artisan Development Labs, Inc. | Modified nucleases |
EP4347818A2 (en) | 2021-06-01 | 2024-04-10 | Arbor Biotechnologies, Inc. | Gene editing systems comprising a crispr nuclease and uses thereof |
CA3230927A1 (en) | 2021-09-10 | 2023-03-16 | Agilent Technologies, Inc. | Guide rnas with chemical modification for prime editing |
WO2023235818A2 (en) | 2022-06-02 | 2023-12-07 | Scribe Therapeutics Inc. | Engineered class 2 type v crispr systems |
WO2024047562A1 (en) | 2022-09-02 | 2024-03-07 | Janssen Biotech, Inc. | Materials and processes for bioengineering cellular hypoimmunogenicity |
WO2024062138A1 (en) | 2022-09-23 | 2024-03-28 | Mnemo Therapeutics | Immune cells comprising a modified suv39h1 gene |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100267928A1 (en) * | 2007-06-12 | 2010-10-21 | Stefan Heckl | Activatable diagnostic and therapeutic compound |
WO2013176772A1 (en) | 2012-05-25 | 2013-11-28 | The Regents Of The University Of California | Methods and compositions for rna-directed target dna modification and for rna-directed modulation of transcription |
WO2014093712A1 (en) | 2012-12-12 | 2014-06-19 | The Broad Institute, Inc. | Engineering of systems, methods and optimized guide compositions for sequence manipulation |
US20140227787A1 (en) | 2012-12-12 | 2014-08-14 | The Broad Institute, Inc. | Crispr-cas systems and methods for altering expression of gene products |
WO2014204725A1 (en) | 2013-06-17 | 2014-12-24 | The Broad Institute Inc. | Optimized crispr-cas double nickase systems, methods and compositions for sequence manipulation |
WO2015139139A1 (en) | 2014-03-20 | 2015-09-24 | UNIVERSITé LAVAL | Crispr-based methods and products for increasing frataxin levels and uses thereof |
US9834791B2 (en) * | 2013-11-07 | 2017-12-05 | Editas Medicine, Inc. | CRISPR-related methods and compositions with governing gRNAS |
-
2016
- 2016-12-15 US US16/061,291 patent/US11118194B2/en active Active
- 2016-12-15 WO PCT/US2016/067040 patent/WO2017106569A1/en active Application Filing
- 2016-12-15 EP EP16876726.7A patent/EP3390624A4/en not_active Withdrawn
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100267928A1 (en) * | 2007-06-12 | 2010-10-21 | Stefan Heckl | Activatable diagnostic and therapeutic compound |
WO2013176772A1 (en) | 2012-05-25 | 2013-11-28 | The Regents Of The University Of California | Methods and compositions for rna-directed target dna modification and for rna-directed modulation of transcription |
WO2014093712A1 (en) | 2012-12-12 | 2014-06-19 | The Broad Institute, Inc. | Engineering of systems, methods and optimized guide compositions for sequence manipulation |
US20140227787A1 (en) | 2012-12-12 | 2014-08-14 | The Broad Institute, Inc. | Crispr-cas systems and methods for altering expression of gene products |
WO2014204725A1 (en) | 2013-06-17 | 2014-12-24 | The Broad Institute Inc. | Optimized crispr-cas double nickase systems, methods and compositions for sequence manipulation |
US9834791B2 (en) * | 2013-11-07 | 2017-12-05 | Editas Medicine, Inc. | CRISPR-related methods and compositions with governing gRNAS |
WO2015139139A1 (en) | 2014-03-20 | 2015-09-24 | UNIVERSITé LAVAL | Crispr-based methods and products for increasing frataxin levels and uses thereof |
US10323073B2 (en) * | 2014-03-20 | 2019-06-18 | UNIVERSITé LAVAL | CRISPR-based methods and products for increasing frataxin levels and uses thereof |
Non-Patent Citations (3)
Title |
---|
Hsu, et al.; "Development and Applications of CRISPR-Cas9 for Genome Engineering"; Cell; vol. 157, No. 6, pp. 1262-1278 (Jun. 5, 2014). |
Staahl, et al.; "Efficient genome editing in the mouse brain by local delivery of engineered Cas9 ribonucleoprotein complexes"; Nature Biotechnology; vol. 35, No. 5, 7 pages (May 2017). |
Zetsche, et al.; "Cpf1 is a single RNA-guided endonuclease of a Class 2 CRISPRCas system"; Cell; vol. 163, No. 3, pp. 759-771 (Oct. 22, 2015). |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022256448A2 (en) | 2021-06-01 | 2022-12-08 | Artisan Development Labs, Inc. | Compositions and methods for targeting, editing, or modifying genes |
WO2022266538A2 (en) | 2021-06-18 | 2022-12-22 | Artisan Development Labs, Inc. | Compositions and methods for targeting, editing or modifying human genes |
WO2023167882A1 (en) | 2022-03-01 | 2023-09-07 | Artisan Development Labs, Inc. | Composition and methods for transgene insertion |
WO2023225410A2 (en) | 2022-05-20 | 2023-11-23 | Artisan Development Labs, Inc. | Systems and methods for assessing risk of genome editing events |
Also Published As
Publication number | Publication date |
---|---|
EP3390624A1 (en) | 2018-10-24 |
EP3390624A4 (en) | 2019-07-10 |
US20180363009A1 (en) | 2018-12-20 |
WO2017106569A1 (en) | 2017-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11118194B2 (en) | Modified site-directed modifying polypeptides and methods of use thereof | |
US20220042047A1 (en) | Compositions and methods for modifying a target nucleic acid | |
US20230250423A1 (en) | Genome editing of human neural stem cells using nucleases | |
US20230124880A1 (en) | Guide scaffolds | |
Staahl et al. | Efficient genome editing in the mouse brain by local delivery of engineered Cas9 ribonucleoprotein complexes | |
US11530421B2 (en) | Self-inactivating endonuclease-encoding nucleic acids and methods of using the same | |
US11008555B2 (en) | Variant Cas9 polypeptides comprising internal insertions | |
EP3303634B1 (en) | Cas9 variants and methods of use thereof | |
EP3352795B1 (en) | Compositions and methods for target nucleic acid modification | |
US20200115688A1 (en) | Compositions and methods for enhancing genome editing | |
JP2021506251A (en) | New RNA programmable endonuclease system, as well as its use in genome editing and other applications | |
US20200291368A1 (en) | Improved CRISPR-Cpf1 Genome Editing Tool | |
US20190218261A1 (en) | Targeted enhanced dna demethylation | |
US20220315914A1 (en) | Variant type v crispr/cas effector polypeptides and methods of use thereof | |
US20200291369A1 (en) | Improved CRISPR-Cas9 Genome Editing Tool | |
KR20190089175A (en) | Compositions and methods for target nucleic acid modification | |
CN114040971A (en) | CRISPR-Cas effector polypeptides and methods of use thereof | |
WO2022197839A9 (en) | Crispr/cas effector-histone modifier fusion proteins and methods of use thereof | |
WO2023147240A2 (en) | Variant type v crispr/cas effector polypeptides and methods of use thereof | |
EP4041884A1 (en) | A nucleic acid delivery vector comprising a circular single stranded polynucleotide | |
CN113166753A (en) | Down-regulation of cytoplasmic DNA sensor pathway |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO SMALL (ORIGINAL EVENT CODE: SMAL); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
AS | Assignment |
Owner name: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DOUDNA, JENNIFER A.;STAAHL, BRETT T.;SIGNING DATES FROM 20181205 TO 20190916;REEL/FRAME:054958/0172 Owner name: F. HOFFMANN-LA ROCHE AG, SWITZERLAND Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:GHOSH, ANIRVAN;REEL/FRAME:054958/0169 Effective date: 20190916 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: AWAITING TC RESP., ISSUE FEE NOT PAID |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: AWAITING TC RESP, ISSUE FEE PAYMENT VERIFIED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
CC | Certificate of correction |