CRISPR
CRISPR/Cas9 Use Overview
The goal of using CRISPR/Cas9 in our project is to knock a Six/FRT-site landing pad into the eukaryotic host genome, which can be targeted by FLP recombinase to integrate additional DNA in a site-specific manner. To this extent, we first compared the efficiency of a variety of sgRNA candidates to find the optimum guide RNA sequence. We also tested the comparative effectiveness of wild-type CRISPR/Cas9 with one of its common mutant variants, CRISPR/Cas9 nickase. Lastly, CRISPR/Cas9 was used for its intended purpose of knocking a Six/FRT recognition site into the HEK293T genome to facilitate Flp-recombination.
What is CRISPR/Cas9?
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR/Cas9) are a targeted nuclease system used to induce breaks at specified regions in DNA. They can be thought of as molecular scissors, slicing DNA apart at a pre-determined site in the genome (Redman et al, 2016). This system utilizes both a protospacer adjacent motif (PAM) sequence and a single guide RNA (sgRNA) to bind DNA and cut it in a targeted fashion. Wild-type Cas9 cuts both strands of DNA, whereas Cas9 nickase cuts a single strand, which therefore requires two molecules of nickase and a pair of sgRNAs in order to induce a full cut (Chiang et al, 2016). These cuts in DNA result in DNA reparation via repair mechanisms which occur naturally in cells. The use of wild-type Cas9 favours repair via non-homologous end joining (NHEJ), while nickase favours homology directed repair (HDR). The mechanism of NHEJ is error prone, and often used to introduce small insertions or deletions into the target site, however it also harbours higher off target activity. HDR relies on homologous DNA sequences to be present which are then used as a template for DNA repair. To take advantage of this, DNA templates with homologous “arms” flanking a desired insert can be provided, and which is then inserted into the DNA cut site through HDR. This mechanism reduces the risk of off target integration, however it also drastically reduces cut efficiency due to the fact that two sites must be cut adjacent to one another instead of just one.
sgRNA Candidate Testing
CRISPR systems rely on two key components in order to execute their nuclease activity: the Cas9 protein, which is responsible for the cleavage of the targeted DNA strand, and the sgRNA, which serves as a guide to direct the Cas9 cleavage protein to the desired site in the genome. The sgRNA can be further broken down into two components as well. The first component is known as trans activating CRISPR RNA (tracrRNA), and the sequence for this component is consistent within a species regardless of the genomic target. The purpose of tracrRNA is to bind and activate the Cas9 protein complex, as well as bind CRISPR RNA (crRNA), the second building block of sgRNA. CrRNA, therefore, is the region which shares homology with the desired genomic target site, and thus serves as the customisable guide piece which can be manipulated to alter the site which Cas9 will cleave. Commercially, tracrRNA and crRNA are often available as a single, pre-complexed sgRNA, as opposed to in their individual components.
However, not all sgRNAs are equivalent with respect their their targeting effectiveness. Different sgRNA sequences can boast significantly different results (Doench et al, 2016)(Gagnon et al, 2014)(Xu et al, 2015). Additionally, differences in the target DNA such as chromatin structure, position within a nucleosome, or hairpinning can alter the efficiency of Cas9 nuclease activity or of sgRNA binding to the target site. We therefore decided to test a variety of sgRNA candidates to compare their relative targeting efficiency, and resultant cut activity. This data would provide us with the optimum site to target for our later gene knock-ins.
Seven sites were selected from two main loci, the AAVS1 locus and the CCR5 locus. The sgRNA candidates associated with each site were cloned into the pU6-Cas9 vector available from Addgene. The vectors were then amplified in DH5A E. coli, purified through plasmid miniprep, and transfected into our HEK293T cells using Lonza nucleofection. After being allowed to grow for 48 hours, the transfected cells were lysed and their genomic DNA was recovered. A PCR was then performed to amplify the region of DNA which was targeted by CRISPR Cas9, and the amplified region was sequenced using Sanger sequencing. CRISPR Cas9 cleavage without a provided repair template often results in the formation of insertions or deletions (indels) due to the error-prone non homologous end joining DNA repair pathway. Thus, indel formation was used as a surrogate measure of relative sgRNA effectiveness. Sequenced DNA was analyzed using a software called TIDE, which compares control sequence chromatogram data to the chromatogram data of a CRISPR-targeted DNA, and reports back the rate of insertions and deletions. The following charts show the indel rates for each of our sgRNA candidates; the Y axis shows the percentage of DNA in each category, and the X axis shows a spectrum ranging from a deletion of 10 base pairs to an insertion of 10 base pairs. Unmodified DNA appears at 0 on the X axis. A candidate with higher targeting effectiveness would result in more cleavage, and therefore more indel formation. Therefore, the more indels which formed in our assay, the higher the effectiveness of the sgRNA candidate.
From our tested candidates, AAVS1-27L was by and large the most effective sgRNA with an indel formation rate of 94.6%. However, based on the relative values of the other sgRNA candidates and the sheer improbability of getting 94.6% deletion efficiency, this is likely experimental error on the part of our indel analysis software. The software relies on chromatogram data in order to view a variety of DNA reads corresponding with a variety of amplicons; however, if one amplicon was exclusively amplified or used in the sequencing reaction, it would result in the extreme monopoly that is seen here. The fact that all measured “indels” contain only a single base pair deletion further support the hypothesis that this data comes from a single amplicon, and is therefore inaccurate. This error is a necessary evil associated with the software, and could also contribute to some of the extremely low indel rates for the same reason. Simply put, any time that it appears that only a single data set was available, the possibility for a read-monopoly is there.
Beyond the aberrant AAVS1-27L read, the rest of the data appeared congruent. Outliers omitted, CCR5-17L sgRNA showed the highest indel formation rate, indicating the most successful sgRNA candidate. Below are the sequences of each tested guide RNA molecule, along with their indel formation efficiency as per our measurements:
- AAVS1-12L: 1.0% Sequence: GGGAACGGGATGAACTCGGC
- AAVS1-12R: 7.9% Sequence: AGTTGTCATGGCGATAGGGG
- AAVS1-27L: 94.6% Sequence: CCTAGCCACTAAGGCAATTG
- AAVS1-27R: 1.0% Sequence: AGGGTACCAGCCTCACCAAG
- CCR5-15L: 5.1% Sequence: CAAGAAGTTGTGTCTAAGTC
- CCR5-15R: 1.1% Sequence: TCTTTTTCCTCCAGACAAGA
- CCR5-17L: 11.1% Sequence: TCTAGTGGACAGGGAAGCTA
CRISPR Cas9: Wild Type Versus Nickase
After our sgRNA comparisons, we wanted to test the effectiveness of wild type Cas9 compared to the nickase variant. As mentioned previously, nickase is a mutant form of the standard CRISPR/Cas9 protein which only cuts a single strand, and therefore requires two adjacent nickase activities in order to induce a full cut. This restriction results in lower off-target activity, at the cost of reduced cut efficiency compared to wild type. While we would prefer to use nickase due to the inherent safety associated with lower off-target activity, we worried that the efficiency of the nickase system would be too low to reliably obtain results for our system. We therefore ran a test, comparing the insertion efficiency of wild type Cas9 and nickase while using the identical insertion DNA. The sequence inserted included a single FRT site (48BP) and homologous arms of 35BP each.
Transfected cells were isolated, and their DNA was purified and sequenced. Sequenced DNA and chromatogram data was then analyzed and indel formation was measured using TIDE to ensure that the Cas9 was cutting as intended. Unfortunately, our results did not find DNA knock-ins in either the wild type or nickase assay. Indel frequency was also measured, and indel activity was found in both assays with a frequency of 5.1% in nickase and 7.3% in wild type. This indicates that the issue with our CRISPR system did not lie with sgRNAs or endonuclease activity, but with the integration of a desired insert through homology directed repair. Our sequencing results did not span all transfected cells, so we can not conclusively state that no knock-ins occurred whatsoever, however the knock-in rate was low enough to be completely missed in the sample we tested. This knowledge demonstrated the extremely low frequency of DNA insertion while using CRISPR, an inherent drawback of the system. We therefore decided to use wild-type CRISPR/Cas9 for the remainder of our assays; if the frequency of insertions was as low as indicated by our assays, we could not risk lowering the efficiency any further by lessening cut effectiveness with nickase.
CRISPR/Cas9 Knock-In of our Six/FRT Recombination Site/Beta Resolvase Half-Site
With testing out of the way, we attempted to use CRISPR/Cas9 to knock in the Six/FRT site necessary for Flp recombination in the second step of our project. This insertion provides a single Six/FRT site which FlpO recombinase can use to insert a plasmid containing the same Six/FRT sequence. This recombination event is elaborated on in the recombination section of the wiki, as well as in the general overview.
HEK293T cells were transfected with four different components integral to the function of our system:
- Cas9 Ribonucleoprotein (RNP) - A Cas9 complex which is complexed with the desired crRNA and tracrRNA prior to transfection, to avoid intracellular association difficulties
- DNA Template for CRISPR knock-in - A single stranded DNA template which contained our desired insertion, the Six/FRT resolvase sites, as well as homologous arms to promote insertion into the genome during HDR
- pCAG-FlpO - A plasmid from Addgene which codes for the protein FlpO, unnecessary for CRISPR activity but critical for the recombination component down the road
- FlpO-Beta Resolvase Hybrid Plasmid - A plasmid coding for beta resolvase and containing a single Six/FRT site, which causes it to integrate into the genome due to FlpO activity. Once again, this component is irrelevant for CRISPR function, but important for recombination later.
Alongside the transfection of the above components, a control was done by transfecting everything except FlpO, which would permit CRISPR integration but would not allow the secondary recombination step. A no transfection control was also tested.Transfected cells were isolated, and their DNA was purified and sequenced. The sequence and chromatogram data were analyzed for targeted insertion activity, and TIDE was used to analyze indels to get a general measure of CRISPR/Cas9 activity at our target site. Once again, our analysis did not show any of our desired CRISPR insertions in either the full assay or the no-FlpO control. However, both analyses showed significant CRISPR activity in the form of indels, with 26.2% indel activity in both trials. This result reinforces the idea that our CRISPR system is cutting as intended, but the issue lies in stimulating HDR to integrate our desired insertion upon repair.
As mentioned in the wild-type versus nickase comparison, our analysis methods did not survey the entirety of DNA from all transfected cells, only the portion which was grabbed for amplification and sequencing. Furthermore, PCR amplification can amplify the same fragment over and over again, under-representing the summation of all DNA reads. As a result, we are not fully convinced that we had no CRISPR/Cas9 knock-ins across all of our cells DNA, only across the DNA which was measured. Moving forward, we hope to continue to repeat our assay in the hopes of sequencing DNA which contains our desired insert in order to conclusively validate CRISPR knock-in.
WORKS CITED
Chiang, T.-W. W., le Sage, C., Larrieu, D., Demir, M., & Jackson, S. P. (2016). CRISPR-Cas9(D10A) nickase-based genotypic and phenotypic screening to enhance genome editing. Scientific Reports, 6, 24356.
Doench, J. G., Fusi, N., Sullender, M., Hegde, M., Vaimberg, E. W., Donovan, K. F., … Root, D. E. (2016). Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nature Biotechnology, 34(2), 184–191
Gagnon, J. A., Valen, E., Thyme, S. B., Huang, P., Ahkmetova, L., Pauli, A., … Schier, A. F. (2014). Efficient Mutagenesis by Cas9 Protein-Mediated Oligonucleotide Insertion and Large-Scale Assessment of Single-Guide RNAs. PLoS ONE, 9(5), e98186
Redman, M., King, A., Watson, C., & King, D. (2016). What is CRISPR/Cas9? Archives of Disease in Childhood. Education and Practice Edition, 101(4), 213–215.
Xu, H., Xiao, T., Chen, C.-H., Li, W., Meyer, C. A., Wu, Q., … Liu, X. S. (2015). Sequence determinants of improved CRISPR sgRNA design. Genome Research, 25(8), 1147–1157.