DESIGN CONSIDERATIONS
Our goal was to design a method of chronological event recording that can achieve temporal resolution of stimuli using CRISPR/Cas9 base editing technology. Rather than behave like a “molecular camera” that shows the average concentration of a stimulus over a period of time, our design acts as a “molecular movie camera”, recording the dynamic behaviors of stimuli and the times they occurred.
To accomplish our system design for recording, we had two major considerations: 1. Mechanism to keep track of elapsed time in the system. 2. Method to mark the presence of a specific stimulus in line with the time point.
Once our designed system finishes recording a stimulus in the system, there needs to be a way to tell at what time the stimulus occurred. Previous systems using CRISPR/Cas9 base editors to record stimuli used costly high-throughput sequencing or Sanger sequencing with an algorithm termed Sequalizer (Sequence Equalizer) to determine position-specific mutation frequencies [1][2]. Sequalizer was limited in analyzing mutation frequencies at only one DNA site, and it lacked a user-friendly installment and interface. To improve the ease of analyzing mutation frequencies and to reduce time and resources needed, we focused on two aspects for readout: 1. Method to determine timing of stimuli through a restriction enzyme digestion and agarose gel. 2. User-friendly software tool with multi-functionality in estimating nucleotide mutation frequencies and predicting off-target effects.
In the process of developing a molecular movie camera, we considered two different system designs: 1. Individual gRNA System 2. Oscillatory gRNA System
CUTSCENE: A Molecular Movie Camera
In both of our system designs, the system can be thought of as a molecular movie camera
Individual gRNA System
In our initial system of recording molecular events, there are two parts: 1. Writing plasmid with a base editor and two gRNAs. 2. Numerous recording plasmids with zero to five repeated target sequences
At the start, the system is induced with aTc and IPTG, producing the CRISPR/Cas9 base editor and a gRNA. To keep track of time, the IPTG-induced gRNA, gRNA 1, locates the target sequence on all the recording DNA plasmids at the first time point. With this, the system is ready to begin recording.
Keeping Track of Elapsed Time
In this system, there are multiple recording plasmids with different amounts of repeating DNA sequences. The amount of DNA repeats corresponds to how many time points the recording plasmid can record to. To keep track of time in this system, gRNA 1 leads the base editor to make a mutation at the first repeating unit in all of the recording plasmids. As mentioned previously in the Description, repeating units are designed such that a mutation made by the base editor to the current frame in the recording DNA allows gRNA 1 to recognize the next repeating frame in the DNA.
Marking the Presence of a Stimulus
Once the stimulus of interest is in the system, gRNA 2 is produced because it is driven by an inducible promoter detecting the stimulus. In our design, the stimulus was arabinose. gRNA 2 is able to recognize the current frame in the recording plasmids corresponding to the time and mark the recording DNA with a special mutation, designating when the stimulus occurred. This mutation is unique because it turns the corresponding recording DNA sequence to a known restriction enzyme site, MfeI (CAATTG).
A target sequence that corresponds to gRNA 2 locates at the end of the repeating sequences and will be marked as the appearance of gRNA 2. However, if this target sequence is totally different from the repeating unit, it will always be open to be marked. By making it overlap with part of the repeating unit, the mutation made by base editor/gRNA 1 complex is able to cap it, making this target sequence only open for a short period of time which corresponds to one time point.
Readout
When we were thinking of a readout that was cheaper and more accessible for researchers to use, we thought of a restriction enzyme digest and an agarose gel as a simple way to visualize the information recorded. Especially since the recording DNA units are limited in being only 10 bp in length, we needed to think of an output that could be accomplished by changing 1-2 base pairs in such a small window. Thus, we came with the idea of the base editor creating a restriction enzyme site when the stimulus was detected in the system. In the recording plasmid corresponding to the time point when the stimulus occurred, an MfeI site is created by the base editor. In addition, there is an MfeI site at a different location in the recording DNA. To visualize if a stimulus occurred, the recording plasmid will cut twice, creating a fragment of DNA.
To achieve better band resolution when running the recording plasmids on a gel, since repeating units are only 10 bp long, we carefully placed the other MfeI site at staggered locations to better distinguish the time points it occurred. For example, for the 1 frame recording plasmid, there is an MfeI site 100 bp upstream of the recording DNA sequence, for the 2 frame recording plasmid there is an MfeI site 200 bp upstream of the recording DNA sequence, and so on. After the recording process has occurred, the prepped DNA samples can be digested with MfeI and run on an agarose gel.
Modeling the Individual gRNA System
After completing the design for the individual gRNA system, we looked into modeling to see if we could obtain a clear readout of base editing on the recording plasmids at each time point [3]. Since tracking time through the shifting of frames by base editing is an important part of our design, we wanted to see if the frames could shift in a predictable manner. Ideally, the presence of a stimulus at a given time point would correspond to a distinct edit at only that time frame, and after that time point, there would be no more editing at that frame. However, modeling results demonstrate that a distinct readout is difficult to achieve. The model represents how much editing is happening at each time frame over time. Instead of seeing a digital signal for each time frame, there is no temporal resolution.
Oscilatory gRNA System
After modeling the individual gRNA system, we came up with another design to improve the temporal resolution in the readout. By utilizing an oscillatory gRNA mechanism to keep track of elapsed time in the system, we are able to obtain a more digital readout at each time frame. Compared to the individual gRNA system model, there are distinct peaks of editing that would allow us to correspond the editing at the time frame to the time it occurred.
In our oscillatory gRNA system, there are three parts: 1. Base editor plasmid 2. 4-gRNA plasmid 3. Recording plasmids with 4 repeated target sequences
Keeping Track of Elapsed Time
In comparison to the Individual gRNA system, where there is only one gRNA keeping track of the elapsed time before a stimulus occurs, the Oscillatory gRNA system has two gRNAs that keep track of time.
For our oscillatory gRNA system, we decided to use the blue light inducible promoter, PBLind-v1, and the blue light repressible promoter, PBLrep-v1. The inducible promoter has an EL222 binding region fused to the luxI promoter. In the off state, EL222 cannot bind to the DNA; blue light is necessary for the EL222 transcription factor to bind to an upstream binding region to recruit RNA polymerase that acts similarly to LuxR-based transcriptional activators. On the other hand, in the blue light repressible promoter, the EL222 binding region is located in consensus regions of the RNA polymerase binding site. Since the promoter is constitutive under no light, the presence of blue light causes EL222 to bind to the promoter area and repress transcription through steric hindrance [4].
In our system, gRNA A and gRNA B act as the oscillatory gRNAs. gRNA A is under the blue-light inducible promoter and gRNA B is under the blue-light repressible promoter. Under pulses of blue light, gRNA A and gRNA B are both able to switch on and off, exhibiting an oscillatory behavior.
Initially, when blue light is on, gRNA A leads the base editor to make a mutation at the first time frame. Again, due to the positioning and specific sequences of the repeating frames, the base edit made at the gRNA A target sequence now allows the edited sequence to be recognized by another gRNA-base editor complex. Revealing the oscillatory nature of this mechanism, gRNA B now matches the next repeated unit and the base editor can make a edit at that frame. This process of tracking time continues, with gRNA A and gRNA B alternating.
Marking the Presence of a Stimulus
Similar to the design of the Individual gRNA system, under the presence of a desired stimulus in the system, different gRNA is produced that matches the current frame in time and marks it with a special mutation, converting that location in the recording DNA into an MfeI restriction enzyme site. However, as there are now two gRNAs alternating to keep track of time, there are two gRNAs, gRNA A Stimulus and gRNA B Stimulus, induced by the same stimulus, that can recognize the current frame.
Characterizing Base Editing
While we were designing our systems, we wanted to test a previous characterized base editor system, CAMERA, developed by Dr. Liu’s lab at Harvard University. This system consisted of a high copy number recording plasmid and a low copy number writing plasmid with a base editor and a gRNA. These two plasmids, ordered from Addgene, were transformed into Escherichia coli S1030 that was also from Addgene. Base editor and the gRNA were induced by aTc and IPTG, respectively. By replicating their experiments, we wanted to confirm the linearity of base editing over time and wanted to test how much mutation frequency can we achieve. Different from the literature, we would not use High Throughput Sequencing to analyze the single nucleotide mutation, but would utilize Sanger sequencing and our software tool CrisPy to analyze the readout.
CrisPy
In working with CRISPR/Cas9 systems, it is important to consider off-target effects to improve the efficiency and accuracy of the gene editing. Thus, we developed a software tool, CrisPy, to provide a fast and accurate way for characterizing CRISPR on-target and off-target edits through Sanger sequencing. Software tools have attempted to numerically characterize DNA traces in the past. Mark Crowe’s SeqDoc (Perl, 2005) provided functionality for normalizing and comparing two traces, while Timothy Liu’s Sequalizer (Matlab, 2017) used these comparisons to estimate nucleotide mutation frequencies. Our system builds upon this previous work by utilizing modified versions of these algorithms in a free and concise python module. Moreover, this module creates new functionality by predicting and ranking likely off-target sequences (using Gibbs free energy) and finding their mutation frequencies. We tested this tool using a plasmid containing purposeful off-target sequences and consequently gained insight into our recording circuit’s behavior. It is our hope that synthetic biologists can use this tool to quickly test for off-target mutation frequencies that exist in their system.
Optimizing System
During the discussion with Dr. Seth Childers, the question about how can we increase the base editing rate was came up. According to his feedback and the parameters we found from literature during modeling, we realized the key limiting factor of editing rate was not the single nucleotide mutation catalyzed by Cytidine Deaminase, but was the release efficiency of CRISPR/Cas9 system from the DNA. Moreover, we also discovered that the replication of a plasmid would help the release of dCas9, since all binding protein would be released when polymerase replicates the plasmid. Since shortening the time between dilutions will increase the replication rate of plasmids, we wanted to see if doing so could make a distinct improvement on editing rate.
Future Work
We will continue to work on co-transform the three plasmids of Oscillatory gRNA System in to Escherichia coli S1030. Once we succeed, we’ll use the on and off of blue light to oscillate the two gRNA, and see if we could keep track of elapsed time and mark the appearance of stimuli. Both Sanger sequencing and the designed restriction enzyme site method will be used to compare the readout. Also, we may compare the ability of base editor by mutating from dCas9 to nCas9. In addition, since we view our system as a universal diagnostic tool, we may switch different promoters and test its ability to respond to various stimuli, such as pH and light.
References
[1] Tang, W. X., & Liu, D. R. (2018). Rewritable multi-event analog recording in bacterial and mammalian cells. Science, 360(6385), 169-+. doi:10.1126/science.aap8992 [2] Farzadfard, F., Gharaei, N., Higashikuni, Y., Jung, G., Cao, J., & Lu, T. K. (2018). Single-Nucleotide-Resolution Computing and Memory in Living Cells. bioRxiv. [3] Harris, L. A. et al. BioNetGen 2.2: advances in rule-based modeling. Bioinformatics 32, 3366–3368 (2016). [4] Jayaraman, P., Devarajan, K., Chua, T. K., Zhang, H., Gunawan, E., & Poh, C. L. (2016). Blue light-mediated transcriptional activation and repression of gene expression in bacteria. Nucleic acids research, 44(14), 6994-7005.