Design of the dumbell-probes and gRNAs
As a first miRNA target, we decided to consider let-7a-5p: this miRNA is not among the ones found to be relevant as melanoma biomarkers (as instead are other miRNAs of the let-7 family) [1, 2]; nonetheless, we thought it might be the best option to start from it as a proof of concept, because it was already well characterized for Rolling Circle Amplification (RCA) by Deng et al. [3] and Qiu et al. [4].
Qiu et al. [4], as well as our colleagues from the related 2016 iGEM team of NUDT China, had designed their probes in order for the amplicons to be recognized by a CRISPR-Cas 9 system. Since our project deals instead with CRISPR-Cas 12a, despite the miRNA sequence being the same, we therefore had to modify the sequences of our probes accordingly. More specifically, we had to adapt the PAM sequence (placed on the amplicon of the probe) in order to match our Cas protein (we worked with LbCpf1): while the requirement for Cas9 was NGG on the 3' of the amplicon, in our case we needed to have TTTN on the 5'. More details on the design are described in the section "Detailed design".
We wanted to test different designs of probes: some were conceived to have the PAM at the beginning of the larger loop of the amplicon (as in the probes from NUDT China), but we also investigated the case where the PAM was placed on the double-stranded part (the stem) instead; the sequence on the uncostrained large loop was also changed among the probes.
We ordered 10 different probes; the sequence and related notes are described in the Table below.
Note: The sequences of the probes include a phosphate group at the 5' end (in order to ligate the probes). We nonetheless always order the oligonucleotides without the phosphate (because the cost was significantly lower) and then performed phosphorylation by means of T4 Polynucleotide Kinase prior to ligation.
For each probe we ran an analysis of the secondary structure by means of available servers online (NUPACK [5], MFold [6]): in all cases the structure of the probe, of its amplicon and of the series of 4-5 copies of the amplicon were tested in order to check the absence of unwanted secondary structures. We also used RNAstructure DuplexFold [7] to test the secondary structure of the dimer probe/miRNA: we were not able to find a more suitable tool for the analysis of the duplex; nonetheless we believe that this server, despite its limitations with respect to our analysis (no possibility of having a circular probe, no possibility to have a DNA/RNA dimer), was enough to show qualitatively the interaction between our probe and let-7a.
Detailed design:
Analysis of given probes
We started our design from the analysis of one probe from Qiu et al., namely "let-7a probe 1" (Probe 2 for us). The sequence was the following one:
5’-pACCTCATTGTATAGCCCCCCCCTGAGGTAGTAGGTTGCCCAACTATA CAACCTACT -3’
where:
- the regions in italic are those belonging to the loops of the hairpin
- the regions in orange and green are those belonging to the stem of the hairpin (and which are complementary with each other)
- the underlined region is the one complementary to the miRNA (let-7a-5p: UGAGGUAGUAGGUUGUAUAGUU)
Such probe consists of a double-stranded stem part, a 10 bases-long loop (which from now on we will refer to as "small loop" - on the right in the figure above) and a 16 bases-long loop ("large loop" - on the left). As we can observe, the toehold region of the probe (i.e. the part on the small loop where the miRNA binds) is 7 bases long, in accordance with Deng et al., who proved it to be the optimal length to achieve both sensitivity and specificity.
The amplicon of such probe is therefore*:
5'-AGTAGGTTGTATAGTTGGGCAACCTACTACCTCAGGGGGGGGCTATACAATGAGGT-3’
where:
the sequence in bold is the one which is complementary to the gRNA (except for two mismatches, which is highlighted) and the region in red is the PAM sequence (in this case single stranded).
We emphasize here that the PAM sequence is on a single-stranded part of the amplicon (the one complementary to the large loop of the probe): therefore, such single-stranded PAM can only be present on the amplicon, and not on the probe itself (as would have been instead if the PAM was on a double stranded part).
The gRNA sequence (as indicated by Qiu et al.) is:
5’-ACUGUACAAACUACU|ACCUCA(GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCG) -3’
with the scaffold region indicated in parentheses. The region out of the brackets is the spacer, binding to the amplicon, and the sequence in italic corresponds in particular to the part of the spacer binding on the loop of the amplicon (with the rest of the spacer binding to the stem). The sign | indicates the position where the gRNA binds to the point on the amplicon where each new "copy" of the amplicon is considered to start (i.e. the point where the 3' of a "subunit" of the amplicon and the 5' of the successive subunit are linked together).
More specifically, we can notice that in this design the spacer coincides with the reverse complement of let-7a, with the exception of the two mismatches and of a missing A at the beginning. The template of the gRNA for Cas9 would therefore be:
5'-[reverse complement of miRNA]-[scaffold]-3'
The expected interaction between amplicon and gRNA is outlined in the figure below:
We can observe how the PAM sequence (in red in the figure) is located at the very beginning of the large loop in the amplicon, whereas the gRNA binds to the whole stem part and partially to the small loop.
* Here and after, when referring to the "amplicon sequence", we only show one single copy of the reverse transcript of the probe. The actual amplicon, by definition of Rolling Circle Amplification, is of course made instead of sequential copies of this "unitary" sequence.