Cells become cancerous because of changes in their genetic makeup. These same changes can result in proteins that are differentially expressed on the cancerous cells but not human cells. These are called neoantigens, and refer to new cancer antigens that can signal the immune system to attack the cancer and eliminate it.
A patient is diagnosed with a cancer tumor. A biopsy of the tumor and a biopsy of healthy tissue are acquired to perform whole exome sequencing on both biopsies. A bioinformatic tool (such as CAPOEIRA’s Ginga) processes the whole exome sequences of both the healthy and tumor biopsies used to identify neoantigens.
A specific neoantigen that is differentially expressed on tumor cells and not healthy cells is supplied to the patient through a vaccine formulation. Dendritic cells of the patient uptake the neoantigen from the vaccine formulation. Alongside the neoantigen, the vaccine formulation supplies an adjuvant that activates the dendritic cell to uptake foreign material, and perceive them as danger signals.
The dendritic cell then processes the neoantigen and cross-presents it on MHC-1 complexes on its surface, where naïve CD8+ T cells can recognize it. Once the naïve CD8+ cells recognize the neoantigen, they mature into cytotoxic CD8+ T cells that specifically attack cells expressing this neoantigen; in this case, the tumor cells.
Rising Importance of Cancer Vaccination
The immunogenicity of neoantigens leading to T-cell activation has long been demonstrated in patients (Wolfel et al., 1995). In fact, preclinical and clinical data has already shown that neoantigen specific cytotoxic T lymphocytes (CTLs) constitute the most potent T-cell populations for tumor rejection (Wolfel et al., 1995; Matsushita et al., 2012).
Still, the natural production of neoantigen-specific CTLs by a patient’s immune system is scarce because of low clonal frequency and ineffective presentation of neoantigens (Alexandrov et al., 2013; Zhu et al., 2017). Therefore, cancer vaccines or adjuvant cancer therapies (ACT) are crucial to potentiate immunity against neoantigens for cancer treatment. Hence, a large number of strategies have been progressed for the creation, formulation and delivery of various cancer vaccines; for example, whole tumor cell lysate, nucleotide (mRNA/ DNA), protein or peptide-based vaccines, dendritic cell (DC) based vaccines, viral vectors and biomaterial-assisted vaccines.
However, it remains challenging to develop a universal and effective delivery strategy to target neoantigen-based vaccines to professional antigen-presenting cells (APCs) for eliciting robust and potent T-cell responses against cancer.
In general, parenterally injected soluble antigens or adjuvants rapidly spread into the systemic circulation making them ineffective due to their small molecular sizes, poor targeting, and rapid draining in lymph nodes (LNs). This ultimately results in a limited immune response (Liu et al., 2014; Fifis et al., 2004).
In addition, even if such soluble tumor neoantigens are acquired by DCs, they would be trapped in endolysosomal compartments and digested into peptides, which are subsequently loaded almost entirely onto MHC class II molecules for presentation to CD4+ helper T-cells solely. However, for achieving an effective immune response, the therapeutic cancer vaccine is expected to elicit robust cytotoxic CD8+ T-cell responses, which is essential for tumor cell destruction (Janssen et al., 2005).
Thus, it is also key for cancer vaccines to enable cytosolic delivery of neoantigens for a successful activation of cytotoxic T-cell mediated immunity. Effectively, having a platform for neoantigen delivery is favourable for vaccine delivery as it protects antigen and adjuvant molecules from degradation and clearing, enhances lymphoid organ targeting, and modulates APCs’ functions for better presentation (Amigorena et al., 2010).
Encapsulin Antigen Delivery
In 2016, an article was published by Sebyung Kang and colleagues describing the employment of the protein cage nanoparticles, Encapsulin (Encap), as neoantigenic peptide nanocarriers by genetically incorporating the OT-1 peptide of ovalbumin (OVA) protein (used as vaccine for B16-OVA melanoma tumor model) to three different positions of the Encap subunit (Choi et al., 2016). This article motivated us to look further into Encapsulin as a strong candidate for the vaccine platform.
In the mentioned study (Choi et al., 2016), DCs that were pulsed with constructs of OT1-Encap-C (C-terminal fusion with OT-1 peptide) induced OT-1-specific CD8+ T cell proliferation both in vivo and in vitro. This indicates Encapsulin ability to enhance the uptake of the OT-1 peptides by dendritic cells and the subsequent presentation of these peptides to DC8+ T cells.
OT1-Encap-C presentation to DCs was also able to induce the differentiation of functional effector CD8+ T cells in murine spleen. Finally, OT-1-Encap subcutaneous vaccinations in B16-OVA melanoma tumor bearing mice effectively activated OT-1 peptide specific cytotoxic CD8+ T cells before or even after tumor generation, resulting in significant suppression of tumor growth in prophylactic as well as therapeutic treatments.
Encapsulin was thus chosen as the platform for CAPOEIRA’s vaccine system, for multiple reasons:
Encapsulin was shown to have an effective activation of dendritic and T cells in vitro and in vivo
Encapsulin allows for the easy conjugation of libraries of neoantigen, as this can be realized through genetic ligation of the neoantigen oligonucleotide sequences to the C-terminus of Encapsulin
Encapsulin, along with the neoantigens, can be expressed in a rapid and straightforward manner using the cell free expression system
Such expression systems might help in reducing the cost of generating libraries of peptides by other technologies such as solid-phase peptide synthesis
Encapsulin (Figure 2) is a protein cage nanoparticle found in the thermophilic bacteria Thermotoga maritima.
Its crystal structure has been recently solved, and was published in a paper in 2008 (Sutter et al., 2008). The Encapsulin multimer is assembled from 60 identical 31 kDa monomers having a thin and icosahedral T=1 symmetric cage structure, with interior and exterior diameters of 20 and 24 nm, respectively. The multimer automatically assembles from the monomers once expressed, as it leads to a lower energy state. The C-terminus is outward pointing, allowing for easy conjugation of peptides after the C-terminus (Moon et al., 2014).
The Encapsulin monomer was modified by inserting a Hexahistidine linker (GGGGGGHHHHHHGGGGG) between residues 43 and 44 of the WT Encapsulin (Moon et al., 2014). This was shown to convey exceptional heat stability and better hydrodynamic properties for the Encapsulin multimer. These properties are crucial to obtain a simpler and more efficient purification of the Encapsulin protein.
Vaccine Design Project
The vaccine design process aimed at establishing a platform that receives a library of neoantigens from Ginga, and outputs a library of vaccines that incorporate these neoantigens on the surface of Encapsulin (Figure 3).
A major requirement of a neoantigen vaccine is allowing for the facile and secure introduction of neoantigen libraries onto the scaffold/carrier. Using Encapsulin, one accessible method for such a conjugation would be the genetic ligation of the neoantigen oligonucleotide sequence to the C-terminus of Encapsulin, as depicted in Figure 4.
After acquiring the raw Encapsulin sequence from the LBNC lab at EPFL (Cassidy-Amstutz et al., 2016; Addgene Catalogue # 86405), we genetically introduced a HexaHistidine linker between Amino Acids 43 & 44 to create HexaHistidine Encapsulin, which was reported to have higher heat resistance and better hydrodynamic properties (Moon et al., 2014). This modification was done using a Golden Gate assembly with BsaI as a type IIS restriction enzyme. The insert was assembled from two synthesized oligos (60 bp each which partially anneal) with BsaI cut sites. The insert was converted to dsDNA using PCR. The Original Encapsulin plasmid was amplified using primers incorporating BsaI cut sites and the insert was incorporated using Golden Gate.
To obtain a rapid, efficient, and reliable incorporation of neoantigens onto the HexaHistidine Encapsulin platform, we designed the plasmid HexaHistidine Encapsulin-CBsaI (Figure 5) (Registry Part BBa_K2686005). Starting from the HexaHistidine Encapsulin plasmid, we introduce at the C-terminus an sfGFP CDS under its native promoter flanked by two BsaI cut sites.
The BsaI cut sites would allow for the rapid, scarless introduction of oligonucleotides encoding for the neoantigens using Golden Gate Assembly (Figures 5 & 6). These neoantigens would be fused to the C-terminus of Encapsulin, and displayed on its outer surface. Such a system allows for a reliable, but fast expression of libraries of encapsulin-neoantigens.
The insert in between the two BsaI cut sites, consisting of sfGFP with a native promoter and terminator, allows for checking the success of the insertion of the neoantigen after transformation of cells with the Golden Gate product (green colonies do not contain the desired peptide insert, but the original plasmid instead). This cloning strategy was useful in the initial characterization of the system and production of the encapsulin fused with OT-1 peptide. For high-throughput production of encapsulin-neoantigen constructs, different strategies avoiding in vivo could be envisioned.
We exploited the fact that Encapsulin is made of protein exclusively, and thus, can be fully expressed as a recombinant protein in a bacterial expression system. However, accelerating the pace of the vaccine production requires a new approach for the rapid expression of proteins encoded on plasmid/linear DNA constructs. Current standard bacterial expression systems require days due to cloning and in-vivo transformations.
This is why CAPOEIRA uses a cell free expression approach, which preserves the protein production capability and regulatory mechanisms of E. coli. Cell-free systems (Figure 7) use all of the inner workings of a cell without having the constricting boundary of the cell wall and thus the precondition of keeping cells alive (Rollin et al., 2013). This allows speeding the design-expression process. When preparing the cell-free systems, all genomic DNA and membranes are eliminated, resulting in a solution containing all of the cells proteins without the limiting factors of a living cell.
The cell free expression has 2 advantages in for CAPOEIRA:
Faster expression of proteins from DNA constructs (8 to 10 hours of expression), allowing for fast and easy expression of libraries of proteins
Faster & Easier purification of protein products from cell free expression reactions compared to purification from cells
The combination of a protein with high heat resistance further improved after Histag modification, along with a cell free expression system allows for an efficient one-step heat purification of our vaccine product. In short, after the expression of the vaccine construct using the cell free expression system (which takes around 10 hours), heat purification of the sample goes as follows (Figure 8):
Heating at 70 ºC for 20 min
Putting on ice for 15 min
Centrifugation at 12,000 xg for 10 min
Separation of the supernatant (containing the purified vaccine construct) from the pellet
This simple heat purification step allows for an exceptional purity of CAPOEIRA’s vaccine system in less than an hour. After the heat purification step, the obtained purity might be very close to a final formulation for vaccine delivery.
Alexandrov, Ludmil B., et al. "Signatures of mutational processes in human cancer." Nature, 500.7463 (2013): 415.
Amigorena, Sebastian, and Ariel Savina. "Intracellular mechanisms of antigen cross presentation in dendritic cells." Current opinion in immunology, 22.1 (2010): 109-117.
Cassidy-Amstutz, Caleb, et al. "Identification of a minimal peptide tag for in vivo and in vitro loading of encapsulin." Biochemistry, 55.24 (2016): 3461-3468.
Choi, Bongseo, et al. "Effective delivery of antigen–encapsulin nanoparticle fusions to dendritic cells leads to antigen-specific cytotoxic T cell activation and tumor rejection." ACS nano, 10.8 (2016): 7339-7350.
Fifis, Theodora, et al. "Size-dependent immunogenicity: therapeutic and protective properties of nano-vaccines against tumors." The Journal of Immunology, 173.5 (2004): 3148-3154.
Janssen, Edith M., et al. "CD4+ T-cell help controls CD8+ T-cell memory via TRAIL-mediated activation-induced cell death." Nature, 434.7029 (2005): 88.
Liu, Haipeng, et al. "Structure-based programming of lymph-node targeting in molecular vaccines." Nature, 507.7493 (2014): 519.
Matsushita, Hirokazu, et al. "Cancer exome analysis reveals a T-cell-dependent mechanism of cancer immunoediting." Nature, 482.7385 (2012): 400.
Moon, Hyojin, et al. "Developing genetically engineered encapsulin protein cage nanoparticles as a targeted delivery nanoplatform." Biomacromolecules, 15.10 (2014): 3794-3801.
Rollin, Joseph A., Tsz Kin Tam, and Y-H. Percival Zhang. "New biotechnology paradigm: cell-free biosystems for biomanufacturing." Green chemistry, 15.7 (2013): 1708-1719.
Sutter, Markus, et al. "Structural basis of enzyme encapsulation into a bacterial nanocompartment." Nature structural & molecular biology, 15.9 (2008): 939.
Wolfel, Thomas, et al. "A p16INK4a-insensitive CDK4 mutant targeted by cytolytic T lymphocytes in a human melanoma." Science, 269.5228 (1995): 1281-1284.
Zhu, Guizhi, et al. "Efficient nanovaccine delivery in cancer immunotherapy." ACS nano, 11.3 (2017): 2387-2392.
Through our interviews with health specialists and oncology experts (more information in Integrated human practices) we assessed the necessity to have a non-invasive treatment companion to determine our vaccine efficacy. Here, we want to provide a proof-of-concept that would allow us to monitor the patient’s response by using the same set of identified neoantigens used for our vaccine.
We also believe that it is important to be able to detect relapses in early melanoma stages, as the survival rates for patients dramatically drop to 20% in stage IV compared to 99% survival rate in stage I and II (Siegel et al., 2018).
To answer these needs, we envision a new generation of diagnostic tools by which a liquid peripheral blood draw could give an accurate prognosis regarding the elimination of the tumor cells and, by targeting specific biomarkers, be a good predictor of relapse. This requires a detection system that is both highly sensitive and specific since single base pair polymorphisms, barely detectable in the blood, can lead to tumorigenesis.
Our idea is to develop a Cas12a detection system coupled to an amplification step. This detection system is rapid, sensitive and specific enough to reliably detect these biomarkers.
Recently, several studies have shown that non-invasive liquid biopsy methods are a promising way to detect cancer relapse and monitor tumor regression (Heitzer et al., 2017). Liquid biopsies represent a fast, reliable and easy way to obtain samples compared to the invasive nature of solid biopsies which are generally time-consuming, difficult to perform frequently and not without some risks to the patient.
Circulating free DNA (cfDNA) is a common term that refers to all the DNA fragments that are present in the blood. This fragmented DNA is thought to originate from apoptotic cells (Harris et al., 2016). In cancer patients the proportion of cfDNAs from necrotic tumor cells - known as “circulating tumor DNA” (ctDNA) - represents a large part of the circulating DNA.
These short DNA fragments of size ranging from 100bp to 200bp - with a peak at 145bp (Underhill et al., 2016) - contain virtually all the possible genetic defects that can be found in the original tumor cell population, including somatic point mutations and translocations (Harris et al., 2016; Calapre et al., 2017). Moreover, literature has shown that levels of ctDNA in the blood are correlated with progression or remission of disease in several cancers, including melanoma (Gray et al., 2015; Girotti et al., 2016; Tsao et al., 2015; Calapre et al., 2017).
Our goal using ctDNA as biomarkers is to come up with a personalized follow-up, and the personalized touch comes back again from our implemented bioinformatic software: Ginga. Indeed, Ginga takes as an input the genetic sequence of the tumor, to generate not only a list of neoantigens that will form the basis of our vaccine, but also a library of another molecular alteration specific to the tumor, namely chromosomal rearrangements, that we will target for relapse detection.
Our goal here is to detect the point-mutated sequences that code for the neoantigens we have selected for our vaccine. More precisely, we seek to quantify the presence of these sequences in the bloodstream through ctDNA. This gives us the opportunity to monitor tumor remission directly by studying the patient’s blood.
As part of any cancer therapy, there is always a need to be vigilant against any recurrence, since it can occur at any time: indeed, although the targeted cell populations have been eliminated, other cells may have survived and resurface after some time. To address this problem, we want to detect the sequences of the individualized junctions identified using our bioinformatic pipeline directly in the blood, using our CRISPR-Cas12a based assay. The detection of such sequences will alert the patient of a potential relapse and the need for a closer follow-up, which can have a lead time of up to 11 months in detecting relapses over clinical established methods in some types of cancers, according to Olsson et al., 2015.
Cancer relapse detection through miRNA
MicroRNAs (miRNAs) are short (18-24 nt) non-coding RNA molecules which act as post-transcriptional regulators of gene expression. Over the years, miRNAs have been proved to play a critical role in a variety of different diseases, including cancer (Larrea et al., 2016). Moreover, miRNAs are remarkably stable in human plasma (Mitchell et al., 2008), and several miRNAs circulating in the blood have recently been shown to be dysregulated (either over- or under-expressed) in patients with certain cancers, including melanoma, with respect to healthy subjects (Mirzaei et al., 2016). For these reasons, miRNAs have been proposed as potential prognostic and diagnostic biomarkers for melanoma, which makes them suitable candidates for the follow-up part of our project as well.
Previous iGEM teams (e.g. NUDT China 2016 team) have shown promising results with Rolling Circle Amplification of miRNAs by means of dumbbell-shaped probes (details in “Amplification”). Our aim is to investigate whether is possible to combine this dumbbell probe design with a Cas12a system to achieve a sensitive and specific detection assay.
To answer the need for a fast and robust detection method we chose to work with the newly characterized Cas12a (Cpf1) protein.
CRISPR-Cas (clustered regularly interspaced short palindromic repeats–CRISPR-associated) systems are originally inspired by an antiviral defense mechanism used by prokaryotes which work by recognizing and cleaving the foreign DNA/RNA. They have, in the recent years, widely been used as a gene editing tool for their ability to find and cut at a specific site allowing the insertion of a desired sequence. This target sequence is what we call the activator.
In the case of Cas12a this activator is composed of two different strands: the target strand (TS) and the non-target strand (NTS). The NTS requires a T-rich protospacer adjacent motif (PAM) sequence whereas the TS contains the sequence we want to detect. CRISPR scans all PAM sequences in the genome and compares its loaded guide RNA (gRNA) with all possible adjacent target sequences. When Cas12a finds its target, it undergoes a conformational change and cleaves the activator: its double stranded DNA (dsDNA) target.
It is also worth mentioning that Cas12a proteins retains the capacity to recognize and cleave ssDNA without any PAM sequence.
As a result of its conformational change upon target recognition, Cas12a unleashes a non-specific endonuclease activity (i.e. collateral cleavage) virtually against any single stranded DNA (ssDNA). Each activated Cas12a protein can cleave huge numbers of ssDNA molecules, and this is what makes this system so suitable for detection, as it greatly amplifies the signal. As explained more in detail in “Fluorescent readout”, by coupling this property to a single-stranded FQ reporter, we can hugely increase even very small signals, which means higher sensitivity for this system.
In our assays we worked with the purified Lba-Cas12a (type V-A CRISPR) extracted from Lachnospiraceae bacterium ND2006 and provided by New England BioLabs.
The gRNA must contain a 17 to 24bp complementary sequence to the dsDNA of interest. For activating Cas12a and further collateral cleavage, it is crucial that the activator incorporates a T-rich PAM sequence, TTTN, 5’ of the target sequence. Once the protein has recognized the PAM sequence and the gRNA has bound the complementary sequence, the staggered cut will occur around 18 bases 3′ of the PAM and leaves 5′ overhanging ends (Zetsche et al., 2017).
Our gRNAs were transcribed using T7 polymerase starting from a ssDNA with the coding sequence downstream of a T7 promoter.
An appropriate design of the gRNA-coding ssDNA consists of three separate parts in the following order:
T7 promoter (5’-ctTAATACGACTCACTATAgg-3’): This is needed for the transcription and the sequence will not appear in the final gRNA. To increase the polymerase efficiency, it is recommended to add 1, 2 or 3 G’s right after the promoter (New England BioLabs) as well as adding CT upstream of it (Baklanov et al., 1996)
Scaffold (5’-TAATTTCTACTAAGTGTAGAT-3’): This sequence can change according to the Cas12a species - the one shown here is specific for LBa Cas12a (Zetsche et al., 2017)
Spacer: It is the gRNA sequence that is complementary to the activator sequence (TS). For the ctDNA group we chose to use shorter guide sequences (17 bp rather than 20) for detecting both single base polymorphism and chromosomal rearrangements, based on the work done by Li et al., 2018, where they proved that shorter guide sequences yielded higher cleavage specificity
The T7 polymerase needs a double stranded region to bind to. It is thus necessary to order a primer for this region. The rest of the sequence can stay single stranded for a lower cost.
Following Chen et al., 2018, we designed a Cas12a detection assay based on the cleavage of DNaseAlert (IDT), which are fluorescence-quenched oligonucleotide probes that emit a fluorescent signal after DNAse degradation: when DNases are present, the linkage between the fluorophore and its quencher is cleaved, which leads to the emission of a bright signal upon excitation at 535-556 nm (Integrated DNA Technologies).
By exploiting indiscriminate cleavage of the Cas12a protein that is triggered upon target recognition, we were able to obtain a fluorescent reading following the cleavage of our reporter molecules. This allows for a rapid and sensitive detection of the dsDNA activator.
A simple blood draw is necessary for both our treatment companion and relapse detection.
The analysis of our biomarkers is done directly in the plasma, without the need to isolate them, sparing us precious time, costs and unnecessary contamination that can occur during nucleic acid extraction (Abe, 2003). The first step for our sample preparation is the isolation of plasma from whole blood. As part of our experiments on ctDNA, we used commercially ordered human plasma for both practical and ethical reasons. The next step is to treat it with PBS then heat it at 95°C for 3 minutes to precipitate proteins.
Sample preparation for miRNA can theoretically be achieved in a similar way: Qiu et al., 2018 showed that is possible to perform amplification of miRNA directly in serum samples pre-diluted in DEPC-treated water and boiled at 95 °C for 10 minutes. We expect that a similar protocol might be applied also to plasma for miRNA, as measurements of miRNA between plasma and serum have been found to be highly correlated (Mitchell et al., 2008).
Amplification of each biomarker is done afterwards, in order to have enough copies to be able to perform the Cas12a assay effectively.
Due to the very low concentration of ctDNA in blood it is necessary to amplify the target prior to Cas12a detection assay. We chose PCR as it is a common practice in most laboratories.
It is important to note that it is possible to replace this method with an isothermal amplification, like LAMP or RPA, to get this assay closer to point of care.
One of the limitation of a Cas12a is the need for a PAM sequence near the target we want to detect. Following Li et al., 2018 and to overcome this limitation, we designed primers that would add the PAM sequence by introducing synthetic mutations. This enables us to virtually target any desired sequence regardless of existence of a T-rich PAM sequence near the target.
Although miRNAs are potentially very valid candidates as biomarkers, they are associated with some hurdles (particularly low abundance) which are not completely overcome by currently existing detection methods (Miao et al., 2015).
Among different recent amplification techniques, Rolling Circle Amplification has been proved to be one of the most suitable, thanks to its robustness, simplicity, specificity and high sensitivity (Cheng et al., 2009). Rolling-Circle Amplification (RCA) is an isothermal amplification (contrarily for instance to Polymerase Chain Reaction) where miRNA (or another short RNA or DNA sequence) is amplified by means of a circular DNA template (i.e. a probe) and a special DNA (or RNA) polymerase: the miRNA acts as a primer, with the RCA product (i.e. the amplicon) consisting in a concatemer containing tens to hundreds of tandem repeats that are complementary to the probe (Ali et al., 2014).
Toehold-initiated Rolling Circle Amplification (tiRCA), in particular, employs phi-29 DNA polymerase and is based on structure-switchable dumbbell-shaped probes (Deng et al., 2014): upon hybridization with the specific target miRNA, one of the two strands of the double-stranded region of the probe is displaced, resulting in an "activated" circular form of the probe with triggers the start of the RCA reaction. The complete mechanism of RCA is shown in Figure 6:
Although it is the probe - and not directly the miRNA - to be amplified, RCA allows to significantly increase the concentration of the miRNA sequence in solution: indeed, since a large portion of the probe is complementary to the miRNA, the amplicon of the probe will incorporate several copies of the original miRNA. This can theoretically be exploited to increase the sensitivity of an assay for quantification of miRNA. As later explained, while our Amplification step was mostly inspired by Qiu et al., 2018, we explored a new, ambitious Detection step after RCA based on Cas12a (and not on Cas9 and split reporter proteins). This implied designing new probes with specific characteristics for Cas12a, as explained in the following sections.
The first miRNA we decided to target is let-7a-5p: this miRNA is not among the ones found to be relevant as melanoma biomarkers (as instead are other miRNAs of the let-7 family) (Larrea et al., 2016; Mirzaei et al., 2016); nonetheless, we thought it might
be the best option to start from it as a proof of concept, because it was already well characterized for Rolling Circle Amplification (RCA) by Deng et al., 2014 and Qiu et al., 2018
Qiu et al., 2018, as well as our colleagues from the related 2016 iGEM team of NUDT China, had designed their probes in order for the amplicons to be recognized by a CRISPR-Cas 9 system. Since our project deals instead with CRISPR-Cas
12a, despite the miRNA sequence being the same, we therefore had to modify the sequences of our probes accordingly. More specifically, we had to adapt the PAM sequence (placed on the amplicon of the probe) in order to match
our Cas protein (we worked with LbCpf1): while the requirement for Cas9 was NGG on the 3' of the amplicon, in our case we needed to have TTTN on the 5'. More details on the design are described in the section "Detailed design".
We wanted to test different designs of probes: some were conceived to have the PAM at the beginning of the larger loop of the amplicon (as in the probes from NUDT China), but we also investigated the case where the PAM was placed
on the double-stranded part (the stem) instead; the sequence on the uncostrained large loop was also changed among the probes.
We ordered 10 different probes; the sequence and related notes are described in the Table below.
Probe designed by our team for Cas 12a. PAM on the large loop of the amplicon. Single base mismatch on the stem with respect to the target miRNA sequence.
Note: The sequences of the probes include a phosphate group at the 5' end (in order to ligate the probes). We nonetheless always ordered the oligonucleotides without the phosphate (because the cost was significantly lower) and
then performed phosphorylation by means of T4 Polynucleotide Kinase prior to ligation.
For each probe we ran an analysis of the secondary structure by means of available servers online (NUPACK, MFold): in all cases the structure of the probe, of its amplicon and of the series of 4-5 copies of the amplicon
were tested in order to check the absence of unwanted secondary structures. We also used RNAstructure DuplexFold to test the secondary structure of the dimer probe/miRNA: we were not able to find a more suitable tool for
the analysis of the duplex; nonetheless we believe that this server, despite its limitations with respect to our analysis (no possibility of having a circular probe, no possibility to have a DNA/RNA dimer), was enough to show
qualitatively the interaction between our probe and let-7a.
Two main alternatives are suitable in order to test the efficacy of Rolling Circle Amplification (Deng et al., 2014; Qiu et al., 2018). First of all, the amplicons can be tested by means of an agarose gel to verify the size; nonetheless, this method shows some limitations because of the large size of the amplicons.
A more valid alternative is instead to perform a real-time fluorescence measurement by means of SYBR Green I.
SYBR green I is an intercalating dye that preferentially binds to minor grooves of double-stranded (dsDNA) (Zipper et al., 2004). It has also been shown to bind to single-stranded DNA (ssDNA) and RNA (for which instead SYBR Green II is a more suitable option (Sigma-Aldrich)), but with a significantly lower performance (Vitzthum et al., 1999).
When complexed with nucleid acid, SYBR Green I absorbs blue light (maximum excitation wavelength is 497 nm) and emits green light (emission peak at 520 nm) (Sigma-Aldrich), which makes it suitable for quantification - by means of a plate reader - of the DNA amplicons (i.e. the reverse complement of the probes) from our Rolling Circle Amplification (RCA).
Indeed, since we verified in all cases the absence of unwanted secondary structures (more details in Detailed Design), the stems in the probes and in the amplicons are the only double-stranded targets to which SYBR Green I can preferentially bind: this allows to observe the increase over time in the size of the amplicon during RCA.
the regions in italic are those belonging to the loops of the hairpin
the regions in orange and green are those belonging to the stem of the hairpin (and which are complementary with each other)
the underlined region is the one complementary to the miRNA (let-7a-5p: UGAGGUAGUAGGUUGUAUAGUU)
Such probe consists of a double-stranded stem part, a 10 bases-long loop (which from now on we will refer to as "small loop" - on the right in the figure above) and a 16 bases-long loop ("large loop" - on the left). As we can
observe, the toehold region of the probe (i.e. the part on the small loop where the miRNA binds) is 7 bases long, in accordance with Deng et al., 2014, who proved it to be the optimal length to achieve both sensitivity and specificity.
the sequence in bold is the one which is complementary to the gRNA (except for two mismatches, which are highlighted) and the region in red is the PAM sequence (in this case single stranded).
We emphasize here that the PAM sequence is on a single-stranded part of the amplicon (the one complementary to the large loop of the probe): therefore, such single-stranded PAM can only be present on the amplicon, and not on the probe itself (as would have been instead if the PAM was on a double stranded part).
with the scaffold region indicated in parentheses. The region out of the brackets is the spacer, binding to the amplicon, and the sequence in italic corresponds in particular to the part of the spacer binding on the loop of the
amplicon (with the rest of the spacer binding to the stem). The sign | indicates the position where the gRNA binds to the point on the amplicon where each new "copy" of the amplicon is considered to start (i.e. the point where
the 3' of a "subunit" of the amplicon and the 5' of the successive subunit are linked together).
More specifically, we can notice that in this design the spacer coincides with the reverse complement of let-7a, with the exception of the two mismatches and of a missing A at the beginning. The template of the gRNA for Cas9
would therefore be:
5'-[reverse complement of miRNA]-[scaffold]-3'
The expected interaction between amplicon and gRNA is outlined in Figure 9:
We can observe how the PAM sequence (in red in the figure) is located at the very beginning of the large loop in the amplicon, whereas the gRNA binds to the whole stem part and partially to the small loop.
*Here and after, when referring to the "amplicon sequence", we only show one single copy of the reverse transcript of the probe. The actual amplicon, by definition of Rolling Circle Amplification, is of course made instead of
sequential copies of this "unitary" sequence.
We then tried to design our own probes for Cas 12a, working backwards from the gRNA.
Contrarily to Cas 9, for which the PAM must be on the 3' side of the target, for Cas12a the PAM must be on the 5’ side of the target instead. This implies that the scaffold part of the gRNA must be on the 5’ side (instead of the 3’) as well (Figure 10).
Below is shown a direct comparison of the interaction between target amplicon and gRNA for Cas 9 and Cas 12a.
We therefore conclude that the template for our guide RNA for Cas 12a should be:
where the sequence in parentheses indicates the scaffold of the gRNA for LbCas12a. The sequence out of the brackets is the spacer, binding to the amplicon, and in particular the sequence in italic corresponds to the part binding on the loop of the amplicon.
The spacer is therefore 22 bases long (as let-7a-5p), 15 of which bind to the stem part of the amplicon and the remaining 7 bind to the small loop of the amplicon. Note that the gRNA for Cas9 from Qiu et al., 2018 was instead 21 bases long (15 and 6): we decided to add one more base at the end to completely match the length of the miRNA.
We can notice that also in this design the spacer has to coincide with the reverse complement of let-7a (as for Cas 9) . The template of the gRNA for Cas12a would therefore be:
We then proceeded to define the bases for the Ns, aiming not to have unwanted minor secondary structures (e.g. smaller loops) in the loops. This was done mostly by considering pairing principles, e.g. avoiding non-Watson-Crick interaction (e.g. T-G) which might be thermodynamically favoured or trying not to have complementary bases with more than 1 base in between (which might lead to hairpin loops). In all cases, the minimum free energy structure (MFE) was plotted by means of the available software (NUPACK, Mfold), both for the amplicon and the probe - i.e. its reverse complement-, to check that the intended dumbbell shape was indeed achieved.
We also wanted to test the case of probes with the amplicon having the PAM sequence not on the large loop, but on the stem instead (i.e. a double-stranded PAM, as usually required in Cas systems, and not single-stranded). We considered in this case three different alternatives:
Changing 4 bases in the large loop in order for them to be complementary to the PAM sequence, without adding more bases. This leads to a 19 bases-long stem, a 10 bases-long "small" loop and a 8 bases-long "large" loop. The template sequence of the amplicon is the following one:
5’-ATAGTTN'AAANNNNNNNNTTTNAACTATACAACCTACNNNTGAGGTAGTAGGTTGT-3’ (with N' being the base complementary to the N in the PAM)
Inserting 4 more bases complementary to the PAM on one end of the large loop (after ATAGTT), without changing any base. This leads to a 19 bases-long stem, a 10 bases-long small loop and a 12 bases-long large loop. The template sequence of the amplicon is the following one:
Inserting 4 more bases complementary to the PAM on one end of the large loop (after ATAGTT) and 4 more bases at the other end of the large loop (before the PAM sequence), in order to keep the original length of the large loop (16 bases). This leads to a 19 bases-long stem, a 10 bases-long small loop and a 16 bases-long large loop. The template sequence of the amplicon is the following one:
Halfway through our project (see Notebook for more details), after starting testing our amplicons with Cas12a and the fluorescent reporter (DNase Alert), we realized that the probe itself (more specifically the product of RCA in the absence of miRNA, i.e. with no amplicon) was triggering the Cas system causing a very high fluorescence signal, comparable to the signal obtained for the samples with miRNA (i.e. with probe+amplicon).
We hypothesized that this was due to the fact the our Cas12a was working PAM-independently (more details in "Promiscuous Cas12a activation: probes as a target" in Results). More specifically, our gRNA was meant to target the whole stem (and in addition 7 bases in the small loop) of the amplicon; since the stem is double-stranded, the target sequence for the gRNA is also present in the probe (in the opposite strand).
This would not have been a problem if the Cas had been working, as expected, PAM-dependently, because the PAM is only contained in the amplicon, not in the probe. Nonetheless, if the Cas does not need the PAM sequence, but simple recognizes a target from the sequence of the gRNA, then also the probe itself is recognized as a target. Moreover, since the concentration of the probe in the RCA reaction is higher than the expected concentration of amplicon, the signal from the probe behaves as noise, overcoming the signal of interest (i.e. from the amplicon).
We therefore designed a new guide RNA with the aim of targeting only the amplicon and not the probe. Our idea was to have the gRNA binding not on the stem, but on the large loop of the amplicon instead. Since the loops of the amplicon are single-stranded (and not double-stranded as the stem) this should allow the gRNA to target only the amplicon and not the probe, being the target sequence contained only in the amplicon and not in its reverse-complement: more specifically, we decided to design a guide RNA perfectly complementary to the large loop of the amplicon of Probe 1; in this way Probe 1, having on the contrary exactly the same sequence as the gRNA, should have never been targeted by this new gRNA.
As from the template gRNA above (5'-[scaffold]-[reverse complement of miRNA]-3'), the spacer was therefore modified to bind (with perfect match) to the large loop of the amplicon of probe 1.
Two different designs were tested, one - referred to elsewhere as "S_1" - binding to the whole large loop and to the first 4 bases after the large loop (for a total of a 20 bases-long spacer), and one - "L_1" elsewhere - binding only to the large loop (16 bases-long spacer). The complete sequences are the following ones:
The comparison between the mode of action of the previous, original gRNA and the "new" ones is better explained in Figure 12:
Our detection scheme
We envision a follow-up based on repeated liquid biopsies in order to track the sequences that have been identified using our bioinformatic software, amplified by either PCR, isothermal amplification or RCA, and finally detected directly in the plasma using our Cas12a based system.
In the following example, the patient receives our treatment based on a cocktail of neoantigens presented on the surface of the encapsulin. The target population decreases with time, which suggests a response to our immunotherapy-based vaccine for a certain period of time but, due to the emergence of resistance or the survival of another cell population, the patient relapses. Chromosomal rearrangements and miRNAs are then the object of our detection, and would suggest in this particular case a potential relapse. It is then strongly recommended for the patient to carry out a clinical test (biopsy, imaging, endoscopy) for confirmation.
Abe, Kenji. "Direct PCR from Serum." PCR Protocols. Humana Press, 2003. 161-166.
Ali, M. Monsur, et al. "Rolling circle amplification: a versatile tool for chemical biology, materials science and medicine." Chemical Society Reviews, 43.10 (2014): 3324-3341.
Baklanov, Michail M., Larisa N. Golikova, and Enrst G. Malygin. "Effect on DNA transcription of nucleotide sequences upstream to T7 promoter." Nucleic acids research, 24.18 (1996): 3659-3660.
Calapre, Leslie, et al. "Circulating tumor DNA (ctDNA) as a liquid biopsy for melanoma." Cancer letters, 404 (2017): 62-69.
Miao, Peng, et al. "Ultrasensitive detection of microRNA through rolling circle amplification on a DNA tetrahedron decorated electrode." Bioconjugate chemistry, 26.3 (2015): 602-607.
Mirzaei, Hamed, et al. "MicroRNAs as potential diagnostic and prognostic biomarkers in melanoma." European journal of cancer, 53 (2016): 25-32.
Mitchell, Patrick S., et al. "Circulating microRNAs as stable blood-based markers for cancer detection." Proceedings of the National Academy of Sciences, 105.30 (2008): 10513-10518.
Olsson, E. et al. Serial monitoring of circulating tumor DNA in patients with primary breast cancer for detection of occult metastatic disease. EMBO Mol Med, 7, 1034–1047 (2015).
Qiu, Xin-Yuan, et al. "Highly Effective and Low-Cost MicroRNA Detection with CRISPR-Cas9." ACS synthetic biology, 7.3 (2018): 807-813.
Reuter, Jessica S., and David H. Mathews. "RNAstructure: software for RNA secondary structure prediction and analysis." BMC bioinformatics, 11.1 (2010): 129.
"sgRNA Synthesis Using the HiScribe™ Quick T7 High Yield RNA Synthesis Kit" - New England BioLabs website. URL:https://international.neb.com/protocols/2015/11/24/sgrna-synthesis-using-the-hiscribe-quick-t7-high-yield-rna-synthesis-kit-neb-e2050 (Accessed 14/10/2018)
Siegel, R. L., Miller, K. D. and Jemal, A. "Cancer statistics, 2018." CA: A Cancer Journal for Clinicians, (2018) 68: 7-30.
"SYBR Green I nucleic acid gel stain" - Sigma-Aldrich. Datasheet. URL: https://www.sigmaaldrich.com/content/dam/sigma-aldrich/docs/Sigma-Aldrich/Datasheet/s9430dat.pdf (Accessed 11/10/2018)
"SYBR Green II RNA Gel Stain" - Sigma-Aldrich. Datasheet. URL: https://www.sigmaaldrich.com/content/dam/sigma-aldrich/docs/Sigma/Datasheet/2/s9305dat.pdf (Accessed 11/10/2018)
Tsao, Simon Chang-Hao, et al. "Monitoring response to therapy in melanoma by quantifying circulating tumour DNA with droplet digital PCR for BRAF and NRAS mutations." Scientific reports, 5 (2015): 11198.
Underhill, Hunter R., et al. "Fragment length of circulating tumor DNA." PLoS genetics, 12.7 (2016): e1006162.
Vitzthum, Frank, et al. "A quantitative fluorescence-based microplate assay for the determination of double-stranded DNA using SYBR Green I and a standard ultraviolet transilluminator gel imaging system." Analytical biochemistry, 276.1 (1999): 59-64.
Xie, Kabin, and Yinong Yang. "RNA-guided genome editing in plants using a CRISPR–Cas system." Molecular plant, 6.6 (2013): 1975-1983.
Zadeh, Joseph N., et al. "NUPACK: analysis and design of nucleic acid systems." Journal of computational chemistry, 32.1 (2011): 170-173.
Zetsche, Bernd, et al. "Multiplex gene editing by CRISPR–Cpf1 using a single crRNA array." Nature biotechnology, 35.1 (2017): 31.
Zipper, Hubert, et al. "Investigations on DNA intercalation and surface binding by SYBR Green I, its structure determination and methodological implications." Nucleic acids research, 32.12 (2004): e103-e103
Zuker, Michael. "Mfold web server for nucleic acid folding and hybridization prediction." Nucleic acids research, 31.13 (2003): 3406-3415.