Precise and high-throughput biological part characterization is at the very core of Catalytic Activity Sequencing (CAT-Seq) system. It is the next generation activity measurement system capable of simultaneous recording of large libraries of catalytic biomolecules or regulatory parts. The system output is simple, yet extremely useful - every unique biomolecule sequence receives a parameter, which denotes the activity of that biological part.
CAT-Seq is best explained with an example - we strongly suggest to read the design section first!
CAT-Seq provides a way to simultaneously measure the activities of billions of catalytic biomolecules in a single experiment by recording the activity information for each unique biomolecule into its coding DNA sequence. This is achieved by quickly encapsulating single biomolecule templates in separate droplets together with special Substrate Nucleotides. The substrate nucleotides have a substrate attached to them, and they act as a target for the catalytic biomolecule. The catalytic biomolecule can then turn the Substrate Nucleotide into the Product Nucleotide. During the DNA amplification in each droplet, this activity is recorded, and by quantifying the number of product nucleotides we can assess the activity of each individual biomolecule simultaneously.
CAT-Seq can also be effortlessly adjusted to measure the activities and cross-interactions of numerous transcriptional or translational regulatory parts instead. The amount of product nucleotide depends not only on the catalytic biomolecule activity, but also on the amount of the biomolecule.
If we would take a library, wherein all of the constructs have the exact catalytic biomolecule sequence, yet unique regulatory sequences, we would have different amounts of catalytic biomolecule depending on the regulatory sequence in each droplet. Moreover, the amount of product nucleotide produced would also be connected to the protein amount in the droplet. Finally, if we would have a stronger regulatory sequence that allows greater catalytic biomolecule expression, we would see that by an increased amount of Product Nucleotide.
Therefore, the regulatory sequence strength can be indirectly, but precisely assessed using CAT-seq.
Finally, CAT-Seq can be used to assess the cross-interactions of different regulatory sequence combinations.
Let’s take Toehold Switches as an example. They consist of two parts - the Toehold Switch and activating RNA. The Toehold Switch forms an RNA loop on the RBS of the controlled coding sequence and in turn inhibits the translation. The expressed activating RNA can bind to the Toehold Switch RNA sequence and open RBS to the translation machinery. Testing a big number of combinations toehold switch pair combination while searching for orthogonal switches is an extremely time and resource consuming task. That is because each combination must be assessed separately.
Using CAT-Seq, the cross-interactions of large libraries of Toehold Switches and RNA activators can be recorded in a single experiment. In this example, two libraries would have to be prepared - the toehold switch and RNA activator. Then, both of these libraries are ligated next to the same catalytic biomolecule. What we then have is a new library that consists of fragments which have random toehold switch and RNA activator combinations and the same esterase.
Depending on how activating RNA affects toehold switch in each droplet, the Product Nucleotide amount will vary. After the amplification and sequencing in each fragment, we would see three main things - the specific toehold switch sequence, the specific activating RNA sequence and their performance or activity score in the form of Product Nucleotides.
While ultra high-throughput is extremely important, the precision and accuracy of the system is just as critical. In order to assess if our system is measuring activities properly in high-throughput, we have first made low-throughput measurements using standard and well-established methods. Then, we have compared them with the results acquired using CAT-Seq.
In silico designed mutant library was subjected to catalytic activity sequencing. The DNA embedded with catalytic activity information (in a form of incorporated reference to catalytically converted nucleotide ratio) was extracted and sequenced with the nanopore. By applying data preparation and analysis pipeline, the mean methylation scores arising from reference nucleotides for each barcoded mutant DNA template were filtered and extracted. The collected data was normalized over Wild Type enzyme version and K227R mutant (it had the lowest activity).
Figure 1. Comparison of In bulk and CAT-Seq measured esterase mutant relative activity. In silico generated Esterase mutant library was subjected to catalytic activity sequencing. The mean methylation scores for each barcoded mutant DNA template were filtered and extracted. The collected data was normalized over Wild Type CAT seq Esterase and K227R mutant (lowest activity). The relative activity, extracted from the mean methylation score of each mutant read is compared to in data gathered in standard sized reactions (in bulk).
The relative methylation score (reference nucleotide count) of each mutant read corresponds to the activity of the enzyme it encodes. The higher the activity of the expressed enzyme, the lower methylation score is assigned, due to the catalytic conversion of substrate nucleotides. The comparison of the results, gathered with CAT-Seq catalytic activity sequencing approach and standard sized reactions (Figure 1) conclude the viability of CAT-Seq as a biomolecule activity to reading method. The activity reading, extracted from the DNA sequence correlates with the kinetic measurement data perfectly. The activity of each Esterase mutant is measured accurately and is assigned to the corresponding DNA sequence. These results prove that CAT-Seq approach enables to screening the activity of millions of enzyme variant sequences and accurately assigns the phenotype of each variant to the genotype it arises from.
The constructed Ribosome Binding site library (BBa_B0030, BBa_B0032, BBa_B0034, BBa_K2621038 with a downstream CAT-Seq esterase gene BBa_K2621000) was subjected to catalytic activity sequencing method and DNA, embedded with catalytic activity information was prepared. Then mean methylation scores (reference nucleotide count) for each barcoded DNA template, housing different RBS were filtered and extracted. The activity collected data was normalized to BBa_B0034 data and is shown in Fig. 2.
Figure 2. Comparison of in bulk and CAT-Seq measured ribosome binding site relative strength. The catalytic activity of esterase gene, regulated by a library of ribosome binding sites was measured using cell-free expression system in bulk or CAT-Seq approach and compared side by side. The mean methylation scores for each barcoded mutant DNA template were filtered and extracted. The collected data was normalized BBa_B0034 corresponding to mean strength of 1.
It’s known that stronger ribosome binding sites increase the yield of translated proteins and in therefore increase the number of catalytically converted substrate nucleotides. This increase of nucleotides is inversely proportional to the assigned mean methylation score of the sequenced DNA template. Based on this fact, the activity results can be extracted from mean methylation scores (reference nucleotide count) and correspond to ribosome binding site strength. The catalytic activity sequencing results were compared to earlier, in standard sized reactions measured RBS strength results. The comparison once again concludes the viability of CAT-Seq approach. The ribosome binding site strength, extracted from the DNA sequence in a form of reference nucleotide count correlates with measurements made with accurate standard assays. These results display the validity of CAT-Seq as a method for screening the strength of regulatory sequences and its ability to assign accurate phenotype to genotype linkage.
Toehold regulatory sequence library constituted of the different toehold and trigger pairs was constructed subjected to catalytic activity sequencing method. The DNA embedded with catalytic activity information was sequenced with Nanopore. The mean methylation scores (reference nucleotide count in the sequence) for each barcoded DNA template, housing different regulatory sequence were filtered and extracted.
Figure 3. The evaluation of Toehold-Trigger riboregulatory sequence orthogonality using CAT-Seq. The catalytic activity of esterase genes, regulated by different Toehold switches were measured using CAT-Seq. The mean methylation scores for each barcoded regulatory construct DNA template was filtered and essigned. Low methylation scores correspond to actively expressed protein and are only assigned when both Toehold and trigger sequences from the same group are present verifying the already measured orthogonality of regulatory parts.
The Figure 3 displays the mean methylation (reference nucleotide count) scores assigned to each barcoded toehold-trigger construct read. Based on the results, low methylation score is only assigned when both Toehold and trigger sequences from the same group are present. This means that esterase was expressed and catalyzed the conversion of substrate nucleotides. These results correlate perfectly to the standard (not in droplet) measurement results. Based on this fact, it can be concluded that CAT-Seq activity sequencing method can be utilized as a precise and accurate way to screen and assign the activity and orthogonality of regulatory sequences in a high throughput manner.
We truly believe that Catalytic Activity Sequencing will be a valuable and frequently used tool by future iGEM teams, Scientists and Engineers. That is why we have created a detailed “Using Cat-Seq” protocol guide that allows users of range of background and skill level to start exploring the vast space of sequence and activity relationships right away!