Synthetic biology is currently one of the most rapidly developing fields in science. Recent advancements, particularly in genetic engineering, allow us to tackle major societal challenges. Gene engineering is even being applied to humans, with the use of gene therapy. Gene therapy is an experimental technique that uses genes to treat or prevent among others severe genetic disorders. However, major concerns are raised about the misuse of gene editing techniques, particularly for human enhancement. Gene doping, the misuse of gene therapy to enhance athletes’ performances, is one example. To promote responsible use of synthetic biology and to help eliminate gene doping from sport, we developed a complete gene doping detection method: ADOPE, the Advanced Detection of Performance Enhancement (figure 1).
ADOPE is based on targeted Next Generation Sequencing (NGS), reducing the amount of data generated with NGS and effectively identifying gene doping DNA. We accomplished this by creating an innovative fusion protein (BBa_K2643000) used in rapid library preparation required for sequencing. Our fusion protein consists of a cleavage deficient nuclease specific dxCas9 and a Tn5 transposase. The dxCas9 part loaded with a single guide RNA (sgRNA) will interact with the specific target DNA sequence via complementary matching between sgRNA and target DNA. Whereas the Tn5 part will integrate two small DNA molecules (adapters) required for nanopore sequencing. Thus, the fusion protein is capable of performing dxCas9 guided adapter ligation for targeted sequencing library preparation, allowing us to identify gene doping in blood samples with our method ADOPE.
We designed ADOPE to detect gene doping in blood samples by targeting the most striking difference between the natural and gene doping DNA, namely the exon-exon junctions that only exist in doping DNA (Beiter et al., 2011). ADOPE consists of four main steps: sample preparation, prescreening, library preparation and sequencing.
1. Sample preparation
Blood samples are commonly taken from athletes during regular doping tests. We used these blood samples to extract DNA from serum or the buffycoat (serum and white blood cells) required for testing (Ni et al., 2011). We built an extensive gene doping kinetics model to predict the amount of gene doping fragments in blood over time based an input concentration of gene doping vectors and an injection frequency. We used this model to determine the appropriate time window for gene doping testing based on the measured sensitivity of DNA extraction.
Figure 3. Concentration of doping DNA in the blood over time after a single intramuscular injection of 141 billion viral vectors. The detection limit of 1000 copies per mL of blood is estimated based on the loss of DNA that occurs during sample preparation and targeted sequencing preparation.
We incorporated a prescreening step in our method based on the advice of Dr. Oliver de Hon of the Dutch Doping Authorities. He emphasised the importance of a high throughput assay that could screen thousands of athletes simultaneously. Therefore, we developed a colorimetric assay based on the extent of gold nanoparticle aggregation (Baetsen-Young et al., 2018). When target doping DNA is absent, the nanoparticle completely aggregates, resulting in a purple color. When target doping DNA is present, it forms a secondary structure with a targeting DNA probe, which stabilizes the nanoparticle from aggregating, resulting in a red color.
3.Targeted library preparation
Positive prescreened samples proceed to our novel rapid targeted next generation sequencing. Targeted library preparation relies on our innovative fusion protein consisting of a Tn5 transposase and a dxCas9. The fusion of these two proteins resulted in a target specific transposition. As a proof of concept, we showed that our fusion protein is target specific with an in vitro targeted integration assay, that was verified by visualising the amplified integration products with gel electrophoresis. Once we established the functionality and optimal conditions, we implemented the fusion protein in the established rapid next generation sequencing library preparation protocol from Oxford Nanopore Technologies (ONT). We replaced the original transposase, responsible for random integration of the sequencing adaptors, by our novel fusion protein. In our case, a specifically designed sgRNA will guide the fusion protein to the exon-exon junction target site and prepare only gene doping DNA for sequencing. Additionally, we developed a sgRNA model which identified the optimal exon-exon target site by searching for the least sgRNA’s required to cover all possible variation of the target sites due to synonymous mutations of EPO coding sequence. A sgRNA array of the resulting 12 sgRNAs can be used in a single library preparation, utilizing the multiplexing capability of dxCas9 (Cong et al., 2013). As a result, we could simultaneously test for the gene doping variants, improving the efficiency of our method.
Further, we implemented multiplexing with barcodes to improve method efficiency, reduce cost, and expand the throughput. We created a barcoding webtool to generate unique barcodes, which are integrated into the adapter sequences that are ligated to the target sequence. This allows us to sequence samples from multiple different athletes in the same run and trace the output sequence back to corresponding barcode (Bayliss et al., 2017).
4. Targeted sequencing
After library preparation, we sequence the samples using a MinION, portable real-time sequencing device, from ONT. Only targeted sequences with adapters are translocated through the pores of ONT’s via the nanopore motor protein. The remaining untagged DNA will not be sequenced. By making a simple enzyme substitution, we transformed ONT’s established next generation sequencing platform into a targeted next generation sequencing platform.
We processed the data obtained from ONT MinION sequencing runs with our tailor-made data analysis software tool. The software consists of an algorithm that aligns all the files generated by our sequencing run sequences with our pre-existing database, containing expected gene doping sequences. Based on the alignment score, the sequences are classified into gene doping DNA or non-gene doping DNA, eliminating any false positives that might have been sequenced. The software strengthens the robustness and reliability of our method, allowing us to determine whether gene doping DNA was present in the athlete’s blood sample. To go a step further, our algorithm has the capacity of expanding the database as it detects new gene doping sequences, thereby simultaneously evolving with gene doping.
We proved our designed fusion protein could direct DNA integration to a specific sequence by the use of a sgRNA. We also demonstrate that this novel protein could be used to perform targeted next generation sequencing with ONT MinION device. Therefore, we believe that by establishing and integrating our novel detection method into valid testing systems, we could discourage athletes from using high-risk gene doping technologies. Not only has the Dutch Doping Authority, the Delft Sports Engineering Institute and Dutch Trotting and Flat Racing Association (NDR) shown interest in our technology, but so have various other non gene doping related stakeholders. Sanquin, the Dutch blood bank, and the Dutch Research Department for Food Safety (RIKILT) at Wageningen University also emphasized the potential and value of our targeted sequencing technology. These stakeholders’ interests highlight the possible applications of our targeted sequencing method in other fields, such as cancer detection, non-invasive prenatal screening, food safety regulations, and even strain identification.
- Baetsen-Young, A.M., Vasher M., Matta L.L., Colgan P., Alocilja E. C., Day B. Direct Colorimetric Detection of Unamplified Pathogen DNA by Dextrin-capped Gold Nanoparticles. Biosensors and Bioelectronics 101 (2018): 29-36. doi:10.1016/j.bios.2017.10.011.
- Bayliss, S. C., Hunt, V. L., Yokoyama, M., Thorpe, H. A., Feil, E. J. (2017). The use of Oxford Nanopore native barcoding for complete genome assembly. GigaScience, 6(3), 1–6. http://doi.org/10.1093/gigascience/gix001
- Beiter T, Zimmermann M, Fragasso A, Hudemann J, Niess AM, et al. Direct and long-term detection of gene doping in conventional blood samples. Gene Therapy. 2011, 18: 225–231.
- Cong L.,, Ran F. A., Cox D., Lin S., Barretto R., Habib, N., Hsu P.D., Wu X., Jiang W, Marraffini L.A., Zhang F. Multiplex Genome Engineering Using CRISPR/Cas Systems. SCIENCE. 15 FEB 2013 : 819-823
- Ni W., Le Guiner C., Gernoux G., Penaud-Budloo M., Moullier P., Snyder R.O. Longevity of rAAV vector and plasmid DNA in blood after intramuscular injection in nonhuman primates: implications for gene doping. Gene Therapy. 2011, 18, 709–718. doi: 10.1038/gt.2011.19