Team:Vilnius-Lithuania-OG/Human Practices

Collaborations

Human Practices

None of us is as smart as all of us

The building of CAT-Seq characterization system was a highly dynamic process in which our team and selected stakeholders engaged in a year long dialog, which has allowed us to precisely identify the most vital obstacles holding back the field of synthetic biology, helped us to ask the right questions which led to the idea of CAT-Seq and enabled the constant iteration and improvement of the system design throughout the year.

That is why we see the Human Practices as the most important part of our engineering process and overall project. Therefore, as You travel along the Human Practices timeline you will find different sections that correspond to the different engineering stages our project was at that time.

We strongly suggest to read the design section first, as it will help to fully appreciate the story of how the final design came to be.

Identifying the problem

A Simple Example of a Text Modal

Lorem ipsum dolor sit amet, consectetur adipisicing elit. Nostrum delectus, tenetur obcaecati porro! Expedita nostrum tempora quia provident perspiciatis inventore, autem eaque, quod explicabo, ipsum, facilis aliquid! Sapiente, possimus quo!

Week 1

The first Overgraduate team from Lithuania was established, and we have set out to explore the most pressing issues synthetic biology field is currently facing.

.

Week 2

We have hosted a meeting and discussion with biotechnology experts from Lithuania to catalyse our exploration.

More

We have talked about the opportunities synthetic biosystems may offer in the future and then discussed on what currently are the biggest barriers that are slowing down the field of synthetic biology the most. These are the 4 things most of us have agreed on:

    1. Huge lack of well-characterized biological parts.
    2. Hard to predict multiple part biological circuits.
    3. The lack of knowledge about sequence to function relationship, which hinders the development of novel synthetic biological parts.
    4. Biological circuit testing is complicated and extremely time consuming
Week 5

We have further discussed the current issues we have outlined during the last meeting.

More

In the end, the lack of well described biological parts and the insufficient knowledge about the biomolecular sequence to function relationship seemed like the 2 most pressing problems to solve right now. That is because basic parts are the core of synthetic biology, and if we cannot understand how the separate elements work or cannot engineer completely novel parts for our needs, it is, and will continue to be difficult task to create complex, yet useful systems that we dream synthetic biology can achieve.

Week 6

We have agreed to initially investigate what are the current state of the art methods for characterization of biological part sequence to function relationships. This was done in order for us to understand if our system should be based on already existing technology foundations or constructed fully from scratch.

Week 7

As we have narrowed down the problem we wanted to solve, we have then determined the stakeholders that would enable us to solve these problems as efficient as possible and that could help us reflect how our designed solutions would affect the world.

.

Stakeholders Of CAT-Seq

1. Life Sciences Center in Vilnius - one of the main collaborative research facilities in Lithuania, where academics from all over the Lithuania, and also the world, come together to solve various fundamental and engineering problems. We see LSC as an important stakeholder, as it can immensely influence our project in terms of helping us to forecast the right solutions to problems we are trying to solve.

2. Droplet Genomics - a startup based in Lithuania specializing in droplet microfluidics and single cell research. People at droplet genomics (DG) are worldwide experts in high-throughput research using microtechnologies. As we want to innovate the throughput and precision of part characterization even further, there was no better place to start then at the company which performs state of the art high-throughput research.

3. Thermo Fisher Scientific - Thermo Fisher’s R&D sector has a long history of working with directed evolution and protein engineering. We are extremely interested in navigating the space of sequence to function relationship, therefore we wanted to establish a line of communication with Thermofisher, in order for them to share their experience and help us to solve the arising engineering problems throughout the year.

Arriving at the Idea

Week 9

After an in-depth literature analyzes we have gained the knowledge there currently there are two main state of the art technological solutions of biological part characterization – microwell plates and microfluidics - have both positive, but more importantly considerable negative aspects in terms of acquiring high-throughput activity and sequence linked data and seems to require completely new approaches in order to remove those negative aspects.

More

First of all, the microtiter plate technology approach offers a way to connect biomolecule sequence to activity by physically separating the biomolecule samples in physically separated plate wells. As the location of each biomolecule is known, a robot can quickly measure the activities of biomolecules in each well. Yet, the limiting factor in this case is extracting DNA template encoding the biomolecule from each well and sequencing it, while not losing the information about what activity was recorded for that particular encoded biomolecule. Therefore, as this method offers a precise way to connect sequence and function, its throughput is rather low.

On the other hand microfluidic technology approaches offers rapid biomolecule internalization into small droplets, where each droplet then works as a separate microreactor. In this way the throughput is a lot higher than in microtiter approach. For example, libraries of 109 different biomolecules can be screened in a single experiment. Yet, the problem with such approach is that even though it offers a lot higher throughput and droplets can even be specifically sorted into few different categories, once a scientist wants to sequence the DNA molecules which are contained in those, for example, 109 droplets, the droplets need to be broken. As the water droplets are broken, different biomolecule encoding DNA is mixed together, and the activity information about each particular biomolecule is lost.

Week 11

We have decided to arrange a meeting with Karolis Leonavičius from Droplet Genomics who specializes in Microfluidic Technologies and single molecule research.

More

We talked about advantages and disadvantages of microfluidic technologies as a whole and then we went on to discuss the case of our interest – biomolecule activity screening and characterization. According to Karolis, single biomolecule encapsulation in nano-sized droplets is already highly optimized and efficient. There also is a lot of research describing efficient expression of biomolecules in those droplets.

Week 12

February 2018

After the discussion session with Karolis, we have decided we want to use microfluidic technologies as the base for our system, as it seemed to be an extremely efficient starting point - the high-throughput is already there, we just need to enable it. Also, this discussion has led us to discover the main concept of the problem we wanted to solve with our project – the barrier to high-throughput sequence to function information acquisition is the loss of that information once all of the microfluidic droplets are broken.

.

Week 13

After a short while we have determined the main goal of our journey – creating a method that enables the high-throughput biomolecule activity and sequence recording inside of each of the billion droplets and then quickly and accurately retrieving that information.

Designing the solution

Week 16

The first question we asked was - what is the object that we can write the activity information into? After hosting a discussion together with Vilnius University students and later analysing the current workflows of various microfluidic techniques we have quickly came to conclusion, that such object should be a thing that already contains half of the information we wanted to record - biomolecule DNA sequence.

.

Week 17

We have presented our earliest form of the idea to the researchers in the Life Science Center - the problems we’re currently facing in synthetic biology, the solutions we have right now and our idea of how we can solve those problems by building a new method based on droplet microfluidics.

More

We have received extremely positive comments about the idea we wanted to achieve. The comments were based on the future promise of what such system of activity recording could bring to the field of synthetic biology and multidisciplinary research. During the discussion, researchers have also shared some of their ideas on how it would be possible to record the activity into the DNA sequence. One of those we liked in particular - recording the activity into the DNA in the form of DNA modifications.

Week 19

After the discussion with the LSC researchers, we have explored literature and brainstormed the possible ways activity could be linked to DNA modification.In addition we have met with Justas Vaitekūnas that has in depth expertise with DNA modification research. Justas has told us about various of techniques which he uses to DNA molecules during his research. As we discussed the possibilities of this approach we have all came to conclusion, that even though we could modify DNA in some way, we still did not have the link mechanism between biomolecule activity and it‘s sequence.

.

Week 20

We have then decided to explore a different approach - instead of looking at the DNA molecule modifications, we have started to look at separate modified nucleotides. By using the recommendations from Vilnius University professors, we have met with Rūta Stanislauskienė.

More

She specializes in modified nucleotide molecular biology. During the meeting she has told us about her research concerning modified nucleotide and DNA polymerase interactions. One of the most amazing thing she told us is that there are modified nucleotide and DNA polymerase combinations, wherein the polymerase cannot incorporate the nucleotide because of its modification.

Week 23

After the last meeting we have done another round of literature analysis on modified nucleotides and their interactions with different DNA polymerases. It appeared that phi29 DNA polymerase, in many cases, refuses to incorporate nucleotides with modifications. We have then figured out the solution on how we can link the DNA sequence and the catalytic biomolecule activity - connecting the substrate for catalytic biomolecule to a nucleotide

.

Week 23

With the help of scientists from Life Sciences Center we have successfully have chosen a specific substrate and attached it to cytidine. Then, using standard screening methods we have selected an esterase that can remove said modification.

More

Audrius Laurynėnas suggested that we should first produce a single pair of well-working substrate nucleotide and enzyme pair before starting to build activity recording system. This was an amazing suggestion as it gave us a way to test the performance of the system we are building, while we were building it.

At that design stage, the performance of catalytic biomolecule was measured by the sequence length and number of amplicons. The amplicons had to be fragmented, because standard sequencing methods are not performing well with long read. Also, amplicons had to be barcoded in order to know that those amplicons came from the same droplet, as the amount of amplified DNA was crucial to us.

Week 26

In order to search for possible pitfalls in our current design we have met with Greta Stonytė from Microtechnologies Sector in Vilnius University.

More

We talked about our current, theoretical system system design. Having more then 5 year experience in microfluidic technologies she quickly pointed out what may be the negative design aspects in our system:

First of all, if we encapsulate only a single template into each of the droplets, the template concentration would be too low for protein expression to occur inside the droplets. This was based on Greta’s own research – it was nearly impossible to express detectable amounts of green fluorescent protein in droplets when only single DNA templates were used.
For this reason we have recognized another thing that requires innovation – protein expression from single templates. To see how we solved this problem, please click here to go to Rolling Circle Transcription section in the design page.

Secondly, as our design incorporated phi29 elongation reaction to record the activity of information after protein expression Greta has noted that Phi29 mix might not be compatible with in vitro transcription-translation (IVTT) mix. Because of that we did early separate experiments to see how phi29 works with IVTT. You can see the results by clicking here . Phi29 amplification has only started working after IVTT mix was diluted at least 4 times. It was a life saving advice, because if we would not have caught this huge problem early on, we would not have enough time to solve it later on.

Week 27

After some more research we had the idea to add phi29 reaction mix to each droplet after protein synthesis reaction is finished by merging substrate droplets with another population of phi29 droplets. This has both allowed for IVTT reaction to be efficient as possible and later when droplets were merged, diluted the IVTT mix enough for phi29 to be functional.

Linas Mažutis from the Sector of Microtechnologies (Institute of Biotechnology, Vilnius University) pointed out that our design might not be efficient at this point, as it incorporates difficult and inefficient DNA fragmentation and barcoding steps. In turn, He suggested we should find alternatives to this approach. The developed method should have as little steps as possible to avoid data loss, as our goal is not only high-throughput but also high-quality data.

Week 29

In order to avoid DNA fragmentation and barcoding we have thought of a solution that could solve this problem - Nanopore sequencing. It solves the problem by having the ability to sequence long strands of DNA and at the same time - detect modifications. That completely removes the need of fragmentation and barcoding, and in turn, eliminates at least 4 burdensome steps from the overall design.

Week 30

After multiple design iterations based on literature and consultations with field experts we have decided it is time to start building different parts from our system.

.

Building CAT-Seq

Week 31

We have started to build our first activity recording system prototype and immediately found the first issue with our initial prototype: the usage of circular ssDNA template.

More

Firstly, it was extremely difficult to make circularized ssDNA templates, as ssDNA is highly unstable and the strand removal from dsDNA and circularization reactions were resulting in low yields even after weeks of various optimizations.

Secondly, the final elongated DNA concentration we got in each droplet after the prototype workflow was extremely low. Each time, it was sub-nanogram amounts of DNA. That was because initially in each droplet there was on average one template of circularized ssDNA. As phi29 elongates that ssDNA template during the amplification step, it still does not make enough of a difference to increase the DNA yield significantly, even when generating millions of droplets in a single experiment. Moreover, after phi29 elongates the ssDNA molecule it creates another long ssDNA molecule, therefore second strand synthesis needs to be performed, which is also far from 100 percent efficient. Considering that nanopore sequencing requires at least of 100 ng to 1000 ng of starting template, we immediately knew we had a serious prototype design flaw in our hands.

Week 32

In hopes of finding a quick and elegant solution to our problem stemming from the usage of single stranded DNA template we have met with Audrius Laurynėnas from the Institute of Biochemistry at Vilnius University.

More

We knew that Audrius has in-depth knowledge about Phi29 and similar isothermal polymerases. We have discussed about different ways that phi29 or any other isothermal polymerase could be applied to our case to solve our current issues.

The meeting was concluded with the idea to try and use the Multiple Displacement Amplification (MDA) reaction together with random hexamers instead of elongating one long single stranded DNA molecule from the circular ssDNA template. Firstly, this would both allow us to use double stranded circular DNA which would make the preparation for experiments much quicker and easier. Secondly, in each droplet the MDA reaction would yield a lot more DNA.

Audrius also suggested that we should model our MDA reaction first in order to know what length distribution of product we should expect and what effect the substrate nucleotide might have on DNA amplification.That was an important aspect to consider, because while exploring the literature we have found that even though phi29 does not incorporate many modified nucleotides, when it tries to do that it is inhibited and falls of. That might result in large number of short fragments, which is not ideal when using Nanopore Sequencing. Therefore, we have written a mathematical description of Phi29 DNA amplification which takes into account Phi29 inhibition by substrate nucleotide, in order to find maximum amount of Substrate Nucleotides we can use without worrying about short sequencing reads. To see the model, please click here.

Week 33

We have redesigned our prototype by adding the MDA reaction. The negative aspect of turning to MDA was the usage of dsDNA as a template, as we did not know whether the Rolling Circle Transcription (RCT) would work, because all of the literature describes RCT only using circular ssDNA template. Gladly, we have later demonstrated that RCT works well with double stranded circular DNA molecules.

Week 34

Another issue we had with our first Prototype was the inefficiency of droplet merging. It is very important to merge all of the droplets containing the Product Nucleotides, as each droplet that is not merged is lost information about a specific mutant. At that time, our efficiency was lower then 1 merged droplet for every 20 to 30.

Week 35

To solve this huge issue we have met with Physicist and Microfluidic Technology expert Valdemaras Milkus from the Institute of biotechnology.

More

We have hoped to receive insight about how could we effectively solve that problem. Valdemaras has told us about how he approached similar problems. In his case it was precise-size droplet generation. He solved this problem by modelling and suggested for us to do the same. According to Valdemaras - even though the microfluidic models are extremely difficult to write, it is a lot harder to experimentally find the right parameters when there are multiple unknown variables. It might be like searching for a needle in a haystack.

Week 37

Following Valdemaras advice we have derived a full mathematical description of the active fusion of droplets inside of a microfluidic chip. Using this model we have solved multiple parameters which we then tested experimentally. The predicted parameters have successfully provided perfect droplet merging conditions which we have then continued using in the CAT-Seq system experiments. You can find our mathematical description of droplet merging by clicking this link

Week 38

At our second meeting with Greta we have presented our results and discussed possible improvements upon our system.

More

Greta has pointed out that our system might suffer from immense sequencing bias, as our system was then based on discriminating different biomolecule activities depending on phi29 amplicon amounts, which in turn was connected to how many Product Nucleotides has the particular biomolecule produced. If the molecule variant was highly active we had a good amount of DNA to sequence. But, if the activity was low, there was a high chance to lose that information, for example, due to low read count.

Week 39

The meeting with Greta has pushed us to innovate our system even further. Soon after that we have came up with the idea of Reference Nucleotides.

More

These were specially modified nucleotides which phi29 could incorporate into DNA easily. Also, the nanopore sequencer could detect those nucleotide modifications at the same time as reading the full sequences. We have added a specific amount of reference nucleotides to amplification droplets used for enriching droplets with amplification mix.

After the merging, each droplet now contained reference nucleotides, which have helped to balance the overall amplified DNA amount, even for the mutants which had extremely low activities.

Week 41

We have presented our finalised design and results at Science Days event which was organized by our sponsor Thermofisher Scientific. After that, we had a session discussion about further development of this project after iGEM and have received offers to collaborate with bioinformatics from Life Sciences Center in order to start using the immense amount of function to sequence relationship data in Artificial Intelligence algorithms.

.