Difference between revisions of "Team:TJU China/Model"

 
(40 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{TJU_China}}
+
<!DOCTYPE >
 
<html>
 
<html>
  
 +
<head>
 +
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
 +
    <link rel="stylesheet" href="https://2018.igem.org/Template:TJU_China/default_CSS?action=raw&ctype=text/css" type="text/css">
  
 +
    <link rel="stylesheet" href="https://2018.igem.org/Template:TJU_China/nav_css?action=raw&ctype=text/css" type="text/css">
  
 +
    <style>
 +
        .head {
 +
            font-size: 35px;
 +
            color: #4e72b8;
 +
            text-align: center;
 +
            margin-top: 150px;
 +
        }
  
 +
        .subhead {
 +
            font-size: 25px;
 +
            text-align: center;
 +
            margin-top: 50px;
 +
            line-height: 1.5;
 +
        }
  
 +
        .title {
 +
            font-size: 30px;
 +
            color: #4e72b8;
 +
            text-align: left;
 +
            margin-top: 50px;
 +
            margin-left: 10%;
 +
            line-height: 1.5;
  
<div class="clear"></div>
+
        }
  
 +
        .subtitle {
 +
            font-size: 25px;
 +
            color: #4e72b8;
 +
            text-align: left;
 +
            margin-top: 50px;
 +
            margin-left: 10%;
 +
            line-height: 1.5;
 +
        }
  
<div class="column full_size">
+
        .word {
<h1> Modeling</h1>
+
            font-size: 20px;
 +
            margin-left: 10%;
 +
            margin-right: 10%;
 +
            text-align: left;
 +
            margin-top: 50px;
 +
            line-height: 1.5;
 +
        }
  
<p>Mathematical models and computer simulations provide a great way to describe the function and operation of BioBrick Parts and Devices. Synthetic Biology is an engineering discipline, and part of engineering is simulation and modeling to determine the behavior of your design before you build it. Designing and simulating can be iterated many times in a computer before moving to the lab. This award is for teams who build a model of their system and use it to inform system design or simulate expected behavior in conjunction with experiments in the wetlab.</p>
+
        /* .equation {
 +
            float: left;
 +
            font-size: 20px;
 +
            margin-left: 30%;
 +
            margin-right: 10%;
 +
            text-align: center;
 +
            margin-top: 30px;
 +
            line-height: 1.5;
 +
        }
 +
        .number{
 +
            float: left;
 +
            font-size: 20px;
 +
            margin-top: 40px;
 +
        } */
  
</div>
+
        .figure {
<div class="clear"></div>
+
            font-size: 15px;
 +
            text-align: center;
 +
            margin-top: 20px;
 +
            line-height: 1.5;
 +
        }
  
<div class="column full_size">
+
        .pic {
<h3> Gold Medal Criterion #3</h3>
+
            margin-left: 20%;
<p>
+
            margin-right: 20%;
Convince the judges that your project's design and/or implementation is based on insight you have gained from modeling. This could be either a new model you develop or the implementation of a model from a previous team. You must thoroughly document your model's contribution to your project on your team's wiki, including assumptions, relevant data, model results, and a clear explanation of your model that anyone can understand.
+
            width: 60%;
<br><br>
+
            margin-top: 30px;
The model should impact your project design in a meaningful way. Modeling may include, but is not limited to, deterministic, exploratory, molecular dynamic, and stochastic models. Teams may also explore the physical modeling of a single component within a system or utilize mathematical modeling for predicting function of a more complex device.
+
        }
</p>
+
  
<p>
+
        .doublepic {
Please see the <a href="https://2018.igem.org/Judging/Medals"> 2018
+
            float: left;
Medals Page</a> for more information.
+
            margin-left: 10%;
</p>
+
            width: 37%;
</div>
+
            margin-top: 30%;
 +
        }
  
<div class="column two_thirds_size">
+
        img {
<h3>Best Model Special Prize</h3>
+
            width: 100%;
 +
            height: auto;
 +
        }
 +
    </style>
 +
</head>
  
<p>
+
<body>
To compete for the <a href="https://2018.igem.org/Judging/Awards">Best Model prize</a>, please describe your work on this page  and also fill out the description on the <a href="https://2018.igem.org/Judging/Judging_Form">judging form</a>. Please note you can compete for both the gold medal criterion #3 and the best model prize with this page.  
+
    <div class="nav">
<br><br>
+
        <div class="logo">
You must also delete the message box on the top of this page to be eligible for the Best Model Prize.
+
            <img class="logo_picture" src="https://static.igem.org/mediawiki/2018/f/f7/T--TJU_China--TJU_China.png">
</p>
+
        </div>
 +
        <ul style="margin-left:80px;" class="main">
 +
            <li>
 +
                <div class="nav_logo_pic" style="width:24px;height:24px;margin-top:13px;">
 +
                    <img src="https://static.igem.org/mediawiki/2018/3/37/T--TJU_China--home.png">
 +
                </div>
 +
                <a href="https://2018.igem.org/Team:TJU_China">Home</a>
 +
            </li>
 +
            <li>
 +
                <div class="nav_logo_pic">
 +
                    <img src="https://static.igem.org/mediawiki/2018/9/91/T--TJU_China--project.png">
 +
                </div>
 +
                <a href="https://2018.igem.org/Team:TJU_China/Description">
 +
                    Project
 +
                </a>
 +
                <ul class="drop menu1">
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Description">Description</a>
 +
                    </li>
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Experiments">Experiments</a>
 +
                    </li>
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Notebook">Notebook</a>
 +
                    </li>
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Demonstrate">Demonstrate</a>
 +
                    </li>
 +
                </ul>
  
</div>
+
            </li>
 +
            <li>
  
 +
                <div class="nav_logo_pic">
 +
                    <img src="https://static.igem.org/mediawiki/2018/0/0b/T--TJU_China--parts.png">
 +
                </div>
 +
                <a href="https://2018.igem.org/Team:TJU_China/Parts">
 +
                    Parts
 +
                </a>
 +
                <ul class="drop menu2">
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Parts#Parts_Overview">Parts Overview</a>
 +
                    </li>
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Parts#Basic">Basic</a>
 +
                    </li>
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Parts#Composite">Composite</a>
 +
                    </li>
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Parts#Collection">Collection</a>
 +
                    </li>
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Parts#Improve">Improve</a>
 +
                    </li>
 +
                </ul>
 +
            </li>
 +
            <li>
 +
                <div class="nav_logo_pic">
 +
                    <img src="https://static.igem.org/mediawiki/2018/5/57/T--TJU_China--drylab.png">
 +
                </div>
 +
                <a href="https://2018.igem.org/Team:TJU_China/Model">Model</a>
 +
                <ul class="drop menu4">
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Model#Dynamic_Model">Dynamic Model</a>
 +
                    </li>
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Model#Off-Target_Model">Off-target Model</a>
 +
                    </li>
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Model#Code">Code</a>
 +
                    </li>
  
<div class="column third_size">
+
                </ul>
<div class="highlight decoration_A_full">
+
            </li>
<h3> Inspiration </h3>
+
            <li>
<p>
+
                <div class="nav_logo_pic">
Here are a few examples from previous teams:
+
                    <img src="https://static.igem.org/mediawiki/2018/e/ea/T--TJU_China--hp.png">
</p>
+
                </div>
<ul>
+
                <a href="https://2018.igem.org/Team:TJU_China/Human_Practices">
<li><a href="https://2016.igem.org/Team:Manchester/Model">2016 Manchester</a></li>
+
                    HP
<li><a href="https://2016.igem.org/Team:TU_Delft/Model">2016 TU Delft</li>
+
                </a>
<li><a href="https://2014.igem.org/Team:ETH_Zurich/modeling/overview">2014 ETH Zurich</a></li>
+
                <ul class="drop menu3">
<li><a href="https://2014.igem.org/Team:Waterloo/Math_Book">2014 Waterloo</a></li>
+
                    <li>
</ul>
+
                        <a href="https://2018.igem.org/Team:TJU_China/Human_Practices">Human Practices</a>
</div>
+
                    </li>
</div>
+
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Public_Engagement">Public Engagement</a>
 +
                    </li>
 +
                </ul>
 +
            </li>
 +
            <li>
 +
                <div class="nav_logo_pic">
 +
                    <img src="https://static.igem.org/mediawiki/2018/0/01/T--TJU_China--team.png">
 +
                </div>
 +
                <a href="https://2018.igem.org/Team:TJU_China/Team">
 +
                    Team
 +
                </a>
 +
                <ul class="drop menu7">
 +
                    <li style="list-style-type: none;">
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Team">Members</a>
 +
                    </li>
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Collaborations">Collaborations</a>
 +
                    </li>
 +
                    <li>
 +
                        <a href="https://2018.igem.org/Team:TJU_China/Attributions">Attributions</a>
 +
                    </li>
 +
                </ul>
 +
            </li>
 +
 
 +
        </ul>
 +
 
 +
    </div>
 +
    <div style=" margin-top:28px;z-index:10; border-top: solid #4e72b8 2px;width: 100%; position: fixed;"></div>
 +
 
 +
    <div class="head" id="Dynamic_Model">Dynamic Model of Heavy Metal Detection Biosensor</div>
 +
    <div class="title">1 Introduction</div>
 +
    <div class="word">Modeling is a powerful tool in synthetic biology. It provides us with a necessary engineering approach to characterize
 +
        our pathways quantitatively and predict their performance,thus help us test and modify our design.Through the dynamic
 +
        model of heavy-metal detection biosensor,we hope to gain insights into the characteristics of our whole circuit's
 +
        dynamics.
 +
    </div>
 +
    <div class="title">2 Methods</div>
 +
    <div class="subtitle">2.1 Analysis of metabolic pathways</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/0/01/T--TJU_China--y1.png">
 +
    </div>
 +
    <div class="figure"><b>Figure 1.</b> Metabolic pathways related to plasmid#1</div>
 +
    <div class="word">At the beginning, on the plasmid#1, the promoter $P_{arsR}$ isn't bound with ArsR,thus it is active.ArsR and smURFP are
 +
        transcribed and translated under the control of the promoters $P_{arsR_{u}}$ and $P_{arsR_{d}}$,with subscript u
 +
        and d representing upstream and downstream separately.The subscript l of smURFP in the equation means leaky expression
 +
        without the expression of $As^{3+}$.As ArsR is expressed gradually,it will bind with the promoter $P_{arsR}$ and
 +
        make it inactive.[1]</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/a/a6/T--TJU_China--m1.PNG">
 +
    </div>
 +
    <div class="word">On the plasmid#2,the fusion protein of dCas9 and RNAP(RNA polymerase) are produced after transcription and translation,and
 +
        sgRNA is produced after transcription.
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/2/26/T--TJU_China--m2.png">
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/2/2b/T--TJU_China--2.png">
 +
    </div>
 +
    <div class="figure"><b>Figure 2.</b> Metabolic pathways related to dCas9/RNAP</div>
 +
    <div class="word">dCas9(*RNAP) can bind with its target DNA sequence without cutting, which is at the upstream of the promoter $P_{arsR_{d}}$.Simulataneously,dCas9
 +
        can lead RNAP to bind with the promoter $P_{arsR_{d}}$ and enhance the transcription of smURFP.However,because the
 +
        promoter $P_{arsR_{d}}$ has already bound with ArsR,as a result,RNAP can't bind with the promoter $P_{arsR_{d}}$.
 +
        can’t bind with the promoter $P_{arsR_{d}}$.</div>
 +
    <div class="word">However,at the presence of $As^{3+}$,it can bind with ArsR,then dissociate ArsR and $P_{arsR_{d}}$ , which makes the
 +
        combination of RNAP and $P_{arsR_{d}}$ possible.</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/4/4d/T--TJU_China--m3.png">
 +
    </div>
 +
    <div class="word">We then take degradation into account: </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/a/a1/T--TJU_China--m4.png">
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/3/32/T--TJU_China--m5.png">
 +
    </div>
 +
    <div class="subtitle">2.2 Analysis of ODEs</div>
 +
    <div class="word">Applying mass action kinetic laws,we obtain the following set of differentiak equations.The several complexes involved:Ars$R^*$$P_{arsR}$,$As^{3+}$,${dCas9}^*$RNAP,${dCas9}^*$RNAP:sgRNA,${dCas9}^*$RNAP:${sgRNA}^*P_{arsR}$,
 +
        are respectively abbreviated as $cplx_1$,$cplx_2$,$cplx_3$,$cplx_4$,$cplx_5$.</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/e/e4/T--TJU_China--m6.png">
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/4/45/T--TJU_China--m7.png">
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/b/b7/T--TJU_China--m8.png">
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/a/ad/T--TJU_China--m9.png">
 +
    </div>
 +
    <div class="subtitle">2.3 Simulation</div>
 +
    <div class="word">Our simulation is based on two softwares: MATLAB (SimBiology Toolbox) and COPASI.
 +
        <br> SimBiology Toolbox provides functions for modeling,simulating and analyzing biochemical pathways by the powerful
 +
        computing engine of MATLAB.</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/a/a5/T--TJU_China--s3.png">
 +
    </div>
 +
    <div class="figure"><b>Figure 3.</b>Reaction map generated from the reaction sets above by SimBiology Toolbox</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/0/04/T--TJU_China--11.png">
 +
    </div>
 +
    <div class="figure"><b>Figure 4.</b>Simulation of smURFP production as a function of time by MATLAB Through the figure, we can see that the smURFP
 +
        can gradually increase and reach a steady state after a period in the presence of arsenic ions.</div>
 +
    <div class="subtitle">2.4 Sensitivity</div>
 +
    <div class="word">A good biosystem should have certain stability towards fluctuations in parameters.A good model should reflect this,and
 +
        hence a test for robustness can be essential to the model.
 +
        <br> Robustness analysis can also pinpoint which reactions/parameters that are important for obtaining a specific biological
 +
        behavior.A simple measure for sensitivity is to measure the relative change of a system feaure due to a change in
 +
        a parameter.As for our model,the feature can be the equilibrium concentration of the smURFP(C) for which the sensitivity(S)
 +
        to a parameter k is:
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/1/11/T--TJU_China--m10.png">
 +
    </div>
 +
    <div class="word">After analysis, we found that the concentration of smURFP is relatively sensitive to parameters such as ktx3,ktl3,ktx4,kb4,kb6,kd2,kd5,
 +
        kd6,kd7,kd8,kd11, etc. Among these parameters, except for the parameters that directly affect the production and
 +
        degradation of smURFP,the rest of them are all related to dCas9-RNAP:sgRNA. It shows that our model reflects the
 +
        critical role of dCas9-RNAP:sgRNA,which initially confirms our hypothesis:dCas0-RNAP can enhance transcription to
 +
        increase the concentration of smURFP. However, due to the lack of previous modeling studies on dCas9-RNAP,some kinetic
 +
        parameters may not be very accurate,and due to time limitation,we have not implemented experiments to measure related
 +
        parameters,which may lead to some deviations in our model.
 +
        <br> The sensitivity of each parameter is shown in the figures below.</div>
 +
 
 +
    <div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/f/f1/T--TJU_China--tx1.png">
 +
        </div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/e/e0/T--TJU_China--tl1.png">
 +
        </div>
 +
    </div>
 +
    <div class="figure">(a)sensitivity of ktx1                    (b)sensitivity of ktl1</div>
 +
 
 +
    <div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/8/8a/T--TJU_China--tx2.png">
 +
        </div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/4/41/T--TJU_China--tl2.png">
 +
        </div>
 +
    </div>
 +
    <div class="figure">(c)sensitivity of ktx2(d)sensitivity of ktl2</div>
 +
 
 +
    <div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/1/1a/T--TJU_China--tx3.png">
 +
        </div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/3/30/T--TJU_China--tl3.png">
 +
        </div>
 +
    </div>
 +
    <div class="figure">(e)sensitivity of ktx3 (f)sensitivity of ktl3</div>
 +
 
 +
    <div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/0/03/T--TJU_China--tx4.png">
 +
        </div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/c/c9/T--TJU_China--b1.png">
 +
        </div>
 +
    </div>
 +
    <div class="figure">(g)sensitivity of ktx4 (h)sensitivity of kb1</div>
 +
 
 +
    <div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/6/69/T--TJU_China--b2.png">
 +
        </div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/3/33/T--TJU_China--b3.png">
 +
        </div>
 +
    </div>
 +
    <div class="figure">(i)sensitivity of kb2 (j)sensitivity of kb3</div>
 +
 
 +
    <div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/a/a9/T--TJU_China--b4.png">
 +
        </div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/2/2b/T--TJU_China--b5.png">
 +
        </div>
 +
    </div>
 +
    <div class="figure">(a)sensitivity of kb4 (b)sensitivity of kb5</div>
 +
 
 +
    <div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/e/e9/T--TJU_China--b6.png">
 +
        </div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/5/5b/T--TJU_China--d1.png">
 +
        </div>
 +
    </div>
 +
    <div class="figure">(c)sensitivity of kb6 (d)sensitivity of kd1</div>
 +
 
 +
    <div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/5/56/T--TJU_China--d2.png">
 +
        </div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/0/0a/T--TJU_China--d3.png">
 +
        </div>
 +
    </div>
 +
    <div class="figure">(e)sensitivity of kd2 (f)sensitivity of kd3</div>
 +
 
 +
    <div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/4/44/T--TJU_China--d4.png">
 +
        </div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/2/2f/T--TJU_China--d5.png">
 +
        </div>
 +
    </div>
 +
    <div class="figure">(g)sensitivity of kd4 (h)sensitivity of kd5</div>
 +
 
 +
    <div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/e/e4/T--TJU_China--d6.png">
 +
        </div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/3/35/T--TJU_China--d7.png">
 +
        </div>
 +
    </div>
 +
    <div class="figure">(a)sensitivity of kd6 (b)sensitivity of kd7</div>
 +
 
 +
    <div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/c/c5/T--TJU_China--d8.png">
 +
        </div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/4/44/T--TJU_China--d9.png">
 +
        </div>
 +
    </div>
 +
    <div class="figure">(c)sensitivity of kd8 (d)sensitivity of kd9</div>
 +
 
 +
    <div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/3/3d/T--TJU_China--d10.png">
 +
        </div>
 +
        <div class="doublepic">
 +
            <img src="https://static.igem.org/mediawiki/2018/7/7a/T--TJU_China--d11.png">
 +
        </div>
 +
    </div>
 +
    <div class="figure">(e)sensitivity of kd10 (f)sensitivity of kd11</div>
 +
    <div class="figure">Note:The ordinate axis represents the sensitivity S,and the abscissa axis is the parameter k for which we want to evaluate
 +
        the sensitivity.</div>
 +
    <div class="subtitle">2.5 Application of the model</div>
 +
    <div class="word">Since the goal of our project is to increase the sensitivity of biosensors by introducing a complex of dCas9-RNAP and
 +
        sgRNA, and one of the purposes of our model is to explore whether this complex is effective.So we assume a reasonable
 +
        and large enough concentration value for this complex. We use the concentration of Glyceraldehyde-3-phosphate dehydrogenase
 +
        A as the assumed concentration.Glyceraldehyde-3-phosphate dehydrogenase A(gapA) is a crucial enzyme in the glycolytic
 +
        pathway,and the gene encoding this enzyme is a housekeeping gene in E.coli cells with high expression levels.We find
 +
        in the literature that the protein mass of gapA is 48645 fg/cell,and its molecular weight is 35492 Da.[4] The amount
 +
        of abundance of Glyceraldehyde-3-phosphate dehydrogenase A protein per cell can be calculated as follows:
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/c/c3/T--TJU_China--m11.png">
 +
    </div>
 +
    <div class="word">As for the size of E.coli,we found relevant data from the literature,as the figure below shows.[5]</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/a/a7/T--TJU_China--dc.png">
 +
    </div>
 +
    <div class="figure"><b>Figure 8.</b>Size of E.coli </div>
 +
    <div class="word">The volume of E.coli can be calculated as follows:</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/2/2a/T--TJU_China--m12.png">
 +
    </div>
 +
    <div class="word">Then the concentration of Glyceraldehyde-3-phosphate dehydrogenase A protein in the cell can be determined:</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/3/3a/T--TJU_China--m13.png">
 +
    </div>
 +
    <div class="word">With this concentration,we can get very nice results:</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/d/d1/T--TJU_China--23.png">
 +
    </div>
 +
    <div class="figure"><b>Figure 9.</b>smURFP production with enough dCas9-RNAP:sgRNA</div>
 +
    <div class="word">Compared to the diagram without introducing dCas9-RNAP:sgRNA:</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/0/0b/T--TJU_China--21.png">
 +
    </div>
 +
    <div class="figure"><b>Figure 10.</b>smURFP production within a reasonable time frame</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/6/6c/T--TJU_China--22.png">
 +
    </div>
 +
    <div class="figure"><b>Figure 11.</b>smURFP production reached equilibrium but it takes a long time</div>
 +
    <div class="word">From these three figures, we can conclude that dCas9-RNAP:sgRNA does have the effect of promoting transcription and increasing
 +
        fluorescence intensity,thereby increasing sensitivity,as long as its concentration is sufficient.This result enhances
 +
        the confidence of the experimental group,and they need to try to improve the expression of dCas9-RNAP:sgRNA in E.coli
 +
        without having to doubt its role.
 +
    </div>
 +
    <div class="head">References</div>
 +
    <div class="word">[1] LA Pola-Lopez et al."Novel arsenic biosensor "POLA" obtained by a genetically modified E.coli bioreporter cell" .In:Sensors
 +
        and Actuators B:Chemical254(2018),pp.1061-1068.
 +
        <br>[2] Yves Berset et al."Mechanistic Modeling of Genetic Circults for ArsR Arsenic Regulation".In:ACS synthetic biology
 +
        6.5(2017),pp.862-874.
 +
        <br>[3] Eyal Karzbrun et al."Coares-grained dynamics of protein synthesis in a cell-free system".In:Phtsical review letters
 +
        106.4(2011),p.048104.
 +
        <br> [4] Yasushi Ishihama et al."Exponentially modified protein abundance index(emPAI) for estimation of absolute protein
 +
        amount in proteomics by the number of sequenced peptides per protein".In:Molecular E Cellular Proteomics 4.9(2005),pp.1265-1272.
 +
        <br>[5] Nili Crossman,Eliora Z Ron,and Conrad L Woldringh."Changes in cell dimensions during amino acid starvation of
 +
        Escherichia coli."In:Journal of bacteriology 152.1(1982),pp.35-41.
 +
    </div>
 +
 
 +
 
 +
 
 +
    <div class="head" id="Off-Target_Model">Free Energy Model of Off-target Problem</div>
 +
 
 +
    <div class="title">1 Background</div>
 +
    <div class="word">Nowadays,the analysis of cleavage possibility can be devided into two type,i,e.meta-empirical and empirical.For the first
 +
        one, people develop the various score function based on experiment data to evaluate if a sgDNA is good or bad.Correspondingly,the
 +
        other group chooce set up a theoretical model based on kinetic theory.But because using many approximations,it has
 +
        drawbacks inevitably.
 +
        <br>Our model aims to investigate the off-target problem in gene editing by the CRISPR-Cas system,therefore finding efficient
 +
        ways to enhance the reliability of gene editing.The foundations of thsi model are mostly simple probability theory
 +
        and dynamic deduction,which make our model both convincing and pellucid.</div>
 +
 
 +
    <div class="title">2 Introduction</div>
 +
    <div class="word">
 +
        <br>Currently,people have constructed a similar model as illustrated in the following figure1.There are four common rules
 +
        when Cas nuclease cleaves the DNA[1].
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/4/4f/T--TJU_China--z1.png">
 +
    </div>
 +
    <div class="figure"><b>Figure 1.</b>schematic diagram</div>
 +
    <div class="word">(1)Seed region:single mismatch(es) within a PAM proximal seed region can completely disrupt interference.
 +
        <br> (2)Mismatch spread:when mismatches are outside the seed region,off-targets with spread out mismatches are targeted
 +
        most strongly.
 +
        <br> (3)Differential binding versus differential cleavage:binding is more tolerant of mismatched than cleavage.
 +
        <br>(4)Specificity-efficiency decoupling:weakened protein-DNA interatctions can improve target selectivity while still
 +
        maintaining efficiency.
 +
        <br>Based on these four rules,probability theory is applied in to explain it.As we know,there are always only two results
 +
        in an experiment,which are successful cleavage and unsuccessful cleavage.In math view,it can be one-hot encoded,and
 +
        they are corresponding to 1 and 0.
 +
    </div>
 +
    <div class="word">However,giving a 0/1 prediction is hard and unreliable.To solve this problem, one choice is to consider it as a cluster
 +
        problem;however,it is easier to find a continuous quantitative function rather than to find a suitable cluster distance
 +
        function.Sonaturally,finding an approximate probability distribution is a good choice.
 +
        <br> In many target design toolkits,they use a score function with several param eters which can generate a score to
 +
        evaluate whether the target is good or bad. Here we consider the score function has the similar ability to probability,which
 +
        is a description of ”better” or ”worse” while can’t affirm whether successful cleavage willappear.For our case,our
 +
        goal is to find a function indicating which target is BETTER.
 +
        <br> Considering the difference between model prediction and experimental data,our model consists of two aspects,which
 +
        are kinetic inference and an updating module.
 +
    </div>
 +
    <div class="title">3 Methods</div>
 +
    <div class="subtitle">3.1 Knietic module</div>
 +
    <div class="word">Figure 2 shows that the whole binding-cleavage process begins with the bind ing between PAM andprotein.Therefore,it corresponds
 +
        to rule1 mentioned before.And as the reaction proceeds,every step of it is reversible,and its irre versibility mainly
 +
        depends on the binding energy of two DNA bases. The boundary probability Pclv;N,representing the probability of matching
 +
        at the Nth position(the last position of sgRNA) of nucleotide base,is given by:
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/9/92/T--TJU_China--m14.png">
 +
    </div>
 +
    <div class="word">Where k is the reaction rate constant; f represents the forward reactions;b represents the backward reaction.And </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/e/ec/T--TJU_China--zm1.PNG">
 +
    </div>
 +
    <div class="word">So for a complete match:</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/0/00/T--TJU_China--zm2.png">
 +
    </div>
 +
    <div class="word">Consider the rate constant $K_f(i)$ and #k_b(i)$:</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/b/b2/T--TJU_China--zm3.png">
 +
    </div>
 +
    <div class="word">where $F_i$ means free energy of each metastable state,$T_{i,i+1}$means the highest free energy point on the reaction
 +
        path from position i to position i+1.Therefore,$T_{i,i+1}$-$F_i$ is the activation energy of forward reaction and
 +
        $T_{i,i+1}$-$F_i$ is activation energy of the backward reaction.
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/0/02/T--TJU_China--zm4.png">
 +
    </div>
 +
    <div class="word">We define</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/f/fe/T--TJU_China--zm5.png">
 +
    </div>
 +
    <div class="word">So</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/9/93/T--TJU_China--zm6.png">
 +
    </div>
 +
    <div class="word">From the above,it is clear that the matching probability depends only on the state transition energy,not on the free
 +
        energy of the metastable states.If we assume there is one dominant minimal bias,say for n = n ∗ ,then this equation
 +
        can be approximated as:</div>
 +
 
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/7/72/T--TJU_China--zm7.png">
 +
    </div>
 +
    <div class="word">
 +
        To sum up,the cleavage possibility mainly relies on the free energy change, and PAM appears as a significant energy decline.
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/7/7c/T--TJU_China--AT.png">
 +
    </div>
 +
    <div class="figure"><b>Figure 2.</b>AT</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/3/37/T--TJU_China--CG.png">
 +
    </div>
 +
    <div class="figure"><b>Figure 3.</b>CG</div>
 +
    <div class="word">The kinetic model sets up a framework to build the relationship between bind ing probability and the numbers of nucleotide
 +
        matches and mismatches.In con sideration of this problem more carefully,the binding probability becomes equal to
 +
        the analysis of energy change,because we know the binding energy of A/T and C/G is different due to the different
 +
        hydrogen bonds between them(figure 2 and 3),and the energy decrease in C/G is approximately 1.5 folds as A/T.Sim
 +
        ilarly,the mismatch has more types of variance because the sizes of nucleotides are various.Hence,the types of the
 +
        mismatched base pair are classified b ygroup volume,i.e.,two pyrimidines(such as C/T,“Large”),pyrimidine and purine(such
 +
        as C/A,"Medium"),two purine(such as G/T,"Small").Hence,the probability can be calculated using the following equation:
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/5/53/T--TJU_China--zm8.png">
 +
    </div>
 +
    <div class="subtitle">3.2 Parameter Optimization</div>
 +
    <div class="word">From the kinetic model,we can get an output,which is the binding probability.It needs to be noticed that the parameters
 +
        we choose should make results well discriminated,because in a cleavage experiment,we only have two outcomes,successful(1)
 +
        and unsuccessful(0).
 +
        <br>To make our predictions from the model more approximate to experiment(facts),we set a regression module and implement
 +
        parameter optimization.
 +
        <br> Here,the method we choose is stochastic gradient descent(SGD) and cross entropy.And their principle can be concluded
 +
        as follows.
 +
    </div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/a/a4/T--TJU_China--zm9.png">
 +
    </div>
 +
    <div class="word">where θ means the parameters array and J means the loss function. Considering the difference in gradient calculation,we
 +
        use difference to substi tute differential aim to accelerate operating speed.</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/f/f3/T--TJU_China--zm10.png">
 +
    </div>
 +
    <div class="word">By using this simple method,our model can be more vibrant,updating using newest data and becoming more reliable.</div>
 +
 
 +
    <div class="subtitle">3.3 Generating sgRNA Candidates</div>
 +
    <div class="word">Meanwhile,we designed a program to generate all the sgRNA candidates for a target gene,and combined with the previous
 +
        model,we can compare and rank all the candidates.The principle of the program is very simple.We use PAM as the input
 +
        and collect the arrays with a certain length which contain the same beginning code as PAM. </div>
 +
 
 +
    <div class="title">4 Results</div>
 +
    <div class="word">Here,we set the parameters as default values and observe its performance.Our default parameters are set based on three
 +
        simple rule.Firstly,because of the different numbers of hydrogen bonds between A/T and C/G,the energy decrease due
 +
        to A/T binding are 1.5 times to C/G binding.So the parameter#1 is 1.5 times parameter#2.Secondly,it's universally
 +
        known that big compounds with small distance will have a high energy because of repulsion.The mismatching combination
 +
        between two nucleotides can be classified into three parts-large,medium and small,and we should set other parameters
 +
        in the same order of values.Thirdly,we use the parameters from[1] as reference.In total,we set the parameters default
 +
        values as[0.06,0.09,0.3,0.6,0.03].
 +
        <br>As the following figure shows,the energy always decreases or has some turning point because of mismatch and is always
 +
        nagative.The yellow line represents complete match.The blue line and the red line are two examples of mismatched
 +
        at different positions.For example,the red line has a peak due to a mismatch, and in our model,we find that it doesn’t
 +
        make the energy positive.That means that in this reaction process there is some force like ”momentum” pushing it
 +
        to proceed and cross the energy peak.For a off-target examination to every location of all the DNA in a system(a
 +
        genome),there will also be positive energy result,which is obviously not a feasible solution(will not lead to off-target).This
 +
        kind of results is omitted in the plot.</div>
 +
 
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/1/1c/T--TJU_China--energy_change.png">
 +
    </div>
 +
    <div class="figure"><b>Figure 4.</b>Energy change of sgRNA candidates binding to DNA</div>
 +
    <div class="pic">
 +
        <img src="https://static.igem.org/mediawiki/2018/5/55/T--TJU_China--fig1.png">
 +
    </div>
 +
    <div class="figure"><b>Figure 5.</b> The possibility of sgRNA candidates binding to DNA's different locations</div>
 +
    <div class="word">After testing our code’s running time,we find its rate can reach approximately $2*10^8$ base/h(under parallel computing
 +
        in 4cores).
 +
        <br>Besides the default parameters,we hope our model can hit more true data.So after we get the experiment data,we can
 +
        use the parameter optimization method mentioned before to get more precise parameters. 2×108base/h(underparallelcomputingin4cores).
 +
        Besidesthedefaultparameters,wehopeourmodelcanhitmoretruedata.So afterwegettheexperimentdata,wecanusetheparameteroptimizationmethod
 +
    </div>
 +
 
 +
    <div class="head">References</div>
 +
    <div class="word">[1] family=Klein,familyi=K.,given=Misha,giveni=M.,"Hybridization kinet ics explains CRISPR-Cas off-targeting rules".In:Cell
 +
        reports 22.6(2018),pp.1413-1423.
 +
    </div>
 +
 
 +
    <div class="head" id="Code">Code</div>
 +
    <div class="word" style="line-height:1.5;">The codes for our dynamic model and off-target model are available at https://github.com/sherrybao222/TJU_China-iGEM-2018.
 +
 
 +
            <br>For the dynamic model, they include the MATLAB files used to generate our results.
 +
           
 +
            <br>The "whole model" file is a simulation of the heavy metal detection biosensor's pathways.
 +
           
 +
            <br>The "without CRISPR" file is a simulation of the biosensor without dCas9-RNAP and sgRNA plasmids.
 +
           
 +
            <br>By comparing the "whole model" with the "without CRISPR" output, the significant role of dCas9-RNAP in the biosensor can be determined.
 +
           
 +
            <br>The sensitivity file is a sensitivity analysis of the parameters involved in the whole model.
 +
           
 +
            <br>The assumed concentration file is a simulation of the biosensor in the case of a hypothetical concentration of dCas9-RNAP.
 +
           
 +
            <br>For the off-target model, they include the MATLAB files used to build our model and implement parameter optimization.
 +
           
 +
            <br>diff0: distinguish five kinds of base pairs and collect the information;
 +
           
 +
            <br>count1: record the numbers of five types of nucleotides combinations;
 +
           
 +
            <br>possible: for this function, its input is the target gene and the genome, and its output is a score which means the possibility of binding;
 +
           
 +
            <br>loss_function: used for calculating loss function;
 +
           
 +
            <br>training: used to optimize the parameter based on experimental data.
 +
            </div>
 +
 
 +
 
 +
 
 +
 
 +
 
 +
    <script src="https://2018.igem.org/common/MathJax-2.5-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
 +
    <script type="text/x-mathjax-config">
 +
    MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}});
 +
  </script>
 +
</body>
 +
<!-- <div>
 +
<div class="pic"><img src=""></div>
 +
<div class="word"></div>
 +
<div class="figure">Figure </div>
 +
    <div class="title"></div>
 +
    <div class="subtitle"></div>
 +
 
 +
 
 +
 
 +
$P_{arsR_{d}}$
 +
 
 +
$As^{3+}$
 +
<div class="equation">  \(P_{J23104} \xrightarrow {k_{tx1}}  P_{J23104} + mRNA_{ArsR}\)</div> <div class="number">(1)</div>
 +
</div> -->
  
 
</html>
 
</html>

Latest revision as of 01:07, 18 October 2018

<!DOCTYPE >

Dynamic Model of Heavy Metal Detection Biosensor
1 Introduction
Modeling is a powerful tool in synthetic biology. It provides us with a necessary engineering approach to characterize our pathways quantitatively and predict their performance,thus help us test and modify our design.Through the dynamic model of heavy-metal detection biosensor,we hope to gain insights into the characteristics of our whole circuit's dynamics.
2 Methods
2.1 Analysis of metabolic pathways
Figure 1. Metabolic pathways related to plasmid#1
At the beginning, on the plasmid#1, the promoter $P_{arsR}$ isn't bound with ArsR,thus it is active.ArsR and smURFP are transcribed and translated under the control of the promoters $P_{arsR_{u}}$ and $P_{arsR_{d}}$,with subscript u and d representing upstream and downstream separately.The subscript l of smURFP in the equation means leaky expression without the expression of $As^{3+}$.As ArsR is expressed gradually,it will bind with the promoter $P_{arsR}$ and make it inactive.[1]
On the plasmid#2,the fusion protein of dCas9 and RNAP(RNA polymerase) are produced after transcription and translation,and sgRNA is produced after transcription.
Figure 2. Metabolic pathways related to dCas9/RNAP
dCas9(*RNAP) can bind with its target DNA sequence without cutting, which is at the upstream of the promoter $P_{arsR_{d}}$.Simulataneously,dCas9 can lead RNAP to bind with the promoter $P_{arsR_{d}}$ and enhance the transcription of smURFP.However,because the promoter $P_{arsR_{d}}$ has already bound with ArsR,as a result,RNAP can't bind with the promoter $P_{arsR_{d}}$. can’t bind with the promoter $P_{arsR_{d}}$.
However,at the presence of $As^{3+}$,it can bind with ArsR,then dissociate ArsR and $P_{arsR_{d}}$ , which makes the combination of RNAP and $P_{arsR_{d}}$ possible.
We then take degradation into account:
2.2 Analysis of ODEs
Applying mass action kinetic laws,we obtain the following set of differentiak equations.The several complexes involved:Ars$R^*$$P_{arsR}$,$As^{3+}$,${dCas9}^*$RNAP,${dCas9}^*$RNAP:sgRNA,${dCas9}^*$RNAP:${sgRNA}^*P_{arsR}$, are respectively abbreviated as $cplx_1$,$cplx_2$,$cplx_3$,$cplx_4$,$cplx_5$.
2.3 Simulation
Our simulation is based on two softwares: MATLAB (SimBiology Toolbox) and COPASI.
SimBiology Toolbox provides functions for modeling,simulating and analyzing biochemical pathways by the powerful computing engine of MATLAB.
Figure 3.Reaction map generated from the reaction sets above by SimBiology Toolbox
Figure 4.Simulation of smURFP production as a function of time by MATLAB Through the figure, we can see that the smURFP can gradually increase and reach a steady state after a period in the presence of arsenic ions.
2.4 Sensitivity
A good biosystem should have certain stability towards fluctuations in parameters.A good model should reflect this,and hence a test for robustness can be essential to the model.
Robustness analysis can also pinpoint which reactions/parameters that are important for obtaining a specific biological behavior.A simple measure for sensitivity is to measure the relative change of a system feaure due to a change in a parameter.As for our model,the feature can be the equilibrium concentration of the smURFP(C) for which the sensitivity(S) to a parameter k is:
After analysis, we found that the concentration of smURFP is relatively sensitive to parameters such as ktx3,ktl3,ktx4,kb4,kb6,kd2,kd5, kd6,kd7,kd8,kd11, etc. Among these parameters, except for the parameters that directly affect the production and degradation of smURFP,the rest of them are all related to dCas9-RNAP:sgRNA. It shows that our model reflects the critical role of dCas9-RNAP:sgRNA,which initially confirms our hypothesis:dCas0-RNAP can enhance transcription to increase the concentration of smURFP. However, due to the lack of previous modeling studies on dCas9-RNAP,some kinetic parameters may not be very accurate,and due to time limitation,we have not implemented experiments to measure related parameters,which may lead to some deviations in our model.
The sensitivity of each parameter is shown in the figures below.
(a)sensitivity of ktx1 (b)sensitivity of ktl1
(c)sensitivity of ktx2(d)sensitivity of ktl2
(e)sensitivity of ktx3 (f)sensitivity of ktl3
(g)sensitivity of ktx4 (h)sensitivity of kb1
(i)sensitivity of kb2 (j)sensitivity of kb3
(a)sensitivity of kb4 (b)sensitivity of kb5
(c)sensitivity of kb6 (d)sensitivity of kd1
(e)sensitivity of kd2 (f)sensitivity of kd3
(g)sensitivity of kd4 (h)sensitivity of kd5
(a)sensitivity of kd6 (b)sensitivity of kd7
(c)sensitivity of kd8 (d)sensitivity of kd9
(e)sensitivity of kd10 (f)sensitivity of kd11
Note:The ordinate axis represents the sensitivity S,and the abscissa axis is the parameter k for which we want to evaluate the sensitivity.
2.5 Application of the model
Since the goal of our project is to increase the sensitivity of biosensors by introducing a complex of dCas9-RNAP and sgRNA, and one of the purposes of our model is to explore whether this complex is effective.So we assume a reasonable and large enough concentration value for this complex. We use the concentration of Glyceraldehyde-3-phosphate dehydrogenase A as the assumed concentration.Glyceraldehyde-3-phosphate dehydrogenase A(gapA) is a crucial enzyme in the glycolytic pathway,and the gene encoding this enzyme is a housekeeping gene in E.coli cells with high expression levels.We find in the literature that the protein mass of gapA is 48645 fg/cell,and its molecular weight is 35492 Da.[4] The amount of abundance of Glyceraldehyde-3-phosphate dehydrogenase A protein per cell can be calculated as follows:
As for the size of E.coli,we found relevant data from the literature,as the figure below shows.[5]
Figure 8.Size of E.coli
The volume of E.coli can be calculated as follows:
Then the concentration of Glyceraldehyde-3-phosphate dehydrogenase A protein in the cell can be determined:
With this concentration,we can get very nice results:
Figure 9.smURFP production with enough dCas9-RNAP:sgRNA
Compared to the diagram without introducing dCas9-RNAP:sgRNA:
Figure 10.smURFP production within a reasonable time frame
Figure 11.smURFP production reached equilibrium but it takes a long time
From these three figures, we can conclude that dCas9-RNAP:sgRNA does have the effect of promoting transcription and increasing fluorescence intensity,thereby increasing sensitivity,as long as its concentration is sufficient.This result enhances the confidence of the experimental group,and they need to try to improve the expression of dCas9-RNAP:sgRNA in E.coli without having to doubt its role.
References
[1] LA Pola-Lopez et al."Novel arsenic biosensor "POLA" obtained by a genetically modified E.coli bioreporter cell" .In:Sensors and Actuators B:Chemical254(2018),pp.1061-1068.
[2] Yves Berset et al."Mechanistic Modeling of Genetic Circults for ArsR Arsenic Regulation".In:ACS synthetic biology 6.5(2017),pp.862-874.
[3] Eyal Karzbrun et al."Coares-grained dynamics of protein synthesis in a cell-free system".In:Phtsical review letters 106.4(2011),p.048104.
[4] Yasushi Ishihama et al."Exponentially modified protein abundance index(emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein".In:Molecular E Cellular Proteomics 4.9(2005),pp.1265-1272.
[5] Nili Crossman,Eliora Z Ron,and Conrad L Woldringh."Changes in cell dimensions during amino acid starvation of Escherichia coli."In:Journal of bacteriology 152.1(1982),pp.35-41.
Free Energy Model of Off-target Problem
1 Background
Nowadays,the analysis of cleavage possibility can be devided into two type,i,e.meta-empirical and empirical.For the first one, people develop the various score function based on experiment data to evaluate if a sgDNA is good or bad.Correspondingly,the other group chooce set up a theoretical model based on kinetic theory.But because using many approximations,it has drawbacks inevitably.
Our model aims to investigate the off-target problem in gene editing by the CRISPR-Cas system,therefore finding efficient ways to enhance the reliability of gene editing.The foundations of thsi model are mostly simple probability theory and dynamic deduction,which make our model both convincing and pellucid.
2 Introduction

Currently,people have constructed a similar model as illustrated in the following figure1.There are four common rules when Cas nuclease cleaves the DNA[1].
Figure 1.schematic diagram
(1)Seed region:single mismatch(es) within a PAM proximal seed region can completely disrupt interference.
(2)Mismatch spread:when mismatches are outside the seed region,off-targets with spread out mismatches are targeted most strongly.
(3)Differential binding versus differential cleavage:binding is more tolerant of mismatched than cleavage.
(4)Specificity-efficiency decoupling:weakened protein-DNA interatctions can improve target selectivity while still maintaining efficiency.
Based on these four rules,probability theory is applied in to explain it.As we know,there are always only two results in an experiment,which are successful cleavage and unsuccessful cleavage.In math view,it can be one-hot encoded,and they are corresponding to 1 and 0.
However,giving a 0/1 prediction is hard and unreliable.To solve this problem, one choice is to consider it as a cluster problem;however,it is easier to find a continuous quantitative function rather than to find a suitable cluster distance function.Sonaturally,finding an approximate probability distribution is a good choice.
In many target design toolkits,they use a score function with several param eters which can generate a score to evaluate whether the target is good or bad. Here we consider the score function has the similar ability to probability,which is a description of ”better” or ”worse” while can’t affirm whether successful cleavage willappear.For our case,our goal is to find a function indicating which target is BETTER.
Considering the difference between model prediction and experimental data,our model consists of two aspects,which are kinetic inference and an updating module.
3 Methods
3.1 Knietic module
Figure 2 shows that the whole binding-cleavage process begins with the bind ing between PAM andprotein.Therefore,it corresponds to rule1 mentioned before.And as the reaction proceeds,every step of it is reversible,and its irre versibility mainly depends on the binding energy of two DNA bases. The boundary probability Pclv;N,representing the probability of matching at the Nth position(the last position of sgRNA) of nucleotide base,is given by:
Where k is the reaction rate constant; f represents the forward reactions;b represents the backward reaction.And
So for a complete match:
Consider the rate constant $K_f(i)$ and #k_b(i)$:
where $F_i$ means free energy of each metastable state,$T_{i,i+1}$means the highest free energy point on the reaction path from position i to position i+1.Therefore,$T_{i,i+1}$-$F_i$ is the activation energy of forward reaction and $T_{i,i+1}$-$F_i$ is activation energy of the backward reaction.
We define
So
From the above,it is clear that the matching probability depends only on the state transition energy,not on the free energy of the metastable states.If we assume there is one dominant minimal bias,say for n = n ∗ ,then this equation can be approximated as:
To sum up,the cleavage possibility mainly relies on the free energy change, and PAM appears as a significant energy decline.
Figure 2.AT
Figure 3.CG
The kinetic model sets up a framework to build the relationship between bind ing probability and the numbers of nucleotide matches and mismatches.In con sideration of this problem more carefully,the binding probability becomes equal to the analysis of energy change,because we know the binding energy of A/T and C/G is different due to the different hydrogen bonds between them(figure 2 and 3),and the energy decrease in C/G is approximately 1.5 folds as A/T.Sim ilarly,the mismatch has more types of variance because the sizes of nucleotides are various.Hence,the types of the mismatched base pair are classified b ygroup volume,i.e.,two pyrimidines(such as C/T,“Large”),pyrimidine and purine(such as C/A,"Medium"),two purine(such as G/T,"Small").Hence,the probability can be calculated using the following equation:
3.2 Parameter Optimization
From the kinetic model,we can get an output,which is the binding probability.It needs to be noticed that the parameters we choose should make results well discriminated,because in a cleavage experiment,we only have two outcomes,successful(1) and unsuccessful(0).
To make our predictions from the model more approximate to experiment(facts),we set a regression module and implement parameter optimization.
Here,the method we choose is stochastic gradient descent(SGD) and cross entropy.And their principle can be concluded as follows.
where θ means the parameters array and J means the loss function. Considering the difference in gradient calculation,we use difference to substi tute differential aim to accelerate operating speed.
By using this simple method,our model can be more vibrant,updating using newest data and becoming more reliable.
3.3 Generating sgRNA Candidates
Meanwhile,we designed a program to generate all the sgRNA candidates for a target gene,and combined with the previous model,we can compare and rank all the candidates.The principle of the program is very simple.We use PAM as the input and collect the arrays with a certain length which contain the same beginning code as PAM.
4 Results
Here,we set the parameters as default values and observe its performance.Our default parameters are set based on three simple rule.Firstly,because of the different numbers of hydrogen bonds between A/T and C/G,the energy decrease due to A/T binding are 1.5 times to C/G binding.So the parameter#1 is 1.5 times parameter#2.Secondly,it's universally known that big compounds with small distance will have a high energy because of repulsion.The mismatching combination between two nucleotides can be classified into three parts-large,medium and small,and we should set other parameters in the same order of values.Thirdly,we use the parameters from[1] as reference.In total,we set the parameters default values as[0.06,0.09,0.3,0.6,0.03].
As the following figure shows,the energy always decreases or has some turning point because of mismatch and is always nagative.The yellow line represents complete match.The blue line and the red line are two examples of mismatched at different positions.For example,the red line has a peak due to a mismatch, and in our model,we find that it doesn’t make the energy positive.That means that in this reaction process there is some force like ”momentum” pushing it to proceed and cross the energy peak.For a off-target examination to every location of all the DNA in a system(a genome),there will also be positive energy result,which is obviously not a feasible solution(will not lead to off-target).This kind of results is omitted in the plot.
Figure 4.Energy change of sgRNA candidates binding to DNA
Figure 5. The possibility of sgRNA candidates binding to DNA's different locations
After testing our code’s running time,we find its rate can reach approximately $2*10^8$ base/h(under parallel computing in 4cores).
Besides the default parameters,we hope our model can hit more true data.So after we get the experiment data,we can use the parameter optimization method mentioned before to get more precise parameters. 2×108base/h(underparallelcomputingin4cores). Besidesthedefaultparameters,wehopeourmodelcanhitmoretruedata.So afterwegettheexperimentdata,wecanusetheparameteroptimizationmethod
References
[1] family=Klein,familyi=K.,given=Misha,giveni=M.,"Hybridization kinet ics explains CRISPR-Cas off-targeting rules".In:Cell reports 22.6(2018),pp.1413-1423.
Code
The codes for our dynamic model and off-target model are available at https://github.com/sherrybao222/TJU_China-iGEM-2018.
For the dynamic model, they include the MATLAB files used to generate our results.
The "whole model" file is a simulation of the heavy metal detection biosensor's pathways.
The "without CRISPR" file is a simulation of the biosensor without dCas9-RNAP and sgRNA plasmids.
By comparing the "whole model" with the "without CRISPR" output, the significant role of dCas9-RNAP in the biosensor can be determined.
The sensitivity file is a sensitivity analysis of the parameters involved in the whole model.
The assumed concentration file is a simulation of the biosensor in the case of a hypothetical concentration of dCas9-RNAP.
For the off-target model, they include the MATLAB files used to build our model and implement parameter optimization.
diff0: distinguish five kinds of base pairs and collect the information;
count1: record the numbers of five types of nucleotides combinations;
possible: for this function, its input is the target gene and the genome, and its output is a score which means the possibility of binding;
loss_function: used for calculating loss function;
training: used to optimize the parameter based on experimental data.