Difference between revisions of "Team:Nanjing-China/Model"

Line 6: Line 6:
 
<link rel="stylesheet" type="text/css" href="https://2018.igem.org/Team:Nanjing-China/CSS:loader?action=raw&ctype=text/css" />
 
<link rel="stylesheet" type="text/css" href="https://2018.igem.org/Team:Nanjing-China/CSS:loader?action=raw&ctype=text/css" />
 
<style type="text/css">
 
<style type="text/css">
#HQ_page .word table{ border-top:rgba(102,102,102,1) double 3px ;border-bottom:rgba(102,102,102,1) double 3px; font-size:85%; }
+
#HQ_page .word table{ border-top:rgba(102,102,102,1) double 3px ;border-bottom:rgba(102,102,102,1) double 3px; font-size:85%; }
 
#HQ_page td{border:thin rgba(102,102,102,1) 1.5px;}
 
#HQ_page td{border:thin rgba(102,102,102,1) 1.5px;}
 
#HQ_page .word .word-1 p{text-align:center;}
 
#HQ_page .word .word-1 p{text-align:center;}
Line 81: Line 81:
 
       <ul><li><a href="#intro">Introduction</a></li>
 
       <ul><li><a href="#intro">Introduction</a></li>
 
       <li><a href="#method">Method</a></li>
 
       <li><a href="#method">Method</a></li>
 +
      <li><a href="#r">Refinement:</a></li>
 
       <li><a href="#document">Document</a></li></ul>
 
       <li><a href="#document">Document</a></li></ul>
 
</div>
 
</div>
Line 130: Line 131:
 
     <div class="contain" >
 
     <div class="contain" >
 
     <div class="word" id="intro">
 
     <div class="word" id="intro">
     <p>This year our team created a mathematical  model to optimize the arrangement of the nif gene cluster. This model helped we  optimized our design and provided some new perspectives of our  nitrogen-fixation system in transcriptional level.<br />
+
     <p>This year our team created a mathematical  model to optimize the arrangement of the <Em>nif</Em> gene cluster. This model helped we  optimized our design and provided some new perspectives of our  nitrogen-fixation system in transcriptional level.<br />
 
We developed this model with two goals in  mind:<br />
 
We developed this model with two goals in  mind:<br />
1.We want to achieve the best  stoichiometric proportion of each nif gene, which is  nifB:nifH:nifD:nifK:nifE:nifN:nifX:nifV=1:3:4:4:1:1:1:1.<br />
+
1.We want to achieve the best  stoichiometric proportion of each <Em>nif</Em> gene, which is  nifB:nifH:nifD:nifK:nifE:nifN:nifX:nifV=1:3:4:4:1:1:1:1.<br />
 
2.We want our system as simple as possible, that means  minimizing the number of promoters and copy number of each nif gene.<br />
 
2.We want our system as simple as possible, that means  minimizing the number of promoters and copy number of each nif gene.<br />
 
We made the following assumptions:<br />
 
We made the following assumptions:<br />
Line 198: Line 199:
 
     <div class="word" id="method">
 
     <div class="word" id="method">
 
     <h3> Method:</h3>
 
     <h3> Method:</h3>
       <p>To start with, we put all genes into two  groups. One group is under the strong promoter while the other is under the  weak one. We introduced some parameters shown in table2. </p>
+
       <p>To start with, we put all genes into two  groups. One group is under the strong promoter while the other is under the  weak one. We constructed two arrays,weak[i] and expected[i].</p>
 
     <div class="word-1">
 
     <div class="word-1">
    <table border="1" cellspacing="0" cellpadding="0" width="70%">
+
      <table border="1" cellspacing="0" cellpadding="0" width="90%">
      <tr>
+
        <tr>
        <td width="30%" valign="top">
+
          <td width="40%" valign="top"><p>Parameters(i=1,2,3,4,5,6,7,8)</p></td>
          <p>Parameters/data</p> </td>
+
          <td width="60%" valign="top"><p>Meanings</p></td>
        <td width="70%" valign="top"><p>Meanings</p></td>
+
        </tr>
      </tr>
+
        <tr>
      <tr>
+
          <td valign="top"><p>weak[i]</p></td>
        <td width="30%" valign="top"><p>weak[ ]</p></td>
+
          <td valign="top"><p>the relative expression level of each <Em>nif</Em> gene under the weak promoter</p></td>
         <td width="70%" valign="top"><p>the expression level of each nif gene under the weak promoter</p></td>
+
         </tr>
      </tr>
+
        <tr>
      <tr>
+
          <td valign="top"><p>weak[i]*</p></td>
        <td width="30%" valign="top"><p>strong[ ]</p></td>
+
          <td valign="top"><p>the relative expression level of each <Em>nif</Em> gene under the weak promoter after normalization</p></td>
         <td width="70%" valign="top"><p>the expression level of each nif gene under the strong promoter</p></td>
+
        </tr>
      </tr>
+
        <tr>
      <tr>
+
          <td valign="top"><p>expected[i]</p></td>
        <td width="30%" valign="top"><p>expected[ ]</p></td>
+
          <td valign="top"><p>the ideal stoichiometric proportion</p></td>
         <td width="70%" valign="top"><p>the ideal stoichiometric proportion</p></td>
+
        </tr>
      </tr>
+
        <tr>
      <tr>
+
          <td valign="top"><p>expected[i]*</p></td>
        <td width="30%" valign="top"><p>d</p></td>
+
          <td valign="top"><p>the ideal stoichiometric proportion after    normalization</p></td>
         <td width="70%" valign="top"><p>deviation between the expected expression level and the actual expression level</p></td>
+
         </tr>
      </tr>
+
        <tr>
    </table>
+
          <td valign="top"><p>strong[i]</p></td>
    <p align="center"><font size="-1">Table 2</font></p>
+
          <td  valign="top"><p>the relative expression level of each <Em>nif</Em> gene under the strong promoter after normalization</p></td>
 +
        </tr>
 +
        <tr>
 +
          <td valign="top"><p>e<sub>i</sub></p></td>
 +
          <td valign="top"><p>the ideal stoichiometric proportion of  the i<sup>th</sup> gene after all preprocessings </p></td>
 +
         </tr>
 +
        <tr>
 +
          <td valign="top"><p>a<sub>i</sub></p></td>
 +
          <td valign="top"><p>the relative expression level of the i<sup>th</sup> gene under the weak promoter after all preprocessings</p></td>
 +
        </tr>
 +
        <tr>
 +
          <td valign="top"><p>m<sub>i</sub></p></td>
 +
          <td valign="top"><p>the number of the i<Sup>th</Sup> gene under the strong promoter</p></td>
 +
         </tr>
 +
        <tr>
 +
          <td valign="top"><p>n<sub>i</sub></p></td>
 +
          <td valign="top"><p>The number of the i<Sup>th</Sup> gene under the weak promoter</p></td>
 +
        </tr>
 +
      </table>
 +
<p align="center"><font size="-1">Table 2 The table of parameters in our model</font></p>
 
     </div>
 
     </div>
     <p> Then we did some necessary preprocessing. Firstly, we presumed the smallest element in each array was 1 and normalized all the other data accordingly. In addition, to ensure there is at least one solution, we adjusted expected[] to make each element greater than or equal to the smallest expression level of the corresponding gene.<br />
+
     <p>Then we did some necessary preprocessings. Firstly, we found the smallest data in weak[i] and  called it &ldquo;min&rdquo;. We normalized all the other data accordingly by doing:<br />
      After that, we began the organization. In order to minimize the total number of genes, we arranged the strong promoter group first, and considered the weak group later. For each gene, we constantly added one copy of it to the strong promoter group, calculated the current deviation after each addition and compared the current deviation with the last one. If the deviation was decreasing ,we added one more copy and repeated the operation until the last deviation was smaller than the current one. In that way, we were able to determine the number of each gene with which the deviations were the smallest and completed the arrangement of the strong group. Similarly, we  arranged the weak group and finally received the result.</p>
+
      <em>weak[i]*=weak[i]/min</em>                                                  (1)                         <br />
      <div class="word-1" align="center">
+
      <em>expected[i]*=expected[i]/min</em>                                            (2)<br />
 +
We constructed  strong[i]:<br />
 +
<em>strong[i]=2*weak[i]*</em>                                                   (3)<br />
 +
Secondly, to guarantee the existence of a solution, we adjusted expected[i]* by examining whether it  is greater than or equal to the corresponding weak[i]*, if not, we did:<br />
 +
<em>expected[i]*=weak[i]* </em>                                                (4)    <br />
 +
      <br />
 +
After that, we began the organization. In order to minimize the total numbers of genes, we arranged the strong promoter group first, and considered the weak group later. Because each gene can be considered separately, here we only describe the organization of the i<sup>th</sup> gene as an example.<br />
 +
For the i<sup>th</sup>  gene, we tried adding one copy of it under the strong promoter. If <br />
 +
<em>|e<sub>i</sub>-2*a<sub>i</sub>|&lt;e<sub>i</sub>,     </em>                                                    (5)<br />
 +
we actually added it. Until we have added (m<sub>i</sub>+1) i<sup>th</sup> genes, and got<br />
 +
<em>|e<sub>i</sub>-2(m<sub>i</sub>+1)*a<sub>i</sub>|&gt;|e<sub>i</sub>-2mi<sub>i</sub>*a<sub>i</sub>| </em>                                           (6)<br />
 +
Then we stopped  adding it and recorded that we have added m<sub>i</sub> i<sup>th</sup> genes  under the strong promoter.<br />
 +
For the weak promoter group, we applied a similar method. For the i<sup>th</sup> gene, we tried adding one copy of it under the weak promoter. If<br />
 +
<em>|e<sub>i</sub>-2*m<sub>i</sub>*a<sub>i</sub>-a<sub>i</sub>|&lt;|e<sub>i</sub>-2*m<sub>i</sub>*a<sub>i</sub>|, </em>                                          (7)                                       <br />
 +
we actually added  it. Until we have added (n<sub>i</sub>+1) i<sup>th</sup> genes, and got <br />
 +
<em>|e<sub>i</sub>-2*m<sub>i</sub>*a<sub>i</sub>-(n<sub>i</sub>+1)*a<sub>i</sub>|&gt;|e<sub>i</sub>-2*mi<sub>i</sub>*a<sub>i</sub>-n<sub>i</sub>*a<sub>i</sub>|  </em>                                (8)<br />
 +
Then we stopped  adding it and recorded that we have added n<sub>i</sub> i<sup>th</sup> genes under the weak promoter.<br />
 +
In that way, we  were able to determine numbers of the i<sup>th</sup> gene under the two  promoters with which the deviation was the smallest.</p>
 +
<div class="word-1" align="center">
 
       <img src="https://static.igem.org/mediawiki/2018/8/8a/T--Nanjing-China--model-1.png"  width="100%"/>
 
       <img src="https://static.igem.org/mediawiki/2018/8/8a/T--Nanjing-China--model-1.png"  width="100%"/>
    <p><font size="-1">Fig 1. A flow diagram describing the idea of our modeling process</font><br />
+
    <p><font size="-1">Fig 1. A flow diagram describing the idea of our modeling process</font></p>
 
       </div>
 
       </div>
 
       <p>According to this flow diagram, we programmed with Python and got the following results:</p>
 
       <p>According to this flow diagram, we programmed with Python and got the following results:</p>
 
       <div class="word-1" align="center">
 
       <div class="word-1" align="center">
 
       <img src="https://static.igem.org/mediawiki/2018/e/ed/T--Nanjing-China--model-2.png"  width="100%"/>
 
       <img src="https://static.igem.org/mediawiki/2018/e/ed/T--Nanjing-China--model-2.png"  width="100%"/>
     <p><font size="-1">Fig 2. The best arrangement of nif genes according to our calculation</font><br />
+
     <p><font size="-1">Fig 2. The best arrangement of nif genes according to our calculation</font></p>
 +
      </div>
 +
      <p>With this arrangement, the proportion of nifB: nifH: nifD: nifK: nifE: nifN: nifX: nifV = 15.44: 46.93: 71.88: 62.10: 16.44: 16.04: 16.0: 15.94, which is most close to the ideal proportion among all the solutions.</p>
 +
    </div>
 +
    <div class="word" id="r">
 +
   
 +
      <p><strong>Refinement of  our model:</strong><br />
 +
        We modified the  putative best expression level of nifB:nifH:nifD:nifK:nifE:nifN:nifX:nifV to 5:3:4:4:1:1:1:1.  We believed in this way, we could better simulate the expression of nitrogenase  in our engineered <em>E.coli</em> strains. We made this change because of three  reasons.</p>
 +
      <p>Firstly, nifB is  indispensable for assembly nitrogenase no matter in diazotrophs or engineered <em>E.coli</em> strains. Apart from the minimal nitrogen fixation gene cluster, the genomic DNA  of wide type <em>Paenibacillus  polymyxa </em>includes analogues of nifM, nifU,  nifS and other genes which exist in other nitrogen-fixing microorganisms and  are essential for the correct folding of nitrogenase iron protein. However, the <em>E.coli </em>genome doesn&rsquo;t have such analogues. Nevertheless, it has been  reported that the excessive expression of nifB could compensate for the absence  of nifU and nifS. That is, if nifB is overexpressed in <em>E.coli</em>, these auxiliaries are not necessary. Therefore, the expression level  of nifB should be the highest 5.</p>
 +
      <p>Secondly, compared with  nitrogen-fixing microorganisms, <em>E.coli</em> also lacks some genes that provide electron  transfer function, such as nifF and niff. So the intracellular reductive power  of <em>E.coli</em> is insufficient to accomplish nitrogen fixation.  Thus it is necessary to overexpress nifH(nitrogenase reductase) and the value  is set to 3 instead of 5 because our semiconductor, the CdS part, can provide  additional electrons.</p>
 +
      <p>Thirdly, we set the expression  level of nifD and nifK to be 4 because molybdenum iron protein is an ɑ2β2 allotetramer and is the core of  nitrogenase.</p>
 +
      <p>Based on the new ideal stoichiometric proportion, we adjusted the code and received a more accurate result.</p>
 +
      <div class="word-1" align="center">
 +
      <img src="https://static.igem.org/mediawiki/2018/5/50/T--Nanjing-China--model-3.png"  width="100%"/>
 +
    <p><font size="-1">Fig 3 The best arrangement of nif genes version 2.0.</font></p>
 
       </div>
 
       </div>
       <p>With this arrangement, the proportion of nifB: nifH: nifD: nifK: nifE: nifN: nifX: nifV = 15.44: 46.93: 71.88: 62.10: 16.44: 16.04: 16.0: 15.94, which is most close to the ideal proportion among all the solutions.<br/>
+
       <p>The achieved stoichiometric proportion of nifB:nifH:nifD:nifK:nifE:nifN:nifX:nifV=77.23:46.93:71.88:62.10:16.44:16.04:16.0:15.94,which is the closest to the ideal 5:3:4:4:1:1:1:1. <br />
      This model provided a potential strategy for the improvement of the activity of the nitrogenase expressed in our engineered E.coli strain.</p>
+
        This model provided a potential strategy for the improvement of biological activity of nitrogenase expressed in our engineered <em>E.coli</em> strain and offered a  great help to our further experiments.</p>
 
     </div>
 
     </div>
 
     <div class="word" id="document">
 
     <div class="word" id="document">

Revision as of 14:42, 16 October 2018

Nanjing-China2018

This year our team created a mathematical model to optimize the arrangement of the nif gene cluster. This model helped we optimized our design and provided some new perspectives of our nitrogen-fixation system in transcriptional level.
We developed this model with two goals in mind:
1.We want to achieve the best stoichiometric proportion of each nif gene, which is nifB:nifH:nifD:nifK:nifE:nifN:nifX:nifV=1:3:4:4:1:1:1:1.
2.We want our system as simple as possible, that means minimizing the number of promoters and copy number of each nif gene.
We made the following assumptions:
1.There are two kinds of promoters, both of which can successfully launch the expression of every nitrogen fixation gene involved in our system.
2.One promoter is stronger (called H) while the other is relatively weak(called L). Under promoter H, each gene’s transcription level is double that of under promoter L.
3.The order of genes has little influence on their transcriptional level.
We conducted Real-time Quantitative PCR to detect the transcription level of nif gene cluster and the experimental data we received became an important reference for our modeling.

gene

Average value of Cq

Relative expression level

16S DNA

6.33

 

nifB

19.97

7.80E-05

nifH

17.37

4.74E-04

nifD

18.34

2.42E-04

nifK

20.77

4.48E-05

nifE

22.20

1.66E-05

nifN

22.24

1.62E-05

nifX

22.92

1.01E-05

nifV

21.25

3.22E-05

Table1 The result of qPCR

Method:

To start with, we put all genes into two groups. One group is under the strong promoter while the other is under the weak one. We constructed two arrays,weak[i] and expected[i].

Parameters(i=1,2,3,4,5,6,7,8)

Meanings

weak[i]

the relative expression level of each nif gene under the weak promoter

weak[i]*

the relative expression level of each nif gene under the weak promoter after normalization

expected[i]

the ideal stoichiometric proportion

expected[i]*

the ideal stoichiometric proportion after normalization

strong[i]

the relative expression level of each nif gene under the strong promoter after normalization

ei

the ideal stoichiometric proportion of the ith gene after all preprocessings

ai

the relative expression level of the ith gene under the weak promoter after all preprocessings

mi

the number of the ith gene under the strong promoter

ni

The number of the ith gene under the weak promoter

Table 2 The table of parameters in our model

Then we did some necessary preprocessings. Firstly, we found the smallest data in weak[i] and called it “min”. We normalized all the other data accordingly by doing:
weak[i]*=weak[i]/min                                                 (1)                        
expected[i]*=expected[i]/min                                           (2)
We constructed strong[i]:
strong[i]=2*weak[i]*                                                  (3)
Secondly, to guarantee the existence of a solution, we adjusted expected[i]* by examining whether it is greater than or equal to the corresponding weak[i]*, if not, we did:
expected[i]*=weak[i]*                                                 (4)   
     
After that, we began the organization. In order to minimize the total numbers of genes, we arranged the strong promoter group first, and considered the weak group later. Because each gene can be considered separately, here we only describe the organization of the ith gene as an example.
For the ith gene, we tried adding one copy of it under the strong promoter. If
|ei-2*ai|<ei,                                                         (5)
we actually added it. Until we have added (mi+1) ith genes, and got
|ei-2(mi+1)*ai|>|ei-2mii*ai|                                            (6)
Then we stopped adding it and recorded that we have added mi ith genes under the strong promoter.
For the weak promoter group, we applied a similar method. For the ith gene, we tried adding one copy of it under the weak promoter. If
|ei-2*mi*ai-ai|<|ei-2*mi*ai|,                                           (7)                                      
we actually added it. Until we have added (ni+1) ith genes, and got
|ei-2*mi*ai-(ni+1)*ai|>|ei-2*mii*ai-ni*ai                                (8)
Then we stopped adding it and recorded that we have added ni ith genes under the weak promoter.
In that way, we were able to determine numbers of the ith gene under the two promoters with which the deviation was the smallest.

Fig 1. A flow diagram describing the idea of our modeling process

According to this flow diagram, we programmed with Python and got the following results:

Fig 2. The best arrangement of nif genes according to our calculation

With this arrangement, the proportion of nifB: nifH: nifD: nifK: nifE: nifN: nifX: nifV = 15.44: 46.93: 71.88: 62.10: 16.44: 16.04: 16.0: 15.94, which is most close to the ideal proportion among all the solutions.

Refinement of our model:
We modified the putative best expression level of nifB:nifH:nifD:nifK:nifE:nifN:nifX:nifV to 5:3:4:4:1:1:1:1. We believed in this way, we could better simulate the expression of nitrogenase in our engineered E.coli strains. We made this change because of three reasons.

Firstly, nifB is indispensable for assembly nitrogenase no matter in diazotrophs or engineered E.coli strains. Apart from the minimal nitrogen fixation gene cluster, the genomic DNA of wide type Paenibacillus polymyxa includes analogues of nifM, nifU, nifS and other genes which exist in other nitrogen-fixing microorganisms and are essential for the correct folding of nitrogenase iron protein. However, the E.coli genome doesn’t have such analogues. Nevertheless, it has been reported that the excessive expression of nifB could compensate for the absence of nifU and nifS. That is, if nifB is overexpressed in E.coli, these auxiliaries are not necessary. Therefore, the expression level of nifB should be the highest 5.

Secondly, compared with nitrogen-fixing microorganisms, E.coli also lacks some genes that provide electron transfer function, such as nifF and niff. So the intracellular reductive power of E.coli is insufficient to accomplish nitrogen fixation. Thus it is necessary to overexpress nifH(nitrogenase reductase) and the value is set to 3 instead of 5 because our semiconductor, the CdS part, can provide additional electrons.

Thirdly, we set the expression level of nifD and nifK to be 4 because molybdenum iron protein is an ɑ2β2 allotetramer and is the core of nitrogenase.

Based on the new ideal stoichiometric proportion, we adjusted the code and received a more accurate result.

Fig 3 The best arrangement of nif genes version 2.0.

The achieved stoichiometric proportion of nifB:nifH:nifD:nifK:nifE:nifN:nifX:nifV=77.23:46.93:71.88:62.10:16.44:16.04:16.0:15.94,which is the closest to the ideal 5:3:4:4:1:1:1:1.
This model provided a potential strategy for the improvement of biological activity of nitrogenase expressed in our engineered E.coli strain and offered a great help to our further experiments.

Here is the codes we taped and used. TXT downlaod:https://static.igem.org/mediawiki/2018/f/fe/T--Nanjing-China--model.txt

The number we typed in:

  1. findSequence([7.8,47.4,24.2,4.48,1.66,1.62,1.01,3.22],[1,3,4,4,1,1,1,1],['nifB','nifH','nifD','nifK','nifE','nifN','nifX','nifV'])
  2. findSequence([7.8,47.4,24.2,4.48,1.66,1.62,1.01,3.22],[5,3,4,4,1,1,1,1],['nifB','nifH','nifD','nifK','nifE','nifN','nifX','nifV'])