Difference between revisions of "Team:Toulouse-INSA-UPS/Model"

Line 28: Line 28:
 
<h1> Modeling</h1>
 
<h1> Modeling</h1>
  
<p>Mathematical models and computer simulations provide a great way to describe the function and operation of BioBrick Parts and Devices. Synthetic Biology is an engineering discipline, and part of engineering is simulation and modeling to determine the behavior of your design before you build it. Designing and simulating can be iterated many times in a computer before moving to the lab. This award is for teams who build a model of their system and use it to inform system design or simulate expected behavior in conjunction with experiments in the wetlab.</p>
+
<h2 id=”why”>Why do we want to model our protein?</h2>
 
+
<p>Our novel fusion protein contains three binding sites, connected by flexible regions that are 30-50 amino acids each. We need to ensure that there will be no non-specific interactions between these sites that would prevent them from binding to their destined ligands.</p>
</div>
+
<div class="clear"></div>
+
 
+
<div class="column full_size">
+
<h3> Gold Medal Criterion #3</h3>
+
 
<p>
 
<p>
Convince the judges that your project's design and/or implementation is based on insight you have gained from modeling. This could be either a new model you develop or the implementation of a model from a previous team. You must thoroughly document your model's contribution to your project on your team's wiki, including assumptions, relevant data, model results, and a clear explanation of your model that anyone can understand.  
+
If possible, defining a range within which the linkers remain would give us a better idea of how the protein behaves in situ. This task presents a particular challenge, as it has never been successfully completed by crystallography.</p>
<br><br>
+
<p>
The model should impact your project design in a meaningful way. Modeling may include, but is not limited to, deterministic, exploratory, molecular dynamic, and stochastic models. Teams may also explore the physical modeling of a single component within a system or utilize mathematical modeling for predicting function of a more complex device.
+
Finally, evaluating the potential interactions between ligands attached to our protein is the final step of our modelling process. For this, we must evaluate the average distance between domains and its variations, which will also provide information about the maximum size of the ligands that we can use.</p>
</p>
+
<p>
 +
All of these questions can be answered through molecular modelling. However, each step requires its own approach and software. On this wiki page, we will detail these steps and our thought process behind the solutions that we chose to use.</p>
 +
<h2 id=”molecular-modelling”>How does (protein) molecular modelling work?</h2>
 +
<ul>
 +
We have primary protein structure, need to calculate secondary and then tertiary
 +
List all different rotations that can happen between 2-3-4 atoms (those nice drawings from Sophie’s classes)
 +
List interactions between distant atoms (VdW, EEL, any others?)
 +
Hydrophobic / philic aa
 +
Minimisation algorithms: theory, Steepest Descent vs Conjugated Gradient (cf Sophie’s classes, 2nd chapter)
 +
Molecular dynamics: theory, Monte Carlo, explicit vs implicit solvent
 +
Limitations
 +
</ul>
 +
###Be succinct for this part! We ain’t teaching them for an exam!###
  
 +
<h2 id=”how”>How did we model our protein?</h2>
 +
<p>3D structures of the CBM3a and SAv domains are available on the Protein DataBase (PDB). We chose to use 4JO5 (CBM3a-L domain with flanking linkers from scaffoldin cipA of cellulosome of Clostridium thermocellum) and 4JNJ (Structure based engineering of streptavidin monomer with a reduced biotin dissociation rate), both obtained by X-ray diffraction, with a resolution of 1.98 and 1.90 A respectively.</p>
 +
<p>After obtaining these files, we faced two problems. First, the linkers connecting our heads had never been resolved by crystallography. Second, azidophenylalanine had never been modelled, in its native state or clicked to another molecule.</p>
 
<p>
 
<p>
Please see the <a href="https://2018.igem.org/Judging/Medals"> 2018
+
We first attempted to model our Cerberus, with a simple phenylalanine replacing the AzF residue, through homology-driven folding algorithms. The three we chose to use were I-TASSER, Swiss Model and MODELLER. This allowed us to obtain a 3D structure of our protein on which we could start calculations.</p>
Medals Page</a> for more information.
+
<p>
</p>
+
“I-TASSER is a hierarchical approach to protein structure and function prediction. It first identifies structural templates from the PDB by multiple threading approach LOMETS, with full-length atomic models constructed by iterative template fragment assembly simulations. Function insights of the target are then derived by threading the 3D models through protein function database BioLiP.” (From their website)</p>
</div>
+
<p>
 
+
“SWISS-MODEL is a fully automated protein structure homology-modelling server, accessible via the ExPASy web server, or from the program DeepView (Swiss Pdb-Viewer).” (From their website)</p>
<div class="column two_thirds_size">
+
<p>
<h3>Best Model Special Prize</h3>
+
“MODELLER is used for homology or comparative modeling of protein three-dimensional structures. The user provides an alignment of a sequence to be modeled with known related structures and MODELLER automatically calculates a model containing all non-hydrogen atoms.” (From their website)</p>
 +
<p>
 +
Only Modeller provided us with a satisfactory structure, most likely due to its improved de novo prediction capacities and the usage of restraints on the final model. The other two algorithms struggled to resolve the position of the linkers.
 +
With this model, we could then start the molecular dynamics phase. At the same time, we started working on a 3D structure of azidophenylalanine, clicked onto a DBCO-fluorescein molecule.</p>
 +
<hr />
  
 
<p>
 
<p>
To compete for the <a href="https://2018.igem.org/Judging/Awards">Best Model prize</a>, please describe your work on this page  and also fill out the description on the <a href="https://2018.igem.org/Judging/Judging_Form">judging form</a>. Please note you can compete for both the gold medal criterion #3 and the best model prize with this page.
+
“The term "Amber" refers to two things. First, it is a set of molecular mechanical force fields for the simulation of biomolecules (these force fields are in the public domain, and are used in a variety of simulation programs). Second, it is a package of molecular simulation programs which includes source code and demos.” (From their website)
<br><br>
+
AMBER suite for modelling</p>
You must also delete the message box on the top of this page to be eligible for the Best Model Prize.
+
<p>
 +
“NAMD, recipient of a 2002 Gordon Bell Award and a 2012 Sidney Fernbach Award, is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Based on Charm++ parallel objects, NAMD scales to hundreds of cores for typical simulations and beyond 500,000 cores for the largest simulations. NAMD uses the popular molecular graphics program VMD for simulation setup and trajectory analysis, but is also file-compatible with AMBER, CHARMM, and X-PLOR.” (From their website).</p>
 +
<p>
 +
CALMIP: SANDER (Simulated Annealing with NMR-Derived Energy Restraints) for minimisation, heating and production</p>
 +
<p>
 +
LAAS? AI methods, robotics, statistical sampling from existing 3D structures http://projects.laas.fr/Psf-Amc/ </p>
 +
<p>
 +
 
 +
From a drawing of the AzF-DBCO-fluorescein molecule, we built the molecule using Avogadro for its ease of use for building molecules de novo.
 +
Adding AzF-DBCO took a while and a lot of effort
 
</p>
 
</p>
 +
<h2 id=”conclusion”>Conclusions</h2>
 +
Actuellement: 30 000 H de calcul réalisés!
 +
No non specific interactions found
 +
Linkers remain flexible and tend to cross, N ter seems to stick to one side of the CBM3a
 +
Size of ligands to be determined but for the moment, seems like quite a bit
  
</div>
 
  
 
<div class="column third_size">
 
<div class="highlight decoration_A_full">
 
<h3> Inspiration </h3>
 
<p>
 
Here are a few examples from previous teams:
 
</p>
 
<ul>
 
<li><a href="https://2016.igem.org/Team:Manchester/Model">2016 Manchester</a></li>
 
<li><a href="https://2016.igem.org/Team:TU_Delft/Model">2016 TU Delft</li>
 
<li><a href="https://2014.igem.org/Team:ETH_Zurich/modeling/overview">2014 ETH Zurich</a></li>
 
<li><a href="https://2014.igem.org/Team:Waterloo/Math_Book">2014 Waterloo</a></li>
 
</ul>
 
</div>
 
 
</div>
 
</div>
 
<!--CONTENT ENDS HERE-->
 
<!--CONTENT ENDS HERE-->

Revision as of 19:14, 9 September 2018

Modeling

Why do we want to model our protein?

Our novel fusion protein contains three binding sites, connected by flexible regions that are 30-50 amino acids each. We need to ensure that there will be no non-specific interactions between these sites that would prevent them from binding to their destined ligands.

If possible, defining a range within which the linkers remain would give us a better idea of how the protein behaves in situ. This task presents a particular challenge, as it has never been successfully completed by crystallography.

Finally, evaluating the potential interactions between ligands attached to our protein is the final step of our modelling process. For this, we must evaluate the average distance between domains and its variations, which will also provide information about the maximum size of the ligands that we can use.

All of these questions can be answered through molecular modelling. However, each step requires its own approach and software. On this wiki page, we will detail these steps and our thought process behind the solutions that we chose to use.

How does (protein) molecular modelling work?

    We have primary protein structure, need to calculate secondary and then tertiary List all different rotations that can happen between 2-3-4 atoms (those nice drawings from Sophie’s classes) List interactions between distant atoms (VdW, EEL, any others?) Hydrophobic / philic aa Minimisation algorithms: theory, Steepest Descent vs Conjugated Gradient (cf Sophie’s classes, 2nd chapter) Molecular dynamics: theory, Monte Carlo, explicit vs implicit solvent Limitations
###Be succinct for this part! We ain’t teaching them for an exam!###

How did we model our protein?

3D structures of the CBM3a and SAv domains are available on the Protein DataBase (PDB). We chose to use 4JO5 (CBM3a-L domain with flanking linkers from scaffoldin cipA of cellulosome of Clostridium thermocellum) and 4JNJ (Structure based engineering of streptavidin monomer with a reduced biotin dissociation rate), both obtained by X-ray diffraction, with a resolution of 1.98 and 1.90 A respectively.

After obtaining these files, we faced two problems. First, the linkers connecting our heads had never been resolved by crystallography. Second, azidophenylalanine had never been modelled, in its native state or clicked to another molecule.

We first attempted to model our Cerberus, with a simple phenylalanine replacing the AzF residue, through homology-driven folding algorithms. The three we chose to use were I-TASSER, Swiss Model and MODELLER. This allowed us to obtain a 3D structure of our protein on which we could start calculations.

“I-TASSER is a hierarchical approach to protein structure and function prediction. It first identifies structural templates from the PDB by multiple threading approach LOMETS, with full-length atomic models constructed by iterative template fragment assembly simulations. Function insights of the target are then derived by threading the 3D models through protein function database BioLiP.” (From their website)

“SWISS-MODEL is a fully automated protein structure homology-modelling server, accessible via the ExPASy web server, or from the program DeepView (Swiss Pdb-Viewer).” (From their website)

“MODELLER is used for homology or comparative modeling of protein three-dimensional structures. The user provides an alignment of a sequence to be modeled with known related structures and MODELLER automatically calculates a model containing all non-hydrogen atoms.” (From their website)

Only Modeller provided us with a satisfactory structure, most likely due to its improved de novo prediction capacities and the usage of restraints on the final model. The other two algorithms struggled to resolve the position of the linkers. With this model, we could then start the molecular dynamics phase. At the same time, we started working on a 3D structure of azidophenylalanine, clicked onto a DBCO-fluorescein molecule.


“The term "Amber" refers to two things. First, it is a set of molecular mechanical force fields for the simulation of biomolecules (these force fields are in the public domain, and are used in a variety of simulation programs). Second, it is a package of molecular simulation programs which includes source code and demos.” (From their website) AMBER suite for modelling

“NAMD, recipient of a 2002 Gordon Bell Award and a 2012 Sidney Fernbach Award, is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Based on Charm++ parallel objects, NAMD scales to hundreds of cores for typical simulations and beyond 500,000 cores for the largest simulations. NAMD uses the popular molecular graphics program VMD for simulation setup and trajectory analysis, but is also file-compatible with AMBER, CHARMM, and X-PLOR.” (From their website).

CALMIP: SANDER (Simulated Annealing with NMR-Derived Energy Restraints) for minimisation, heating and production

LAAS? AI methods, robotics, statistical sampling from existing 3D structures http://projects.laas.fr/Psf-Amc/

From a drawing of the AzF-DBCO-fluorescein molecule, we built the molecule using Avogadro for its ease of use for building molecules de novo. Adding AzF-DBCO took a while and a lot of effort

Conclusions

Actuellement: 30 000 H de calcul réalisés! No non specific interactions found Linkers remain flexible and tend to cross, N ter seems to stick to one side of the CBM3a Size of ligands to be determined but for the moment, seems like quite a bit