Team:Paris Bettencourt/Model

Modeling

Our project is based on the consequences of a conformational change of antimicrobial peptides. Results generated by the testing group showed that MIC is not a reliable criteria to understand the activity of our StarCores while it has been previously used for species. It was crucial to have some models to:

1.Determine which constructs would be interesting.

2.Interpret our experiments results.

RESULTS

Our Modelling workflow could be summed up as :

a. Two hackathons to screen the most promising cores and AMPs based on rationnal criteria.

b. Homology modeling of the 210 constructs with Modeller on Pymol.

c. Molecular Dynamic simulations to test the stability of the predicted structures using Yasara and Inserm Cluster.

I. Homology Modelling

Followed by cores and AMPs selection, we were interested in determing the model of the fusion. This led us to the modeling expert, Antoine Tally, who guided us to standardise a protocol for our analysis. Aim: To construct an atomic-resolution model of Star-core monomer via comparative homology modelling.

1. Making fusion protein of 2 known PDB files using CHIMERA.

2. Monomer construct on Chimera using Modeller graphic interface.

3. Construct optimization on Chimera.

Superpozition on Pymol thanks to handmade script called -> Superpoz.py <-



II. Molecular dynamic simulation

OBJECTIVE:

-Study behaviour of fusion molecule in vicinity of cell.

-Assess the stability of Star cores in terms of protein folding, structural characteristics and energy.

-Define constraints and parameters for simulations.

-Use Yasara to analyze MDS data.

CONCLUSION

In this project, we combinatorially fused a set of known AMPs to structurally diverse, self-assembling protein cores to produce star-shaped complexes.

We selected 14 cores that already exist and 15 AMPs, hence over 200 fusions were designed and expressed in a cell-free system, then screened for activity, biocompatibility, and membrane selectivity.

To study the behaviour of in-vitro synthesised molecules we designed the fusion molecules for all constructs using Chimera, Modeller and Pymol. We visualized the assembly and fusion monomer (core + AMP) and studied their behavioural attributes, changes in folding of alpha helix and beta sheath using Yasara as it mimics atom behaviour in real life. Using visualization and MDS studies we confirm that proteins folds well and on an average maintain constant RMSF (Root Mean Square Fluctuations) with all amino acid residues, also expressed well in cell free system.

METHODS

These protocols were defined under certain parameters:

AMPs are relatively small, so we assumed that structural changes would be minimal, enabling us to perform Homology modeling.

AMPs are only fused on the N/C terminals thus oriented outside the homomultimeric self-assembling protein, the modelisation should keep the geometry of the nude core, displaying the fusionned peptides on the surface of the core. It enabled us to structure superposition.

However we verified this hypothesis with our stability MD assay developped with Marc Baaden.

Homology Modelling

What is homology modelling?

Homology modelling is comparative modelling of proteins. It is a comparative protein modelling method designed to find the most probable structure for a sequence given its alignment with related structures. The three-dimensional (3D) model is obtained by optimally satisfying spatial restraints derived from the alignment and expressed as probability density functions (pdfs) for the features restrained. For example, the probabilities for main-chain conformations of a modelled residue may be restrained by its residue type, main-chain conformation of an equivalent residue in a related protein, and the local similarity between the two sequences. Several such pdfs are obtained from the correlations between structural features in 17 families of homologous proteins which have been aligned on the basis of their 3D structures. The pdfs restrain C alpha-C alpha distances, main-chain N-O distances, main-chain and side-chain dihedral angles. A smoothing procedure is used in the derivation of these relationships to minimize the problem of a sparse database. The 3D model of a protein is obtained by optimization of the molecular pdf such that the model violates the input restraints as little as possible. The molecular pdf is derived as a combination of pdfs restraining individual spatial features of the whole molecule. The optimization procedure is a variable target function method that applies the conjugate gradients algorithm to positions of all non-hydrogen atoms. The method is automated. We used modeller to predict all our models. The steps are:

1. Obtain reference PDB structures representing the core and antimicrobial peptide protein monomers

2. Use MODELLER via CHIMERA interface for homology modelling

3. Choose the best fusion protein model that represents the Star core monomer

For this workflow we will use:

UCSF-CHIMERA: download link

MODELLER: online or installed

Core: Ferritin: PDB= 4XGS

AMP: Cg-Defensin: PDB= 2B68

AA seq in FASTA format: download

pDB.001_translation

MGFGCPGNQLKCNNHCKSISCRAGYCDAATLWLRCTCTDCNGKKESSHLKPEMIEKLNEQMNLELYSSLL

YQQMSAWCSYHTFEGAAAFLRRHAQEEMTHMQRLFDYLTDTGNLPRINTVESPFAEYSSLDELFQETYKLEQ

LITQKINELAHAAMTNQDYPTFNFLQWYVSEQHEEEKLFKSIIDKLSLAGKSGEGLYFIDKELSTLDTQN

Part 1: Making fusion protein of 2 known PDB files using CHIMERA & AMP MODELLER

Open the fasta file in Chimera. Then from sequence window menu: Info… Blast Protein to search the PDB for matching structures. In the Blast Protein results, find 4XGS and 2B68 and choose both lines (click, ctrl-click in the results dialog), then click the “Show in MAV” and “Load Structure” buttons at the bottom of the dialog.

Now you will have a new sequence alignment window with 3 sequences in it: the fusion protein and the two protein structures, and in the main Chimera window, the two structures.

In Chimera, delete all extra protein chains. In other words, if there are extra copies of those structures, just delete them so that you have only one copy to use as the template. Make sure that the remaining copy of each is associated with its sequence in the alignment (sequence alignment window menu: Structures… Associations…)

Position the two structures so that the termini are in a somewhat reasonable place relative to each other to template the fusion protein. In our case, the C-terminal of Defensin monomer attaches with the N-terminal of Ferritin monomer. You can “freeze” one in place by deactivating it and move just the other with the mouse as described here:

From the sequence alignment window menu choose: Structure… Modeller (homology) to show the Modeller dialog. Choose the query as the target and both structures as the template, etc. as in the modeling tutorials. You may also want to turn on “Use thorough optimization” in the Advanced Options section.

NOW YOU CAN SAVE ONE OF THE STRUCTURE AS A PDB FILE AND USE THIS AS A MONOMER FOR FURTHER STEPS.

Part 2: Make a multimer of the fusion protein using reference PDB structure in Pymol

Open core assembly ( 4XGS) and fusion monomer (4XGS + 2B68) in pymol.

Use python based code, -> superpoz.py <- .

We wrote a small script to generate the assembly structure of our StarCores based on the reference biological assembly of the nude core.

Result

Fusion monomers for all the constructs are developed for molecular dynamic simulation studies and assembly of core with monomer for visualization of scaffold proteins.

Molecular dynamic simulation

MD studies are based on Newton second law of motion:

F = ma


Where F is the force of an atom, m is the mass of the particle and a is the acceleration.


As, per universal concept of energy

F(x) = −∇U(x)


Where x represents coordinates of all atoms, and U is the potential energy function, Velocity is the derivative of position, and acceleration is the derivative of velocity.


We can thus write the equations of motion as:

dx/dt = v

dv / dt = F(x)/m


This is a system of ordinary differential equations. For <n atoms, we have 3n position coordinates and 3n velocity coordinates. Calculation of “Analytical” (algebraic) solution is impossible but Numerical solution can be given by,

xi+1 = xi +δ t vi


vi+1 = vi +δ t F(xi )/m


This is layman representation of principle used for molecular dynamic simulation studies. Since, in real life and MD also, atoms are in constant motion then motion is used to understand the probability of observing a particular arrangement of atoms as a function of the potential energy.

Objective

Open YASARA

1. Go to options > Macro & Movie > Set Target pdb > md_runfast.mcr

2. Wait for the structure to run upto 15ns.

3. Record a video if required.

4. Analyse the result using MD_analyze.mcr

5. Analyse each residue using MD_analyzeres.mcr

Figure : Simulation of fusion monomer ferritin + ovispirin (4XGS + 1HU5)

Parameters

-pH at which the simulation should be run, by default physiological pH 7.4

The ion concentration as a mass fraction, here we use 0.9% NaCl (physiological solution)ions=‘NaCl,0.9’

-Simulation temperature, 298K

-Water density = 0.997

-Duration of the simulation = 15ns

-Extension of the cell on each side of the protein ‘10’ means that the cell will be 20 A larger than the protein.

Shape of the simulation cell: ‘Cube’.

-Forcefield : ForceField AMBER14

-Cell boundary : Boundary periodic

The simulation speed, ‘fast’ (maximize performance with 2*2.5 fs timestep and constraints)
The save interval for snapshots. Normally you don’t need more than 500-1000 snapshots of your simulation.
Solute from diffusing around and crossing periodic boundaries. Disable that for simulations of crystals.

Centre for Research and Interdisciplinarity (CRI)
Faculty of Medicine Cochin Port-Royal, South wing, 2nd floor
Paris Descartes University
24, rue du Faubourg Saint Jacques
75014 Paris, France
paris-bettencourt-2018@cri-paris.org