# Modeling

Molecular dynamics (MD) simulations aim for the comparison with experimental lab works. Molecular modeling is used to describe atomic and molecular movements inside a system. With MD simulation we gained information which enzymes fit best for experimental work linking rhamnolipids to the chitosan matrix enzymatically.

## Introduction and Assumptions

The state of the art does not allow perfect simulations, therefore some simplifications are necessary. The possible states of the mechanical system are reduced into a microcanonical ensemble, which is a system that does not exchange particles (N) and energy (E) outside the defined box. It also has a fixed volume (V). Because of that a system of N particles with coordinates X and velocities V can be defined into two ordinary differential equations shown on the right side. These equations are termed as the potential in physics or the force field in chemistry. It can be defined through two different ways. First, as the mathematical form (i.e. the functional approach for the individual types of interaction, mostly borrowed from classical mechanics) or as the atom specific parameters. The parameters are obtained from spectroscopic experiments, diffraction experiments (X-ray diffraction) and/or quantum mechanical calculations (quantum chemistry). For this reason, there can be different parameter sets for a force field approach. The parameterization of a force field is a big challenge. The choice of the correct force field is an important decision. In general, force fields are only applicable to systems for which they are parameterized (e.g. proteins). The canonical ensemble is characterized by constant temperature. For production run, the system of isothermal-isobaric (NPT) was used. In the isothermal–isobaric ensemble, the amount of particle (N), pressure (P) and temperature (T) are conserved. For our simulation we used the npt condition.

$$F(X)=-\triangledown{}U(X)=M\dot{V}(t)$$ $$V(t)=\dot{X}(t)$$

## Methods

We used three different enzymes (Candida antarctica Lipase B (CALB), Pig Liver Esterase (PLE) and Fusarium solani Cutinase (FSC)) and two ligands, rhamnolipid and chitosan, to perform different tests. We used Gromacs [1] and the tutorial by Justin A. Lemkul. He provides a tutorial about protein ligand complex MD simulation (Mdtutorials). The selected enzymes were downloaded from the PDB database. The ligands were imported into a pymol session to place the ligands into the active site of the protein (Figure 1). Afterwards, the ligands and proteins including the new structure coordinates are extracted. The protein topology was prepared by using the pdb2gmx command of the gromacs version 2018.1-gnu7.12018.1-gnu7.1 installation. The force field of charmm36 all-atom force field (KJuly 2017) by the MacKerell lab website was chosen. The ligand topology file was created using the LigParGen server [2]. A complex.gro file is created that contains the processed, force field-compliant structure of our protein and the manually added ligands structures. Also a main topology file was created which contains the ligands parameters such as force field parameters, too. Furthermore a box including the solvent was created by using the commands gmx editconf and gmx solvate. To achieve a net free charge we added ions into the systems which contained an overall charge. Some error occured so we did trouble shooting e.g. we changed the lines which showed the wrong number of atoms. Afterwards we performed the energy minimization using an input parameter file (em.mdp) provided by the Emkul AB Virginiatech group. Subsequently an index file was created to get the position restraints defined by the ligands. As following step the system was equilibrated using a modified script of the Prof. Dr. Pleiss group. So the system has a constant number of atoms, same volume and the same temperature (NVT), so called a canonical ensemble. Furthermore the simulation, in which the temperature and pressure is constant, is started on the bwHP Cluster using a modified script provided by the group of Prof. Dr. Pleiss. The bwHPC is a high performance computing cluster of the state of Baden-Württemberg. We used the Tier 3 bwHPC clusters of Karlsruhe (bwUniCluster) shown in figure 6 and Tübingen (bwForCluster BinAC). The trajectory was analyzed using a script (modified from the IBTB bioinformatics group). RMSD, radius of gyration and SASA were calculated and plotted using xmgrace [3] in linux.

## Enzymes

An analysis of the three different proteins CALB, PLE and FSC was performed. CALB was chosen since it was reported to accept a broad range of substrates [4], [5]. The other two enzymes were chosen because they show a similar reaction mechanism with a comparable catalytic triad but a more open conformation. Therefore, we suggested that bigger substrates as chitosan and dirhamnolipids may be accepted better when less steric hindrance occurs.
For CALB, a tetrahedral intermediate can be observed when the carboxy group from one of the substrates is attacked by Ser 105 leading to an acylated intermediate. This intermediate can be further processed when the second substrate enters the active site leading to an ester bond formation between both substrates. In our case, dirhamnolipid provides the carboxy group while chitosan carries a hydroxy group for the second reaction step under water-free conditions (compare to Figure 2). PLE and FSC show a similar reaction mechanism since their catalytic triad also consists of a Serine (Ser), a Histidine and an acidic residue such as Glutamic acid or Asparagic acid.

## Results

All molecular dynamics steps were performed on these enzymes. Every simulation was performed 200 ns except PLE which was only simulated 165 ns in case of bigger size and therefore higher computing needs. For stability analysis, the root mean square deviation (RMSD), radius of gyration and solvent accessible surface area (SASA) were measured with the implemented tool in Gromacs [1]. RMSD can give an insight whether the protein is unfolding during simulation or is stable at the system parameters. The average distance between the protein backbone during simulation and a reference structure is calculated for every time step. As reference, the structure after equilibration is used. An increase of RMSD during time means the protein is unfolding. Radius of gyration and SASA usually increase in the same way when the protein unfolds when more of the protein’s surface is accessible for solvent molecules.
The result of the simulation for CALB with dirhamnolipid and a chitosan pentamer preplaced in the active site is shown in movie 1. For visualization purpose, solvent molecules are hidden. On top of that, we measured the distance during simulation between the two substrates and a catalytic important amino acid residue of the proteins. For CALB, Ser 105 was chosen since it is part of the active site by forming an ester bond with the carboxy group of dirhamnolipid. For PLE and FSC, the corresponding Serines were used since they share the same mechanism. These results are described in the following.

## RMSD

The root mean square deviation of CALB (Figure 3, A) indicates that the system is well equilibrated after about 30 ns when no change of RMSD occurs anymore. From this point on, RMSD stays constant at around 1.8 nm. For PLE (Figure 3, B), only a slight increase of RMSD is visible during simulation. With a value around 0.25, it is slightly bigger than the RMSD for CALB but still seems to be stable in the setup. For FSC (Figure 3, C), RMSD value is the lowest of all enzymes with values around 0.125 nm over time. These results are supported by radius of gyration and SASA measurements (data not shown) which do not increase for all three enzymes. Overall, it can be assumed that all simulations ran properly, and system setup was in a range where the protein is stable and therefore catalytic activity can be assumed.

## Active site accessibility of chitosan

For investigation of active site accessibility of chitosan for all three enzymes, chitosan was preplaced in the active site and distance between the ligand and an important residue in the active site of the enzyme was measured in every time step during simulation. In figure 4, the results for active site accessibility of chitosan are shown.
These results implicate that chitosan is not accepted perfectly by any of the investigated enzymes. For an active bound state of chitosan, the distance to Oγ of Serine should not exceed 0.5 nm. In case of CALB (Figure 4, A), chitosan leaves the active site after the first 15 ns and then stays bound to the protein in a non-active manner for about 150 ns with an average distance of 1.2 nm. After this time it leaves the binding pocket area.
For PLE (Figure 4, B), this distance is higher right from the beginning (approx. 2 nm) but stays bound to the enzyme for about 100 ns. In case of FSC (Figure 4, C) chitosan leaves the active site in the first 5 ns of production leading to the conclusion that it is not accepted very well as a product. Interestingly, chitosan seems to re-enter the active site at the end of simulation in the last 10 ns.

## Active site accessibility of rhamnolipids

For investigation of active site accessibility of rhamnolipids for all three enzymes, dirhamnolipid was preplaced in the active site and distance between the ligand and an important residue in the active site of the enzyme was measured in every time step during simulation.
Since the enzymes are only active when both substrates enter the active site, the distance of dirhamnolipid and Serine is also important. For investigation of active site accessibility of dirhamnolipid for all three enzymes, it was preplaced in the active site and distance between the ligand and important residue Serine in the active site of the enzyme was measured in every time step during simulation.
Dirhamnolipid leaves the active site of CALB after more than 100 ns but is not bound in contact distance so a reaction seems to be unlikely (Figure 5, A). Most of the time, it is more than 1 nm away from Ser 105 while it is bound much closer in PLE (Figure 5, B) leading to the conclusion, that PLE is a much better acceptor for this substrate. Here, it stays almost all 165 ns in contact distance slightly over 0.5 nm. For FSC (Figure 5, C), this distance is also bigger than for PLE. Interestingly, dirhamnolipid is reentering the active site at the end just like chitosan in this case. This observation can be further analyzed since this represents the substrate docking process and at the end, a productive conformation is reached.

## Conclusion and Outlook

To put it in a nutshell, all three simulations showed a well equilibrated system and no protein unfolding. PLE was the only enzyme able to bind dirhamnolipid within 0.5 nm. Therefore we suggest PLE for linking rhamnolipids to chitosan enzymatically.

For higher significance of simulations, replicates are useful since MD simulations always contain a statistical aspect. This work can be done in the future. Furthermore, other enzyme can be tested for better substrate acceptance. For example other Lipase like Lipase II from Rhizopus Niveus.