# Model

Modelling and simulations allow to determine the behavior of a designed system before it is even realized. Since synthetic biology is an engineering discipline, it is a tremendous tool complementing the work in the laboratory. Simulations allow precise information also in situations where observations in the lab cannot be obtained due to potential difficulties of the experimental setup. As a result, problems and unwanted effects can be identified with little effort already before the lab phase: Simulation and changes in the design can be iterated many times in a computer to develop an auspicious design.

One outstanding aspect in the characterization of S-layer proteins is the interaction between the S-layer proteins and the surface. Of special interest is the interaction with artificial surfaces for industrial use. Therefore, we aimed for information about the surface properties that allows self-assembly of the S-layer proteins. Additionally, we wanted to analyze the interaction of different S-layer protein types in mixture in order to determine the emerging local structure (clusters, homogeneous patterns or quasicrystals).

## How did we model?

Due to the proteins’ complexity, simulating protein behavior from scratch is impossible in a reasonable time frame. Therefore, we use a coarse graining algorithm considering the amino acids as the smallest subunits of a protein. The amino acids are represented as charged beads positioned as given within the tertiary structure of the protein of interest. With this simplification we follow a coarse graining method used in the publication “Monte Carlo study of the molecular mechanisms of surface-layer protein self-assembly” written by Christine Horejs, et all (The Journal of chemical physics 134.12 (2011): 03B616.), since it keeps the majority of properties of the proteins required for reliable statement.

We thus concentrate on the S-layer proteins with known tertiary structure: the s-layer protein SbsB from the organsim *Geobacillus stearothermophilus *(P1) and the protein RsaA from *C. crescentus*. The simulations of S-layer proteins with unknown tertiary structures, as p2, is impossible, since the knowledge about the position of the amino acids within the protein of interest is essential for modelling, since it clearly is a pre-requisite for the used coarse graining algorithm.

After the coarse graining the proteins are considered as rigid bodys.

## SbSB from *Geobacillus stearothermophilus *(P1)

Electron desity (source: RCSB)

Coarse grained protein

## RsaA from *C. crescentus*(P3)

Electron desity (source: RCSB)

Coarse grained protein

Since a simulation that considers every possible physical interaction is a hopeless endeavor due to the pure complexity we obviously need to concentrate on the most important ones. As previously described in the publication “Monte Carlo study of the molecular mechanisms of surface-layer protein self-assembly” by Christine Horejs, et al. (The Journal of chemical physics 134.12 (2011): 03B616.), a reasonable assumption is that almost all of the relevant information is contained in the electrostatic properties. Instead of a brute-force simulation of the force encoded in Maxwell’s theory of electrodynamics, which again would require enormous calculational effort, we relied on the well-known and reliable Debye-Hückel model for electrolyte solutions that naturally incorporates effects such as the statistic shielding of charges in solutions.

Since we are interested in the final arrangement of the S-layer proteins on a surface and not in the particular dynamics that leads to the assembly, we decided to use Markov chain Monte Carlo (MCMC) based simulations.

The simulations take place within fixed simulation boundaries in the form of a box. Furthermore, the simulation has variable macroscopic parameters as temperature and the charge of the surface that allow us to influence the behavior of the simulation. As in all systems in nature, the energy of our system is minimal and the Monte-Carlo simulation allows us to identify exactly those configurations of minimal energy.

The initial state of the system is a random arrangement of the proteins in the simulation box. In the following times steps the arrangements are randomly changed. Whether an arrangement is kept for next time step depends on the systems energy.

The total energy is composed of all the possible interaction terms in the system: first of all we clearly need to consider the electrostatical interaction terms between all objects in the simulation. Additionally we used a hard sphere model for the coarse grained versions of our proteins that prevents the proteins from intersecting. Moreover, if a protein leaves the simulation box, the systems energy is set to infinity to keep the proteins within the system boundaries.

The probability to accept an arrangement for the next time step is proportional to

Therefore, it depends on the systems energy of the new arrangement E_{new}, the previous arrangement E_{old} and the temperature T. If the arrangement is not excepted, the arrangement from the previous time step is kept. This procedure allows to homogenously sample the space of all possible configurations with a Markov chain and is guaranteed to converge at some point to a configuration with minimal energy.

Temperature

With a temperature larger than zero, it is possible to accept arrangements with a higher systems energy. Therefore, the system has the possibility to leave local minima and can reach the global minimum of the systems' energy due to the thermal fluctuations introduced by the Boltzmann distribution encoded in the given probability distribution. The higher the temperature, the higher the probability that an arrangement is taken even if it moves to a configuration with higher system energy.

Monte Carlo Simulation is a stochastic method which enables us to solve complex problems with the help of probability theory. Therefore, it is a good alternative to analytic approaches that has applications in almost all fields of research nowadays. The Monte Carlo simulation is based on a very large number of repeated random experiments. A very demonstrative example is the numerical approximation of an integral (area) using random numbers: the number of random points inside the area (blue dots) divided by the total number of all random points (blue dots and green dots). The obtained result tends to become closer to the real value as more random experiments are performed (Law of large numbers).

Since the simulation of protein interactions is a time-consuming endeavor, further optimizations were made. Firstly, we compute the Debye-Hückel potential only for charged amino acids. For those, which are not charged, the Debye-Hückel potential equals zero and cannot influence the systems energy anyway. This reduces the computation significantly since the number of charged amino acids is considerably smaller than the total number of amino acids. Furthermore, to verify whether the arrangement is allowed or not, it is not necessary to check all proteins whether they intersect with another, since only close proteins can intersect. This again reduced the simulation time noticeably.

Our model allows the study of surface-protein’s self-assembly in dependence on the macroscopic parameters like temperature and the electromagnetic charge of the surface. The gathered results can be used for guidance in the choice of system parameters in future lab experiments. It is also possible to draw conclusions about the local arrangements of the emerging S-layers patterns. This is of special interest for different S-layer types in mixture.

To prove that our model is correct and can produce reliable results, we did some basic and easily verifiable experiment that all lead to plausible results. Here are some selected examples:

1. 10 oppositely charged beads (5x-10e, 5x10e, 10000 time steps, Temperature of 1):

2. One very negatively charged bead and 10 slightly positively charged beads (1x-100e,10x1e,10000 timesteps, Temperature of 1)

3. Very negatively charged beads in the lattice (blue) and positively charged beads (red) (20 beads of 10e, 225 beads in the lattice of -1e,5000 timesteps, Temperature of 1