We created an advanced multiscale model of the ComCDE quorum sensing system and biofilm formation via WIG synthesis in
S. mutans. Employing the capabilities of the modelling environment Morpheus to simulate cell migration and adhesion as
well as the the diffusion of extracellular small molecules and proteins, we modelled the logarithmic phase of bacterial
growth in two dimensions. Our goal in modelling was to determine the most important factors in the bacterial biofilm
initiation system in order to predict the effectiveness of various methods of inhibiting biofilm growth.
Symbol
Meaning
Value
Units
ktx
Transcription Rate of a Gene
transcripts per gene per second
ktf
Transcription Factor Activity of Phosphorylated ComE
0.15
transcripts per ComEP per gene per second
δmCSP
Decay Rate of CSP mRNA
transcripts per second
δmComD
Decay Rate of ComD mRNA
transcripts per second
δmComE
Decay Rate of ComE mRNA
transcripts per second
δmGTFC
Decay Rate of mGTFC mRNA
transcripts per second
pComC
Number of ComC genes
1
genes
pComD
Number of ComD genes
1
genes
pComE
Number of ComE genes
1
genes
pGTFC
Number of GTFC genes
1
genes
αCSP
Translation Rate of CSP
proteins per second
αComD
Translation Rate of ComD
proteins per second
αComE
Translation Rate of ComE
proteins per second
αGTFC
Translation Rate of GTFC
proteins per second
δCSP
Decay Rate of CSP
∣∣
proteins per second
δComE
Decay Rate of ComE
∣∣
proteins per second
δComD
Decay Rate of ComD
1.5× 10-5
proteins per second
δGTFC
Decay Rate of GTFC
∣∣
proteins per second
δCSPComDP
Decay Rate of CSPComDP
1.5× 10-5
complexes per second
δComEP
Decay Rate of Phosphorylated ComE
proteins per second
kb
Binding Activity of CSP to ComD
complexes per CSP per ComD per second
kub
Unbinding Activity of CSP:Phosphorylated ComD Complex
dissasociations per CSPComDP per second
kk
Kinase Activity of Phosphorylated ComD
phosphorylations per CSPComDP per ComE per second
ke
Enzyme Activity of GTFC
glucans formed per GTFC per Sugar per second
ϕCSP
Export Rate of CSP from a Cell
1∕CSP × second
ϕGlucan
Export Rate of Glucans from a Cell
1∕Glucan × second
kab
Binding Activity of ScFv to GTFC
Complexes per ScFv per GTFC per second
δGTFCScFv
Decay Rate of GTFC:ScFv Complex
∣∣
complexes per second
δScFvcell
Decay Rate of ScFv
∣∣
proteins per second
kkc
Binding Activity of Kappa-Casein
1
1∕KCasein × Glucan × second
δKCaseincell
Decay Rate of Kappa-Casein
∣∣
proteins per second
2.1 Cell Energy and Migration
(2)
(3)
(4)
(5)
Morpheus is based on the Cellular Potts Model, which defines cells as spaces in a two
or three dimensional lattice (our model used a hexagonal lattice in 2d) and
determines changes to cell shape and size by using the Hamiltonian (Equations
2-4). Equation 2 determines changes to the volume of a cell σ with current
volume vσ in lattice sites and intended volume Vt where λV is a constant
parameter of elasticity governing the extent to which the difference between
the cell’s immediate and intended volume contributes to a rise in its free
energy H. Equation 3 incorporates a similar system for cell perimeter into the
equation.
In addition to modelling adjustments to cells’ shape and size as they migrate, the
Cellular Potts Model also uses the Hamiltonian to model the interactions between
cells. Equation 4 determines the interaction energies between different cell types
where τ(σi,σj) represents the cell types of two cells σi and σj and J specifies
said energies in matrix form. In order to prevent cells from returning values
for interactions with themselves, the term known as the Kronecker Delta
defined in Equation 5 is included. Each update to the configuration of cells in
the lattice as a result of equations 2-4 only occurs with a certain likelihood
governed by the Boltzmann probability in Equation 1. This equation states
that the cell’s chance of changing state is 100% if its change in energy ΔH
added to its resistance to change Y is favorable (ΔH+Y < 0) and decreases
exponentially with a rate of where T represents the amount of unfavorable
updates to the cell lattice, defined as modifications to the cell’s perimeter or
volume.
2.2 Diffusion and the Cell Lattice Space
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
Another important component of our model was treating sugar, CSP, and our
potential outputs not simply as cell-associated values but as scalar fields representing
the concentration of said molecules. Fortunately Morpheus also incorporates the
ability to overlay fields like these on simulated cell populations and allows the field to
be updated locally based on cell activity at a specific lattice site (cells can also report
on fields or other cells surrounding them, a function which is explained in the
next section). Diffusion Fields were evaluated using the Central Difference
Method to solve the 2-D Diffusion Equation (Equations 6-14). Equation 6 is a
differential equation modelling diffusion in one dimension where the concentration
c is a function of the X-coordinate and time T. D signifies the diffusion
coefficient and L represents the length between nodes in (X, T) space where
the domain of solutions is 0≤X≤L. By making the change of variables in
Equation 7, the Diffusion Equation can be rewritten in the form of Equation
8.
Next, a set of approximations are made in order to solve for the first and second
derivatives of concentration with respect to x (Equation 9). The central difference
method is a version of the finite difference method of approximations to solve the
Diffusion Equation, which assumes that there is a minimum distance in both
dimensions, δx and δt, for which the concentration changes. This creates
a grid in (x, t) space as shown in Figure . The central difference method
approximates the x-derivative of concentration based on concentration values on
either side of a point (i, j), one for each of the smallest change in x from
the starting point: (i+1, j) and (i-1, j) (Equation 10). This is more precise
than the forwards or backwards difference methods which only take into
account one direction of incrementation in x. Next, the second derivative of
concentration with respect to x is approximated using the values of the first
derivative on a range half as large as in the previous approximation (Equation
11).
Using these results, an approximation of the concentration at a point 1 time step
in the future can be obtained based on the concentrations at the surrounding points
at the initial time (Equation 12), which can be appended with terms based on the
approximations for the first and second derivatives in Equations 10 and 11 in order to
obtain a more accurate approximation. Of course, in our model this was done in two
dimensions, but the steps for the second dimension y are the same as those above and
can be combined into Equations 13 and 14 in order to obtain values for a 2-D
diffusion model.
In order to sense nearby concentrations or cells in Morpheus, each cell can take
advantage of the NeighborhoodReporter function, which maps values within a
specified node length from a cell to an average, variance, or sum specific to that cell.
In our model, cells employed the sum function of the NeighborhoodReporter to
interact with the various diffusion fields. Equation 15 is the equation for a lattice sum
in a 2-dimensional hexagonal lattice, which Morpheus approximates to write field
values to individual cells.
The first objective of accurately modelling a bacterial population was to model
population growth over time. We chose to define our time step in the model as one
second, and used the canonical doubling time of around 3800 seconds to
determine a probability of cell division at each time step. Next, in order to reflect
the constantly shifting environment of saliva on teeth, we programmed the
cells with random motion defined within a realistic range for cell velocity in
the salivary microbiome. This ensured that the cells would interact so that
they would be affected by the modified adhesion associated with biofilm
initiation.
3.1 Transcription and Translation of the Relevant Genes
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
The biggest hurdle in creating an accurate model of biofilm formation in S. mutans
was simulating the two-component signaling system known as ComCDE. We began
by creating differential equations for the transcription and translation of the ComC,
ComD, and ComE genes based on transcription rates, translation rates and decay
rates for mRNAs and proteins from the literature. We also researched the genome
size and operon structure of the bacteria in order to better reflect its production of
the relevant proteins in our model. Equations 16-18 represent the transcription of
mRNAs for the three genes we chose to study, and the first four terms in each of
Equations 20-22 represent the translation of those mRNAs. All terms and
constants are defined in the table at the beginning of this section. Now that we
had the cells generating values for these variables at each time step, we
could add terms to model the kinetics of the two-component sensing system
itself.
3.2 Two-Component Sensing System: Ligand Binding, Response Regulator
Phosphorylation and DNA Binding
(24)
(25)
(26)
We used the Law of Mass Action to create differential equations reflecting the
kinetics of ComD binding its ligand CSP, the autophosphorylation of ComD, the
phosphorylation of ComE by ComD, and finally the DNA binding and transcription
factor activity of phosphorylated ComE. For a full explanation of the way the
two-component signalling system works in live bacteria, please refer to our project
page, as this section assumes knowledge of the relevant components and their
functions. The first step in the signalling system is the binding of CSP to
ComD. In order to initiate this activity, we used the NeighborhoodReporter
function to update the variable CSPout at each time step. ComD binds
CSP at the rate kb, at which point the two are considered one complex,
CSPComDP for autophosphorylated ComD:CSP, and the amount of CSP and
ComD decrease accordingly (Equations 21 and 24). CSPComDP has an
unbinding rate kub which decreases the amount of ComD:CSP complexes while
increasing both CSPout and ComD by the same amount. In order to simplify
the system of equations, the assumption was made that once CSPComDP
phosphorylates its response regulator ComE, the complex dissociates and
returns to separate ComD and CSPout. kk represents the rate of the complex
phosphorylating ComE (Equation 25). Once ComE is phosphorylated, it is considered
a different variable, ComEP for phosphorylated ComE. Phosphorylated ComE
upregulates the ComC and Glycosyltransferase genes with the rate ktf in order
to create a positive feedback loop within the population characteristic of
quorum-sensing systems and produce the necessary enzymes for biofilm formation.
Once again, the assumption was made that after ComEP binds DNA it is
dephosphorylated by available phosphatases and returns to its initial state,
hence the amount of ComEP decreases with the rate ktf for each gene it
upregulates and ComE increases by the same amount (Equations 22 and
26).
3.3 Water-Insoluble Glucan Synthesis via Glucosyltransferase Activity
(27)
Biofilm formation in the model depends on two main factors: Sugar available to the
cell and the amount of Glycosyltransferase enzymes a cell possesses (GTFC). mRNAs
for GTFC (Equation 19) are only transcribed when ComEP binds the gene to
upregulate it, and GTFC proteins are translated according to Equation 23.
Sugar concentration (Sugarfield) is given an initial value as a diffusion field
in the model, which was defined as a homogeneous diffusion field due to
sugar’s high solubility in saliva. Sugarcell is defined at each time step via the
NeighborhoodReporter function based on the current concentration of the
Sugarcell.
The final differential equation based on the Law of Mass Action in the most basic
form of the model, Equation 27, governs the production of water-insoluble
glucans from available sugar. The enzymatic activity of GTFC is denoted
ke, and the glucans become the basis of the extracellular biofilm by being
incorporated into the extracellular field of the same name at the rate ϕGlucan. In
order to simulate the transition of S. mutans from free-floating planktonic
cells to biofilm-bound immobilized cells which grow in localized colonies,
each cell was programmed to also use the NeighborhoodReporter to return
a constantly updated value for how concentrated biofilm-forming glucans
are around that cell. Once the local biofilm reaches a threshold value of
5 arbitrary units, cells are considered to be adhered and their movement
ceases.
3.4 Extracellular Diffusion Fields
(28)
(29)
(30)
Equations 28-30 are evaluated at each time step at each point in the lattice in order
to affect the changes individual cells make to field values. Both the biofilm and CSP
field equations increase at an export rate times the number of proteins a cell at a
given point has, and the Sugar decreases as GTFC transforms it into glucans. Sugar
undergoes homogeneous or well-mixed diffusion, CSP was given a diffusion
constant of D = 1, and the Biofilm was given the diffusion constant D =
0.001, both in as employed in Equation 6 and its subsequent approximate
solutions.
Our model proved quite successful in qualitatively reproducing multiple aspects of
the formation of live S. mutans biofilms. The simulated cells form clear microcolonies
in the locations where glucan concentration is the highest which expand
from there. CSP increases everywhere throughout the simulation as the
time steps modelled only encompass the phase of growth when that is the
case.
4.1 Effect of Sugar Concentration on Biofilm Formation
The first parameter we varied was sugar by modifying the initial value of the
diffusion field. After fifteen thousand time steps, there is a clear difference in biofilm
density in plots of the cells themselves. Graphs of data collected from a
representative cell in each condition show a clear distinction in glucan production
between sugar concentrations. For the time period simulated, limiting sugar
concentration was between 10 and 25 arbitrary units, at which point the cells were
unable to generate enough glucans to adhere and create microcolonies in a
biofilm. These results reflect our experiments with live S. mutans in which
we varied sucrose concentration and measured its effect on biofilm growth
through image analysis of of colony-forming units as well as crystal violet
staining. As expected based on our differential equations, the concentration
of sugar decreases exponentially over time, whereas the total amount of
glucans produced increases exponentially over time. The rates of exponential
growth and decay also depend on the sugar available, with the difference
in glucans produced by the end of the simulation being around one order
of magnitude less than the difference in sugar available in all cases. This
data was useful in helping us match the arbitrary units in the model to real
experiments.
4.2 Inhibition of GTFC Activity by Single-Chain Variable Fragments
4.2.2 Additional Field Equations Governing ScFv Activity
(33)
4.2.3 Modified Equations Integating ScFv Activity
(34)
The primary purpose of our model was to preemptively compare the effectiveness of
our potential outputs by modelling their inhibition of different parts of the pathway
and its effects on overall biofilm growth. We began by implementing the above
differential equations (Equations 31-33), and modifying Equation 23 into equation 34.
The amount of ScFvs affecting the cell is determined via the NeighborhoodReporter
function, and the ScFvs bind GTFC at the rate kab. Once the ScFv binds a GTFC, it
forms a GTFC:ScFv complex represented in Equation 31, decreasing both
the cell’s GTFC and ScFv count accordingly (Equations 32 and 34). The
complex degrades quickly based on a rate from the literature, and once it has
degraded both the ScFv and GTFC bound have been eliminated from the
simulation.
The effects of the ScFv on biofilm formation are immediately obvious from plots
of the cells at the end of the simulation, with only 0.075 arbitrary units of
concentration to start with being enough to prevent the bacteria from forming
adherent microcolonies throughout the time modelled. The critical point
at which the amount of GTFC: ScFv complexes goes from increasing to
decreasing determines how soon the cells are able to begin initiating biofilm
formation, assuming they are still in the logarithmic growth phase where CSP
production is increasing. Since GTFC production is decreased, everything
downstream of it is also affected: cells consume less sugar, produce fewer
glucans, and create less of a biofilm field in the presence of ScFvs, as shown
in the following figures. In equation 33, the decrease of the ScFv field is
divided by 100 due to the membrane resolution being set to 100 for the
simulation.
Next, we sought to create a similar set of equations to model the effect of
Kappa-Casein on biofilm formation. Equation 35 governs the effect of K-Casein on
an individual cell. Cells use the NeighborhoodReporter to determine the
amount of K-Casein molecules affecting it, and those molecules bind glucans
at a rate of kkc. Rather than creating a new variable for the complex of
K-Casein and a glucan, both are removed from the simulation immediately.
Due to being outside the cell, the binding rate of K-Casein was set to 1.
Equation 37 is a modification of equation 27 integrating the decrease in
glucans.
It became clear from varying the initial K-Casein concentration that the
equations were successful in modelling the decrease in biofilm formation due to a
decrease in effective adherent glucans surrounding the cells. However, despite the
K-Casein being given a dramatically higher binding rate than the ScFvs, a much
higher concentration of K-Casein was required to limit the formation of a biofilm
during the time simulated. In fact, ten times as high a concentration of
K-Casein was required to achieve the same effects as ScFvs. The K-Casein
also affected fewer aspects of the overall system of equations because its
influence was only effective on one of the most downstream components of the
simulation.
The most obvious takeaway from the results of our computational model was that
ScFvs were more effective than K-Casein at inhibiting biofilm formation, or, more
broadly, inhibiting GTFC is a better method for limiting the virulence of S. mutans
than reducing the glucans themselves. This result had a profound effect on our
experimental design and our project as a whole. While Kappa-Casein can be readily
purchased and implemented in live experiments, the ScFvs we planned to
express and implement were not available for purchase anywhere, would be
very expensive to get synthesized, and would require additional training
and lab techniques to isolate from mammalian cells expressing the protein
itself. However, thanks to data from the model, we were able to confirm the
superiority of the ScFvs for our purposes. Based on these results, we decided to
move further ahead with characterization experiments for GTFC-inhibiting
ScFvs.