Team:MIT/Model

Modeling

We created an advanced multiscale model of the ComCDE quorum sensing system and biofilm formation via WIG synthesis in S. mutans. Employing the capabilities of the modelling environment Morpheus to simulate cell migration and adhesion as well as the the diffusion of extracellular small molecules and proteins, we modelled the logarithmic phase of bacterial growth in two dimensions. Our goal in modelling was to determine the most important factors in the bacterial biofilm initiation system in order to predict the effectiveness of various methods of inhibiting biofilm growth.


Symbol	Meaning	Value	Units


k_tx	Transcription Rate of a Gene		transcripts per gene per second

k_tf	Transcription Factor Activity of Phosphorylated ComE	0.15	transcripts per ComEP per gene per second

δ_mCSP	Decay Rate of CSP mRNA		transcripts per second

δ_mComD	Decay Rate of ComD mRNA		transcripts per second

δ_mComE	Decay Rate of ComE mRNA		transcripts per second

δ_mGTFC	Decay Rate of mGTFC mRNA		transcripts per second

pComC	Number of ComC genes	1	genes

pComD	Number of ComD genes	1	genes

pComE	Number of ComE genes	1	genes

pGTFC	Number of GTFC genes	1	genes

α_CSP	Translation Rate of CSP		proteins per second

α_ComD	Translation Rate of ComD		proteins per second

α_ComE	Translation Rate of ComE		proteins per second

α_GTFC	Translation Rate of GTFC		proteins per second

δ_CSP	Decay Rate of CSP	∣∣	proteins per second

δ_ComE	Decay Rate of ComE	∣∣	proteins per second

δ_ComD	Decay Rate of ComD	1.5 × 10^-5	proteins per second

δ_GTFC	Decay Rate of GTFC	∣∣	proteins per second

δ_CSPComDP	Decay Rate of CSPComDP	1.5 × 10^-5	complexes per second

δ_ComEP	Decay Rate of Phosphorylated ComE		proteins per second

k_b	Binding Activity of CSP to ComD		complexes per CSP per ComD per second

k_ub	Unbinding Activity of CSP:Phosphorylated ComD Complex		dissasociations per CSPComDP per second

k_k	Kinase Activity of Phosphorylated ComD		phosphorylations per CSPComDP per ComE per second

k_e	Enzyme Activity of GTFC		glucans formed per GTFC per Sugar per second

ϕ_CSP	Export Rate of CSP from a Cell		¹∕_{CSP × second}

ϕ_Glucan	Export Rate of Glucans from a Cell		¹∕_{Glucan × second}

k_ab	Binding Activity of ScFv to GTFC		Complexes per ScFv per GTFC per second

δ_GTFCScFv	Decay Rate of GTFC:ScFv Complex	∣∣	complexes per second

δ_{ScFv_cell}	Decay Rate of ScFv	∣∣	proteins per second

k_kc	Binding Activity of Kappa-Casein	1	¹∕_{KCasein × Glucan × second}

δ_{KCasein_cell}	Decay Rate of Kappa-Casein	∣∣	proteins per second

2.1 Cell Energy and Migration

P (σ ′ → σ ) = { 1-i(fΔ ΔHH+Y)+ Y > 0 (1) x x e---T---otherwise

∑ H = λV(νσ - Vt)2 σ>0

(2)

∑ H = [λV (νσ - Vt)2 + λP (ρσ - Pt)2] σ>0

(3)

∑ H = J[τ(σi),τ(σj)](1- δσiσj) i,j

(4)

δσiσj = {1,σi = σj;0,σi ⁄= σj}

(5)

Morpheus is based on the Cellular Potts Model, which defines cells as spaces in a two or three dimensional lattice (our model used a hexagonal lattice in 2d) and determines changes to cell shape and size by using the Hamiltonian (Equations 2-4). Equation 2 determines changes to the volume of a cell σ with current volume vσ in lattice sites and intended volume V _t where λ_V is a constant parameter of elasticity governing the extent to which the difference between the cell’s immediate and intended volume contributes to a rise in its free energy H. Equation 3 incorporates a similar system for cell perimeter into the equation.

In addition to modelling adjustments to cells’ shape and size as they migrate, the Cellular Potts Model also uses the Hamiltonian to model the interactions between cells. Equation 4 determines the interaction energies between different cell types where τ(σ_i,σ_j) represents the cell types of two cells σ_i and σ_j and J specifies said energies in matrix form. In order to prevent cells from returning values for interactions with themselves, the term known as the Kronecker Delta defined in Equation 5 is included. Each update to the configuration of cells in the lattice as a result of equations 2-4 only occurs with a certain likelihood governed by the Boltzmann probability in Equation 1. This equation states that the cell’s chance of changing state is 100% if its change in energy ΔH added to its resistance to change Y is favorable (ΔH+Y < 0) and decreases exponentially with a rate of 1T-

where T represents the amount of unfavorable updates to the cell lattice, defined as modifications to the cell’s perimeter or volume.

2.2 Diffusion and the Cell Lattice Space

-∂c -∂2c ∂T = D ∂X2

(6)

X- T- x = L ,t = τ

(7)

∂c τD-∂2c- ∂t = L2 ∂x2

(8)

c(x,t)  ci,j ≡ c(xi,tj),xi ≡ iδx,tj ≡ jδt

(9)

∂c-|xi,tj ci+1,j --ci--1,j= ci+1,j---ci-1,j- ∂x xi - xi-1 2δx

(10)

∂c ∂c ∂2c-|  -∂x |xi+-12 --∂x |xi--12= ci+1,j---2ci,j-+ci-1,j ∂x2 xi xi+12 - xi- 12 (δx)2

(11)

c = c + -δt--(c - 2c + c ) i,j+1 i,j (δx)2 i+1,j i,j i-1,j

(12)

( 2 2 ) ∂-c+ ∂-c  ci+1,j --2ci,j-+-ci-1,j-+ ck+1,j --2ck,j +-ck-1,j ∂x2 ∂y2 (δx)2 (δy)2

(13)

ci,k,j+1 = ci,j +-δt2(ci+1,j - 2ci,j +ci-1,j)+ --δt2(ck+1,j - 2ck,j + ck-1,j) (δx) (δy)

(14)

4 ∑∞ sin[(n-+-1)θ]sin[(m-+-1)θ]--sin-(nθ)sin[(m---1)θ] h2(2s) = 3 [(n+ 12m )2 + 3(12m)2]s m,n=-∞

(15)

Another important component of our model was treating sugar, CSP, and our potential outputs not simply as cell-associated values but as scalar fields representing the concentration of said molecules. Fortunately Morpheus also incorporates the ability to overlay fields like these on simulated cell populations and allows the field to be updated locally based on cell activity at a specific lattice site (cells can also report on fields or other cells surrounding them, a function which is explained in the next section). Diffusion Fields were evaluated using the Central Difference Method to solve the 2-D Diffusion Equation (Equations 6-14). Equation 6 is a differential equation modelling diffusion in one dimension where the concentration c is a function of the X-coordinate and time T. D signifies the diffusion coefficient and L represents the length between nodes in (X, T) space where the domain of solutions is 0≤X≤L. By making the change of variables in Equation 7, the Diffusion Equation can be rewritten in the form of Equation 8.

Next, a set of approximations are made in order to solve for the first and second derivatives of concentration with respect to x (Equation 9). The central difference method is a version of the finite difference method of approximations to solve the Diffusion Equation, which assumes that there is a minimum distance in both dimensions, δx and δt, for which the concentration changes. This creates a grid in (x, t) space as shown in Figure . The central difference method approximates the x-derivative of concentration based on concentration values on either side of a point (i, j), one for each of the smallest change in x from the starting point: (i+1, j) and (i-1, j) (Equation 10). This is more precise than the forwards or backwards difference methods which only take into account one direction of incrementation in x. Next, the second derivative of concentration with respect to x is approximated using the values of the first derivative on a range half as large as in the previous approximation (Equation 11).

Using these results, an approximation of the concentration at a point 1 time step in the future can be obtained based on the concentrations at the surrounding points at the initial time (Equation 12), which can be appended with terms based on the approximations for the first and second derivatives in Equations 10 and 11 in order to obtain a more accurate approximation. Of course, in our model this was done in two dimensions, but the steps for the second dimension y are the same as those above and can be combined into Equations 13 and 14 in order to obtain values for a 2-D diffusion model.

In order to sense nearby concentrations or cells in Morpheus, each cell can take advantage of the NeighborhoodReporter function, which maps values within a specified node length from a cell to an average, variance, or sum specific to that cell. In our model, cells employed the sum function of the NeighborhoodReporter to interact with the various diffusion fields. Equation 15 is the equation for a lattice sum in a 2-dimensional hexagonal lattice, which Morpheus approximates to write field values to individual cells.

The first objective of accurately modelling a bacterial population was to model population growth over time. We chose to define our time step in the model as one second, and used the canonical doubling time of around 3800 seconds to determine a probability of cell division at each time step. Next, in order to reflect the constantly shifting environment of saliva on teeth, we programmed the cells with random motion defined within a realistic range for cell velocity in the salivary microbiome. This ensured that the cells would interact so that they would be affected by the modified adhesion associated with biofilm initiation.

3.1 Transcription and Translation of the Relevant Genes

dmCSP-- = ktx × pComC + ktf × ComEP × pComC - δmCSP × mCSP dt

(16)

dmComD---= ktx × pComD - δmComD × mComD dt

(17)

dmComE---= ktx × pComE - δmComE × mComE dt

(18)

dmGT--FC-= k × ComEP × pGTF C - δ × mGT FC dt tf mGTFC

(19)

dCSPin- = α × mCSP - δ × CSP dt CSP CSP in

(20)

dComD--= αComD ×mComD - δComD×ComD+kub ×CSP ComDP - kb×CSPout ×ComD dt

(21)

dComE--= αComE ×mComE - δComE ×ComE - kk×CSP ComDP ×ComE+ktf ×pComC ×ComEP +ktf×pGT F C×ComEP dt

(22)

dGT-FC- = αGTFC × mGT F C - δGTFC × GT FC dt

(23)

The biggest hurdle in creating an accurate model of biofilm formation in S. mutans was simulating the two-component signaling system known as ComCDE. We began by creating differential equations for the transcription and translation of the ComC, ComD, and ComE genes based on transcription rates, translation rates and decay rates for mRNAs and proteins from the literature. We also researched the genome size and operon structure of the bacteria in order to better reflect its production of the relevant proteins in our model. Equations 16-18 represent the transcription of mRNAs for the three genes we chose to study, and the first four terms in each of Equations 20-22 represent the translation of those mRNAs. All terms and constants are defined in the table at the beginning of this section. Now that we had the cells generating values for these variables at each time step, we could add terms to model the kinetics of the two-component sensing system itself.

3.2 Two-Component Sensing System: Ligand Binding, Response Regulator Phosphorylation and DNA Binding

dCSPout ---dt---= kub×CSP ComDP - kb×CSPout ×ComD+kk ×CSP ComDP ×ComE

(24)

dCSP-ComD-- dt = kb×CSPout×ComD - kub×CSP ComDP - kk×CSP ComDP ×ComE - δCSP ComDP ×CSP comDP

(25)

dComEP---= kk×CSP ComD ×ComEP - ktf×pComC ×ComEP - ktf×pGT FC ×ComEP - δComEP×ComEP dt

(26)

We used the Law of Mass Action to create differential equations reflecting the kinetics of ComD binding its ligand CSP, the autophosphorylation of ComD, the phosphorylation of ComE by ComD, and finally the DNA binding and transcription factor activity of phosphorylated ComE. For a full explanation of the way the two-component signalling system works in live bacteria, please refer to our project page, as this section assumes knowledge of the relevant components and their functions. The first step in the signalling system is the binding of CSP to ComD. In order to initiate this activity, we used the NeighborhoodReporter function to update the variable CSPout at each time step. ComD binds CSP at the rate kb, at which point the two are considered one complex, CSPComDP for autophosphorylated ComD:CSP, and the amount of CSP and ComD decrease accordingly (Equations 21 and 24). CSPComDP has an unbinding rate kub which decreases the amount of ComD:CSP complexes while increasing both CSPout and ComD by the same amount. In order to simplify the system of equations, the assumption was made that once CSPComDP phosphorylates its response regulator ComE, the complex dissociates and returns to separate ComD and CSPout. kk represents the rate of the complex phosphorylating ComE (Equation 25). Once ComE is phosphorylated, it is considered a different variable, ComEP for phosphorylated ComE. Phosphorylated ComE upregulates the ComC and Glycosyltransferase genes with the rate ktf in order to create a positive feedback loop within the population characteristic of quorum-sensing systems and produce the necessary enzymes for biofilm formation. Once again, the assumption was made that after ComEP binds DNA it is dephosphorylated by available phosphatases and returns to its initial state, hence the amount of ComEP decreases with the rate ktf for each gene it upregulates and ComE increases by the same amount (Equations 22 and 26).

3.3 Water-Insoluble Glucan Synthesis via Glucosyltransferase Activity

dGlucan ---dt---= ke × GT F C ×Sugarcell - ϕGlucan × Glucan

(27)

Biofilm formation in the model depends on two main factors: Sugar available to the cell and the amount of Glycosyltransferase enzymes a cell possesses (GTFC). mRNAs for GTFC (Equation 19) are only transcribed when ComEP binds the gene to upregulate it, and GTFC proteins are translated according to Equation 23. Sugar concentration (Sugar_field) is given an initial value as a diffusion field in the model, which was defined as a homogeneous diffusion field due to sugar’s high solubility in saliva. Sugar_cell is defined at each time step via the NeighborhoodReporter function based on the current concentration of the Sugar_cell.

The final differential equation based on the Law of Mass Action in the most basic form of the model, Equation 27, governs the production of water-insoluble glucans from available sugar. The enzymatic activity of GTFC is denoted ke, and the glucans become the basis of the extracellular biofilm by being incorporated into the extracellular field of the same name at the rate ϕ_Glucan. In order to simulate the transition of S. mutans from free-floating planktonic cells to biofilm-bound immobilized cells which grow in localized colonies, each cell was programmed to also use the NeighborhoodReporter to return a constantly updated value for how concentrated biofilm-forming glucans are around that cell. Once the local biofilm reaches a threshold value of 5 arbitrary units, cells are considered to be adhered and their movement ceases.

3.4 Extracellular Diffusion Fields

dCSP --dt-- = ϕCSP ×CSPin

(28)

dBiofilm ---------= ϕGlucan ×Glucan dt

(29)

dSugarfield ----dt----= - ke × GT FC × Sugarcell

(30)

Equations 28-30 are evaluated at each time step at each point in the lattice in order to affect the changes individual cells make to field values. Both the biofilm and CSP field equations increase at an export rate times the number of proteins a cell at a given point has, and the Sugar decreases as GTFC transforms it into glucans. Sugar undergoes homogeneous or well-mixed diffusion, CSP was given a diffusion constant of D = 1, and the Biofilm was given the diffusion constant D = 0.001, both in as employed in Equation 6 and its subsequent approximate solutions.

Our model proved quite successful in qualitatively reproducing multiple aspects of the formation of live S. mutans biofilms. The simulated cells form clear microcolonies in the locations where glucan concentration is the highest which expand from there. CSP increases everywhere throughout the simulation as the time steps modelled only encompass the phase of growth when that is the case.

4.1 Effect of Sugar Concentration on Biofilm Formation

The first parameter we varied was sugar by modifying the initial value of the diffusion field. After fifteen thousand time steps, there is a clear difference in biofilm density in plots of the cells themselves. Graphs of data collected from a representative cell in each condition show a clear distinction in glucan production between sugar concentrations. For the time period simulated, limiting sugar concentration was between 10 and 25 arbitrary units, at which point the cells were unable to generate enough glucans to adhere and create microcolonies in a biofilm. These results reflect our experiments with live S. mutans in which we varied sucrose concentration and measured its effect on biofilm growth through image analysis of of colony-forming units as well as crystal violet staining. As expected based on our differential equations, the concentration of sugar decreases exponentially over time, whereas the total amount of glucans produced increases exponentially over time. The rates of exponential growth and decay also depend on the sugar available, with the difference in glucans produced by the end of the simulation being around one order of magnitude less than the difference in sugar available in all cases. This data was useful in helping us match the arbitrary units in the model to real experiments.

4.2 Inhibition of GTFC Activity by Single-Chain Variable Fragments

4.2.1 Additional Cellular Equations Governing ScFv Activity

dGT-F-CScF-v = kab × ScFvcell × GT FC - δGTFCScFv × GT FCScF v dt

(31)

dScF-vcell= - k ×ScF v × GT FC - δ × ScF v dt ab cell ScFvcell cell

(32)

4.2.2 Additional Field Equations Governing ScFv Activity

dScF-vfield-= (- k × ScF v ×GT F C - δ × ScFv ) ÷100 dt ab cell ScFvcell cell

(33)

4.2.3 Modified Equations Integating ScFv Activity

dGT FC --dt---= αGT FC × mGT F C - δGT FC ×GT F C - kab × ScFvcell × GT FC

(34)

The primary purpose of our model was to preemptively compare the effectiveness of our potential outputs by modelling their inhibition of different parts of the pathway and its effects on overall biofilm growth. We began by implementing the above differential equations (Equations 31-33), and modifying Equation 23 into equation 34. The amount of ScFvs affecting the cell is determined via the NeighborhoodReporter function, and the ScFvs bind GTFC at the rate kab. Once the ScFv binds a GTFC, it forms a GTFC:ScFv complex represented in Equation 31, decreasing both the cell’s GTFC and ScFv count accordingly (Equations 32 and 34). The complex degrades quickly based on a rate from the literature, and once it has degraded both the ScFv and GTFC bound have been eliminated from the simulation.

The effects of the ScFv on biofilm formation are immediately obvious from plots of the cells at the end of the simulation, with only 0.075 arbitrary units of concentration to start with being enough to prevent the bacteria from forming adherent microcolonies throughout the time modelled. The critical point at which the amount of GTFC: ScFv complexes goes from increasing to decreasing determines how soon the cells are able to begin initiating biofilm formation, assuming they are still in the logarithmic growth phase where CSP production is increasing. Since GTFC production is decreased, everything downstream of it is also affected: cells consume less sugar, produce fewer glucans, and create less of a biofilm field in the presence of ScFvs, as shown in the following figures. In equation 33, the decrease of the ScFv field is divided by 100 due to the membrane resolution being set to 100 for the simulation.

4.3 Inhibition of Glucan Activity by Kappa-Casein

4.3.1 Additional Cellular Equations Governing K-Casein Activity

dKCaseincell -----dt-----= - kkc × KCaseincell × Glucan - δKCaseincell × KCaseincell

(35)

4.3.2 Additional Fiels Equations Governing K-Casein Activity

dKCaseinfield -----dt------= (- kkc×KCaseincell×Glucan- δKCaseincell×KCaseincell)÷100

(36)

4.3.3 Modified Equations Integating K-Casein Activity

dGlucan-= ke×GT F C×Sugarcell- ϕGlucan×Glucan - kkc×KCaseincell×Glucan dt

(37)

Next, we sought to create a similar set of equations to model the effect of Kappa-Casein on biofilm formation. Equation 35 governs the effect of K-Casein on an individual cell. Cells use the NeighborhoodReporter to determine the amount of K-Casein molecules affecting it, and those molecules bind glucans at a rate of kkc. Rather than creating a new variable for the complex of K-Casein and a glucan, both are removed from the simulation immediately. Due to being outside the cell, the binding rate of K-Casein was set to 1. Equation 37 is a modification of equation 27 integrating the decrease in glucans.

It became clear from varying the initial K-Casein concentration that the equations were successful in modelling the decrease in biofilm formation due to a decrease in effective adherent glucans surrounding the cells. However, despite the K-Casein being given a dramatically higher binding rate than the ScFvs, a much higher concentration of K-Casein was required to limit the formation of a biofilm during the time simulated. In fact, ten times as high a concentration of K-Casein was required to achieve the same effects as ScFvs. The K-Casein also affected fewer aspects of the overall system of equations because its influence was only effective on one of the most downstream components of the simulation.

The most obvious takeaway from the results of our computational model was that ScFvs were more effective than K-Casein at inhibiting biofilm formation, or, more broadly, inhibiting GTFC is a better method for limiting the virulence of S. mutans than reducing the glucans themselves. This result had a profound effect on our experimental design and our project as a whole. While Kappa-Casein can be readily purchased and implemented in live experiments, the ScFvs we planned to express and implement were not available for purchase anywhere, would be very expensive to get synthesized, and would require additional training and lab techniques to isolate from mammalian cells expressing the protein itself. However, thanks to data from the model, we were able to confirm the superiority of the ScFvs for our purposes. Based on these results, we decided to move further ahead with characterization experiments for GTFC-inhibiting ScFvs.