Team:Chalmers-Gothenburg/ModelConstruction

Model- iGEM Chalmers-Gothenburg 2018

Model Design

This page serves as support to the model overview page. Here, the construction and initialization of the different models used in this iGEM project are discussed. None of our models are created from nothing, thus the sources and inspiration for our approach and models are presented on this page as well. In the following sections we will be referring to our scripts, functions, and models, which can all be found in our GitHub repository.

Kinetic Model - α Pheromone Feedback Loop

In order to simulate the α pheromone system and the production of anti-cancer agent, a model by Kofahl & Klipp (2004) was implemented. The model describes the α pheromone pathway from receptor activation to the production of Ste12. The system described by the model is illustrated in figure 1.

Model by Kofahl and Klipp (2004)
Figure 1: Illustration of the α pheromone pathway described by the model of Kofahl and Klipp (2004)

In this project, Bar1 is removed from the genome of the yeast. Because of this, the concentrations and rate of changes of Bar1 and all of its deriviatives included in the model by Kofahl & Klipp (2004) was set to zero. To get the anti-cancer agent production as a function of the α pheromone concentration in the environment, additional reactions where added to the model. The additional reactions include production of anti-cancer agent mRNA and protein as shown in the equations below.

As can be seen by the equations, it is assumed that the transcription and translation are linear to the Ste12 and the mRNA concentrations, respectively. The purpose of the constant Q is presented below.

The production of α pheromone via mRNA was added to the model in the same manner as for the anti-cancer agent production. In this way, a feedback loop was created where the concentration of α pheromone is regulated by itself.

Parameter values for the added reactions were found with optimization. The model was fit to the expression of GFP under the FUS1 promoter with the addition of 2000 nM α phermone. The data was obtained from iGEM Delft (2012). Prior to optimization, the GFP expression at the first time point was set as the baseline and the data was normalized. In order to fit the data, Bar1 was temporarily added to the model while Far1 was removed from it. In addition, pseudo reactions for mRNA and protein production replaced the separate productions of α pheromone and anti-cancer agent in order to fit the total protein production to the data. The initial concentration of α pheromone in the model was set to 2000 nM based on the concentration of α pheromone used in the experiments. For the optimization, particle swarm optimization as adapted from the course Stochastic Optimization Algorithms (Whade, M., 2017) was applied, using the squared difference between the data and the model output as the objective function. The following parameter values were obtained.

The figure below shows the model output, total protein concentration, together with the data used for optimization.

Figure 2: Normalized experimental data, GFP expression as a function of time, plotted together with the model output obtained after parameter estimation. The initial concentration of α pheromone was set to 2000 nM

It should be noted that the data used for optimization is not quantitative and that the model output as a result is only qualitative. In addition to this, an identifiability analysis should optimally have been performed in order to find out if the solution to the optimization problem is unique. In this case, it is highly suspected that the parameter values that were optimized for are linearly dependent, or correlated, which means that some of them should have been set prior to optimization in order to get a unique solution. Furthermore, it can be questioned wether it is reasonable that the transcription is so much faster than the translation as the parameter values indicate.

In order to find separate parameters for the production of anti-cancer agent and α pheromone, translation, transcription and degradation were assumed to be linear with respect to protein length. Q in the equations above was set to Q = r / (r + 1), where r = length(mfα2) / length(anti-cancer agent). The following lengths were used: 120 aa for mfα2, (Saccharomyces Genome Database, no date), 28 aa for p28 (Warso et al., 2013) and 527 aa for myrosinase.

The relative p28 production obtained by simulation with 20 nM, 200 nM and 2000 nM α pheromone is shown in figure 3.

Figure 3: Relative p28 production obtained by simulation for different start concentrations of α pheromone

The production of myrosinase is shown in figure 4.

Figure 4: Relative myrosinase production obtained by simulation for different start concentrations of α pheromone

Observing the figures of p28 and myrosinase production, it can be seen that as the initial α pheromone concentration is increased, the maximum concentration of anti-cancer agent produced increases. It can also be noted that the anti-cancer agent concentration becomes zero after some time, illustrating the need for continuous stimulation with α pheromone in order to get continuous production of anti-cancer agent.

Comparing figures 3 and 4, it can be seen that, as a direct consequence of the assumption that transcription, translation and degradation are linear to protein length, the production of p28 is higher than that of myrosinase. It can also be observed that the peak of the p28 production is reached within 500 minutes while the production of myrosinase is more prolonged.

To investigate how changes in the α pheromone concentration affect the anti-cancer agent production, the model was run for initial concentrations of α pheromone between 0 and 1000 nM. Figure 5 shows the maximum concentrations of p28 and myrosinase obtained as a function of initial α pheromone concentration.

Figure 5: Maximum concentrations of p28 and myrosinase as functions of initial α pheromone concentration

Figure 5 suggests that the anti-cancer agent production peaks at an initial concentration of approximately 20 nM of α pheromone. This would mean that the highest concentration of anti-cancer agent is obtained when the concentration of α pheromone in the environment is around this value. A value of 20 nM can seem low considering that we do not want production of anti-cancer agent unless several yeast cells have accumulated. However, it should be taken into account that as the yeast cells produce α pheromone, it will disperse in the colon making the effective α pheromone concentration lower than the one produced.

Kinetic Model - Cell Cycle

In order to determine the effect of p28 and myrosinase on cancer cell survival, an ODE model of the cell cycle was implemented. The cell cycle model is based on the work of Hamada et al. (2009) and consists of three parts; p53 dynamics, cell cycle arrest and apoptosis. The cell cycle arrest and apoptosis parts of the model were implemented according to Hamada et al. (2009) (figure 6) based on the models created by Bagci, Vodovotz, Billiar, Ermentrout, & Bahar (2000) and Aguda (1999), respectively (figures 7 och 8). Parameter values and initial conditions for these systems were taken from Hamada et al. (2009). An illustration of the cell cycle arrest system is shown in the figure below.

Cell cycle arrest system
Figure 6: An illustration of the cell cycle arrest system based on the work of Hamada et al. (2009)

The model by Hamada et al. (2009) was originally created to observe the effect of DNA damage on the cell cycle. For our purposes, the DNA damage part of the system was removed. An illustration of the apoptosis system is shown below.

Apoptosis system
Figure 7: An illustration of the apoptosis system based on the work of Hamada et al. (2009)

The cell cycle model is based on the idea that if the p53 levels are elevated, the cell will go into cell cycle arrest or undergo apoptosis. A relatively small increase in the p53 levels should make the cell go into cell cycle arrest, indicated by a delay in the peak of the active MPF levels (aMPF in the figure). Based on the work of Hamada et al. (2009), the condition for cell cycle arrest was set so that the delay should be at least 25 % of the initial peak time. If the p53 levels become high enough, the concentrations of the species involved in apoptosis should be elevated and the cell should undergo apoptosis. To simulate this, a threshold was set on the concentration of caspase-3 (abbreviated casp3 in the figure), a species involved in the later stages of the apoptosis system. It was decided that if the caspase-3 concentration becomes higher than the initial concentration of caspase-3, the cell undergoes apoptosis. This decision was based on the findings of Hamada et al. (2009). In conjunction to this, an extra term was added to the system with the motivation that the caspase-3 levels should be able to increase from zero. The term was constructed so that the p53 concentration has a direct effect on the concentration of caspase-8 (casp8), a species that indirectly cause the caspase-3 levels to be elevated.

The p53 threshold, p53thresh was set to 0.40 μM (Hamada et al., 2009).

The purpose of the cell cycle model was to find the effectiveness of the anti-cancer agents. To measure this, we decided to simulate the percentage of cancer cells that undergoes apoptosis or cell cycle arrest as a function of environmental anti-cancer agent levels. Stochasticity was introduced into the cell cycle model by two separate means. Firstly, the initial extracellular anti-cancer agent levels are drawn from a normal distribution around the actual environmental concentration. This is done in order to simulate fluctuations in the environment. Secondly, the uptake rate of anti-cancer agent into the cell is drawn from a normal distribution where the mean was found by optimization (see below). For both of the normal distributions, the standard deviation was set to half of the mean.

p53 Dynamics

For the p53 dynamics part of the cell cycle model, a new system of ODEs was created. The system describes the degradation of p53 both by Mdm2 and Cop1, as well as by other means not related to Mdm2 and Cop1. An illustration is shown in figure 8.

p53 dynamics
Figure 8: An illustration of p53 dynamics as adapted from Hamada et al. (2009)

The differential equations describing the illustrated p53 dynamics follow below.

The parameter values for the p53 dynamics were found with optimization (see below) except for the association, kCop11 and dissociation, kCop12, constants for Cop1. These values were set to kCop11 = 0.1 μM-1s-1 and kCop12 = 1.1 * 10-3 s-1 (Moscetti, Bizzarri, & Cannistraro, 2018).

The initial concentrations of the p53 system were found by the assumption that the system is in steady state prior to addition of anti-cancer agent. The initial concentrations depend on the parameter values as follows.

After the optimization, a stability analysis was performed to make sure that this set of initial concentrations gives a stable steady state.

Models of p28 and Myrosinase

To evaluate and compare the effects of p28 and myrosinase on cancer cell survival, two different models were made from the cell cycle model. The first model includes p28 and its effect on the p53 levels in the cell. The other model includes the enzymatic activity of myrosinase and the effect of sulforaphane on cell survival.

Model with p28

For the purpose of simulating the effect of p28 on the p53 levels and thereby cell survival, the following reactions were added to the cell cycle model.

The association, kp281, and dissociation, kp282, constants for p28 was set to kp281 = 0.86 μM-1 s-1 and kp282 = 5.6 * 10-3 s-1 (Moscetti, Cannistraro, & Bizzarri, 2017). The effect of p28 on the p53 levels are governed by two processes; the uptake of p28 and the binding of p28 to p53. It is assumed that the only reaction affected by p28 is the degradation of p53 via Cop1. This means that it is assumed that p53 can act even when p28 is bound.

The model with p28 was optimized using data for sulforaphane generated by Yamada et al. (2009) to find values for the parameters involved in the p53 system and the additional parameter added to the dynamics of caspase-8. The data concern breast cancer cells but for these purposes p28 is assumed to have the same effect on colon cancer cells. The optimization was performed with a genetic algorithm as adapted from the course Stochastic Optimization Methods (Wahde, M., 2017) and a set of model parameters were evaluated as follows. The model was run 100 times using the given parameter set to find the survival probability for several concentrations of p28. Preferably, the survival would have been estimated from a larger number of runs but in order to keep the running time within a reasonable range, 100 runs was deemed sufficient. The ODE system was solved for 86,400 time steps, corresponding to the 24 hours at which the data was recorded. The objective function was set to maximize the inverse of the squared difference between the generated and the experimental data. The parameter values obtained are shown below.

The mean uptake rate was found as 0.1326 s-1. Using the parameter values from the optimization, the initial concentrations of the p53 system were set as follows.

A stability analysis was performed in order to make sure that this set of initial values renders a stable steady state. The stability analysis showed that this steady state is indeed stable. A pdf of the Mathematica code used for the analysis can be found here.

Observing the parameter values and the initial concentrations obtained from the optimization it could be suspected that the initial concentration of Cop1 and the degradation of p53 via Cop1 is overestimated while those for Mdm2 is underestimated. The reason for this is that Mdm2 is supposedly the main regulator of p53 activity (Wang et al, 2011). It should also be noted that the cell never seems to go into cell cycle arrest, why the condition for cell cycle arrest could maybe be less strict. A less strict condition for cell cycle arrest might also affect the dynamics of Cop1. Another interpretation of the high values for the Cop1 activity could be that the regulation of p53 via Cop1 is more pronounced in a cancer cell. However, it is reasonable to assume that the overestimation of the Cop1 activity is a direct result of the fact that the objective function used during optimization favors parameters sets where p28 has a noticeable effect on the cell survival. The overestimation of Cop1 activity suggests that the model is faulty or that there are several solutions to the optimization problem. To investigate this further, an identifiability analysis should be performed. Based on the outcome of the analysis it could be decided if the model should be changed or if parameters should be set prior to optimization. Constraints could also be inferred on the Cop1 and Mdm2 activities, to make sure that the Mdm2 in fact becomes the main regulator of p53.

Model with Myrosinase

In order to test the effect of sulforaphane on cell survival, myrosinase and sulforaphane were added to the cell cycle model as follows.

It was assumed that the myrosinase concentration is constant during the time considered and that Michaelis-Menten kinetics apply (Hochkoeppler & Palmieri, 1992). The kcat value of myrosinase was set to kcat = 1.19 * 102 s-1 and the Km to Kmmyr = 160 nM * s-1 (Hochkoeppler & Palmieri, 1992). The concentration of glucosinolate was approximated based on an average content of 62.3 mg glucosinolate in 100 g of broccoli (Lewis & Fenwick, 1987). Assuming that the molecular weights of the glucosinolates are similar, the molecular weight of sinigrin was used to find the corresponding molar content. The molecular weight of sinigrin is 397.454 g/mol (Pubchem, n.d.), which gave a glucosinolate content of approximately 0.157 mmol per 100 g broccoli. Based on that a person eats a 100 g of broccoli in a day and that 10 liters of liquid passes through the gastrointestinal tract every 24 hours, the glucosinolate content could be approximated to 0.157/(10*24) mM ≈ 650 nM. Note that this is a rough approximation.

To see an effect of sulforaphane on cell survival, sulforaphane was assumed to linearly upregulate concentrations of Bax and p21 and to increase the levels of cytosomal cytochrome C (Tortorella, Royce, Licciardi, & Karagiannis, 2015). Bax, p21 and cytochrome C are all involved in apoptosis. The following reactions were added to the respective species in the model.

The parameter values for the uptake and action of sulforaphane was find with a genetic algorithm using data from Liu et al. (2017) and the same objective function as for the p28 model. The following parameter values were obtained.

The mean uptake rate of extracellular sulforaphane was found to be 2.073 * 10-4 s-1.

Outcome of Models

To test the effect of p28 and myrosinase on cell survival, the models were run for anti-cancer agent concentrations between 0 and 200 μM. For simplification, the concentration of sulforaphane was set directly and the conversion of glucosinolate by myrosinase was left out from the simulations. For each concentration of anti-cancer agent, cell survival was calculated based on a 1000 realizations evaluated for 86,400 time steps, corresponding to 24 hours. The results are shown in the figure below.

Cell survival
Figure 9: Cell survival as a function of anti-cancer agent concentration obtained with simulation during a time period corresponding to 24 hours

Observing figure 9, it can be seen that the effect of sulforaphane seems to be stronger than that of p28. It should be noted, however, that the effect of myrosinase might be less strong since it will be limited by the concentration of glucosinolate in the colon. Moreover, it should also be considered that the results from the α pheromone model suggest that the cells will produce more p28 than myrosinase at a given cell concentration.

Saccharomyces boulardii Genome Scale Metabolic Model

The first step in our genome scale metabolic modeling approach was to obtain a Saccharomyces boulardii model. Unfortunately, there was no readily available model for this organism in the literature, so we opted to generate our own. Since it can take anywhere from months to years to develop a well curated model, we decided to generate a Saccharomyces boulardii model using the closely related and well curated Saccharomyces cerevisiae GEM. Having access to this GEM allowed us to use a homology based approach to obtain the desired Saccharomyces boulardii GEM with the help of the RAVEN toolbox (Agren, 2013). Please refer to the script reconstruction.m to see our MATLAB code for this GEM reconstruction. The script is divided into the following sections:

1. Install RAVEN & Set Up Working Directory

We downloaded and installed the RAVEN toolbox. We used RAVEN 2.0.2. The exact version of RAVEN used for reconstruction, curation, and simulations is also available on our GitHub page. This GitHub repository also contains our community dFBA framework, the genome scale metabolic models we employed, and our simulation results.

2. Obtain Protein FASTA Files & Perform BLAST

Use following links to obtain protein FASTA files for our organisms of interest:

We ran a bi-directional BLAST using the FASTA files for our two organisms to obtain a BLAST structure. This structure contains information about which genes, and therefore reactions, from the template model are present in the target model.

3. Obtain High Quality Saccharomyces cerevisiae GEM

Download the yeast GEM here. We checked that model is functional and can produce biomass.

4. Get Saccharomyces boulardii Homology Model

We used the RAVEN function getModelFromHomology, which takes a template model and BLAST structure as inputs. A model is obtained by essentially subsetting reactions from the template model which are also present in the target model, based on the similarity of their protein FASTA files.

5. Saccharomyces boulardii GEM Curation & Gap-Filling

Next, we added non-gene annotated reactions from Saccharomyces cerevisiae GEM. This includes spontaneous reactions, exchange reactions, pseudo-reactions, etc. which were not captured in the BLAST structure. The model is still not able to produce biomass, therefore gap-filling needs to be done. The RAVEN toolbox offers the fillGaps function, which uses a template model as a reference to fill gaps so that biomass can be produced.

6. Add Proteins of Interest to Saccharomyces boulardii Model

After the draft model for Saccharomyces boulardii was reconstructed, we added protein synthesis and exchange reactions for our peptides of interest: MFalpha2, Myrosinase, and P28. The protein synthesis reactions were constructed based on the amino acid content of each peptide, and account for the ATP needed for the addition of each amino acid each (~4 ATP/amino acid (Amthor, 2000)). While the RAVEN toolbox offers functions to add reactions and metabolites to a model, it does not readily construct the lengthy stoichiometric equations needed for each protein. Instead of writing these expressions by hand we wrote a function to do this, called makeProteinEqn, which is under review to be added to the RAVEN toolbox. This function takes amino acid sequences and protein names as inputs, while outputting balanced protein forming equations such as:

'26 Ala-tRNA(Ala)[c] + 20 Arg-tRNA(Arg)[c] + 31 Asn-tRNA(Asn)[c] + 36 Asp-tRNA(Asp)[c] + 11 Cys-tRNA(Cys)[c] + 13 Gln-tRNA(Gln)[c] + 24 Glu-tRNA(Glu)[c] + 39 Gly-tRNA(Gly)[c] + 15 His-tRNA(His)[c] + 28 Ile-tRNA(Ile)[c] + 44 Leu-tRNA(Leu)[c] + 35 Lys-tRNA(Lys)[c] + 10 Met-tRNA(Met)[c] + 30 Phe-tRNA(Phe)[c] + 28 Pro-tRNA(Pro)[c] + 35 Ser-tRNA(Ser)[c] + 31 Thr-tRNA(Thr)[c] + 11 Trp-tRNA(Trp)[c] + 36 Tyr-tRNA(Tyr)[c] + 24 Val-tRNA(Val)[c] + 2107 ATP[c] => 26 tRNA(Ala)[c] + 20 tRNA(Arg)[c] + 31 tRNA(Asn)[c] + 36 tRNA(Asp)[c] + 11 tRNA(Cys)[c] + 13 tRNA(Gln)[c] + 24 tRNA(Glu)[c] + 39 tRNA(Gly)[c] + 15 tRNA(His)[c] + 28 tRNA(Ile)[c] + 44 tRNA(Leu)[c] + 35 tRNA(Lys)[c] + 10 tRNA(Met)[c] + 30 tRNA(Phe)[c] + 28 tRNA(Pro)[c] + 35 tRNA(Ser)[c] + 31 tRNA(Thr)[c] + 11 tRNA(Trp)[c] + 36 tRNA(Tyr)[c] + 24 tRNA(Val)[c] + 2107 ADP[c] + 2107 phosphate[c] + Myrosinase[c]'

Refer to makeProteinEqn.m to see the MATLAB code for this function.

7. Simulations

Checked that proteins can be feasibly synthesized while biomass is produced.

8. Final Clean-up of Saccharomyces boulardii Model

Added information about model reconstruction, export model to excel, and save model as a .mat file.

Representative Gut Microbiome Genome Scale Metabolic Models

To model the gut microbiome we decided to use three representative species in terms of composition and biosynthetic capabilities. A similar approach was taken in a gut microbiome study by Shoaie et al. (2013). In this paper, the authors generated genome scale metabolic models for three key gut microbiome member species: Bacteroides thetaiotaomicron, Eubacterium rectale and Methanobrevibacter smithii, which represent the main phyla Bacteroidetes, Firmicutes, and Euryarchaeota, respectively. We used these models, which were provided in the supplementary information section. In the subsequent sections, we will discuss the model curation performed on each model. Please refer to sections 1.2-1.4 of the simulationScript.m file to view our MATLAB code.

Bacteriodetes thetaiotaomicron GEM Curation

The objective function was set to growth. Exchange reactions for glutamine, histidine, lysine, phenylalanine, valine, threonine, isoleucine, leucine, methionine, and tryptophan were added to allow for uptake. Additionally, some metabolite names were modified for consistency with other models. For example, the metabolite “CO2” was changed to “carbon dioxide”. This facilitates the identification of relevant fluxes for the automatic formulation of metabolite mass balances in the dFBA simulations.

Eubacterium rectale GEM Curation

The objective function was set to growth. Uptake exchange reactions for the abovementioned amino acids, as well as for succinate, were added. Some metabolite names were modified for consistency, as with the previous model.

Methanobrevibacter smithii GEM Curation

The objective function was set to growth. Curation for this model was more extensive as it appears to be the least well-curated model. Running the gapReport revealed that the model contains many gaps and isolated subnetworks. By using the checkRxn function, it was discovered that a few reactants in the biomass pseudo reaction were not able to be synthesized by the model. To address this, we reformulated the biomass equation to exclude these unreachable metabolites, which allowed the model to grow. Next amino acid and formate uptake exchange reactions were added. Finally, some metabolite names were modified, as discussed previously.

Colorectal Cancer Genome Scale Metabolic Model

As we were unable to find a specific genome scale metabolic model for colorectal cancer in the literature, we consulted with a few researchers in our department. Lucky for us, Avlant Nilsson was working on a Hep G2 model and allowed us to use it. While the model is meant to be specific to liver cancer cells, it was still useful for us since different human cancers similar metabolic hallmarks (Hanahan, & Weinberg, 2011). This liver cancer model was derived from HMR 2.0, a well curated genome scale metabolic model for generic human cells (Mardinoglu, 2014). Curation for our cancer model was very similar to that of the previous ones, uptake exchange reactions were added for some amino acids, as well as the short chain fatty acids propanoate, butyrate, and acetate. The objective function was set to growth. Finally, some metabolite names were modified for consistency with other models. Please refer to section 1.5 of simulationScript.m to see the MATLAB code.

Human Gut Tissue Genome Scale Metabolic Model

We obtained a GEM of human gut tissue from the Human Metabolic Atlas tissue-specific model repository (Uhlen et al., 2015). This model was also derived from HMR 2.0. Since it was not appropriate to set the objective function of this model to growth, we used a modified human maintenance reaction as the objective function for this model. We took the human_proteinPool reaction and metabolite (a stoichiometric sum of amino acids needed for protein turnover) from the HMR 2.0 model, and added it as a necessary reactant for the maintenance reaction: ATP[s] + 6 H2O[s] + proteinPool[s] => ADP[s] + Pi[s] + H+[s]. Note that some additional water was added on the reactant side to make the model uptake water, as this is part of the biological function of gut cells. In the subsequent simulations, this additional water uptake will not have a significant influence as water is not limiting. Uptake exchange reactions for some amino acids, propanoate, butyrate, and acetate were added. Finally, some metabolite names were modified for consistency with other models. Please refer to section 1.6 of simulationScript.m to see the MATLAB code.

COM-dFBA: A Community Dynamic Flux Balance Analysis Tool

We decided to carry out some form of community dynamic flux balance analysis simulations early on in our modeling approach development. The advantage of dFBA over traditional FBA is that it allows us to simulate dynamic behavior. However it does require knowledge of certain kinetic parameters for the uptake of metabolites from literature. At first we considered compartmentalizing each model to generate one large network, but this would have been tedious and would have presented difficulties in determining a community objective function. To avoid this problem, we opted to not merge the models, and instead have them connect through mass balances of user-defined shared exchange metabolites. The scheme is presented in the modeling overview page. It consists of three interacting parts: the kinetic block, the FBA block, and the dynamic block. Based on user-defined initial metabolite concentrations, the kinetic block first determines the upper uptake rate of each metabolite j by each organism i, based on Michaelis-Menten kinetics. These rates are then used to update the allowed uptake bound of each exchange metabolite j by each organism i. Next, an FBA simulation is performed for each model in the FBA block. In the dynamic block, the fluxes for each exchange metabolite j is identified in the FBA solution of each organism i, and mass balances are generated for each exchange metabolite. An ODE solver then integrates these mass balances to determine the concentrations of each exchange metabolite j at a given time point t. These new metabolite concentrations are then used by the kinetic block to recalculate the allowed uptake bounds of metabolite j by organism i. This continues in a cyclic fashion until a steady state is reached, and thus dynamic simulation is obtained.

While there are existing tools which could be used for dFBA simulations, such as DyMMM, the COBRA toolbox, and DFBAlab, we decided to create our own dFBA tool, since the scheme is relatively straightforward to implement in MATLAB. Furthermore, the most recent and cutting edge tool, DFBAlab, does not seem to be easily accessible online. We requested access to the appropriate person, but we have yet to get a response from them.

COM-dFBA, short form COMmunity dynamic Flux Balance Analysis is a simple MATLAB framework for carrying out dFBA simulations for complex communities, such as the gut microbiome. It consists of six functions and one script, which can be found in our GitHub repository under the folders COM-dFBA/Scripts. It’s usage will now be described.

Initializing Models

Once the models have been curated we move on the initialization step. To view our code, please refer to section 2.0 of simulationScript.m. First, we create a cell structure containing the models to be used: modelSbo, modelBth, modelEre, modelMsi, modelCan, and modelCol representing Saccharomyces boulardii, Bacteroides thetaiotaomicron, Eubacterium rectale, Methanobrevibacter smithii, colorectal cancer cells, and colon cells, respectively. Next, we pass this cell structure to the initModels function, which outputs a structure containing the loaded models, called superModel, as well as other relevant information. Refer to initModels.m to see exactly what the function is doing. Next, we pass the superModel structure to the function superModelFluxes, which neatly summarizes the exchange fluxes of each model in a structure called fluxes. Note that these fluxes are calculated with no uptake limitations, aside from the ones imposed by each model's respective upper and lower bounds. Refer to superModelFluxes.m to see what the function is doing. Next, we create a cell structure with the names of the exchange metabolites for which mass balances will be generated. These include biomass for each model, substrates, and products. This structure, as well as the superModel structure, is passed to the checkFluxes function, which in turn outputs the structures metEquations, metIdx, bounds, blocked. This function is mostly meant to serve as a debugging tool to check that the metabolites have been correctly matched in each model, which can be double checked in the metEquations strucure. Refer to checkFluxes.m to see in detail what this function is doing.

Initializing Environment and Kinetic Parameters

Once the models have been initialized we move on the initializing the environment and kinetic parameters. To view our code, please refer to section 3.0 of simulationScript.m. First, we load data from the excel file called paramTables.xlsx, which can be found in the COM-dFBA/Scripts folder. This file contains four sheets, with information regarding Vmax values, Km values, initial and flow concentrations, and other miscellaneous parameters, respectively. The literature search and sources for these parameters are discussed in a further section. The excel data is loaded into a structure called params. Next, we pass the superModel and params structures to the function updateFluxes, which outputs the structures superModelConstrained, fluxesConstrained, boundsConstrained, and lbubConstrained. The function serves as a debugging tool and sanity check. It runs one iteration of the dFBA simulation to check that the bounds and fluxes are being adjusted properly, based on the chosen kinetic parameters and initial concentrations. The most interesting output to look at is perhaps fluxesConstrained, which displays similar information as the abovementioned fluxes structure, the difference being that the fluxes are now cosntrained by initial conditions and kinetic parameters. Refer to updateFluxes.m to see what the function is doing. Once everything is in working order, we finally move onto simulations.

Simulations

Once the environment and models have been initialized we can begin running simulations. Refer to section 4.0 of the file simulationScript.m to see the MATLAB code. To begin a simulation, we simply use the following snippet of code: [t,xa] = ode15s(@(t,x)f(t,x,superModel,params),[t0 tf], initialConditions ,odeoptions). As can be seen, we rely on one of MATLAB’s ODE solver for integration. More specifically, we use the variable step continuous implicit solver ode15s. This solver was determined to be the most efficient through trial and error of different solvers, including ode23 and ode113. Next, we see that the ode15s function calls on another function called f. Additionally, we provide the ODE solver with a time span, initial conditions, and some options. The function f takes in the superModel and params structures as input arguments. Please refer to the file f.m to inspect the code in detail. This function is the foundation of our dFBA framework, and contains three distinct sections: 1. KINETIC BLOCK, 2. FBA BLOCK, and 3. DYNAMIC BLOCK. The kinetic block uses Michaelis-Menten kinetics to calculate the maximum possible uptake bound for each metabolite and organism, which depends on how much of a given metabolite is present in the environment. The following expression is calculated in a nested for loop, for each organism i and for each metabolite j.

Next, the function uses these above calculated values to update the appropriate uptake bounds in each corresponding GEM. Finally, we compare the MFalpha2 concentration in the environment with the MFalpha2 threshold for anti-cancer protein production, and update the protein production in the S. boulardii model accordingly. The FBA block then solves the models with updated bounds to obtain exchange fluxes for each organism. If the solution for the S. boulardii model becomes infeasible due to protein production constraints, we attempt to successively diminish the protein production load until the model is solvable or protein production is halted. Finally, in the dynamic block we take the FBA solutions and construct mass balances for each exchange metabolite of interest. This is done in a loop, and the mass balances are formulated as follows:

In other words, the change in the concentration of metabolite X is equal to the sum of production of X, minus the sum of consumption of X, minus the difference between outflow and inflow of X multiplied by the dilution rate. Please note that the summation terms refer to production and consumption of metabolite X by all models, and they take into account the biomass concentration of each organism consuming or producing metabolite X. The only exception to this formulation is the mass balance for cancer biomass, which accounts for the effect of anti cancer proteins instead of having a dilution term. In the following equation the term ACE represents the Anti Cancer Efficiency (1/hr), MW represents the molecular weight of the anti cancer protein (g/mmol), and X(Myr) represents the anti cancer protein (in this case Myrosinase) concentration (mmol/L):

Note that the resulting expressions for biomass have units of g/L*hr, whereas the expressions for metabolites have units of mmol/L*hr.

The beauty of COM-dFBA is that it takes care of identifying exchange fluxes for user-defined metabolites in the FBA solution of each model i, it identifies if the fluxes corresponds to production or consumption, and automatically formulates a mass balance equation for each exchange metabolite j. To see exactly how COM-dFBA does this, please refer to the last part of the f function.

Parameter Determination

Important parameters to determine include dilution rate, initial concentrations of metabolites, inflow concentrations of metabolites, and kinetic parameters for the uptake of exchange metabolites for each organism. Let us first consider the nature of our system, i.e. the gastro-intestinal tract. This can be split up in three main compartments: the stomach, responsible for the acidic and enzymatic break down of the ingested food, the small intestine, which further digest the food and absorbs a large fraction of the nutrients and water, and finally, the large intestine, which includes the colon and hosts microorganisms that ferment non-digestable nutrients (Crane, 1975).

Figure 10: Global schematic of stomach and intestine functions (Crane, 1975)

We are specifically interested in modeling the last of these three compartments, the large intestine, which includes the colon. This compartment can be viewed as a continuous ideally stirred-tank reactor (CISTR), where the volume is constant. Two key assumptions apply: the concentration in the outflow is equal to the concentration in the reactor, and the inflow and outflow rates are constant and equal. Next, volumes and flow rates have to be determined. Further in the discussion, initial and inflow concentrations of metabolites and biomass will also be determined.

Volume and Flow Rate

A person ingests around 2 liters of food and drinks a day (Wikibooks, n.d.), to digest this, the body produces around 7l of secretions: 2,5l in the stomach, 4,5l in the small intestine and 0,2l in the large intestine (Wikibooks, n.d.). The small intestine in turn absorbs water accounting for 90% of this volume (Azzouz & Sharma, 2018). This sums up to an inflow of:

(4,5 + 2,5 + 2) * 0,1/24 l/day = 0,0375 l/h from the small intestine
0,2 l/day = 0,0083 l/h of secretions
Thus:0,0375 + 0,0083 = 0,0458 l/h round up to: 0,05 l/h

The large intestine absorbs a large part of the remaining fluids but this will not be included in order to have only one outflow to the large intestine; the faeces.

While in reality the volume of the content of the gastrointestinal tract is not constant, we estimate the volume of large intestine based on the transit time of its content. This transit time is highly variable amongst individuals and directly depending on the individual’s diet. The transit time for the large intestine will be approximated to 30h (Chaddock et al., 2014). Assuming that the content passes through the large intestine on a linear time scale we get a volume of: 0,05 l/h x 30h = 1,5l

Initial and Inflow Concentrations

With the framework of the simulation being set up, both initial and inflow concentrations of metabolites as well as biomass have to be determined. These will allow the model to compute final steady-state concentrations that are of interest.

Initial Biomass Concentrations

Six different biomasses are included in the model; gut microorganisms, including 3 representative species of the gut microflora, the human colon cells, the human cancer cells and Saccharomyces boulardii. All of these except the last one are assumed to have no inflow but only an initial concentration.

The total dry mass of all microorganisms in the colon is 0,100g (Karlsson et al., 2012). 98% of the microflora consist of Bacteria and the remaining 2% are Archea (Karlsson et al., 2012). Of the Bacteria, 29% belong to the Bacteroidetes phylum, 56% to the Firmicute phylum and the remaining 15% are other species that are not included in the model (Karlsson et al., 2012). To keep the same proportion of bacteria to archea, the percentages of Bacteriodetes and Firmicutes have been proportionally increased to 33% and 65% respectively. With a volume of 1,5l as calculated above, this results in 22,29 g/l of Bacteriodetes represented by B. thetaiotaomicron, 43,04 g/l of Firmicutes represented by E. rectale and 1,33 g/L of Euryarcheota species represented by M. smithii.

Due to a lack of literature data, we have assumed that the number of colonocites is in the same range as the enterocyte cell count in the small intestine, which is 1,67*1010 (Bianconi, 2013).
With a cell volume of 1,4*10-12 liter (Wiśniewski, 2012), a density of 1,05 (Pertoft & Laurent, 1977) and 40% of its mass being dry weight (Kretsinger, Uversky, & Permiakov, 2013) the total dry mass enterocytes, and thus the assumed dry mass of colonocites equals:
1.67*1010 * 1,4*10-12 * 1,05 * 0,40 = 0,98*10-2kg
With a volume of 1,5l this gives a concentration of: 6,55 gdw/L

Metabolite Inflow Concentrations

The metabolite inflow used in the model is based on daily intake according to a "standard British-type diet" (Stephen & Cummings, 1980). This diet contains 407 grams carbohydrates, including 22 grams of fibers, 85 grams of proteins and 108 grams of fats. Fats are not included in the simulation because they do not play a major role in the fermentation. Additionally, key metabolites such as ammonium and phosphate are included. Respectively 4,018 g/day (WHO, 2003), of which the majority is secreted by the body, and 1,5g/day (EFSA NDA Panel , 2015)

Carbohydrates

Dietary carbohydrates are a wide variety of molecules ranging from simple monosaccharides to complex and resistant fibers. The carbohydrates of interest in this model are those that are fermented in the colon, namely fermentable fibers and resistant starch (Chebrut, Barry, Lairon, & Durand, 1995). They are consumed by the gut microorganisms to produce Short Chain Fatty Acids (SCFA). The other types of carbohydrates will either be majoritarily taken up in the small intestine or passing through unaffected. The composition of carbohydrates varies greatly depending on diet type. Here, it is assumed that 20% by mass of dietary carbohydrates are fermentable. To keep the number of exchange metabolites in the model low, all carbohydrates are modeled as glucose.

Amino Acids

In the large intestine, amino acids are both used as building blocks for protein production in all present cells and as substrate for fermentation. 12 grams of proteinaceous material enter the large intestine every day (Hughes, Magee, & Bingham, 2000). It is mainly composed of proteins and peptides and around half of it originates from the diet while the body secretes the rest. All proteinaceous material is modeled as free amino acids and only essential amino acids are included as metabolites. The proportions of amino acids entering the large intestine are assumed to be the same as in food as determined by (Lenders et al., 2009) and scaled down to 12g/day. In our simulations we included the following amino acids: glutamine, histidine, lysine, phenylalanine, valine, threonine, tryptophan, methionine, leucine, isoleucine.

Oxygen

Since the inside of the colon is anaerobic, a mass balance for oxygen was not included in the simulations involving the gut microbiome. For simulations involving cancer and our engineered yeast, we allowed a small influx of oxygen. The rationale behind this was that the vascular remodeling induced by the cancer would provide some oxygen for our yeast, which is programmed to bind to the cancer.

Secondary Metabolites

Other metabolites included in the simulation are carbon dioxide, acetate, propanoate, butyrate, succinate, formate, ethanol and methane. These are all secondary metabolites, meaning they are not present in the inflow.

Michaelis-Menten Kinetic Parameters for Substrate Uptake

Uptake rates are based on Michaelis-Menten kinetics. Thus, they behave according to the equation:

Vs = (Vmax*S)/(KM+S)

For each substrate and each organism both a maximal uptake rate Vmax and a Michaelis constant KM are needed. These are only rarely well researched, thus in some cases rough assumptions had to be made. The values for these constant are summarized in the excel table accessible here as found in literature and converted into the right units for our purpose.

In order to make the simulations output realistic results, parameters have been changed to obtain final biomass and metabolite steady state concentrations that correspond to what would be expected from literature. The changes have been summarized in this excel file.

Simulation Design

To investigate the effect of our engineered organism, we set up two distinct in silico experiments:

  1. S. boulardii and Cancer
  2. Gut Community Interaction
    • Gut Microbiome and Colon before S. boulardii Inflow
    • Gut Microbiome and Colon with S. boulardii Inflow
    • Gut Microbiome and Colon after interrupting S. boulardii Inflow

In the first in silico experiment we can see the interaction dynamics between S.boulardii and colorectal cancer. More specifically, we aim to investigate: if our engineered yeast can compete with the cancer for substrates, if our engineered yeast can produce enough α pheromone to trigger the production of our anti-cancer agent, and if the resulting anti-cancer agent concentration is sufficient to disrupt the growth of the cancer. This simulation was run for 500 hours. The parameter values used for this simulation are summarized in this excel file.

In the second in silico experiment, which is actually three consecutive simulations, we aim to investigate the effect of adding S. boulardii to the gut microbiome. In the first sub-simulation, we determine the steady state composition of the gut microbiome without S. boulardii. In the second sub-simulation, we then introduce an inflow of S. boulardii to the steady state obtained in the first sub-simulation. Finally, in the third sub-simulation, we stop the inflow of S. boulardii. The rationale behind this is to see the extent to which S. boulardii disrupts the gut microbiome composition, to see if S. boulardii will become out competed by the gut microbiome once the dosage/inflow is halted, and to see if the gut microbiome can return to its pre-S. boulardii composition once the dosage/inflow is halted. The parameter values used for this simulation are summarized in this excel file.

The results of these simulations are discussed on the model overview page.

References

Aguda, B. D. (1999). A quantitative analysis of the kinetics of the G2 DNA damage checkpoint system. Proceedings of the National Academy of Sciences, 96(20), 11352–11357. https://doi.org/10.1073/pnas.96.20.11352

Azzouz, L. L., & Sharma, S. (2018). Physiology, Large Intestine. StatPearls. StatPearls Publishing. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/29939634

Bagci, E. Z., Vodovotz, Y., Billiar, T. R., Ermentrout, G. B., & Bahar, I. (2006). Bistability in Apoptosis: Roles of Bax, Bcl-2, and Mitochondrial Permeability Transition Pores. Biophysical Journal, 90(5), 1546–1559. https://doi.org/10.1529/biophysj.105.068122

Bianconi, E., Piovesan, A., Facchin, F., Beraudi, A., Casadei, R., Frabetti, F., … Canaider, S. (2013). An estimation of the number of cells in the human body. Annals of Human Biology, 40(6), 463–471. https://doi.org/10.3109/03014460.2013.807878

Bianconi, E., Piovesan, A., Facchin, F., Beraudi, A., Casadei, R., Frabetti, F., … Canaider, S. (2013). An estimation of the number of cells in the human body. Annals of Human Biology, 40(6), 463–471. https://doi.org/10.3109/03014460.2013.807878

Chaddock, G., Lam, C., Hoad, C. L., Costigan, C., Cox, E. F., Placidi, E., … Spiller, R. C. (2014). Novel MRI tests of orocecal transit time and whole gut transit time: studies in normal subjects. Neurogastroenterology and Motility : The Official Journal of the European Gastrointestinal Motility Society, 26(2), 205–214. https://doi.org/10.1111/nmo.12249

Crane, R. K. (1975). The Physiology of the Intestinal Absorption of Sugars This review has to do with the physiology of the intestinal absorption of sugars and should properly begin with a brief discussion of the components of the physiological system which carries out this i, 2–19. https://doi.org/10.1021/bk-1975-0015.ch001

EFSA NDA Panel. (2015). Scientific Opinion on Dietary Reference Values for phosphorus. EFSA Journal, 13(7), 4185. https://doi.org/10.2903/j.efsa.2015.4185

Frydoonfar, H. R., McGrath, D. R., & Spigelman, A. D. (2004). Sulforaphane inhibits growth of a colon cancer cell line. Colorectal Disease, 6(1), 28–31. https://doi.org/10.1111/j.1463-1318.2004.00488.x

Hamada, H., Tashima, Y., Kisaka, Y., Iwamoto, K., Hanai, T., Eguchi, Y., & Okamoto, M. (2009). Sophisticated Framework between Cell Cycle Arrest and Apoptosis Induction Based on p53 Dynamics. PLoS ONE, 4(3), e4795. https://doi.org/10.1371/journal.pone.0004795

Hochkoeppler, A., & Palmieri, S. (1992). Kinetic properties of myrosinase in hydrated reverse micelles. Biotechnology Progress, 8(2), 91–96. https://doi.org/10.1021/bp00014a001

Hughes, R., Magee, E. A., & Bingham, S. (2000). Protein degradation in the large intestine: relevance to colorectal cancer. Current Issues in Intestinal Microbiology, 1(2), 51–58. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/11709869

Karlsson, F. H., Fåk, F., Nookaew, I., Tremaroli, V., Fagerberg, B., Petranovic, D., … Nielsen, J. (2012). Symptomatic atherosclerosis is associated with an altered gut metagenome. Nature Communications, 3(1), 1245. https://doi.org/10.1038/ncomms2266

Kofahl, B., & Klipp, E. (2004). Modelling the dynamics of the yeast pheromone pathway. Yeast, 21(10), 831–850. https://doi.org/10.1002/yea.1122

Kretsinger, R. H., Uversky, V. N., & Permi︠a︡kov, E. A. (Evgeniĭ A. (2013). Encyclopedia of metalloproteins.

Ledley, R. S. (1960). Report on the Use of Computers in Biology and Medicine - Robert Steven Ledley - Google Boeken. Retrieved from https://books.google.se/books?id=J5grAAAAYAAJ&printsec=frontcover&hl=nl&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false

Lev Bar-Or, R., Maya, R., Segel, L. A., Alon, U., Levine, A. J., & Oren, M. (2000). Generation of oscillations by the p53-Mdm2 feedback loop: A theoretical and experimental study. Proceedings of the National Academy of Sciences, 97(21), 11250–11255. https://doi.org/10.1073/pnas.210171597

Lewis, J., & Fenwick, G. R. (1987). Glucosinolate content of brassica vegetables: Analysis of twenty-four cultivars of calabrese (green sprouting broccoli, Brassica oleracea L. var. botrytis subvar. cymosa Lam.). Food Chemistry, 25(4), 259–268. https://doi.org/10.1016/0308-8146(87)90012-4

Moscetti, I., Bizzarri, A. R., & Cannistraro, S. (2018). Imaging and kinetics of the bimolecular complex formed by the tumor suppressor p53 with ubiquitin ligase COP1 as studied by atomic force microscopy and surface plasmon resonance. International Journal of Nanomedicine, Volume 13, 251–259. https://doi.org/10.2147/IJN.S152214

Moscetti, I., Cannistraro, S., & Bizzarri, A. R. (2017, November 20). Surface plasmon resonance sensing of biorecognition interactions within the tumor suppressor P53 network. Sensors (Switzerland). https://doi.org/10.3390/s17112680

Pertoft, H., & Laurent, T. C. (1977). Isopycnic Separation of Cells and Cell Organelles by Centrifugation in Modified Colloidal Silica Gradients. In Methods of Cell Separation (pp. 25–65). Boston, MA: Springer US. https://doi.org/10.1007/978-1-4684-0820-1_2

Pubchem. (n.d.). Compound Summary for CID 23682211 - Sinigrin. Retrieved October 7, 2018, from https://pubchem.ncbi.nlm.nih.gov/compound/Sinigrin#section=Top

Shoaie, S., Karlsson, F., Mardinoglu, A., Nookaew, I., Bordel, S., & Nielsen, J. (2013). Understanding the interactions between bacteria in the human gut through metabolic modeling. Scientific Reports, 3(1), 2532. https://doi.org/10.1038/srep02532

Stephen, A. M., & Cummings, J. H. (1980). THE MICROBIAL CONTRIBUTION TO HUMAN FAECAL MASS. Journal of Medical Microbiology, 13(1), 45–56. https://doi.org/10.1099/00222615-13-1-45

Stephen, A. M., & Cummings, J. H. (1980). THE MICROBIAL CONTRIBUTION TO HUMAN FAECAL MASS. Journal of Medical Microbiology, 13(1), 45–56. https://doi.org/10.1099/00222615-13-1-45

Tortorella, S. M., Royce, S. G., Licciardi, P. V., & Karagiannis, T. C. (2015). Dietary Sulforaphane in Cancer Chemoprevention: The Role of Epigenetic Regulation and HDAC Inhibition. Antioxidants & Redox Signaling, 22(16), 1382–1424. https://doi.org/10.1089/ars.2014.6097

Wang, L., He, G., Zhang, P., Wang, X., Jiang, M., & Yu, L. (2011). Interplay between MDM2, MDMX, Pirh2 and COP1: the negative regulators of p53. Molecular Biology Reports, 38(1), 229–236. https://doi.org/10.1007/s11033-010-0099-x

Warso, M. A., Richards, J. M., Mehta, D., Christov, K., Schaeffer, C., Rae Bressler, L., … Das Gupta, T. K. (2013). A first-in-class, first-in-human, phase I trial of p28, a non-HDM2-mediated peptide inhibitor of p53 ubiquitination in patients with advanced solid tumours. British Journal of Cancer, 108(5), 1061–1070. https://doi.org/10.1038/bjc.2013.74

WHO. (2003). Ammonia in Drinking-water - Background document for development of WHO Guidelines for Drinking-water Quality. Guidelines for drinking-water quality (Vol. 2). World Health Organization. Retrieved from http://www.who.int/water_sanitation_health/dwq/ammonia.pdf

Wikibooks. (n.d.). Medical Physiology/Gastrointestinal Physiology/Secretions - Wikibooks, open books for an open world. Retrieved October 7, 2018, from https://en.wikibooks.org/wiki/Medical_Physiology/Gastrointestinal_Physiology/Secretions

Wiśniewski, J. R., Ostasiewicz, P., Duś, K., Zielińska, D. F., Gnad, F., & Mann, M. (2012). Extensive quantitative remodeling of the proteome between normal colon tissue and adenocarcinoma. Molecular Systems Biology, 8, 611. https://doi.org/10.1038/msb.2012.44

Yamada, T., Mehta, R. R., Lekmine, F., Christov, K., King, M. L., Majumdar, D., … Das Gupta, T. K. (2009). A peptide fragment of azurin induces a p53-mediated cell cycle arrest in human breast cancer cells. Molecular Cancer Therapeutics, 8(10), 2947–2958. https://doi.org/10.1158/1535-7163.MCT-09-0444