Metabolic engineering and synthetic biology in general are powerful areas of research, but they can only realize a leap in breakthroughs when computational models involved in the Design-Build-Test-Learn engineering paradigm are standardized and implemented in the field. As part of our iGEM project, the models developed were motivated by different objectives and approaches in order to inform and be shaped by our endeavors in the lab. To do so, two different computational models were used to determine the best approaches to improve PHA yield and productivity in recombinant E. coli.
One of the ways to model a system in silico is through the dynamic model, a favourite among systems biologists. By defining a pathway (such as the heterologous PHA biosynthetic pathway) or metabolic network as a series of ordinary differential equations (ODE), dynamic behaviors in the system over time may be predicted. A considerable limitation arises from the need for highly detailed information to parameterize the kinetics of the pathway; therefore, this type of model may be difficult to implement.
On the other hand, an organism as well-studied as E. coli enjoys a vast wealth of literature surrounding its genome and metabolic network. As a result, stoichiometric or constraint-based models such as genome-scale models (GEM) are available to be used for in silico simulations without necessitating the rigorous characterization associated with dynamic or kinetic models. Using this, we can investigate problems such as the best conditions to grow a recombinant recombinant Sleeping beauty mutase (SBM) operon-containing E. coli and how to improve the titer of PHA.
A metabolic network such as that of E.coli may be viewed in mathematical terms: a system of equations that follow the transcriptional, translational and enzymatic activity within the organism. However, for the purposes of looking at the PHA biosynthetic pathway, we will be looking predominantly at the biochemical reactions from an enzyme kinetics perspective. The two main ways to represent reactions and their rates are mass-action and Michaelis-Menten kinetics. The latter is the most common way to represent the enzyme kinetics of a biochemical pathway; thus, for our purposes we will use it as the main paradigm for further derivation.
Because the PHA biosynthetic pathway utilizes acetyl-CoA - a crucial component in core metabolism of the organism - as an initial substrate, we must ensure that the model also contextualizes PHA production in the scope of other relevant pathways: the two main ones are glycolysis and the TCA cycle. If we plan to consider the use of alternative feedstock with high quantities of short-chain fatty acids (acetate and propionate), it would be ideal to also provide information regarding the assimilatory pathway for these substrates. Having the most comprehensive dynamic model allows understanding of the control structure of the system (Moreno-Sánchez et al., 2008) as well as identifying bottlenecks arising from elements such as energy and reducing power (NADPH/NADH) for anabolic and catabolic reactions. However, building a dynamic model requires a wealth of information, which most of time is unavailable and requires extrapolation and inference.
To make predictions regarding the metabolic capacity of engineered strains of E. coli, for instance, the strain wherein the Sleeping beauty mutase operon is activated, allowing for in vivo biosynthesis of propionate (Srirangan et al., 2016; Gonzalez-Garcia et al., 2017) that may be used for producing the PHBV co-polymer. Understanding how a combination of feedstock such as sugars and fatty acids can influence the overall yield and composition of our PHBV end product can be an enabling tool; however, to achieve this goal is very labor- and resource-intensive. For this purpose, it was deemed more appropriate to use a constraint-based modeling approach (see below).
Understanding the relevant pathways for PHA biosynthesis
Using the dynamic model, we can investigate problems related to the behavior of the system and the consequences in production. The questions that we seek to investigate using this model are related most closely to the amount of feedstock (e.g. acetate and propionate) and their relative proportions that would achieve a desired end composition of PHBV co-polymer (Horng et al., 2013). In addition, we also used the model to theoretically survey the effects that can be achieved towards this end through the engineering of enzymes (that is, changing the binding affinity and catalytic activity). From this, we may also be able to speculate on the effect of change in gene arrangement in the operon (Hiroe et al., 2012). The resulting differences in transcriptional levels of the encoded enzymes can be represented in the system as different amounts of enzymes.
To make a cursory exploration into how we may approach the production of PHBV with tailor-made composition and properties, we deemed that the dynamic model may even be sufficient if we were to limit the scope to only the PHA biosynthetic pathway while we would effectively "black box" the peripheral metabolic network. That is, we will define the pathway of three subsequent reactions, condensing acetyl-CoA and propionyl-CoA into polymers of PHBV.
For the model, we must assume a number of aspects. First, we have to assume that the parallel pathways contributing to the PHB and PHV components in the co-polymer do not interact. While it is most likely not the case, representation of a pathway where there is competition between substrates for a common enzyme may difficult to do so via a Michaelis-Menten derivation without more data. On a first pass, we assume that this distinction is condonable as long as the differences in the parameters we are testing are significant. Second, we must provide an influx of material into the biosynthetic pathway. Because we are not intensively modeling the pathways for glycolysis, TCA and fatty acid assimilation (a problem for another day after iGEM!), we assume the introduction of acetyl-CoA and propionyl-CoA precursors can be represented via a heuristic means. The third assumption builds upon this in that, the amount of starting precursors that we test for simulations of the dynamic models are assumed to be within physiological ranges and would not be detrimental to the cell. Fourth, we assume that the system is homogeneous (no spatial element or compartments). Finally, we assume that the levels of enzymes are conserved throughout the experiment (note that the concentration of available enzyme does not fluctuate as would be the case in a mass-action representation that involves the formation of enzyme-intermediate complexes).
The biosynthetic pathway towards producing PHBV co-polymer
Based on these assumptions, a model was developed expanding on the work done by the Stanford-Brown 2015 iGEM team. Their model originally outlines the system using irreversible Michaelis-Menten kinetics for PHB biosynthesis. However, understanding that the catalytic activity of our encoded PHA enzymes can go both ways (for example, the β-ketothiolase can both form acetoacetyl-CoA from 2 acetyl-CoA molecules via a Claisen condensation or can degrade the acetoacetyl-CoA back in the reverse direction), we deemed it was more appropriate to represent the kinetics of the system with the reversible form of the Michaelis-Menten equation (Imperial and Centelles, 2014). This has been shown in past literature to be preferred for in vivo scenarios. This applied to the enzymes encoded by phaA, bktB and phaB; however, the phaC enzyme is excluded as literature has shown the synthase activity to be unidirectional. Another enzyme is responsible for the degradation of PHAs into usable subunits; however, since this element is not present in E. coli, we can effectively disregard this.
Defining the dynamic model using a system of ordinary differential equations (ODEs)
The reactions regarding the condensation and reduction of acetyl-CoA and propionyl-CoA are two-substrate reactions; thus, we decided to represent this mathematically with the corresponding formulae (Cornish-Bowden 1993) shown above. The Michaelis constants (Km) factor in the enzymes' relative affinity to each substrate. These are also considered in the enzyme's catalytic efficency (ks, kp ). You may find further details on the chosen parameters in our Github repository.
In the case of PHA synthase, since it directly incorporates 1 equivalent of 3HB-CoA or 3HV-CoA into a PHBV polymer, we represented this using a traditional irreversible Michaelis-Menten equation.
The aforementioned means to represent influx of material is derived also from the Stanford-Brown 2015 model. In their model, they incorporated an enzyme called acetate synthase (encoded by acs) that produces acetyl-CoA from acetate and coenzyme A. We adapted this formula to represent the assimilation of both acetate and propionate and their subsequent conversion into their CoA-ligated forms.
To read more about the dynamic model developed by the Stanford-Brown 2015 team, please click here!
From a theoretical standpoint, it was worth investigating how different combinations of precursors may influence the resulting composition of the co-polymer product. The time plot generated from solving the ODE system with different initial parameters reveals as we hypothesized that with increasing available propionyl-CoA, the PHBV was predicted to have an increasing percentage of the PHV component. The initial concentrations of all enzymes were held at 0.1 μM and acetate at 1000 μM. For each run, the initial concentration of propionate was increased by 500 μM.
Among the main observations from the model results were that the resulting predicted molar ratios may be inaccurate and significantly discordant from the actual physiologically relevant amounts of PHBV found in microorganisms. Even in the case where 500 μM propionate and 1000 μM acetate was fed into the system, the occurrent molar ratio of PHV was as high as approximately 40%. This, however, was to be expected given the limited scope of the dynamic model. We decided to test other parameter alterations to see whether relative changes in behavior may be reflected concordantly with experimental data.
Assessing the effect of increasing propionate vs acetate substrate concentration on final PHBV composition
Given that one of our endeavors in the lab was to test how phaA and bktB β-ketothiolases performed relatively to one another in facilitating PHBV biosynthesis, we decided to investigate this by testing various combinations of parameters related to the concentrations of each enzyme. While phaA is the common enzyme found in C. necator used for PHB biosynthesis, it has been shown that commonly co-expressed bktB was superior in producing the PHV component of the co-polymer. However, we were interested to see whether bktB can fully replace phaA without compromising on the total yield of PHBV.
From what is seen from the surface plot (left) comparing the resulting PHBV total after 24 hours (at steady-state) from different concentrations of phaA and bktB, the dynamic model predicts that along higher levels of bktB there was an overall greater amount of PHBV produced. However, this does not reflect the actual case that occurs in microbes as PHV is not the majority component of PHA produced but rather PHB. Recalling the dubious results we observed from the aforementioned experiment where we tested different amounts of starting substrate, the errors observed here suggest that there is a fundamental inaccuracy in the formulated system of ODEs that biases PHV biosynthesis through bktB.
Exploring the impact of phaA and bktB enzyme levels on final PHBV composition
Measured melting temperatures and inferred molar composition of PHBV
In the lab, we measured the melting temperatures of PHBV produced from strains expressed phaA and/or bktB. From the melting curves of the material, we may infer an approximate range of the molar composition of the PHBV polymer (table below). Comparing the inferred molar ratio based on literature (Anjum et al., 2016) with the surface plot (right) showing the PHBV composition from various enzyme levels, we can make a number of observations.
First, the maximal percentage of PHV in the co-polymer is approximately 30%. Compared to the significantly higher ratios predicted from our previous experiment with different propionate substrate levels, we may speculate that the enzyme level does not act as a great contributing factor to the %PHV as the amount of fatty acid supplied. Of course, this should be a tentative remark given the skeptical nature of the model.
Second, we may be able to optimize our model with the experimental data. From what we observe along the surface plot, the physiologically relevant range of PHV content (~20% max; Anjum et al., 2016) is achieved at as low as an enzyme concentration of bktB of 0.5 nM. It is worth noting that it may be due to the fact that the concentrations of other enzymes in the system (phaB, phaC are set at a “high” level of 0.1 μM, and perhaps these values should be adjusted were we able to obtain any new information on physiological conditions of the enzymes. On the other hand, the phaA enzyme level does not influence the end composition. This shows that it is imperative to refine our model to account for the phaA and bktB enzymes’ ability to react with common substrates. That is, phaA may be able to compete for propionyl-CoA from bktB.
Exploring the impact of different turnover rates for phaA and bktB on final PHBV composition
If we were to experiment with other parameters, we would also observe similar feats to the aforementioned. For instance, we may choose to use our model to predict the effects of enzyme engineering of phaA or bktB. This be in the form of modification of the turnover number or the enzyme’s catalytic capacity (kcat) or of its specificity to each of its substrates. For example, it may be possible to engineer the binding pocket of the enzyme to be less conducive for acetyl-CoA thus accentuating the preference for propionyl-CoA substrate.
Unfortunately, with the current model testing the various Michaelis constants (Km) is not possible. This is due to how the reactions producing PHB and PHV are parallel to each other (as commented on above). Thus, we decided to test a wide range of turnover rates for phaA and bktB. One of the main observations that resonates with previous simulations. The maximal predicted molar composition of PHV is shown to also be approximately 30%; the composition remains the same while the total PHBV produced increases as long as the reagents are not limiting.
Testing the dynamic model reveals its severe limitations when it does encompass the various relevant pathways for PHA biosynthesis. To optimize it would require a significant array of data. One of the main caveats in using this type of model to understand cell factory design is that it is does not account for cell growth. We represent the mixture of enzymes as a single-compartment, homogeneous system and thus cannot use it to predict growth phenotypes. For this latter purpose, it was decided to use a constraint-based modeling approach.
Methods based on constraint-based reconstruction and analysis (COBRA) stem from a set of fundamental concepts: the designation of physicochemical constraints on reaction networks, a mathematically-derived representation of cell objectives (e.g. growth and maintenance) and a genome-wide scope of metabolism (Lewis, Nagarajan and Palsson, 2012). Constraints in biochemical networks serve to generate accurate representation of the system. The main constraining factors are: substrate and enzyme availability, mass and charge balances and thermodynamics. Two critical assumptions of this type of model is that mass and energy are conserved and that the system is in steady-state (i.e. there is no internal accumulation of metabolites).
The constraints of a network are represented mathematically by a matrix delineated by coefficients representing the stoichiometries of metabolites in each reaction. For example, a coefficient of -1 indicates one molecule of the reactant is consumed in the corresponding reaction, a coefficient of 1 conversely signifies the production of one molecule and a coefficient of 0 indicates a metabolite that does not participate in the reaction. This along with constraints regarding the maximal and minimal fluxes of reactions demarcate a solution space of feasible distributions that satisfy the designated constraints, representing possible states of the microbial network.
We further define the reconstructed model with a statement of the cell or microorganism's objectives: in most cases, that would be the urge to grow as much as possible! There have been other disputed goals that the cell may wish to achieve such as the minimization of energy use (i.e. ATP storage). These objective functions are added onto the GEM to refine the possible states that are attained by the cell in the context of the goal (García Sánchez and Torres Sáez, 2014). For example, a biomass function (Feist and Palsson, 2010), a pseudo-reaction constructed to describe the rate and proportion at which precursors for biosynthesis of crucial components (e.g. amino acids) are made, can be used to predict the specific growth rate of the cell. These functions can then be maximized or minimized using optimization (e.g. linear programming).
Linear programming is a mathematical optimization method used to find the maximum or minimum of a mathematical model wherein the requirements are defined by linear relationships. The main algorithm that employs this technique - and is a staple in GEM analysis - is the flux balance analysis (FBA). In FBA, the objective function and environmental constraints (e.g. different carbon sources, available oxygen levels, etc.) are defined and the system is solved using linear programming (Orth, Thiele and Palsson, 2010). With FBA being the most basic form of analysis, there may be more than one optimal solution; these are often accounted for through more sophisticated techniques that employ mixed-integer linear programming (e.g. flux variability analysis, FVA) or even quadratic programming (e.g. minimization of metabolic adjustment, MOMA).
Using the constraint-based model, we can delve into questions concerning the holistic behavior of the microbial cell factory, especially facets such as growth and overall productivity. We may look into how the addition of new and more sustainable substrate sources, may impact the cell growth and PHA yield. We maybe able to identify a maximal yield without compromising growth. Additionally, using the COBRA Toolbox, we can also simulate the metabolic network in cases of overexpression or knockout of certain pathway enzymes such succinyl-CoA synthase (sucCD) and investigate it would have predicted effects on product yield in our strain of E. coli.
For the constraint-based modeling of growth and PHA biosynthesis in our recombinant strain of E. coli , we derived our GEM from iJO1366, the most comprehensive GEM of E. coli K-12 to date (Orth et al., 2011), and added onto it the heterologous pathway for PHA biosynthesis encoded by the pha operon.
To carry out these operations computationally, we used the COBRA Toolbox (Schellenberger et al., 2011) created by the lab of Bernhard Ørn Palsson at University of San Diego. Built for the MATLAB suite, the toolbox can be used to implement FBA as well as a plethora of other powerful (and awesome) functions. Using the toolbox, we added in the biochemical pathway for PHBV biosynthesis to develop the basis for projections into how we may optimize PHBV production in the E. coli chassis.
Using the COBRA Toolbox to reconstruct and analyze a genome-scale model
To learn more about the COBRA Toolbox developed by the Palsson Group, please click here!
To first investigate how to best optimize the growth of E. coli as a biological host for PHA production, uptake of different carbon sources into the iJO1366 GEM were compared in silico using the COBRA toolbox. In order to best characterize the growth phenotype (denoted by setting the biomass function as the objective in FBA) associated with uptake of the carbon source, separate simulations were operated per single source. This was carried out by setting the uptake of the carbon source of interest to -20 mmol/gDW-h (note the negative coefficient represents an influx of material into the system) while holding associated uptake rates of all other carbon sources at zero. The predicted specific growth rates were highest when grown on glucose or fructose. Based on the model, the PHB biosynthesis rates were proportional to the specific growth rates; thus, only the former is shown below.
The main proposed sustainable source of carbon is pot ale, by-products obtained locally from distilleries. The three main components that pot ale comprises are, from highest to lowest in abundance, acetic acid, lactic acid and propionic acid (Graham et al., 2012). Each component was tested independently (as their conjugate bases) at the aforementioned -20 mmol/gDW-h uptake rate; the results from the FBA showed that at equimolar amounts, propionate yielded the relative highest growth rate followed subsequently by lactate and acetate.
The results of pot ale utilization were initially dubious, which showed that the overall growth rate would be higher than with glucose. We deduced that the apparent outlier was due to the fact that pot ale was represented in the simulation by setting the uptake rates of each individual component (lactate, acetate and propionate) to -20 mmol/gDW-h. We then sought to normalize and weight the relative abundance of each compound in the pot ale. Using data on the relative composition of the by-product, the relative abundance of acetic acid, lactic acid and propionic acid were used as “weights” to compute the relative uptakes of the individual components, normalizing for the amount of material taken into the system per unit of pot ale.
In silico predictions of PHB production rate with various substrates
To affirm our predictions, we compiled experimental data on the growth of our strain in two conditions: a 3% glucose + M9 media vs. a 1% glucose + M9 media + pot ale. The pot ale was provided kindly by the local distillery in Edinburgh, UK.
Unfortunately, the time points along the growth curve due to the fact that the experiments were done several weeks apart and that our schedule did not permit as consistent collection of data. Thus, we may not be able to observe the various phases of microbial growth from the data. However, we may be able to note the maximal growth of the culture in the various media.
From what is shown, it is evident that glucose is the more significant contributor to biomass and growth as was predicted by the model. The conditions wherein M9 was supplemented with 1% glucose and pot ale showed that pot ale was not entirely conducive to growth. This should be verified with a control where the culture was grown only on pot ale. This experiment was initially carried out but terminated due to difficulties.
Growth curve on pot ale vs. glucose in M9 media
Taking a step further, we sought to predict the production rates of the co-polymer PHBV from various substrates. As was reflected in the above data, PHBV production rates among the lowest when the culture was grown on acetate/propionate and pot ale media. However, supplementation of the pot ale with glucose showed a significant increase in the rate of PHBV production, which matches what the data we collected demonstrated.
A study by Bhatia et al. looked into the effect of supplementation by succinate and glycerol. In their conclusion, they purported that glycerol was a strong contributing factor to expanded capacity for biomass growth, while propionate in the presence of succinate influenced the PHV content of the copolymer. However, these observations applied to a strain wherein the gene sucCD encoding succinyl-CoA synthase was overexpressed. We would like to test whether this may be expected in a strain without the over expression profile.
In silico predicted rates of PHBV production with various substrate combinations
We observed an increased production rate of PHBV when acetate and propionate was supplemented with glycerol instead of glucose; however, the projected rate did not increase when pot ale was the main substrate. Furthermore, when succinate was the main substrate with supplementation by glucose, the projected rate of synthesis was the highest. We speculate that this may be because glycerol and succinate may have roles in restoring the redox potential in the metabolism of the organism.
First, glycerol is known as a better carbon source for making reducing chemicals as its catabolism can restore the reducing equivalents in the cell. Also, succinate’s conversion to fumarate by succinate dehydrogenase is also a main contributor to restoring available NADH. From this, we surmise that redox may become a main bottleneck in PHA biosynthesis as phaB reductase enzyme will heavily depend on these conditions.
Additionally, overexpression of enzymes such as succinyl-CoA synthase may not favor PHA biosynthesis. Because PHA production does not favor the survival of the cell, it may prioritize other pathways to ensure maximal growth. Thus, the overexpression of the enzyme may in fact have the reverse effect as the reaction is reversible. Originally, increase in the enzyme activity was attempted using the COBRA Toolbox but showed no apparent differences from a native WT scenario. Thus, we decided to observe what changes would occur were the enzyme to be grossly deleted (representing a gene knockout).
In silico flux distribution of E. coli metabolism under sucCD deletion
Mapping out the redistributed flux showed that in order to promote growth, the cell would attempt to channel material through the glyoxylate cycle instead of the now incomplete TCA cycle. Of course, this would not be favorable as it would compromise on the production of GTP for energy. We were quite surprised to note that the reduction in the growth rate was less than 1%.
However, this turned out not to be the case when we grew various strains overexpressing sucAB (SC2) and sucCD (SC3). In fact, the growth of the strains were compromised when the TCA enzymes were overexpressed (seen in the plot).
With the presented data, we cannot yet discern between whether or not this impeded growth is due to the increased activity of the enzymes or the metabolic burden introduced by the plasmid. To investigate further, we should also test the capacity of the various strains to utilize propionate or the ability to synthesize higher titers of PHBV.
Growth of E. coli with various overexpression profiles (sucABCD)
For those interested in the code behind the implementation, please click here to access our GitHub repository!