Metabolic engineering and synthetic biology in general are powerful areas of research, but they can only realize a quantum leap in breakthroughs when computational models involved in the Design-Build-Test-Learn engineering paradigm are standardized and implemented in the field. As part of our iGEM project, the models developed are motivated by different objectives and approaches in order to inform and be shaped by our endeavors in the lab. To do so, two different computational models were used to determine the best approaches to improve PHA yield and productivity in recombinant E. coli.
One of the ways to model a system in silico is through the dynamic model, which many systems biologists should be familiar with. By defining a pathway (such as the heterologous PHA biosynthetic pathway) or metabolic network as a series of ordinary differential equations (ODE), dynamic behaviors in the system over time may be predicted. A considerable limitation arises from the need for highly detailed information to parameterize the kinetics of the pathway; therefore, this type of model may be difficult to implement.
On the other hand, an organism as well-studied as E. coli enjoys a vast wealth of literature surrounding its genome and metabolic network. As a result, stoichiometric or constraint-based models such as genome-scale models (GEM) are available to be used for in silico simulations without necessitating the rigorous characterization associated with dynamic or kinetic models. Using this, we can investigate problems such as the best conditions to grow a recombinant SBM E. coli and how to improve the titer of PHA.
A metabolic network such as that of E.coli may be viewed in mathematical terms: a system of equations that follow the transcriptional, translational and enzymatic activity within the organism. However, for the purposes of looking at the PHA biosynthetic pathway, we will be looking predominantly at the biochemical reactions from an enzyme kinetics perspective. The two main ways to represent reactions and their rates are mass-action and Michaelis-Menten kinetics. The latter is the most common way to represent the enzyme kinetics of a biochemical pathway; thus, for our purposes we will use it as the main paradigm for further derivation.
Because the PHA biosynthetic pathway utilizes acetyl-CoA - a crucial component in core metabolism of the organism - as an initial substrate, we must ensure that the model also contextualizes PHA production in the scope of other relevant pathways: the two main ones are glycolysis and the TCA cycle. If we plan to consider the use of alternative feedstock with high quantities of short-chain fatty acids (acetate and propionate), it would be ideal to also provide information regarding the assimilatory pathway for these substrates. Having the most comprehensive dynamic model allows understanding of the control structure of the system (Moreno-Sánchez et al., 2008) as well as identifying bottlenecks arising from elements such as energy and reducing power (NADPH/NADH) for anabolic and catabolic reactions. However, building a dynamic model requires a wealth of information, which most of time is unavailable and requires extrapolation and inference.
To make predictions regarding the metabolic capacity of engineered strains of E. coli, for instance, the strain wherein the Sleeping beauty mutase operon is activated, allowing for in vivo biosynthesis of propionate (Srirangan et al., 2016; Gonzalez-Garcia et al., 2017) that may be used for producing the PHBV co-polymer. Understanding how a combination of feedstock such as sugars and fatty acids can influence the overall yield and composition of our PHBV end product can be an enabling tool; however, to achieve this goal is very labor- and resource-intensive. For this purpose, it was deemed more appropriate to use a constraint-based modeling approach (see below).
Understanding the relevant pathways for PHA biosynthesis
Using the dynamic model, we can investigate problems related to the behavior of the system and the consequences in production. The questions that we seek to investigate using this model are related most closely to the amount of feedstock (e.g. acetate and propionate) and their relative proportions that would achieve a desired end composition of PHBV co-polymer (Horng et al., 2013). In addition, we also used the model to theoretically survey the effects that can be achieved towards this end through the engineering of enzymes (that is, changing the binding affinity and catalytic activity). From this, we may also be able to speculate on the effect of change in gene arrangement in the operon (Hiroe et al., 2012). The resulting differences in transcriptional levels of the encoded enzymes can be represented in the system as different amounts of enzymes.
To make a cursory exploration into how we may approach the production of PHBV with tailor-made composition and properties, we deemed that the dynamic model may even be sufficient if we were to limit the scope to only the PHA biosynthetic pathway while we would effectively "black box" the peripheral metabolic network. That is, we will define the pathway of three subsequent reactions, condensing acetyl-CoA and propionyl-CoA into polymers of PHBV.
For the model, we must assume a number of aspects. First, we have to assume that the parallel pathways contributing to the PHB and PHV components in the co-polymer do not interact. While it is most likely not the case, representation of a pathway where there is competition between substrates for a common enzyme may difficult to do so via a Michaelis-Menten derivation without more data. On a first pass, we assume that this distinction is condonable as long as the differences in the parameters we are testing are significant. Second, we must provide an influx of material into the biosynthetic pathway. Because we are not intensively modeling the pathways for glycolysis, TCA and fatty acid assimilation (a problem for another day after iGEM!), we assume the introduction of acetyl-CoA and propionyl-CoA precursors can be represented via a heuristic means. The third assumption builds upon this in that, the amount of starting precursors that we test for simulations of the dynamic models are assumed to be within physiological ranges and would not be detrimental to the cell. Fourth, we assume that the system is homogeneous (no spatial element or compartments). Finally, we assume that the levels of enzymes are conserved throughout the experiment (note that the concentration of available enzyme does not fluctuate as would be the case in a mass-action representation that involves the formation of enzyme-intermediate complexes).
The biosynthetic pathway towards producing PHBV co-polymer
Based on these assumptions, a model was developed expanding on the work done by the Stanford-Brown 2015 iGEM team. Their model originally outlines the system using irreversible Michaelis-Menten kinetics for PHB biosynthesis. However, understanding that the catalytic activity of our encoded PHA enzymes can go both ways (for example, the β-ketothiolase can both form acetoacetyl-CoA from 2 acetyl-CoA molecules via a Claisen condensation or can degrade the acetoacetyl-CoA back in the reverse direction), we deemed it was more appropriate to represent the kinetics of the system with the reversible form of the Michaelis-Menten equation (Imperial and Centelles, 2014). This has been shown in past literature to be preferred for in vivo scenarios. This applied to the enzymes encoded by phaA, bktB and phaB; however, the phaC enzyme is excluded as literature has shown the synthase activity to be unidirectional. Another enzyme is responsible for the degradation of PHAs into usable subunits; however, since this element is not present in E. coli, we can effectively disregard this.
Defining the dynamic model using a system of ordinary differential equations (ODEs)
The reactions regarding the condensation and reduction of acetyl-CoA and propionyl-CoA are two-substrate reactions; thus, we decided to represent this mathematically with the corresponding formulae (Cornish-Bowden 1993) shown above. The Michaelis constants (Km) factor in the enzymes' relative affinity to each substrate. These are also considered in the enzyme's catalytic efficency (ks, kp ). You may find further details on the chosen parameters in our Github repository.
In the case of PHA synthase, since it directly incorporates 1 equivalent of 3HB-CoA or 3HV-CoA into a PHBV polymer, we represented this using a traditional irreversible Michaelis-Menten equation. The aforementioned means to represent influx of material is derived also from the Stanford-Brown 2015 model. In their model, they incorporated an enzyme called acetate synthase (encoded by acs) that produces acetyl-CoA from acetate and coenzyme A. We adapted this formula to represent the assimilation of both acetate and propionate and their subsequent conversion into their CoA-ligated forms.
To read more about the dynamic model developed by the Stanford-Brown 2015 team, please click here!
Assessing the effect of increasing propionate vs acetate substrate concentration on final PHBV composition
Exploring the impact of phaA and bktB enzyme levels on final PHBV composition
Exploring the impact of different turnover rates for phaA and bktB on final PHBV composition
Moreno-Sánchez, R., Saavedra, E., Rodríguez-Enríquez, S. and Olín-Sandoval, V. 2008. Metabolic Control Analysis: A Tool for Designing Strategies to Manipulate Metabolic Pathways. Journal of Biomedicine and Biotechnology, 2008, pp.1-30.
Srirangan, K., Liu, X., Tran, T., Charles, T., Moo-Young, M. and Chou, C. 2016. Engineering of Escherichia coli for direct and modulated biosynthesis of poly(3-hydroxybutyrate-co-3-hydroxyvalerate) copolymer using unrelated carbon sources. Scientific Reports, 6(1).
Gonzalez-Garcia, R., McCubbin, T., Wille, A., Plan, M., Nielsen, L. and Marcellin, E. 2017. Awakening sleeping beauty: production of propionic acid in Escherichia coli through the sbm operon requires the activity of a methylmalonyl-CoA epimerase. Microbial Cell Factories, 16(1).
Horng, Y., Chien, C., Huang, C., Wei, Y., Chen, S., Lan, J. and Soo, P. (2013). Biosynthesis of poly(3-hydroxybutyrate-co-3-hydroxyvalerate) with co-expressed propionate permease (prpP), beta-ketothiolase B (bktB), and propionate-CoA synthase (prpE) in Escherichia coli. Biochemical Engineering Journal, 78, pp.73-79.
Hiroe, A., Tsuge, K., Nomura, C.T., Itaya, M. and Tsuge, T., 2012. Rearrangement of gene order in the phaCAB operon leads to effective production of ultra-high-molecular-weight poly [(R)-3-hydroxybutyrate] in genetically engineered Escherichia coli. Applied and environmental microbiology, pp.AEM-07715.
Imperial, S. and Centelles, J. 2014. Enzyme Kinetic Equations of Irreversible and Reversible Reactions in Metabolism. Journal of Biosciences and Medicines, 02(04), pp.24-29.
Cornish-Bowden, A. 1993. Enzyme specificity in reactions of more than one co-substrate. Biochemical Journal, 291(1), pp.323.2-324.
Methods based on constraint-based reconstruction and analysis (COBRA) stem from a set of fundamental concepts: the designation of physicochemical constraints on reaction networks, a mathematically-derived representation of cell objectives (e.g. growth and maintenance) and a genome-wide scope of metabolism (Lewis, Nagarajan and Palsson, 2012). Constraints in biochemical networks serve to generate accurate representation of the system. The main constraining factors are: substrate and enzyme availability, mass and charge balances and thermodynamics. Two critical assumptions of this type of model is that mass and energy are conserved and that the system is in steady-state (i.e. there is no internal accumulation of metabolites).
The constraints of a network are represented mathematically by a matrix delineated by coefficients representing the stoichiometries of metabolites in each reaction. For example, a coefficient of -1 indicates one molecule of the reactant is consumed in the corresponding reaction, a coefficient of 1 conversely signifies the production of one molecule and a coefficient of 0 indicates a metabolite that does not participate in the reaction. This along with constraints regarding the maximal and minimal fluxes of reactions demarcate a solution space of feasible distributions that satisfy the designated constraints, representing possible states of the microbial network.
We further define the reconstructed model with a statement of the cell or microorganism's objectives: in most cases, that would be the urge to grow as much as possible! There have been other disputed goals that the cell may wish to achieve such as the minimization of energy use (i.e. ATP storage). These objective functions are added onto the GEM to refine the possible states that are attained by the cell in the context of the goal (García Sánchez and Torres Sáez, 2014). For example, a biomass function (Feist and Palsson, 2010), a pseudo-reaction constructed to describe the rate and proportion at which precursors for biosynthesis of crucial components (e.g. amino acids) are made, can be used to predict the specific growth rate of the cell. These functions can then be maximized or minimized using optimization (e.g. linear programming).
Linear programming is a mathematical optimization method used to find the maximum or minimum of a mathematical model wherein the requirements are defined by linear relationships. The main algorithm that employs this technique - and is a staple in GEM analysis - is the flux balance analysis (FBA). In FBA, the objective function and environmental constraints (e.g. different carbon sources, available oxygen levels, etc.) are defined and the system is solved using linear programming (Orth, Thiele and Palsson, 2010). With FBA being the most basic form of analysis, there may be more than one optimal solution; these are often accounted for through more sophisticated techniques that employ mixed-integer linear programming (e.g. flux variability analysis, FVA) or even quadratic programming (e.g. minimization of metabolic adjustment, MOMA).
Using the constraint-based model, we can delve into questions concerning the holistic behavior of the microbial cell factory, especially facets such as growth and overall productivity. We may look into how the addition of new and more sustainable substrate sources, may impact the cell growth and PHA yield. We maybe able to identify a maximal yield without compromising growth. Additionally, using the COBRA Toolbox, we can also simulate the metabolic network in cases of overexpression or knockout of certain pathway enzymes such succinyl-CoA synthase (sucCD) and investigate it would have predicted effects on product yield in our strain of E. coli.
For the constraint-based modeling of growth and PHA biosynthesis in our recombinant strain of E. coli , we derived our GEM from iJO1366, the most comprehensive GEM of E. coli K-12 to date (Orth et al., 2011), and added onto it the heterologous pathway for PHA biosynthesis encoded by the pha operon.
To carry out these operations computationally, we used the COBRA Toolbox (Schellenberger et al., 2011) created by the lab of Bernhard Ørn Palsson at University of San Diego. Built for the MATLAB suite, the toolbox can be used to implement FBA as well as a plethora of other powerful (and awesome) functions. Using the toolbox, we added in the biochemical pathway for PHBV biosynthesis to develop the basis for projections into how we may optimize PHBV production in the E. coli chassis.
Using the COBRA Toolbox to reconstruct and analyze a genome-scale model
To learn more about the COBRA Toolbox developed by the Palsson Group, please click here!
In silico predictions of PHB production rate with various substrates
In silico flux distribution of E. coli metabolism under sucCD deletion
Lewis, N., Nagarajan, H. and Palsson, B. 2012. Constraining the metabolic genotype–phenotype relationship using a phylogeny of in silico methods. Nature Reviews Microbiology, 10(4), pp.291-305.
García Sánchez, C. and Torres Sáez, R. 2014. Comparison and analysis of objective functions in flux balance analysis. Biotechnology Progress, 30(5), pp.985-991.
Feist, A. and Palsson, B. 2010. The biomass objective function. Current Opinion in Microbiology, 13(3), pp.344-349.
Orth, J., Thiele, I. and Palsson, B. 2010. What is flux balance analysis?. Nature Biotechnology, 28(3), pp.245-248.
Orth, J., Conrad, T., Na, J., Lerman, J., Nam, H., Feist, A. and Palsson, B. 2011. A comprehensive genome-scale reconstruction of Escherichia coli metabolism--2011. Molecular Systems Biology, 7(1), pp.535-535.
Schellenberger, J., Que, R., Fleming, R., Thiele, I., Orth, J., Feist, A., Zielinski, D., Bordbar, A., Lewis, N., Rahmanian, S., Kang, J., Hyduke, D. and Palsson, B. 2011. Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nature Protocols, 6(9), pp.1290-1307.
For those interested in the code behind the implementation, please click here to access our GitHub repository!