Team:Rice/Model/Code

Modeling

Model Overview



Our modeling team accomplished three main tasks: first, we extended a model of orthogonal translation from the literature [1] to incorporate RNA polymerases. In the process, we fit parameters describing how often host cells transcribe RNA polymerase genes. Next, we conducted a sensitivity analysis to show that the PORTAL system can better maintain circuit protein output under metabolic changes than can a host-controlled circuit. Finally, we determined the optimal transcription rates for the components of the PORTAL system in order to allow the system to best buffer against changes in host metabolism. These results will help us improve the PORTAL system to allow even more consistent expression in different hosts.


Design



We had three main constraints in deciding the design of our model. First, the model should allow both orthogonal transcription and translation. Second, to account for different host species, the model should be tunable. Finally, since we need to compare our model to wet lab data, the model must produce physiological output. We could not find a model design in the literature to address all of these design constraints at once, but a model by [1] came closest to satisfying our constraints. However, they did not include RNA polymerases or orthogonal transcription, so we extended their model by incorporating these essential elements.

Model



Our model is based on the enzyme kinetics model of [1] and consists of linked biochemical reactions describing over twenty different molecular species. These reactions describe the synthesis and utilization of molecules in the cell, from ribosomes to RNAs to metabolic enzymes.

Various parts of our model
Figure 1: Main components of the model. Click on the different portions of the image to explore the details!

The model includes metabolism, host expression, circuit expression, and orthogonal expression. The metabolism section describes how the host cell import nutrients and converts them to energy, while host expression explains how the cell transcribes and translates its own proteins. Circuit expression addresses how the cell transcribes and translates circuit proteins without the PORTAL system (under host control), and orthogonal expression invokes the components of the PORTAL system to express circuit proteins.

Metabolism Host Expression Circuit Expression Orthogonal Expression

RNA Polymerase Inclusion



Our model for RNA polymerases is based on a mechanism where the polymerase binds reversibly to a promoter and the complex slowly generates an mRNA. The chemical scheme looks like:

The promoter reversibly binds a RNA polymerase and then the polymerase transcribes the gene.

Here the RNA polymerase binds reversibly to the promoter and then transcribes the gene to produce mRNAs. The situation is actually more complicated since the same polymerase can bind to the promoters of a large number of genes. However, given a few assumptions, we can find an analytic solution for transcription rate as a function of RNA polymerase concentration. First, we assume a quasi-steady state where the equilibrium between polymerases and promoters occurs much more quickly than transcription. This assumption is justified since free RNA polymerases in E. coli are in rapid equilibrium with non-specifically bound RNA polymerases at a time scale of a few milliseconds [2]. Second, we assume that the binding constants for all promoters are equal. This assumption does not mean that all promoters are equally active; promoters at different concentrations in the cell will still show differences in transcription. Under these two assumptions, we can solve the system exactly to figure out transcription rate as a function of the number of host RNA polymerases. For more technical details about this derivation, click here. At the end of this procedure, we still need to fit two parameters related to transcription of the RNA polymerase itself: the maximum transcription rate of the host RNA polymerase gene (wP) and the energy threshold for this transcription (oP).

Parameter Estimation



Estimating parameters describes the process of finding values to ensure the model is as close to data from real bacterial cells as possible. Since most of our parameter values come from [1], we use the same data set that they use to fit the two extra parameters. This data set, originating in [4], includes growth rates and RNA to protein ratios for a variety of environmental conditions. To estimate the error, we simulate the system until it reaches steady-state for each of the conditions in the data set and find the square of the difference between the data and the simulation results. A simple way to estimate parameters would be to simply search for the values that result in the minimum error. However, this approach can hide important information about the sensitivity of the model to the parameters or relationships between the parameters.

The technique we use is called Markov chain Monte Carlo parameter estimation (MCMC). Rather than finding single values for parameters, MCMC samples the entire parameter space and determines a probability distribution for the parameters. This probability distribution gives the chance that any given parameter values explain the data. As a result, MCMC better shows the model's sensitivity to parameter changes and any relationships between the parameters. The actual algorithm we use is called the Delayed Rejection Adaptive Metropolis algorithm (DRAM), and we used the implementation from [2] which applies several techniques to speed up the sampling. With this algorithm, we generated 10,000 estimates of the parameters and removed the first 2,000 estimates (our burn-in time) since the algorithm needs some time before it reaches the correct distribution. We can visualize this distribution by plotting the probability density in the parameter space.

Probability distribution for possible pairs of parameter values
Figure 2: Probability distribution for possible values of the parameters.

The distribution is shown as a heat map, where different colors imply different values for the parameters. The pair with the highest probability corresponds to the likeliest values. In this case, the maximum transcription rate is most likely 2.975 per minute, and the transcription energy threshold is most likely 0.7 energy molecules. We used these values in the rest of our simulations.

Sensitivity Analysis



Our overall goal in this section is to test whether expressing proteins using the orthogonal system leads to more consistent expression across strains. The wet lab group tested this hypothesis directly, but modeling of the situation can provide valuable insight since we have fine control over the details of our model. This same level of control is difficult to achieve in the wet lab.

The simplest way to test our hypothesis through the model would be to determine circuit protein production in different host strains, as specified by their unique parameter values. Unfortunately, finding these parameters would require extensive experimentation. Instead of finding specific values for the parameters, we vary model parameters and determine the effect of these changes on the protein production. Specifically, we address three parameters: φEm, and kP.

φE is the efficiency of energy conversion; we chose this parameter since we expected that different species would have different abilities to metabolize nutrients. In addition, changing φE affects the energy available to the cell, providing a possible model for starvation.

γm gives the maximum translation rate. Varying this parameter directly impacts translation, but it also leads to changes in the growth rate. As we expect different species to have different natural growth rates, modifying γm allows us to approximate this difference between species.

kP provides a measure for the number of active genes in the host cell. It also affects the global rate of transcription in the cell. We expect different species to have differing transcription rates and numbers of active genes; thus, modifying kP approximates this difference.

To quantify changes in protein production, we use sensitivity analysis. Sensitivity is defined as the ratio between the percent change of the output and the percent change of the input. We can calculate this quantity by changing a parameter (φE, γm, or kP) by a certain percent and then measuring the resulting change in the steady-state values of the circuit protein count per cell. To ensure that differences in sensitivity are only due to changes in our parameters, we perform controlled comparisons, where we ensure identical initial output between the three systems (host-controlled expression, the UBER circuit with only orthogonal transcription, and the PORTAL system with orthogonal ribosomes and T7 RNA polymerases). After we make sure that the circuit protein concentrations (our outputs) are the same, we can then modify φE, γm, or kP and measure the resulting change in circuit protein concentration.

Sensitivity of each circuit to changes in energy conversion efficiency
Figure 3: The PORTAL system is less sensitive than the host-controlled system to changes in metabolism.

When we calculate the sensitivity for different values of the energy conversion efficiency, we find that the PORTAL system, including both orthogonal ribosomes and T7 RNA polymerases, was the least sensitive over all conditions tested. This finding implies that PORTAL is better than a host-controlled circuit at maintaining circuit protein output when the metabolic ability of the host changes.

Sensitivity of each circuit to changes in maximum translation rate
Figure 4: The PORTAL system is sometimes less sensitive than the host-controlled system to changes in translation.

Here we plot the sensitivity for different maximum translation rates; changing the translation rate equally affects both circuit and host translation. We find that the PORTAL system was the least sensitive for lower translation rates, but host control becomes the least sensitive for large enough translation rates.

Sensitivity of each circuit to changes in number of active genes
Figure 5: The PORTAL system is always more sensitive than the host-controlled system to changes in the number of active host genes.

As we vary the number of active host genes, we see that the sensitivity of the host-controlled system is always less than that of the PORTAL system. The PORTAL is therefore not very effective at buffering against this kind of change.

Overall, we find that expression using the PORTAL system is more resistant to changes in metabolism and translation, but not as resistant to changes in transcription. This property reflects a complicated balancing act between the energy allocated to cellular growth (or replication) and energy allocated to cellular expression. When more energy is available to the cell - i.e. when the host metabolism is changed - the PORTAL system ensures that this excess energy is funnelled into translation of circuit proteins rather than simply replication or growth, as shown in figure 3. However, this ability comes with drawbacks: the PORTAL system expresses proteins orthogonally to the host and therefore cannot directly affect host transcription. Thus, it cannot effectively buffer against host transcriptional changes, as we see in figure 5. In contrast, the PORTAL system can somewhat affect host translation by requisitioning host ribosomal proteins, giving it the ability to indirectly buffer against changes in host translation. This ability leads to PORTAL's partial resistance against host translational change, as displayed in figure 4. All of these statements can be tested in the lab by assaying fluorescence of a reporter under metabolic, translational, and transcriptional changes. Thus, our model provides testable hypotheses to better understand the buffering ability of orthogonal expression systems.

Circuit Optimization



In this section, we want to determine the optimal transcription rates for the orthogonal 16S rRNA and the T7 RNA polymerase to minimize the sensitivity of the system. Since the PORTAL system buffered best against metabolic differences, we will try to maximize this ability even further. In a biological context, this process is equivalent to choosing the correct repressor in our circuit. From the viewpoint of our model, we are trying to minimize the average sensitivity of circuit protein production over all metabolic conditions by varying transcription rates of the orthogonal components.

Minimizing sensitivity by modifying transcription rate of T7 RNAP
Figure 6: Average sensitivity of each system to metabolic changes for different values of the T7 RNA polymerase transcription rate.

As long as we maintain polymerase transcription below 15 mRNAs/min, the PORTAL system is less sensitive than the host-controlled system. In fact, as we reduce the transcription rate even further, we obtain lower average sensitivities.

Minimizing sensitivity by modifying transcription rate of orthogonal 16S rRNA
Figure 7: Average sensitivity of each system to metabolic changes for different values of the orthogonal 16S rRNA transcription rate.

We can maintain this transcription rate below 0.8 rRNAs/min to further reduce the sensitivity of the PORTAL system. For both the orthogonal 16S rRNAs and the T7 RNA polymerase, we can ensure our transcription rates stay in the range we identified by adding an appropriate repressor to the PORTAL system. Our model therefore provides a simple method to determine how strongly we need to express the orthogonal components of our circuit.

Conclusions and Future Work



Our model provides a general framework from which to evaluate and test orthogonal expression systems, including both orthogonal transcription and translation. In addition, it also provides several testable hypotheses; for example, the model predicts the PORTAL system is less sensitive than the host-controlled system to changes in the substrate that the bacteria grows on. We can directly test this hypothesis in the lab by comparing fluorescence reporter levels produced by PORTAL and a host-controlled system when grown in different media. In this case, we would expect the fluorescence to be more consistent for the PORTAL system than the host-controlled system.

References



[1] Darlington, A.P.S., Kim, J., Jiménez, J.I., & Bates, D.G. (2018). Dynamic allocation of orthogonal ribosomes facilitates uncoupling of co-expressed genes. Nature Communications, 9, 695.

[2] Haario, H., Laine, M., Mira, A. & Saksman, E. (2006). DRAM: Efficient adaptive MCMC. Statistics and Computing, 16, 339-354.

[3] Patrick, M., Dennis, P.P., Ehrenberg, M., & Bremmer, H. (2015). Free RNA polymerase in Escherichia coli. Biochimie, 119, 80-91.

[4] Scott, M., Gunderson, C. W., Mateescu, E. M., Zhang, Z. & Hwa, T. (2010). Interdependence of cell growth and gene expression: origins and consequences. Science, 330, 1099–1102.

[5] Weiße, A. Y., Oyarzún, D. A., Danos, V., & Swain, P. S. (2015). Mechanistic links between cellular trade-offs, gene expression, and growth. Proc. Natl. Acad. Sci. USA 112, E1038–E1047