Team:Rice/Model/Polymerase

Adding RNA Polymerases

Derivation

Note: if the equations below seem messed up, please follow this link to a simplified version of this page.

We consider a simple model of RNA polymerase activity as described by the chemical equations below $$\ce{d_i + p <=>[f_i][r_i] c_i ->[\omega_i] p + d_i + m_i}$$ In this equation, $d_i$ represents the promoter on the DNA, $p$ represents the RNA polymerase, $c_i$ is the complex between DNA and RNA polymerase, and $m_i$ is the finished mRNA. The index $i$ can, again, represent any of the protein components. The remaining constants describe the rate of the various processes. However, the process as described here will result in a large number of constants to fit (at least three for each type of mRNA simulated), many of them not easy to determine via experimentation or simulation. We therefore make a few approximations to simplify the model. First, we assume that the binding/unbinding equilibrium occurs much faster than the transcription reaction (a quasi-steady state). In other words, $r_i >> \omega_i$. We also assume that $c_i = 0$ at the beginning. Under these assumptions, we find that \begin{equation*} \begin{aligned} c_i &= k_i p d_i \\ D_i &= c_i + d_i \\ P_0 &= p + \sum c_i \\ \end{aligned} \end{equation*} where $k_i = \frac{f_i}{r_i}$, $D_i$ is the initial amount of promoter for each gene, and $P_0$ is the initial amount of RNA polymerase protein. The actual rate of production of mRNAs is simply equal to $\omega_i c_i$ for each gene. Although this is just a system of equations, the system is non-linear, and an exact solution is not possible. We therefore make another assumption: that $k_i$ is the same constant for all genes, i.e. that $k_i = k$. In other words, all promoters bind RNA polymerase equally. Although this assumption seems biologically unfeasible, we can imagine redefining $D_i$ to be scaled by the activity of the promoter. Thus, as long as we give this role to $D_i$, our assumption should be valid. In that case, we can solve the equations to get \begin{equation*} c_i = D_i \bigg( \frac{1+k(P_0+D) - \sqrt{4 k P_0 + (1+k(D-P_0))^2}}{2 D k} \bigg) \end{equation*} where $D = \sum D_i$ is the total amount of active promoters. We term the expression in the parentheses 'polymerase activity'. The parameters $k$ and $D$ in this equation are relatively easy to correlate to biological properties. $k$ is essentially the average transcription initiation rate, measured in initiations per minute per gene. $D$ is essentially the number of genes in a bacterial cell. For E. coli, $k$ is approximately 20 per minute per gene [2], and $D$ is about 4494 genes [1].

Activity of RNA polymerases increases linearly to 1 as the total number reaches 4494 and then levels off

Now, the exact polymerase activity is costly to calculate in a simulation because square roots are typically expensive operations. The overall behavior of the term is to start off linearly with $P_0$ and become flat at $P_0 = 4494$. To approximate this, we use $f(P_0) = \text{min}(\frac{P_0}{4494},1)$, where $f(P_0)$ is the approximate polymerase activity. The rate of production of mRNAs by transcription is equal to the product of $\omega_i$ and $c_i$. We therefore see that \begin{equation*} \frac{d m_i}{dt} = \omega_i D_i f(P_0) \end{equation*} Although it appears that both $\omega_i$ and $D_i$ must be fit from data, the product of the two has already been fit by [3] in the base model. We therefore assume that \begin{equation*} \omega_i D_i = \omega_i \frac{E_c}{E_c + o_i} \end{equation*} where parameters in the second expression have already been found for $i = T,E,R,H$. The final expression we get for mRNA production is therefore \begin{equation*} \frac{d m_i}{dt} = \omega_i f(P_0) \frac{E_c}{E_c+o_i} \end{equation*} and the only parameters we still have to determine are $w_P$ and $o_P$, the maximum transcription of the RNA polymerase gene and the threshold for transcription of the RNA polymerase gene respectively. To account for RNA polymerase activity in the differential equations, we let $\alpha = f(P_0)$ and multiply the transcription rates by this factor. Thus, the differential equation for any of the mRNAs looks like \begin{equation*} \frac{d m_i}{dt} = \omega_i \alpha \frac{E_c}{E_c+o_i} + ... \end{equation*} where the "..." represents the other terms related to translation or growth.

For orthogonal polymerases, we use essentially the same equations except with $\alpha = f(P_{T7})$ since the transcription rate depends on the concentration of T7 RNA polymerase.

References

[1] Keseler, I.M., Mackie, A., Santos-Zavaleta, A., Billington, R., Bonavides-Martinez, C., ..., Karp, P.D. (2017). The EcoCyc database: reflecting new knowledge about Escherichia coli K-12. Nucleic Acids Res, 45,D543-D550.

[2] Pai, A., You, L. (2009). Optimal tuning of bacterial sensing potential. Mol Syst Bio, 5:286.

[3] Weiße, A. Y., Oyarzún, D. A., Danos, V., & Swain, P. S. (2015). Mechanistic links between cellular trade-offs, gene expression, and growth. Proc. Natl. Acad. Sci. USA 112, E1038–E1047