Modeling 101

The purpose of this page is to offer a starting place for iGEM teams that are brand new to modeling. We cover the basics of deconstructing a system into a "model-able" form, basic kinetic models, and using differential equations.

1. What is modeling?

Modeling (n): the use of computer-aided constructs (algorithms, data structures, simulations, etc.) to recreate, predict, and/or analyze a biological system

One of the main pursuits of system biology is developing the ability to control and tune complex biological systems. Another is understanding how system properties (those within itself as well as those in relation to another system) arise.

Modeling is an interdisciplinary practice in systems biology that provides answers to these questions by using computing abilities. The first step in the process is to identify key components and interactions and then creating a model that replicates the interactions and enables one to gain meaningful insight about the system as a whole.

Modeling also has practical benefits (time $\equiv$ money as they say) when it comes to experimenting with complex systems. You might never see the same results twice after spending weeks in a lab but you can run a computer simulation and achieve reproduceable results within minutes. Learning how to conduct basic modeling is essential to reducing the strain on lab efforts and supporting your results with theoretical rigor.

   
	    Warning: From here on, "model" will generally refer to the representation of a biological system as a system of equations.

2. Creating a model

Key questions and concepts in model-building.

Define the key molecular players and interactions:

What is being consumed and what is being created in the system? Which molecules and biological features can be exploited by a model? Clarify what the key players in the cell are, what they do, and how they interact with each other.

Describe the molecules and interactions in a mathematical model:

Choose equations that reflect the types and directions of interaction. Drawing the system and writing the chemical equations can help you do this (see Fig. 1 for example). Keep in mind that it's not always necessary to represent every interaction to create a meaningful model.

Estimate parameter values for the model:

When or where does the model stop? As within the world, there are also limits within the molecular world cells work in. These are parameters. A system’s needs and limits are fixed in the model so its behavior can be studied under multiple states. To make the model exact, we recommend using parameters measured or estimated from experimental data. Online database tools like BioNumbers are available for this purpose.

Analyze the dynamical behavior:

How do the “blocks” of the model interact with each other and themselves? By analyzing these relationships, you may start to consider different rates and sensitivities. Testing the behavior is vital and may be done by altering vital factors within each block.

Determine your model’s position:

What can and can't be represented in your model? Understand the fundamental strengths and limitations of different types of models. If its behavior is unexpected, make sure the assumptions you made in the earlier steps are consistent with literature and experimental data

Fig. 1 System design of an optogenetic tool by the 2016 iGEM Wageningen Team

3. Kinetic Laws

Mass Action

Let $S_1$, $S_2$ be reactants and $P_1$, $P_2$ be products in a closed system. And let $s_1$, $s_2$, $p_1$, $p_2$ below represent the stoichiometric coefficients for a balanced chemical equation. Note: when building your model, a reaction may not necessarily have both directions.

$$s_1 S_1 + s_2 S_2 + \ldots \longleftrightarrow p_1 P_1 + p_2 P_2 + \ldots$$

The forward reaction rate is equivalent to $k_f *[S_1]^{s_1}*[S_2]^{s_2} \ldots $ where $k_f$ is a constant. Identically, the reverse reaction rate is equivalent to $k_r *[P_1]^{p_1}*[P_2]^{p_2} \ldots $ with $k_r$ a constant. For a system at equilibrium, the Law of Mass Action states that the forward rate = reverse rate and $K_{eq}$ is a constant given by

$$K_{eq} = \frac{[P_1]^{p_2} *[P_2]^{p_2}* \ldots}{[S_1]^{s_1} *[S_2]^{s_2}* \ldots}$$

Michaelis-Menten

The Michaelis-Menten model is used to calculate how long it takes for an enzyme and substrate to successfully bind and create a product.

$$S + {E}\underset{k_r}{\overset{k_f}{\longleftrightarrow}}{ES} \overset{k_{cat}}{\longrightarrow}{P+E}$$

Enzyme-substrate binding is a reversible process so the formation of $[ES]$ complex is represented with $\longleftrightarrow$. To simplify things, the formation of product $P$ is considered in one direction only at a catalytic rate $k_{cat}$. The following formula for the rate of formation of product is known as the Michaelis-Menten equation:

$v_0=\frac{d[P]}{dt} = V_{max}\frac{[S]}{K_m + [S]}$ where

$V_{max} = k_{cat}*[E]_0$ is the maximal reaction rate
$[E]_0$ is the initial enzyme concentration
$K_m$ is the substrate concentration $[S]$ known as the Michaelis-Menten constant at which the reaction rate is $\frac{V_{max}}{2}$

Hill Kinetics

The Hill equation is used to determine the likeliness of a ligand binding to a larger macromolecule. Assuming the initial concentration of macromolecule, $[P_0]$, was completely unbound, the ratio of $\frac{\text{bounded } [P]}{\text{total } [P]}$ is given by

$\frac{[PL]}{[P_0]} = \frac{[L]^n}{(K_d) + [L]^n} = \frac{[L]^n}{(K_A)^n + [L]^n} = \frac{1}{(\frac{K_A}{[L]})^n + 1}$ where

$PL$ is the bounded macromolecule-ligand concentration
$n$ is the Hill coefficient which represents cooperativity depending on the context of usage. For example, its value may represent the number of binding sites per macromolecule.
$K_d$ is the rate of dissociation of $[PL]$
$K_A$ is the ligand concentration which occupies half of the binding sites (possibly given by n)

Fig.2 Ratio of bound to total macromolecule changing with respect to n

4. Introduction to ODEs

A function takes in a number and spits another one out (in the case of real-valued functions of real numbers). In a biological system, we can represent the concentration of a molecule as a function of time. However, we typically don't know what this function looks like. What we do know is the rate at which a molecule is consumed and produced in a reaction. From Section 3, we know that the rate of change is a function of concentration at a given point in time.

A differential equation represents the relationship between a function $y$ and its derivative $y'$- a separate function that represents the rate of change of $y$. Generally, function $y$ can depend on many variables. However, in an ordinary differential equation (ODE), it is only allowed to depend on one. Thus the general form of an ODE is given by

$$f(t,y,y',\ldots ,y^n)=0$$

where $y$ is a function of $t$ and $y^n$ is the nth derivative of $y$. The cousin to an ODE is a PDE or partial differential equation which is an equation relating a function of multiple variables to its derivatives. We use the following examples to illustrate how ODE's are derived from kinetic models. Try this Extremely Fun Exercise$^{TM}$: for each equation, identify which of the 3 kinetic models is being used and why!

Examples

$\beta$ denotes the degradation rate.

	$\color{blue}{S_1} + {\color{red}{S_2}}\overset{k_f}{\longrightarrow}{\color{green}{P}}$
Here, S1, S2, and P are functions of time only while $\frac{d[P]}{dt}$ is a function of time and P. That is, a function of a function.	$\frac{d[\color{green}{P}]}{dt} = k_f [\color{blue}{S_1}] [\color{red}{S_2}] - \beta_P[\color{green}{P}]$

Sometimes the intermediate $[ES]$ step is ignored to further simplify the model.	$\color{blue}{S} + \color{red}{E}\underset{k_r}{\overset{k_f}{\longleftrightarrow}}\color{purple}{ES} \overset{k_{cat}}{\longrightarrow}{\color{green}{P}+\color{red}{E}}$
If this equality holds, what does it say about the relationship between the Michaelis-Menten and Hill models?	$k_{cat}[\color{purple}{ES}] = V_{max}\frac{[\color{blue}{S}]}{K_m + [\color{blue}{S}]}$
	$\frac{d[\color{green}{P}]}{dt} =k_{cat}[\color{purple}{ES}] - \beta_P [\color{green}{P}]$
	$\frac{d[\color{red}{E}]}{dt} = -k_f[\color{red}{E}][\color{blue}{S}]+k_r[\color{purple}{ES}]+k_{cat}[\color{purple}{ES}]-\beta_E[\color{red}{E}]$
	$\frac{d[\color{blue}{S}]}{dt} = -k_f[\color{red}{E}][\color{blue}{S}]+k_r[\color{purple}{ES}]-\beta_S[\color{blue}{S}]$
	$\frac{d[\color{purple}{ES}]}{dt} = k_f[\color{red}{E}][\color{blue}{S}]-k_{cat}[\color{purple}{ES}]-k_r[\color{purple}{ES}]-\beta_{ES}[\color{purple}{ES}]$

Myoglobin (Mb) is the primary oxygen-binding protein found in muscle tissue. Diving mammals such as seals, whales, otters, and beavers are able to remain submerged for long periods because they have 10x the myoglobin in their muscles compared to humans.

The binding reaction of $O_2$ to $Mb$ is described by the equilibrium equation $Mb + O_2 \longleftrightarrow MbO_2$. The dissociation constant is a ratio of bound $O_2Mb$ to unbound $O_2$ given by $K_d = \frac{[Mb][O_2]}{[O_2]}$. Solving for $[Mb]$ and substituting into the fractional saturation equation $Y_{O_2}=\frac{[MbO_s]}{[Mb]+[MbO_2]}$, we get

$\frac{[O_2]}{K_d + [O_2]} = \frac{1}{\frac{K_d}{[O_2]} + 1}$

"This model will be a simplification and an idealization, and consequently a falsification." - Alan Turing, The Chemical Basis of Morphogenesis

Sometimes the intermediate \([ES]\) step is ignored to further simplify the model.	\(\color{blue}{S} + \color{red}{E}\underset{k_r}{\overset{k_f}{\longleftrightarrow}}\color{purple}{ES} \overset{k_{cat}}{\longrightarrow}{\color{green}{P}+\color{red}{E}}\)
If this equality holds, what does it say about the relationship between the Michaelis-Menten and Hill models?	\(k_{cat}[\color{purple}{ES}] = V_{max}\frac{[\color{blue}{S}]}{K_m + [\color{blue}{S}]}\)
	\(\frac{d[\color{green}{P}]}{dt} =k_{cat}[\color{purple}{ES}] - \beta_P [\color{green}{P}]\)
	\(\frac{d[\color{red}{E}]}{dt} = -k_f[\color{red}{E}][\color{blue}{S}]+k_r[\color{purple}{ES}]+k_{cat}[\color{purple}{ES}]-\beta_E[\color{red}{E}]\)
	\(\frac{d[\color{blue}{S}]}{dt} = -k_f[\color{red}{E}][\color{blue}{S}]+k_r[\color{purple}{ES}]-\beta_S[\color{blue}{S}]\)
	\(\frac{d[\color{purple}{ES}]}{dt} = k_f[\color{red}{E}][\color{blue}{S}]-k_{cat}[\color{purple}{ES}]-k_r[\color{purple}{ES}]-\beta_{ES}[\color{purple}{ES}]\)

	\(\color{blue}{S_1} + {\color{red}{S_2}}\overset{k_f}{\longrightarrow}{\color{green}{P}}\)
Here, S1, S2, and P are functions of time only while \(\frac{d[P]}{dt}\) is a function of time and P. That is, a function of a function.	\(\frac{d[\color{green}{P}]}{dt} = k_f [\color{blue}{S_1}] [\color{red}{S_2}] - \beta_P[\color{green}{P}]\)

Team:CUNY Kingsborough/Modeling 101