Line 103: | Line 103: | ||
<div class="clear extra_space"></div> | <div class="clear extra_space"></div> | ||
− | <table style="width: | + | <table style="width:700px" class="center_table"> |
<caption style="font-size:15px;"> <strong> <i>Table 1. Actual coefficients and coefficients derived from the Least Square Regression, without Gaussian Noise </i> </strong> </caption> | <caption style="font-size:15px;"> <strong> <i>Table 1. Actual coefficients and coefficients derived from the Least Square Regression, without Gaussian Noise </i> </strong> </caption> | ||
<tr> | <tr> |
Revision as of 18:38, 17 October 2018
Enzyme
Kinetics
Enzyme Kinetics and Least Squares Regression
The efficiency of our RESCUE system is likely to be dependent on multiple factors such as mismatch distance, length of spacer regions as with ADAR-dCas13b constructs (Cox et al., 2016), as well as the relative concentrations of the substrates/enzymes. The concept of regression models can be utilized to identify and evaluate the significance of these factors from experimental results. As such, a early build of an enzyme kinetics regression model on dCas13b-APOBEC editing efficiency may help us to gain further insights of the RESCUE system.
Goal
- 1. Simulate the RESCUE system under different relative concentrations of substrates and enzymes to determine the concentrations that might yield maximum efficiency.
- 2. Determine the change in the binding and catalytic efficiency when spacer length and mismatch distance is varied, which will help in the design of gRNA for more efficient base editing.
Our assumptions of the model are as follows:
- 1. Association between dCas13b and gRNA is reversible and precedes enzyme-gRNA complex association with substrate mRNA. This is because dCas13b requires the gRNA to bind to the correct target sequence.
- 2. Once the Enzyme-gRNA-Substrate-mRNA trinity complex (ERS) is formed, the reaction will proceed in a single direction to produce the cleaved product.
First, we build upon a kinetics model for our fusion protein product.
Where:
Given the set of kinetic equation as defined above, the rate of change of each protein/RNA can then be described in the following differential equations as follow:
$$\delta_{t}S = G_{s} - D_{s\cdot}S -k_{2}\cdot ER\cdot S +k{-2}\cdot ERS \quad \textrm{(Equation 1)}$$
$$\delta_{t}ER = k_{1}\cdot E\cdot R - k{-1}\cdot ER - k_{2}\cdot ERS +k_{-2}\cdot ERS \quad \textrm{(Equation 2)}$$
$$\delta_{t}ERS = k_{2}\cdot ER\cdot S - k_{3}\cdot ERS - k_{-2}\cdot ERS \quad \textrm{(Equation 3)} $$
$$\delta_{t}P = k_{3}\cdot ERS - D_{s}\cdot P \quad \textrm{(Equation 4)} $$
$$\delta_{t}F = k_{4}\cdot P - D_{f}\cdot F \quad \textrm{(Equation 5)} $$
where Equation 4 assumes that ERS to ER + P is an non-energetically favourable step, and hence is unlikely to occur.
The differential equations, when plotted, yield a complex, non-linear kinetic curve that can be comparable to the standard Komod kinetics curve. But unlike Komod kinetics, logarithmic conversion of the axes may not always result in a linear plot. Non-linear least-square regression (LSR) is thus an alternative strategy that can be used to obtain a best fit curve for enzymatic assays, capable of generating a set of possible values for the coefficients within the differential equations.
S - Unedited Substrate mRNA
R - Guide RNA
E - Enzyme for C to U editing
ER - Enzyme-gRNA complex
ERS - Enzyme-gRNA-Substrate complex
P - Edited Product mRNA
F - Green Fluorescent protein produced
Given the set of kinetic equation as defined above, the rate of change of each protein/RNA can then be described in the following differential equations as follow:
$$\delta_{t}S = G_{s} - D_{s\cdot}S -k_{2}\cdot ER\cdot S +k{-2}\cdot ERS \quad \textrm{(Equation 1)}$$
$$\delta_{t}ER = k_{1}\cdot E\cdot R - k{-1}\cdot ER - k_{2}\cdot ERS +k_{-2}\cdot ERS \quad \textrm{(Equation 2)}$$
$$\delta_{t}ERS = k_{2}\cdot ER\cdot S - k_{3}\cdot ERS - k_{-2}\cdot ERS \quad \textrm{(Equation 3)} $$
$$\delta_{t}P = k_{3}\cdot ERS - D_{s}\cdot P \quad \textrm{(Equation 4)} $$
$$\delta_{t}F = k_{4}\cdot P - D_{f}\cdot F \quad \textrm{(Equation 5)} $$
where Equation 4 assumes that ERS to ER + P is an non-energetically favourable step, and hence is unlikely to occur.
The differential equations, when plotted, yield a complex, non-linear kinetic curve that can be comparable to the standard Komod kinetics curve. But unlike Komod kinetics, logarithmic conversion of the axes may not always result in a linear plot. Non-linear least-square regression (LSR) is thus an alternative strategy that can be used to obtain a best fit curve for enzymatic assays, capable of generating a set of possible values for the coefficients within the differential equations.
The sum of squared differences is the sum of the computed y-axis differences of each readings to their respective points on the current curve. Mathematically, the squared differences (abbreviated as D) can be expressed through Equation 6 below.
$$D = \sum_{i=1}^{n} (y_{i}-p_{i})^2 \quad \textrm{(Equation 6)} $$
where $$y_{i}$$ is the observed data and $$p_{i}$$ is the predicted data.
Subsequently, each coefficients are increased by a small interval of 0.005 A.U. and also decreased by the same interval to check for the lower D value. The lowest D value will be selected for and the coefficients will be updated as accordingly. The simulations will continue to run until no further decrease to the D can be found (localised to a minima). To perform the simulation, we hard-coded the entire regression script in Python, which is shown in the document located at the bottom of this webpage.
As a proof-of-concept, we ran a simulation for a likely positive control experiment that may occur during EGFP reporter construct testing, where cells will be transfected with functional EGFP instead of the mutant EGFP. This thought experiment assumes that the enzyme (E) is absent and the substrate (S) produces functional fluorescent protein product (F). As such, the model is simplified to contain only 2 equations, with only 4 coefficients to tune.
$$ S \rightarrow F$$
$$ \delta_{t}S = G_{S} - D_{S}\cdot S $$
$$ \delta_{t}F = k_{4}\cdot S - D_{f}\cdot F $$
We started off defining an initial set of coefficients for the simulation. A time-dependent graph is plotted for the fluorescent concentrations obtained using defined “actual” coefficients (experimental fluo). Guess coefficients (1st guess fluo) were randomly assigned, and derived coefficients (theoretical fluo) were calculated using the Python script. The plots were shown in Figure 1 (left). To test our script in a more realistic situation, we introduced some noise (assumed to follow Gaussian distribution with standard deviation of 0.5) to our simulation. The results were shown in Table 2 and Figure 1 (right). In general, the generated parameters are found to be a good fit to each of our initial input values.
$$D = \sum_{i=1}^{n} (y_{i}-p_{i})^2 \quad \textrm{(Equation 6)} $$
where $$y_{i}$$ is the observed data and $$p_{i}$$ is the predicted data.
Subsequently, each coefficients are increased by a small interval of 0.005 A.U. and also decreased by the same interval to check for the lower D value. The lowest D value will be selected for and the coefficients will be updated as accordingly. The simulations will continue to run until no further decrease to the D can be found (localised to a minima). To perform the simulation, we hard-coded the entire regression script in Python, which is shown in the document located at the bottom of this webpage.
As a proof-of-concept, we ran a simulation for a likely positive control experiment that may occur during EGFP reporter construct testing, where cells will be transfected with functional EGFP instead of the mutant EGFP. This thought experiment assumes that the enzyme (E) is absent and the substrate (S) produces functional fluorescent protein product (F). As such, the model is simplified to contain only 2 equations, with only 4 coefficients to tune.
$$ S \rightarrow F$$
$$ \delta_{t}S = G_{S} - D_{S}\cdot S $$
$$ \delta_{t}F = k_{4}\cdot S - D_{f}\cdot F $$
We started off defining an initial set of coefficients for the simulation. A time-dependent graph is plotted for the fluorescent concentrations obtained using defined “actual” coefficients (experimental fluo). Guess coefficients (1st guess fluo) were randomly assigned, and derived coefficients (theoretical fluo) were calculated using the Python script. The plots were shown in Figure 1 (left). To test our script in a more realistic situation, we introduced some noise (assumed to follow Gaussian distribution with standard deviation of 0.5) to our simulation. The results were shown in Table 2 and Figure 1 (right). In general, the generated parameters are found to be a good fit to each of our initial input values.
Without Gaussian Noise | $$G_{s}$$ | $$D_{s}$$ | $$k_{4}$$ | $$D_{f}$$ |
---|---|---|---|---|
Actual (Theoretical) | 20.00 | 0.10 | 15.00 | 0.25 |
Guess (1st Guess) | 10.00 | 0.05 | 12.50 | 0.25 |
Derived (Experimental) | 15.04 | 0.15 | 16.80 | 0.15 |