The goal of the model is to study the Incoherent Feed Forward Loop (IFFL) as a network type in three systems that use different components critical to the proper network operation. This function aims at balancing the final product to a specific steady state, regardless of the input disruption. We then responded to questions raised by the wet lab, thus helping towards the right course of thought during experimental planning, as well as a better understanding of the systems and the different choices that emerged. We applied a sensitivity analysis to classify system parameters according to their significance, and finally, we explored the robustness of our system predictions by indentifying parameter sets and evaluating their cellular behavior.
How IFFL works
The stability of a system in which IFFL is applied is represented by the staying of its output at a constant level in the equilibrium state for each change of input. In this type of network, the number of plasmids is the input, which is variable and the final protein is the output of the system that must remain constant at each change of input.
Dry Lab Workflow
As Dry Lab our work consists of:
- Accurately describe the mathematical representation of the biological system.
- Create simulations that predict the behaviour of the system.
-
Characterize system parameters by:
- working with previous studies.
- analyzing and prioritizing the effect that parameters have on the system output.
- analyzing the robustness of the system and make observations.
All of the above are going to be useful for your project, as you need to have a model, that can estimate the in vivo system behaviour of the system, in order to be able to make and test hypotheses that you and the Wet Lab poses.
In this section we our going to share the knowledge that we gained, from our experience working for our project, in designing and working on biological network motifs. You can also view this section as brief introduction on all the things we did, as we put links as examples in the workflow, that take you to our models.
In Figure 1, we present a flowchart, that summarizes our understanding regarding how you should approach the modelling of a biological network. Figure 1: Dry Lab workflow summarized in a flowchart.
The most important part is that you need to understand perfectly how the system should work and what each component of the system does. This means, that you have to arrange sessions with Wet lab where you will discuss about the system and you ask as many questions as you have. These sessions will never end, but it’s important to arrange as many as you can when starting off, in order to build a strong foundation on understanding your system and what it has to offer.
After that you will be confident enough to showcase the interactions happening in the system and create a mathematical representation of it. The mathematical representation in our case, were the Ordinary Differential Equations of the system and the chemical reactions happening between different components in the cell.
Furthermore, for the model to be accurate it has to have well characterized parameters. For this, you can actually do many things. We combined sensitivity and robustness analysis alongside with a very extensive bibliography search. Our methods are described later in this section and you can find implementations of them on both TALE and dcas9-sgRNA models. As shown in Figure 1, it is important to take into account, that the condition that will break the characterization loop is different for each project and depends on what you want from your model and how accurate should it be.
And that’s it. After all these things, you have a model that tries its best to describe your system behaviour and estimate the concentration of the different components. But, we are not finished yet. Now you have to arrange new sessions with Wet lab, show them your system, describe to them what you did and answer all the questions they have, and most importantly, make decisions about the system design that you could not answer before. Now you can make any hypothesis for the system and actually test it with your model. You should though, take into consideration, that the model is an estimation of the reality and depending on the effort you put in the designing and characterizing it, you will get better or worse estimations. As an example, of how we used our models to help in the system design, you can take a look at TALE tuning with IPTG- LacI, dcas9-sgRNA repressor analysis, and dcas9 expression site decision. As a last part of our model, we also tested the implementation of dcas9-sgRNA repressor analysis IFFL in RNA level. On the other hand, an example of how we integrated Wet lab’s feedback into our model is shown in Dcas9 subsystem.
Finally, when the experiments are finished you will have the chance to take the experimental data and the model parameters into them. This could be as simple as using the data produced by robustness analysis and finding the parameter set with the minimum distance from the experimental data, as we did in TALE robustness analysis, or using methods like linear/non linear regression, bayesian estimation etc. The most experimental data you have, the better for the model characterization. In our case, the data we used [1] to fit the model was the system error and for this reason, we had limited options regarding what parameters we should fit to achieve the better results. To overpass this obstacle, we used the result from TALE sensitivity analysis that showed as what was the most influential parameters for the output and fitted them while fixing the others.
Now you have a model, that is fitted to experimental results and you can use that model, if you want to continue your work in the system. If the experimental results, are not as expected, after fitting your model to them, you can optimize the parameters or use new components and continue by giving new feedback to Wet lab. Again for optimizing you can perform a new robustness analysis, because with it you can test extensively different sets from the parametric space save and analyze all the results as you want. Another option that we considered, is a gradient descend algorithm but as we didn’t used it in the end, we will not discuss further about it. If computational cost is not a problem for you, we would recommend robustness analysis, otherwise gradient descent algorithms or other Artificial Intelligence algorithms are the way to go.
Sensitivity analysis
What is sensitivity analysis?
The study of how uncertainty in the output of a model (numerical or otherwise) can be apportioned to different sources of uncertainty in the model input [3].
Why we use sensitivity analysis?
When we construct a computational model, there are many parameters that can not be identified, because it is likely that a similar experiment has not been carried out. Also, there is a chance where a similar experiment has been carried out, but for minor variations of the experimental arrangements, the changes in each parameter are large. It is therefore extremely useful to use a mathematical tool,the sensitivity analysis, with which we can come to many conclusions that will help us to evolve the model. The most important of them are[3]:
Factor prioritization: One reason for using sensitivity analysis is to study the parameters in order to identify the most important ones. The most important ones, i.e. those with the highest sensitivity, are the ones that cause the greatest variation in the final output. By knowing what these are, we can focus on their precise measurement to reduce the variance of the final system, while providing a model that best approaches reality.
Factor Fixing: Having ranked parameters from the most important to the less important, we can choose a fixed point for the non-critical parameters that lies within the originally defined range since it does not contribute much to the output variance. In this way, we can better observe the inter-relationships of important parameters in many system subunits that contain a large set of kinetic equations.
Variance cutting: To reduce the variance of the output, we can apply a series of sensitivity analyses, starting from all system inputs and investigating each time which parameters lead to greater sensitivity in output variance. By removing the parameters, that are the most influential, as long as they are characterized well for the biological system we are studying, then the output variance that remains is the sum of variations caused by the non-important parameters which is not big. This leads the system to fewer input parameters with less output variation and simultaneous reduction of computational cost.
Factor Mapping: To achieve concrete outputs, we need to evaluate the relationships between inputs and which parameter combinations lead to undue variation in output. There are parameter interactions that for a particular system cannot achieve system response within strictly set limits and certain conditions. For example, in a system that controls the amount of a toxic substance that, if present in large quantities, can lead to cell destruction, we need to explore the system's parameters, and find which combinations lead to the desired results.
Global vs Local Sensitivity analysis
Sensitivity analysis can be applied in 2 variations:
Local sensitivity analysis evaluates changes in model outputs with respect to variations in single parameter input [3] [4]. In particular, a small change is made, accompanied by a calculation of the local sensitivity index, for each parameter separately. The local sensitivity index of a parameter p is calculated by the partial output of the system relative to the parameter for a very small change of the parameter value.
On the contrary, Global sensitivity analysis simultaneously varies all parameters over the entire parameter space, which allows to simultaneously evaluate the relative contributions of each individual parameter as well as the interactions between parameters to the model output variance [3]. Because of this, global sensitivity is more capable of approaching the parameters when non-linear phenomena occur in the system.
The biological systems that we study in synthetic biology show many non-linear phenomena among their components and the range of parameter values cannot always be found in bibliography and is often considered large. For this reason, it is proposed to explore the importance of the parameters globally.
Sobol sensitivity analysis
Below we describe the procedure followed by the Sobol sensitivity analysis to calculate:
- First and second order indices referring to the contribution of a particular parameter to the output and to the contribution of the interaction between the pair of parameters respectively.
- Total-effect indices to calculate the contribution of a parameter in combination with its contribution from its interactions with other parameters.
For a system, the output of which is characterized by a function f(x), which is integrable and its inputs are x(x1,x2,x3). The first move is to decompose into the sums of the effect, that the various components of the system have.
Defined as , is the average of f(x) for different parameter sets values
f(x) Decomposition:
We can generalize the above as follows
where n is the number of system inputs and i < j, because fij(xi,xj) = fji(xj,xi).Then we define the variance D of the output function, of the total sums and the average of each parameter as a random variable, that is evenly distributed over an interval [4] and the result is the sum of the variance caused by each parameter or a combination of these in each set of parameters describing the total variance of the output as the following sum:
where f(x) is square-integrable, therefore as total sums of the contribution of all the differences between the set parameters, Equation (3) is analyzed as (1), and shows how much each variable has contributed, as well as their combined interactions in the final overall system variance. where is the variance we have from specific parameters. For example, if we wanted the first order sensitivity index of variable we would use as a numerator only .Computational cost is proportional to the type of problem, the number of parameters and the non-linear phenomena that appear in the relationship between the interacting parameters. For small numbers of parameters like the model we have performed, variance-based methods are considered satisfactory and relatively quick. As the order of sensitivity indices increases, the cost increases accordingly. As mentioned above, one way to use a variance-based method like Sobol is to reduce the number of parameters (variance cutting) by excluding for each calculation those that do not affect.
Robustness analysis
When we start building a model, its structure contains many uncertainties that are not easy to assess. Such uncertainties are the mathematical expression of kinetic equations, stoichiometric coefficients of kinetics, and the simplifications made from a more complex network to a more simplified model. In addition, the complexity of biological systems makes it impossible to fully analyze all the reactions that occur inside and outside the environment, among the various components that they consist of. Robustness analysis is a tool useful for studying the general behavior of a system, despite the uncertainties that exist in it.
Applying robustness analysis to the mathematical model under study, we set price ranges for each parameter that we do not know, setting maximum and minimum values that are considered realistic within the biological system, which are estimated from bibliography and experimental data. Then the model is exhaustively simulated in many sets of parameters created by sampling. The sampling method used is the Latin hypercube, which generates random numbers divided into equal intervals within the range given for each parameter. Within each interval a value is randomly selected.
We use error bars in order to show the output variation around the mean value. They may correspond to a large range of parameter values that are spaced apart to several orders of magnitude between them. The error bars were calculated either by standard deviation of all concentrations for each copy number or by the corresponding standard error.
Once, the sensitivity analysis described above, has been applied to reduce the number of uncertain parameters in the system, and robustness analysis to output values that are highly variable, we can evaluate the results and better study the behavior of the system. In our system we studied the uncertainty of protein of interest concentrations at a time when all components of the system are stabilized in steady-state, relative to the copy number of the plasmids.
Implementation
Simbiology Matlab®
We created the biological networks, building-up the dynamic systems using the MATLAB®language, specifically Simbiology® toolbox that allowed as to analyze easily the kinetic equations with species, reactions and doses-response with it’s simple to use Graphical interface. We chose to solve the ordinary differential equations with variable-step continuous solver ode15s . The ode15s is a good choice for stiff problems and in our simulations settings we set absolute tolerance and relative tolerance . Υου can see our simbiology diagrams here.
Python packages
After the usage of Simbiology for the chemical interaction modelling we use scipy’s odeint for the construction of our systems ODE equations. We use python because we wanted to integrate other packages (e.g for sensitivity analysis) that help us to extend our analysis, and compare the results. The interaction between simbiology and python help us to understand the differences that exist between them and to practice in both environments.
In addition to that, we used Python’s built in multiprocessing module in order to increase the computational speed by dividing the tasks of robustness and sensitivity analysis to many processes (as many as the cores of each PC).
SALib - Sensitivity Analysis Library in Python
This library helped us to incorporate sensitivity analysis in all systems calculating the effects of model input (different parameters in each model) regarding variation of model output (concentration of sfGFP). We specifically used Sobol Sensitivity analysis that calculates the sensitivity index generating model inputs using Saltelli’s extension of the Sobol sequence sampling. This way, we were able to understand which parameters are the most affected and discuss with wet lab if and what changes are needed. You can see all of our simulation code in our Github page.
Robustness analysis
For robustness analysis, we use python packages and SALib.sample package to produce the sampling. The sampling uses the Latin hypercube method (LHS), that generates a random sample of N points for each uncertain input parameter. The number of parameter sets is 10.000 unless otherwise mentioned and the parameter ranges take the same values as in sensitivity analysis. For each result, robustness analysis measures the standard deviation, and the mean for every copy number as well as the error between input-output and prints these results in a csv.
Digital Ocean
Furthermore, in order to increase the computational speed of the analysis as well as running more experiments together, we used Digital Ocean cloud computing platform.
Salis calculator RBS
We used RBS calculator from Salis Lab [5] to estimate the mRNA’ s translation rate for all proteins on our systems. The calculations are described in each model’s page. See translation rate estimation.