Line 36: | Line 36: | ||
<p id="interwovenfilaments" style="text-align:center;"> <img src="https://static.igem.org/mediawiki/2018/9/99/T--DTU-Denmark--modelThies1.png" style="max-width: 100%;" > <figcaption><p style="text-align:center; font-size:14px;"><b>Fig. 1: </b></p></figcaption> | <p id="interwovenfilaments" style="text-align:center;"> <img src="https://static.igem.org/mediawiki/2018/9/99/T--DTU-Denmark--modelThies1.png" style="max-width: 100%;" > <figcaption><p style="text-align:center; font-size:14px;"><b>Fig. 1: </b></p></figcaption> | ||
</p> | </p> | ||
− | + | <p> | |
The simplest linear response model is presented below: | The simplest linear response model is presented below: | ||
Revision as of 13:31, 16 October 2018
Design of Experiments
During practical experiments in the laboratory, one is often left with a large number of factors, which need to be tested in order to create a meaningful model. A model which is considered meaningful should be capable of relating experimental factors (explanatory variables) to a response variable (experimental outcome) via a process/system as shown in Figure 1.
Fig. 1:
The simplest linear response model is presented below:
\begin{equation}
y_{i}=\beta_{0}+\beta_{1}x_{1}+...+\beta_{n}x_{n}+\varepsilon_{i} , \varepsilon_{i} \sim N(0,\sigma^{2}I)
\end{equation}
Where $y_{i}$ is the response variable of the i’th observation, the $x_{1..n}$ are the explanatory variable of the i’th response, the $\beta_{0..n}$ are the regression coefficients and finally the residuals of the model $\varepsilon_{i}$ are considered to be normally distributed with a constant variance ($\sigma^{2}I$, with Ibeing the unity matrix) and a mean of 0.
This model will, for convenience, be written in matrix notation:
\begin{equation}
Y=X\beta+\varepsilon, \varepsilon \sim N(0,\sigma^{2}I)
\end{equation}
Where X is the design matrix of the model the size of k x N where k is the number of factors and N is the number of responses. Y is a vector of response variables of size 1 x Y, Y is a vector of regression coefficients of size 1 x N and finally, the \varepsilon is also a vector of the size of 1 x N containing the residuals for each response variable. The model is very important, as this is the foundation of the experimental design showing which factors should undergo testing and the selection is thus specifically related to the design matrix (see later).
To understand the importance of an experimental design one could look at previously employed methods. In traditional design of experiments (DOE) one would use the change “one factor at a time until no improvements can be achieved” principle (2). However, this technique does not take into account that of a possible interaction between factors. Why this is a problem can be illustrated via our experiment regarding the testing of compressive strength in the fungal bricks (indsæt link til experimentet-ikkeskrevet). Had we only taken one factor at a time and keeping everything else constant, the possible interaction of for instance different burning temperature and different burning time would not be identified, thus leaving out a significant part of the explained variance, resulting in a worse fit of the compressive strength, our response variable.
A general DOE which may provide a foundation of an experimental model (such as the linear model presented earlier) are often employed as a factorial design (4). When using a factorial design, one can with relative ease gain a solid modelling of the entire design space of an experiment. The design works by having multiple factors which are all considered to be either at a high level or low level, thus spanning over the largest design space possible. A simple example of such a design can be written as an ANOVA model:
\begin{equation}
y_{ijk}=\mu+\alpha_{i}+\beta_{j}+(\alpha\beta)_{ij}+\varepsilon_{ijk}, \varepsilon_{ijk} \sim N(0,\sigma^{2}I)
\end{equation}
$\mu$ is the overall mean
$\alpha_{i}$ is the effect of factor A at the i’th level
$\beta_{j}$ is the effect of factor B at the j’th level
$(\alpha\beta)_{ij}$ is the interaction of factor A and B at different levels
The k subscript in the $y_{ijk}$ response variable and residuals $\varepsilon_{ijk}$ denotes the amount of replicates $k=(1,2,3...m)$.
The theoretical flow of a solid DOE has 3 phases: 1 screening, 2 optimization and 3 robustness check.
For the initial screening process the k is put to 1 thus creating an unreplicated factorial design to test multiple factors as fast as possible using only main effects. When the the significant factors are found, the optimization of factor levels can be carried out using a replicated factorial design. Finally a test of robustness must be carried out to prove that small factor fluctuations does not significantly influence the experiment. The ANOVA type model is very useful for both the screening process and the optimization process of experimentation.
The last part needs some other form of designs and will not be further discussed here (see post hoc analysis in the statistical model section, lav link).
(1) Lejeune, R., Nielsen, J. og Baron, G. V. (1995) “Morphology of Trichoderma reesei QM 9414 in submerged cultures”, Biotechnology and Bioengineering, 47(5), s. 609–615. doi: 10.1002/bit.260470513.
(2) Spohr, A., Dam-Mikkelsen, C., Carlsen, M., Nielsen, J. og Villadsen, J. (1998) “On-line study of fungal morphology during submerged growth in a small flow-through cell”, Biotechnology and Bioengineering, 58(5), s. 541–553. doi: 10.1002/(SICI)1097-0290(19980605)58:5<541::AID-BIT11>3.0.CO;2-E.Lejeune, R.
(3) Lejeune, R. og Baron, G. V. (1996) “Simulation of growth of a filamentous fungus in 3 dimensions.”, Biotechnology and bioengineering, 53(2), s. 139–50. doi: 10.1002/(SICI)1097-0290(19970120)53:2<139::AID-BIT3>3.0.CO;2-P.
(4) Monod, J. (1949) “The Growth of Bacterial Cultures”, Annual Review of Microbiology. Annual Reviews 4139 El Camino Way, P.O. Box 10139, Palo Alto, CA 94303-0139, USA , 3(1), s. 371–394. doi: 10.1146/annurev.mi.03.100149.002103.