Revision as of 13:30, 16 October 2018

Design of Experiments

During practical experiments in the laboratory, one is often left with a large number of factors, which need to be tested in order to create a meaningful model. A model which is considered meaningful should be capable of relating experimental factors (explanatory variables) to a response variable (experimental outcome) via a process/system as shown in Figure 1.

Fig. 1:

The simplest linear response model is presented below: \begin{equation} y_{i}=\beta_{0}+\beta_{1}x_{1}+...+\beta_{n}x_{n}+\varepsilon_{i} , \varepsilon_{i} \sim N(0,\sigma^{2}I) \end{equation} Where $y_{i}$ is the response variable of the i’th observation, the $x_{1..n}$ are the explanatory variable of the i’th response, the $\beta_{0..n}$ are the regression coefficients and finally the residuals of the model $\varepsilon_{i}$ are considered to be normally distributed with a constant variance ($\sigma^{2}I$, with Ibeing the unity matrix) and a mean of 0.
This model will, for convenience, be written in matrix notation: \begin{equation} Y=X\beta+\varepsilon, \varepsilon \sim N(0,\sigma^{2}I) \end{equation} Where X is the design matrix of the model the size of k x N where k is the number of factors and N is the number of responses. Y is a vector of response variables of size 1 x Y, Y is a vector of regression coefficients of size 1 x N and finally, the \varepsilon is also a vector of the size of 1 x N containing the residuals for each response variable. The model is very important, as this is the foundation of the experimental design showing which factors should undergo testing and the selection is thus specifically related to the design matrix (see later).

To understand the importance of an experimental design one could look at previously employed methods. In traditional design of experiments (DOE) one would use the change “one factor at a time until no improvements can be achieved” principle (2). However, this technique does not take into account that of a possible interaction between factors. Why this is a problem can be illustrated via our experiment regarding the testing of compressive strength in the fungal bricks (indsæt link til experimentet-ikkeskrevet). Had we only taken one factor at a time and keeping everything else constant, the possible interaction of for instance different burning temperature and different burning time would not be identified, thus leaving out a significant part of the explained variance, resulting in a worse fit of the compressive strength, our response variable.

A general DOE which may provide a foundation of an experimental model (such as the linear model presented earlier) are often employed as a factorial design (4). When using a factorial design, one can with relative ease gain a solid modelling of the entire design space of an experiment. The design works by having multiple factors which are all considered to be either at a high level or low level, thus spanning over the largest design space possible. A simple example of such a design can be written as an ANOVA model: \begin{equation} y_{ijk}=\mu+\alpha_{i}+\beta_{j}+(\alpha\beta)_{ij}+\varepsilon_{ijk}, \varepsilon_{ijk} \sim N(0,\sigma^{2}I) \end{equation} $\mu$ is the overall mean $\alpha_{i}$ is the effect of factor A at the i’th level $\beta_{j}$ is the effect of factor B at the j’th level $(\alpha\beta)_{ij}$ is the interaction of factor A and B at different levels The k subscript in the $y_{ijk}$ response variable and residuals $\varepsilon_{ijk}$ denotes the amount of replicates $k=(1,2,3...m)$.

The theoretical flow of a solid DOE has 3 phases: 1 screening, 2 optimization and 3 robustness check. For the initial screening process the k is put to 1 thus creating an unreplicated factorial design to test multiple factors as fast as possible using only main effects. When the the significant factors are found, the optimization of factor levels can be carried out using a replicated factorial design. Finally a test of robustness must be carried out to prove that small factor fluctuations does not significantly influence the experiment. The ANOVA type model is very useful for both the screening process and the optimization process of experimentation.
The last part needs some other form of designs and will not be further discussed here (see post hoc analysis in the statistical model section, lav link).

(1) Lejeune, R., Nielsen, J. og Baron, G. V. (1995) “Morphology of Trichoderma reesei QM 9414 in submerged cultures”, Biotechnology and Bioengineering, 47(5), s. 609–615. doi: 10.1002/bit.260470513.

(2) Spohr, A., Dam-Mikkelsen, C., Carlsen, M., Nielsen, J. og Villadsen, J. (1998) “On-line study of fungal morphology during submerged growth in a small flow-through cell”, Biotechnology and Bioengineering, 58(5), s. 541–553. doi: 10.1002/(SICI)1097-0290(19980605)58:5<541::AID-BIT11>3.0.CO;2-E.Lejeune, R.

(3) Lejeune, R. og Baron, G. V. (1996) “Simulation of growth of a filamentous fungus in 3 dimensions.”, Biotechnology and bioengineering, 53(2), s. 139–50. doi: 10.1002/(SICI)1097-0290(19970120)53:2<139::AID-BIT3>3.0.CO;2-P.

(4) Monod, J. (1949) “The Growth of Bacterial Cultures”, Annual Review of Microbiology. Annual Reviews 4139 El Camino Way, P.O. Box 10139, Palo Alto, CA 94303-0139, USA , 3(1), s. 371–394. doi: 10.1146/annurev.mi.03.100149.002103.

@@ Line 27: / Line 27: @@
 <div class="interlabspace">
-<h2>Under the lens</h2>
 <p>
 During practical experiments in the laboratory, one is often left with a large number of factors, which need to be tested in order to create a meaningful model. A model which is considered meaningful should be capable of relating experimental factors (explanatory variables) to a response variable (experimental outcome) via a process/system as shown in Figure 1.
@@ Line 62: / Line 62: @@
 \end{equation}
+$\mu$ is the overall mean
+$\alpha_{i}$ is the effect of factor A at the i’th level
+$\beta_{j}$ is the effect of factor B at the j’th level
+$(\alpha\beta)_{ij}$ is the interaction of factor A and B at different levels
+The k subscript in the $y_{ijk}$ response variable and residuals $\varepsilon_{ijk}$ denotes the amount of replicates $k=(1,2,3...m)$.
-</div>
-<div class="interlabspace">
-<h2>Microscopic view</h2>
-<p>
-One of our models focuses on the morphology at the hyphal level by simulating the movement of hyphal tips, branching rates, extension rates and the density levels during a growth period in two dimensions. All of the code scripts can be found on our <a target="_blank" href="https://github.com/BioBuilders2018/mycelium-simulations">GitHub repository</a>.<br><br>
-The mycelium of a fungus consists of many interwoven hyphae, and the density depends on how many filaments there are in a given location. In figure 1 below, it is possible to see microscopic pictures of fungal mycelium. Three different levels of zoom illustrate how the network looks, where it can be observed how they interlink and how a fungal filament can branch into more.
-</p>
-<p id="interwovenfilaments" style="text-align:center;"> <img src="https://static.igem.org/mediawiki/2018/f/fc/T--DTU-Denmark--model-1.png" style="max-width: 100%;" > <figcaption><p style="text-align:center; font-size:14px;"><b>Fig. 1: </b> - Snapshots of mycelium development of Aspergillus oryzae. These are representative microscopic images of how a network of intertwined hyphal filaments could look in a microscope.</p></figcaption>
-</p>
-</div>
-<div class="verticalLineright textbreather interlabspace verticalrightscience">
-<h3 class="media-heading"  style="text-align: right;margin-bottom: 35px; color:#0C233F;">Simulation of the mycelium development
-</h3>
-<p>
-Fungal growth is initiated by $n$ number of spores, and a branch will start to extend from each of the spores added to the space. Following along one of these branches originating from a single spore, the hyphae will grow in a direction $\theta$ with a tip extension $r_{tip, i}$. A branching event, in which a new branch is formed from the first branch, can occur with a probability $q$.
-Tip extension rate is calculated by using the equation below, which considers growth kinetics for the fungi and the amount of substrate available. It essentially outputs the accelerated growth dependent on the amount of substrate available, where the accelerated growth equation depends on fungal kinetics and branch lengths (1, 3).
-\begin{equation}
-	r_{tip, i} = \bigg(k_{tip,1} + k_{tip,2}\cdot \frac{l_{br,i}}{l_{br, i} + K_t}\bigg)\cdot\bigg(\frac{S}{S+K_s}\bigg)
-\end{equation}
-$k_{tip,1}$ is the initial tip extension rate of the branch and $k_{tip,2}$ is the difference between the maximum extension rate and $k_{tip,1}$ (1, 3). $K_t$ symbolizes the time it takes to reach half of the maximum extension rate.  The length of branch $i$ described by $l_{br,i}$. $S$ is the substrate concentration and $K_S$ corresponds to the substrate concentration to reach half of the maximum growth level (4).
-As the starting coordinates of this simulation are $(x_0, y_0)$  and end coordinates $(x, y)$, the length of branch $i$ can be calculated as the distance between two points:
-\begin{equation}
-l_{br, i} = \sqrt{(x-x_0)^2 + (y-y_0)^2}
-\end{equation}
-By dividing the growth area into a grid of $w\cdot h$ areas, it is possible to investigate the uptake of substrate, hyphal movement and the development of biomass through the simulation.
-<br><br>
-At the start of the simulation, it can be assumed that the initial substrate concentration $S_0$ will be distributed evenly across the grid. When the spores are placed randomly, the branches start to develop by using the substrate available around the hyphal tip. So for each branch, it is checked which tip is in which grid and whether there is any substrate available. That means that one can track the substrate depletion in each grid, and when there is no more substrate in one of the areas of the grid, the hyphal tips located there can no longer grow and can be considered as inactive hyphae. If the total area around the mycelium no longer contains substrate, the growth will simply stop.
 <br><br>
-In the same way as tracking substrate depletion, biomass production can also be viewed in a grid. But instead of calculating a loss of substrate, each new branch extension is considered a gain in biomass or the density $d$. It should show the same mechanics as the substrate depletion, as these two are directly related to each other by the equation $l_{br,i}$.
+The theoretical flow of a solid DOE has 3 phases: 1 screening, 2 optimization and 3 robustness check.
-<br><br>
+For the initial screening process the k is put to 1 thus creating an unreplicated factorial design to test multiple factors as fast as possible using only main effects. When the the significant factors are found, the optimization of factor levels can be carried out using a replicated factorial design. Finally a test of robustness must be carried out to prove that small factor fluctuations does not significantly influence the experiment. The ANOVA type model is very useful for both the screening process and the optimization process of experimentation.<br>
-As the fungi grow in size, the computational cost also increases. We were limited in running the simulation due to the power of our computers, thus resulting in us introducing two restrictions: the number of hyphae $M$ and the number of steps in the simulation $N$.
+The last part needs some other form of designs and will not be further discussed here (see post hoc analysis in the statistical model section, lav link).<br><br>
-</p>
@@ Line 123: / Line 80: @@
 </div>
-<div class="verticalLine textbreather interlabspace verticalleftrealyellow">
-<h3 class="media-heading" style="text-align: left;margin-bottom: 35px; color:#F8A05B;">Results</h3>
-<p>
-The simulation ran with the parameters listed in table 1 [below], which painted a detailed picture of the time history of growth from the spore to the densely branched network of hyphal filaments. The parameters are based on growth of <i>Aspergillus oryzae</i> in Spohr (2).
-</p>
-<table class="tg">
-  <tr>
-    <th class="tg-0lax">Parameter</th>
-    <th class="tg-0lax">Value</th>
-    <th class="tg-0lax">Source or rationale</th>
-  </tr>
-  <tr>
-    <td class="tg-0lax">$k_{tip,1}$</td>
-    <td class="tg-0lax">80 $\mu m\cdot tip^{-1}\cdot h^{-1}$</td>
-    <td class="tg-0lax">Taken from Spohr (2)</td>
-  </tr>
-  <tr>
-    <td class="tg-0lax">$k_{tip,2}$</td>
-    <td class="tg-0lax">75 $\mu m\cdot tip^{-1}\cdot h^{-1}$</td>
-    <td class="tg-0lax">Taken from Spohr (2)</td>
-  </tr>
-  <tr>
-    <td class="tg-0lax">$K_t$</td>
-    <td class="tg-0lax">5 $\mu m$</td>
-    <td class="tg-0lax">Estimated</td>
-  </tr>
-  <tr>
-    <td class="tg-0lax">$S_0$</td>
-    <td class="tg-0lax">50000 mg/L</td>
-    <td class="tg-0lax">Estimated</td>
-  </tr>
-  <tr>
-    <td class="tg-0lax">$K_S$</td>
-    <td class="tg-0lax">200 mg</td>
-    <td class="tg-0lax">Estimated</td>
-  </tr>
-  <tr>
-    <td class="tg-0lax">$M$</td>
-    <td class="tg-0lax">100000</td>
-    <td class="tg-0lax">Set as a simulation limit</td>
-  </tr>
-  <tr>
-    <td class="tg-0lax">$N$</td>
-    <td class="tg-0lax">5000</td>
-    <td class="tg-0lax">Set as a simulation limit</td>
-  </tr>
-  <tr>
-    <td class="tg-0lax">$q$</td>
-    <td class="tg-0lax">00.05</td>
-    <td class="tg-0lax">Estimated</td>
-  </tr>
-</table><figcaption><p style="text-align:center; font-size:14px;"><b>Table 1: </b> - Parameters in a typical simulation run. </p></figcaption>
-<p>
-The end results can be viewed in figure 2 below, where three animations are shown of the hyphal development: hyphal movement and locations, biomass development and substrate depletion. It is the same mycelium simulation in all three animations, where we can observe that the density increases as the substrate level decreases.
-</p>
-<div style="width:100%;">
-<div class="col-xs-4">
-<p style="text-align:center;"> <img src="https://static.igem.org/mediawiki/2018/9/90/T--DTU-Denmark--modelgrowth-sim1-1.gif" style="max-width: 100%;" > <figcaption><p style="text-align:center; font-size:14px;"><b>Fig. 2a: </b> - Hyphal development over time from 10 spores.</p></figcaption>
-</div>
-<div class="col-xs-4">
-<p style="text-align:center;"> <img src="https://static.igem.org/mediawiki/2018/b/b3/T--DTU-Denmark--modelgrowth-sim-2.gif" style="max-width: 100%;" > <figcaption><p style="text-align:center; font-size:14px;"><b>Fig. 2b: </b> - Density development as the hyphae grows. The darker the color the higher the density.</p></figcaption>
-</div>
-<div class="col-xs-4">
-<p style="text-align:center;"> <img src="https://static.igem.org/mediawiki/2018/f/f3/T--DTU-Denmark--modelgrowth-sim-3.gif" style="max-width: 100%;" > <figcaption><p style="text-align:center; font-size:14px;"><b>Fig. 2c: </b> - The depletion of substrate over time, as the fungus uses the substrate to extend its hyphal network. The lighter color indicates lower levels of substrate.</p></figcaption>
-</div>
-</div>
-<p>
-Figure 3 shows how that the hyphal lengths in the beginning of the growth simulation are increasing, whereas there is initially not that many hyphal tips. But as time progresses, the hyphal tips start to increase drastically due to there being more and more hyphae and the original branches being longer. The frequencies of hyphal lengths and branching events can be seen in figure 4, where the majority of the hyphal elements are very short and haven’t branched that much. Those that are longer are also those which have branched more often than the rest. The substrate level decreases as the amount of biomass increases, where this pattern can be observed in figure 5. All of the substrate is not gone, but that is due to the fact that the mycelium growth was not that efficient in using all the substrate in the given simulation time.
-</p>
-<p style="text-align:center;"> <img src="https://static.igem.org/mediawiki/2018/1/13/T--DTU-Denmark--model-growth-3.png" style="max-width: 80%;" > <figcaption><p style="text-align:center; font-size:14px;"><b>Fig. 3: </b> - The number of hyphal tips (red) and total length of all hyphae (black) over time in the simulation.
-</p></figcaption>
-<div style="width:100%;">
-<div class="col-xs-6">
-<p style="text-align:center;"> <img src="https://static.igem.org/mediawiki/2018/e/e9/T--DTU-Denmark--model-growth-4a.png" style="max-width: 100%;" > <figcaption><p style="text-align:center; font-size:14px;"><b>Fig. 4a: </b> - The frequencies of hyphal lengths, where it can be seen that the majority of the lengths are lower than $\mu$m.</p></figcaption>
-</div>
-<div class="col-xs-6">
-<p style="text-align:center;"> <img src="https://static.igem.org/mediawiki/2018/d/da/T--DTU-Denmark--model-growth-4b.png" style="max-width: 100%;" > <figcaption><p style="text-align:center; font-size:14px;"><b>Fig. 4b: </b> - The frequencies of the number of branching events per hyphae, where most of the branches only have less than 5 events and those that have more are the oldest branches in the simulation.</p></figcaption>
-</div>
-</div>
-<p style="color:white">hi</p>
-<div style="width:100%;">
-<div class="col-xs-6">
-<p style="text-align:center;"> <img src="https://static.igem.org/mediawiki/2018/9/97/T--DTU-Denmark--model-growth-5a.png" style="max-width: 100%;" > <figcaption><p style="text-align:center; font-size:14px;"><b>Fig. 2b: </b> - The increase in mycelium biomass over time, where the curve follows an exponential growth mechanism.</p></figcaption>
-</div>
-<div class="col-xs-6">
-<p style="text-align:center;"> <img src="https://static.igem.org/mediawiki/2018/7/7d/T--DTU-Denmark--model-growth-5b.png" style="max-width: 100%;" > <figcaption><p style="text-align:center; font-size:14px;"><b>Fig. 2c: </b> - The substrate level over time decreases as the mycelium develops.
-</p></figcaption>
-</div>
-</div>
-<p style="color:white">hi</p>
-</div>
-<div class="verticalLineright textbreather interlabspace verticalrightscience">
-<h3 class="media-heading"  style="text-align: right;margin-bottom: 35px; color:#0C233F;">Simulation of the mycelium development
-</h3>
-<p>
-Fungal growth is initiated by $n$ number of spores, and a branch will start to extend from each of the spores added to the space. Following along one of these branches originating from a single spore, the hyphae will grow in a direction $\theta$ with a tip extension $r_{tip, i}$. A branching event, in which a new branch is formed from the first branch, can occur with a probability $q$.
-Tip extension rate is calculated by using the equation below, which considers growth kinetics for the fungi and the amount of substrate available. It essentially outputs the accelerated growth dependent on the amount of substrate available, where the accelerated growth equation depends on fungal kinetics and branch lengths (1, 3).
-\begin{equation}
-	r_{tip, i} = \bigg(k_{tip,1} + k_{tip,2}\cdot \frac{l_{br,i}}{l_{br, i} + K_t}\bigg)\cdot\bigg(\frac{S}{S+K_s}\bigg)
-\end{equation}
-$k_{tip,1}$ is the initial tip extension rate of the branch and $k_{tip,2}$ is the difference between the maximum extension rate and $k_{tip,1}$ (1, 3). $K_t$ symbolizes the time it takes to reach half of the maximum extension rate.  The length of branch $i$ described by $l_{br,i}$. $S$ is the substrate concentration and $K_S$ corresponds to the substrate concentration to reach half of the maximum growth level (4).
-As the starting coordinates of this simulation are $(x_0, y_0)$  and end coordinates $(x, y)$, the length of branch $i$ can be calculated as the distance between two points:
-\begin{equation}
-l_{br, i} = \sqrt{(x-x_0)^2 + (y-y_0)^2}
-\end{equation}
-By dividing the growth area into a grid of $w\cdot h$ areas, it is possible to investigate the uptake of substrate, hyphal movement and the development of biomass through the simulation.
-<br><br>
-At the start of the simulation, it can be assumed that the initial substrate concentration $S_0$ will be distributed evenly across the grid. When the spores are placed randomly, the branches start to develop by using the substrate available around the hyphal tip. So for each branch, it is checked which tip is in which grid and whether there is any substrate available. That means that one can track the substrate depletion in each grid, and when there is no more substrate in one of the areas of the grid, the hyphal tips located there can no longer grow and can be considered as inactive hyphae. If the total area around the mycelium no longer contains substrate, the growth will simply stop.
-<br><br>
-In the same way as tracking substrate depletion, biomass production can also be viewed in a grid. But instead of calculating a loss of substrate, each new branch extension is considered a gain in biomass or the density $d$. It should show the same mechanics as the substrate depletion, as these two are directly related to each other by the equation $l_{br,i}$.
-<br><br>
-As the fungi grow in size, the computational cost also increases. We were limited in running the simulation due to the power of our computers, thus resulting in us introducing two restrictions: the number of hyphae $M$ and the number of steps in the simulation $N$.
-</p>
-</div>
-<div class="verticalLine textbreather interlabspace verticalleftrealyellow">
-<h3 class="media-heading" style="text-align: left;margin-bottom: 35px; color:#F8A05B;">Future development</h3>
-<p>
-The model shows how the fungi grows in 2D, so it can be interpreted to work as looking at a petri dish in detail. If the model were expanded into 3D, the result could be related to filling out a form and still studying the hyphal interactions at a very detailed level. 3D models of fungal growth do already exist, for instance see Lejeune (3) that simulates the growth of the filamentous fungi <i>Trichoderma reesei</i>.<br><br>
-There are many different factors that influence the growth kinetics and this model only includes a few of them. Parameters such as temperature or oxygen levels could be implemented to get the simulation to work more realistically.
-</p>
-</div>
 <p style="color:#000; font-size:14px;">(1)  Lejeune, R., Nielsen, J. og Baron, G. V. (1995) “Morphology of Trichoderma reesei QM 9414 in submerged cultures”, Biotechnology and Bioengineering, 47(5), s. 609–615. doi: 10.1002/bit.260470513.<br><br>

Difference between revisions of "Team:DTU-Denmark/DesignOfExperiments"

Revision as of 13:30, 16 October 2018

Design of Experiments