Difference between revisions of "Team:Bielefeld-CeBiTec/Software"

 
(26 intermediate revisions by the same user not shown)
Line 34: Line 34:
  
  
 
+
function myFunction() {
 +
    var x = document.getElementById("myDIV");
 +
    if (x.style.display === "block") {
 +
        x.style.display = "none";
 +
    } else {
 +
        x.style.display = "block";
 +
    }
 +
}
 
</script>
 
</script>
  
Line 43: Line 50:
  
 
<div class="title_picture">
 
<div class="title_picture">
<img src="https://static.igem.org/mediawiki/2017/6/6c/T--Bielefeld-CeBiTec--title-img-bielefeld.jpg" style="width:100%">
+
<img src="https://static.igem.org/mediawiki/2018/0/0f/T--Bielefeld-CeBiTec--code_vk.png" style="width:100%">
 
</div>  
 
</div>  
 +
 +
 +
              <div class="sidenavi" id="side_bar">
 +
 +
<li class="side_list">
 +
<a href="#intro">siRNAS short introduction</a>
 +
</li>
 +
                        <li class="side_list">
 +
<a href="#overh">Overhangs and scaffolds</a>
 +
</li>
 +
<li class="side_list">
 +
<a href="#dm">Choosing design methods</a>
 +
</li>
 +
<li class="side_list">
 +
<a href="#rat">Rational siRNA design</a>
 +
</li>
 +
                        <li class="side_list">
 +
<a href="#ui">Ui-Tei rule</a>
 +
</li>
 +
                        </li>
 +
                        <li class="side_list">
 +
<a href="#prob">Calculating silencing probability</a>
 +
</li>
 +
                        <li class="side_list">
 +
<a href="#srnai">siRNA construction</a>
 +
</li>
 +
                        <li class="side_list">
 +
<a href="#check">Check siRNA</a>
 +
</li>
 +
                        <li class="side_list">
 +
<a href="#coma">Command line application</a>
 +
</li>
 +
                        <li class="side_list">
 +
<a href="#grai">Graphical Interface usage</a>
 +
</li>
 +
                        <li class="side_list">
 +
<a href="#outlo">Outlook</a>
 +
</li>
 +
</div>
 +
  
 
<div class="container">
 
<div class="container">
Line 51: Line 98:
 
<div class="title">siRCon - A siRNA Constructor</div>
 
<div class="title">siRCon - A siRNA Constructor</div>
  
 +
<h2>Short Summary</h2>
 
<article>
 
<article>
In our project we introduce RNA interference (RNAi) and silencing with small interfering (si)RNAs as an alternative to CRISPR/Cas. To use siRNA as silencing agents for the gene-of -interest we propose a two-step design process. At first potential siRNAs for prokaryotic organisms must be designed. In the second step the silencing effect of these siRNAs can be validated by our siRNA vector system <a href"https://2018.igem.org/Team:Bielefeld-CeBiTec/siRNA">Tace.</a> To facilitate the initial siRNA design step, we developed a siRNA construction tool, which can find possible siRNAs for a given gene sequence and calculate their gene silencing probability. It consists of the three modules siRNAs for RNAi, siRNA, and check siRNA. Obtained siRNAs are perfectly compatible with our siRNA vector system. To the best of our knowledge, this is the first tool dedicated to predicting customized siRNA for the application in prokaryotes. This Python tool comes in two versions: a command line application and an easy-to-use graphical interface.
+
In our project, we introduced RNA interference (RNAi) and translation repression with small interfering RNAs (siRNAs) as an alternative to CRISPR/Cas. To use siRNA as silencing agents for the gene-of -interest, we proposed a two-step design process. At first, potential siRNAs for prokaryotic organisms must be designed. In the second step, the silencing effect of these siRNAs can be validated by our siRNA vector system <a href="https://2018.igem.org/Team:Bielefeld-CeBiTec/siRNA">TACE</a>. To facilitate the initial siRNA design step, we developed a siRNA construction tool which identifies possible siRNAs for a given gene sequence, calculates their probability to silence the target gene, and returns candidates ranked based on the calculated score. It consists of three modules: "siRNAs for RNAi", "siRNA", and "check siRNA". The siRNAs predicted by our software are perfectly compatible with our siRNA vector system. To the best of our knowledge, this is the first tool dedicated to predicting customized siRNA for application in prokaryotes. This Python tool comes in two versions: a command line application and an easy-to-use graphical interface.
 
</article>
 
</article>
  
<h2>siRNAS short introduction</h2>
+
<a name="intro" id="intro" class="shifted-anchor"></a>
 +
<h2>siRNAs short introduction</h2>
  
 
<article>
 
<article>
siRNAs are small single- or double-stranded RNAs with an average length of 21-25 nucleotides. They are non-coding RNAs which can bind a specific complementary coding mRNA and silence its function. During eukaryotic RNAi siRNAs are loaded to Argonaute proteins, which carry out the repression, either by blocking mRNA translation or by degrading the mRNA (Siomi and Siomi, 2009). More about the siRNAs and the mechanisms are found <a href="https://2018.igem.org/Team:Bielefeld-CeBiTec/siRNA">here.</a>
+
siRNAs are small, non-coding single-stranded RNAs with an average length of 21-25 nucleotides which bind a specific complementary coding mRNA and silence its function. In eukaryotic RNAi, siRNAs are loaded to Argonaute proteins which carry out the repression, either by blocking mRNA translation or by degrading the mRNA (Siomi and Siomi, 2009). More detailed information on both possible siRNAs mechanisms is found <a href="https://2018.igem.org/Team:Bielefeld-CeBiTec/siRNA">here.</a>  
 
</article>
 
</article>
  
 +
<a name="overh" id="overh" class="shifted-anchor"></a>
 +
<h2>siRNA design</h2>
 +
 +
<article>
 +
In order to achieve effective gene silencing or knock-down, the 19 nt binding sequence must be flanked by special, non-binding 5' and 3' extensions (Figure 1). To trigger mRNA degradation by RNase E, the 5’-terminal triphosphate of the siRNA needs to be converted to a monophosphate by RNA pyrophosphohydrolase (RppH). For the siRNA to be recognized by RppH, the 5’ end of the siRNA has to start with the tetranucleotide AGNN which is not allowed to match the targeted mRNA (Foley et al., 2015). At the 3’ end of the siRNA, the small MicC scaffold is added which facilitates the hybridization of siRNA and target mRNA and protects the siRNA from degradation (Na et al., 2013).
 +
</article>
 +
 +
<figure role="group">
 +
                      <img class="figure hundred" src="https://static.igem.org/mediawiki/2018/4/46/T--Bielefeld-CeBiTec--RNAi_scaffolds_new2.png">
 +
                      <figcaption>
 +
                          <b>Figure 1:</b> Effects of siRNA design on RNAi effectiveness and siRNA stability. <b>A</b> If the siRNA does not carry suitable 5' or 3' extensions, it is quickly degraded. <b>B</b> siRNAs extended by the tetranucleotide AGNN are recognized and processed by the pyrophosphohydrolase RppH. This enzyme converts the 5' triphosphate to a monophosphate which greatly reduces siRNA degradation. This allows the siRNA to hybridize to its target mRNA which in turn is degraded by RNAse E, thus leading to effective mRNA silencing. <b>C</b> Extending siRNAs with a 3' MicC scaffold in addition to the 5' tetranucleotide AGNN further enhances mRNA silencing. MicC facilitates the hybridization of siRNA and target mRNA and protects the siRNA from degradation.
 +
                      </figcaption>
 +
                  </figure>
 +
 +
 +
<article>
 +
In addition to degradation-based RNAi, siRNA can also be used to block mRNAs without degradation. This is achieved by adding the outer membrane protein A (OmpA) scaffold to the 5' end of the siRNA (Figure 2), enhancing its stability. In addition, the hybridization of the siRNA and the target mRNA can be facilitated by addition of MicC to the 3' terminus.
 +
</br>
 +
Both sequence extensions are also part of our vector system, enabling efficient design and construction of effective siRNAs. If our vector system is selected when using our tool, the fitting overlaps to our vectors are added automatically. More theoretical information about the overhangs and scaffolds can be found <a href="https://2018.igem.org/Team:Bielefeld-CeBiTec/siRNA">here</a>.
 +
</article>
 +
 +
<figure role="group">
 +
                      <img class="figure sixty" src="https://static.igem.org/mediawiki/2018/4/4c/T--Bielefeld-CeBiTec--siRNA_scaffolds_new_vk_2.png">
 +
                      <figcaption>
 +
                          <b>Figure 2:</b> siRNA design for silencing translation. <b>A</b> If the siRNA does not carry suitable 5' or 3' extensions, it is quickly degraded. <b>B</b> siRNAs supplemented with the outer membrane protein A (OmpA) scaffold are more stable and effectively silence the translation of target mRNAs. <b>C</b> If the siRNA is supplemented with the OmpA as well as the MicC scaffold the repression is enhanced further. </figcaption>
 +
                  </figure>
 +
 +
 +
<a name="dm" id="dm" class="shifted-anchor"></a>
 
<h2>Choosing appropriate design methods</h2>
 
<h2>Choosing appropriate design methods</h2>
  
 
<article>
 
<article>
In 2012 the <a href="https://2012.igem.org/Team:SYSU-Software/Models#pp2">iGEM team SYSU-Software</a> integrated an siRNA cDNA designer as a small part in their project. siRNAs designed with this tool were applicable in eukaryotic organisms. They included two different design methods: Tom Tuschl’s method and Rational siRNA design.
+
In 2012, the <a href="https://2012.igem.org/Team:SYSU-Software/Models#pp2">iGEM team SYSU-Software</a> integrated an siRNA cDNA designer as a small part of their project. siRNAs designed with this tool were applicable in eukaryotic organisms. They included two different design methods: Tom Tuschl’s method and Rational siRNA design.
 +
</br>
 +
In the following as well as in our software tool siRCon, nucleotide sequences exclusively contain the letter 'T' for sake of simplicity. Please note that in the case of RNA, the corresponding base is uracil.
 
</article>
 
</article>
  
Line 70: Line 150:
 
                       <img class="figure sixty" src="https://static.igem.org/mediawiki/2018/c/c6/T--Bielefeld-CeBiTec--Tom_Tuschl_small_vk.png">
 
                       <img class="figure sixty" src="https://static.igem.org/mediawiki/2018/c/c6/T--Bielefeld-CeBiTec--Tom_Tuschl_small_vk.png">
 
                       <figcaption>
 
                       <figcaption>
                           <b>Figure 1:</b> Structure of an siRNA designed with Tom Tuschl's method. Both siRNA have a characteristic 'TT' overhang at the 3'-terminus.  
+
                           <b>Figure 3:</b> Structure of an siRNA designed with Tom Tuschl's method. Both siRNA have a characteristic 'TT' overhang at the 3'-terminus (Elbashir et al., 2001).  
 
                       </figcaption>
 
                       </figcaption>
 
                   </figure>
 
                   </figure>
  
 
<article>
 
<article>
Tom Tuschl’s method focuses mainly on the existence of 5’ and 3’ ‘TT’ overhangs (Figure X)(Elbashir et al., 2001). These are not compatible with overhangs and scaffold sequences necessary for the prokaryotic mechanisms. Therefore, we decided to use the Ui-Tei rules as an alternative design method (Naito and Ui-Tei, 2012). Furthermore, we adapted the Rational siRNA design since it was more suitable for our application (Reynolds et al., 2004). Both design rules apply only to the 19nt long target binding sequence.
+
Tom Tuschl’s method focuses mainly on the existence of 5’ and 3’ ‘TT’ overhangs (Figure 3) (Elbashir <i>et al.</i>, 2001). These are not compatible with overhangs and scaffold sequences required by the prokaryotic mechanisms. Therefore, we decided to use the rules published by Ui-Tei as an alternative design method (Naito and Ui-Tei, 2012). Furthermore, we adapted the rational siRNA design as it was more suitable for our application (Reynolds <i>et al.</i>, 2004). Both design rules apply only to the 19 nt long target binding sequence.
 
</article>
 
</article>
  
 
+
<a name="rat" id="rat" class="shifted-anchor"></a>
 
<h2>Rational siRNA design</h2>
 
<h2>Rational siRNA design</h2>
  
 
<article>
 
<article>
By a systematic analysis of 180 eukaryotic siRNAs Reynolds et al. identified eight criteria that are important for ther functionality (Reynolds et al., 2004). Each criterion gets a sciore that can be either positive or negative, corresponding to its effect on the siRNA. All siRNA candidates that have a score above six are potential high functional siRNAs.  
+
By a systematic analysis of 180 eukaryotic siRNAs, Reynolds <i>et al.</i> identified eight criteria that are important for their functionality (Reynolds <i>et al.</i>, 2004). Each criterion gets a score that is either positive or negative, corresponding to its effect on the siRNA. All siRNA candidates with a score above six are potential highly functional siRNAs.  
 
</article>
 
</article>
  
 
<table id="t01" class="centern" style="margin-top:30px; margin-bottom:30px;">
 
<table id="t01" class="centern" style="margin-top:30px; margin-bottom:30px;">
  <caption style="line-height:1.5; text.align:left;"><b>Table 1:</b>Rational siRNA design criteria with corresponding score</caption>
+
  <caption style="line-height:1.5; text.align:left;"><b>Table 1:</b>Rational siRNA design criteria with corresponding score (Reynolds et al., 2004)</caption>
 
  <tr>
 
  <tr>
 
<th>Rule</th>
 
<th>Rule</th>
Line 96: Line 176:
 
  </tr>
 
  </tr>
 
  <tr>
 
  <tr>
<td>At least 3 'A/U' bases at positions 15-19</td>
+
<td>At least 3 'W' ('A' or 'T') at positions 15-19</td>
<td>+1 (for each 'A/U' base)</td>
+
<td>+1 (for each 'A' or 'T')</td>
 
  </tr>
 
  </tr>
 
  <tr>
 
  <tr>
Line 104: Line 184:
 
  </tr>
 
  </tr>
 
  <tr>
 
  <tr>
<td>An 'A' base at position 3</td>
+
<td>An 'A' at position 3</td>
 
<td>+1</td>
 
<td>+1</td>
 
  </tr>
 
  </tr>
 
  <tr>
 
  <tr>
<td>An 'A' base at position 19</td>
+
<td>An 'A' at position 19</td>
 
<td>+1</td>
 
<td>+1</td>
 
  </tr>
 
  </tr>
 
  <tr>
 
  <tr>
<td>An 'U' base at position 19</td>
+
<td>A 'T' at position 19</td>
 
<td>+1</td>
 
<td>+1</td>
 
  </tr>
 
  </tr>
 
  <tr>
 
  <tr>
<td>A base other than 'G' or 'C' at 19</td>
+
<td>An 'A' or 'T' at position 19</td>
 
<td>-1</td>
 
<td>-1</td>
 
  </tr>
 
  </tr>
 
  <tr>
 
  <tr>
<td>A base other than 'G' at position 13</td>
+
<td>An 'A', 'C' or 'T' at position 13</td>
 
<td>-1</td>
 
<td>-1</td>
 
  </tr>
 
  </tr>
Line 127: Line 207:
  
 
<article>
 
<article>
The melting Temperature Tm is calculated as followed (Kibbe, 2007):
+
The melting temperature Tm is calculated as follows (Kibbe, 2007):
  
 
$$ T_m = 79.8 + (18.5 * log_{10}[Na^+]) + (58.4 * [\text{G/C content}]) \\+ (11.8 * [\text{G/C content}]^2) - \left(\frac{820}{\text{[G/C content]}}\right)$$
 
$$ T_m = 79.8 + (18.5 * log_{10}[Na^+]) + (58.4 * [\text{G/C content}]) \\+ (11.8 * [\text{G/C content}]^2) - \left(\frac{820}{\text{[G/C content]}}\right)$$
Line 134: Line 214:
 
</article>
 
</article>
  
 
+
<a name="ui" id="ui" class="shifted-anchor"></a>
 
<h2>Ui-Tei rule</h2>
 
<h2>Ui-Tei rule</h2>
  
 
<article>
 
<article>
Ui-Tei et al analyzed 62 eukaryotic siRNAs and identified four design rules for effective siRNAs (Ui-Tei, 2004). Only siRNAs fulfilling all four criteria are considered functional siRNAs.
+
Ui-Tei <i>et al.</i> analyzed 62 eukaryotic siRNAs and identified four design rules for effective siRNAs (Ui-Tei, 2004). Only siRNAs fulfilling all four criteria are considered functional siRNAs.
 
</article>
 
</article>
  
Line 144: Line 224:
 
<li>An ‘A’ or ‘T’ at position 19</li>
 
<li>An ‘A’ or ‘T’ at position 19</li>
 
<li>A ‘G’ or ‘C’ at position 1</li>
 
<li>A ‘G’ or ‘C’ at position 1</li>
<li>At least five ‘U’ or ‘A’ residues from positions 13 to 19</li>
+
<li>At least five ‘T’ or ‘A’ residues from positions 13 to 19</li>
<li>No ‘GC’ stretch more than 9nt long</li>
+
<li>No ‘GC’ stretch more than 9 nt long</li>
 
</ol>
 
</ol>
  
 +
<a name="prob" id="prob" class="shifted-anchor"></a>
 
<h2>Calculating silencing probability</h2>
 
<h2>Calculating silencing probability</h2>
  
 
<article>
 
<article>
Not only the sequences of possible effective siRNAs are to be determined and returned by the tool, but also the probability with which they are effective. This probability can be calculated with the help of Bayes’ theorem by calculating probabilities of dependent events. The following calculations and formular are based on Takasaki (2009).
+
Our software siRCon should report not only the sequences of potential effective siRNAs, but also rank them based on the probability with which they are effective. This is calculated with the help of Bayes’ theorem by calculating probabilities of dependent events. The following calculations and formulas are based on Takasaki (2009).
 
</br>
 
</br>
The initial hypothesis is that the given siRNA effectively silences an mRNA. To perform the calculations a prior probability is necessary. The prior probability for effective gene silencing of mammalian genes can be obtained from former siRNA experiments and is approximately 0.1 (Takasaki, 2009). Since we have no data on prokaryotic siRNAs, we use the same prior probability for our prediction. </br>
+
TThe initial hypothesis is that the given siRNA effectively silences an mRNA. To perform the calculations, a prior probability is necessary. The prior probability for effective gene silencing of mammalian genes can be obtained from former siRNA experiments and is approximately 0.1 (Takasaki, 2009). Since we have no data on prokaryotic siRNAs, we use the same prior probability for our predictions. </br>
The gene silencing probability can be described as:
+
The gene silencing probability \(P(eff|X)\) is described as:
  
  
  
$$ P(eff|X) = \frac{P^{eff} P(X|eff)}{P(X)}  \qquad (1)$$   
+
$$ P(eff|X) = \frac{P^{eff} P(X|eff)}{P^{eff} P(X|eff) + P^{inf} P(X|inf)}  \qquad (1)$$   
  
 
+
The 19 nt siRNA binding sequence is represented by X, where \(x_i^n\) corresponds to the bases adenine, guanine, cytosine or uracil (indexes 1&le;n&le;4) at sequence position i. The probabilities P(X|eff) and P(X|inf) are calculated based on prior knowledge about siRNA sequences that were shown to be effective respectively ineffective in silencing their target mRNAs. Based on the analysis of 833 effective and 847 ineffective siRNAs, Takasaki et al. determined the likelyhood with which base n occures at position i in an effective/ineffective siRNA sequences, represented by the coefficients \(q_{x_i^n}^{eff}\) and \(q_{x_i^n}^{inf}\) respectively (Takasaki, 2009). These coeffecients are often referred to as frequency ratios of n at position i.
 
+
\(P^{eff}\) is the prior probability 0.1 as mentioned above. The siRNA sequence is represented by \(X\), where \(X_1, X_2 ... X_n\) belong to the possible nucleotides adenine, guanine, cytosine and thymine. As \(P(X|eff)\) is the probability, that the given siRNA sequence will effectively silence if the nucleotides belong to the frequent nucleotides of common effective siRNAs, it is computed as the product of the probabilities that a particular nucleotide is located at a particular position of the siRNA:
+
$$ P(X|eff) = \prod_{i=1}^{19} q_{x_i^n}^{eff} \qquad (2)$$
+
 
</article>
 
</article>
  
 
<article>
 
<article>
\(q_{x_i^n}^{eff}\) indicates how likely the occurrence of base \(n\) is at position \(i\) based on known effective siRNAs. It can also be called frequency ratio of \(n\) at position \(i\). The last element \(P(X)\) of formula \((1)\) is the possibility that \(X\) will effectively silence the target sequence. It is the sum of the probability that \(X\) is effective if its nucleotides are found in effective siRNAs plus the probability that \(X\) is effective if its nucleotides are found in ineffective siRNAs. Both probabilities are weighted with the prior probabilities \(P^{eff}\) and \(P^{inf} = 1-P^{eff}\).
+
\(P(X|eff)\) and \(P(X|inf)\) are computed as the product of the frequency ratios for each base n at position i in the siRNA binding sequence:
$$ P(X) = P^{eff} P(X|eff)+P^{inf} P(X|inf) \qquad  (3)$$
+
</article>
+
  
<article>
+
$$ P(X|inf) = \prod_{i=1}^{19} q_{x_i^n}^{eff} \qquad  (2)$$
\(P(X|inf)\) is calculated similar to \(P(X|eff)\) and is the probability that \(X\) will effectively silence if the nucleotides belong to the frequent nucleotides of common ineffective siRNAs.
+
$$ P(X|inf) = \prod_{i=1}^{19} q_{x_i^n}^{inf} \qquad  (3)$$
 
+
$$ P(X|inf) = \prod_{i=1}^{19} q_{x_i^n}^{inf} \qquad  (4)$$
+
 
</article>
 
</article>
 
  
  
 
<article>
 
<article>
In this case, \(q_{x_i^n}^{eff}\) indicates how likely the occurrence of base \(n\) is at position \(i\) based on known ineffective siRNAs.
+
Both probabilities are weighted with their prior probabilities, \(P^{eff}\) and \(P^{inf} = 1-P^{eff}\), where \(P^{eff}\) is set to 0.1 as mentioned previously. With all defined formulas (1), (2) and (3), the gene silencing probability \(P(eff|X)\) is calculated as follows:
</article>
+
 
+
<article>
+
With all defined formulas \((2)\),\((3)\) and \((4)\), formula \((1)\) can now be calculated as follows:
+
  
 
$$P(eff|X) = \frac{P^{eff} P(X|eff)}{P^{eff} P(X|eff)+P^{inf} P(X|inf)} \\\\= \frac{P^{eff} \prod_{i=1}^{19} q_{x_i^n}^{eff}}{P^{eff} \prod_{i=1}^{19} q_{x_i^n}^{eff}+P^{inf} \prod_{i=1}^{19} q_{x_i^n}^{inf}} $$
 
$$P(eff|X) = \frac{P^{eff} P(X|eff)}{P^{eff} P(X|eff)+P^{inf} P(X|inf)} \\\\= \frac{P^{eff} \prod_{i=1}^{19} q_{x_i^n}^{eff}}{P^{eff} \prod_{i=1}^{19} q_{x_i^n}^{eff}+P^{inf} \prod_{i=1}^{19} q_{x_i^n}^{inf}} $$
 
 
</article>
 
</article>
  
 
<article>
 
<article>
In order to actually calculate the silencing probability, only the frequency ratios \(q_{x_i^n}^{eff}\) and \(q_{x_i^n}^{inf}\) of the individual nucleotides at positions 1 to 19 are missing. These could be taken from the same publication from Takasaki as the calculations.  
+
In order to actually calculate the silencing probability, only the frequency ratios \(q_{x_i^n}^{eff}\) and \(q_{x_i^n}^{inf}\) of the individual nucleotides at positions 1 to 19 are missing. These could be taken from the same publication from Takasaki as the calculations (Takasaki, 2009).  
 
</article>
 
</article>
  
 
<article>
 
<article>
For the frequency ratios 833 effective and 847 ineffective siRNAs from previous publications were analyzed. For each nucleotide, the probability of occurrence was determined for each position of the siRNA. Different models were taken into account in the calculation. First of all, the occurrence of the different nucleotides at positions 1 to 19 can be considered independently. The probabilities for each position are then calculated independently. However, the occurrences of the nucleotides can also be considered dependently. This means the occurrence of a nucleotide depends on the nucleotide at the position before. For the calculation of dependent probabilities, the Simple Markow Model was used. It has been found that the resulting silence probability is most accurate when the frequency ratios of the effective siRNAs are calculated dependent and the frequency ratios of the ineffective siRNAs are calculated independent. All frequency ratios can be looked up <a href="https://static.igem.org/mediawiki/2018/5/51/T--Bielefeld-CeBiTec--frequency_ratios_vk.pdf" style="padding-right:0; margin-right:0;">here</a>.
+
For each nucleotide, the probability of occurrence was determined for each position of the siRNA. Different models were taken into account in the calculation. First of all, the occurrence of the different nucleotides at positions 1 to 19 can be considered as independent of each other. The probabilities for each position are then calculated independently. However, the occurrences of the nucleotides can also be considered as dependent of each other. This means that the occurrence of a nucleotide depends on the nucleotide at the position before. For the calculation of dependent probabilities, the Simple Markow Model was used. It has been found that the resulting silence probability is most accurate when the frequency ratios of the effective siRNAs are calculated dependent and the frequency ratios of the ineffective siRNAs are calculated independent. All frequency ratios can be looked up
 +
<a href="https://static.igem.org/mediawiki/2018/5/51/T--Bielefeld-CeBiTec--frequency_ratios_vk.pdf" style="padding-right:0; margin-right:0;">here</a>.
 
</br>
 
</br>
Together with the frequency ratios it is now possible to calculate the silencing probability for the 19 bp long binding site of siRNAs.  
+
In combination with the frequency ratios it is now possible to calculate the silencing probability for the 19 bp long binding site of siRNAs.
 
</article>
 
</article>
  
<h2>siRNA overhangs and scaffolds</h2>
 
  
<article>
 
In order to achieve effective gene silencing or knockdown, the 19 nt binding sequence must be supplemented with overhangs. There are different sequences that can be added to the binding sequence for different functionalities.  </article>
 
  
<figure role="group">
 
                      <img class="figure sixty" src="https://static.igem.org/mediawiki/2018/1/14/T--Bielefeld-CeBiTec--RNAI_scaffold_vk.png">
 
                      <figcaption>
 
                          <b>Figure 2:</b> Possible siRNA constructs for RNAi.
 
                      </figcaption>
 
                  </figure>
 
<article>
 
In Figure 2, the scheme of the siRNAs after cunstruction is shown. To trigger the mRNA degradation by the RNase E, the 5’-terminal triphosphate of the siRNA is converted to a monophosphate by the RNA pyrophosphohydrolase (RppH). For the siRNA to be recognized by the RppH, the 5’ end of the siRNA have to start with the nucleotides adenine and guanine. Furthermore, the nucleotides at position three and four are not allowed to match with the target mRNA(Foley et al., 2015). At the 3’ end of the siRNA the small MicC scaffold is added, which facilitates the hybridization of siRNA and target mRNA and protects the siRNA from degradation (Na et al., 2013).
 
</article>
 
  
<figure role="group">
 
                      <img class="figure sixty" src="https://static.igem.org/mediawiki/2018/5/57/T--Bielefeld-CeBiTec--siRNA_scaffolds_vk.png">
 
                      <figcaption>
 
                          <b>Figure 3:</b> Possible siRNA constructs for gene silencing.
 
                      </figcaption>
 
                  </figure>
 
  
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
<a name="srnai" id="srnai" class="shifted-anchor"></a>
 +
<h2>siRNA selection for RNAi and repression of translation</h2>
 
<article>
 
<article>
Figure 3 shows the scheme of a siRNA that should only silence the mRNA target. To achieve a higher stability of the siRNA, the outer membrane protein (Omp) A scaffold is added at the 5’ end. In addition, the hybridization of the siRNA and the target mRNA should be facilitated by MicC again.
+
The procedures of siRNA selection for both mechanisms, RNAi and repression of translation, are very similar. Thus the first two modules, RNAi and siRNA, are similar. First the mRNA binding sequence is determined using the rational design and the Ui-Tei rules. In the next step, the silencing probability is determined. At the end, the corresponding overhangs and scaffolds are added to the 19 nt long binding sequence to form the mature siRNA.
</br>
+
 
These overhang and scaffold sequences are also part of our vector system. If our vector system is selected when using our tool, the fitting overlaps to our vectors are added automatically. More theoretical information about the overhangs and scaffolds can be found <a href="">here</a>.
+
 
</article>
 
</article>
  
 +
<a name="check" id="check" class="shifted-anchor"></a>
 
<h2>Check siRNA</h2>
 
<h2>Check siRNA</h2>
 
<article>
 
<article>
Beside the construction of siRNAs, we also implemented a check siRNA functionality. For a given target sequence and a corresponding siRNA it is checked whether the siRNA might bind to its target and how well it fulfills the described criteria’s. Furthermore, its silence effectivity is calculated.
+
Beside the selection of siRNAs, we also implemented a functionality to check siRNAs derived by other methods. For a given target sequence and a corresponding siRNA it is checked whether the siRNA might bind to its target and how well it fulfills the described criteria. Furthermore, its silencing efficiency is calculated.
 
</article>
 
</article>
  
 +
<a name="coma" id="coma" class="shifted-anchor"></a>
 
<h2>Command line application</h2>
 
<h2>Command line application</h2>
  
<article>
+
<div class="article">
The command line application can be obtained directly <a href="https://static.igem.org/mediawiki/2018/9/9a/T--Bielefeld-CeBiTec--siRCon_1_1_all_versions.zip" style="padding-right:0;">here</a> or downloaded from our <a href="https://github.com/iGEMBielefeldCeBiTec/iGEM_Bielefeld_CeBiTec_2018/releases" style="padding-right:0;">GitHub repository.</a> For the execution of this command line application Python 2.7 needs to be installed.
+
The command line application can be obtained directly <a href="https://static.igem.org/mediawiki/2018/9/9a/T--Bielefeld-CeBiTec--siRCon_1_1_all_versions.zip" style="padding-right:0;">here</a> or downloaded from our <a href="https://github.com/iGEMBielefeldCeBiTec/iGEM_Bielefeld_CeBiTec_2018/releases" style="padding-right:0;">GitHub&nbsp;repository.</a> To run the command line application, Python 2.7 needs to be installed.
</article>
+
</div>
  
 
<figure role="group">
 
<figure role="group">
                       <img class="figure seventy" src="https://static.igem.org/mediawiki/2018/c/c5/T--Bielefeld-CeBiTec--help_commandline_vk.png" style="width:120%">
+
                       <img class="figure seventy" src="https://static.igem.org/mediawiki/2018/c/c5/T--Bielefeld-CeBiTec--help_commandline_vk.png" style="width:100%">
 
                       <figcaption style="padding-top:10px;">
 
                       <figcaption style="padding-top:10px;">
                           <b>Figure 4:</b> Instruction on how to use the command line application.
+
                           <b>Figure 4:</b> Help message on how to use the command line application.
 
                       </figcaption>
 
                       </figcaption>
 
                   </figure>
 
                   </figure>
  
 
<article>
 
<article>
An overview of the necessary and optional arguments gives Figure 4. For more information a README is available in our repository. Figure X shows the execution of the tool with a GFP gene sequence.
+
Used without input, a help message is displayed listing the mandatory and optional input parameters (Figure 4). For more information a README is available in our repository.  
All resulting siRNAs are saved in one FASTA file. This simplifies the integration into different workflows. For example, it is possible to test the siRNAs on off-target bindings site using Blast. An exemplary call of the application can be seen in Figure 5.
+
All resulting siRNAs are saved in one FASTA file. This simplifies the integration into different workflows. For example, it is possible to test the siRNAs on off-target bindings site using BLAST. An exemplary call of the application as well as the results returned can be seen in Figure 5.
 
</article>
 
</article>
  
 
<figure role="group">
 
<figure role="group">
                       <img class="figure seventy" src="https://static.igem.org/mediawiki/2018/c/c5/T--Bielefeld-CeBiTec--help_commandline_vk.png" style="width:120%">
+
                       <img class="figure seventy" src="https://static.igem.org/mediawiki/2018/5/58/T--Bielefeld-CeBiTec--siRCon_ausgabe_vk.svg" style="width:100%">
 
                       <figcaption style="padding-top:10px;">
 
                       <figcaption style="padding-top:10px;">
                           <b>Figure 5:</b> Exemplaric call of the command line application.
+
                           <b>Figure 5:</b> Exemplary call and results of the command line application using a GFP gene sequence as input.
 
                       </figcaption>
 
                       </figcaption>
 
                   </figure>
 
                   </figure>
  
 +
<a name="grai" id="grai" class="shifted-anchor"></a>
 
<h2>Graphical Interface usage</h2>
 
<h2>Graphical Interface usage</h2>
  
<article>
+
<div class="article">
As the command line application, the graphical interface version can either be downloaded directly <a href="https://static.igem.org/mediawiki/2018/9/9a/T--Bielefeld-CeBiTec--siRCon_1_1_all_versions.zip" style="padding-right:0;">here</a>, or via our <a href="https://github.com/iGEMBielefeldCeBiTec/iGEM_Bielefeld_CeBiTec_2018/releases" style="padding-right:0; margin-right:0;">GitHub repository.</a>
+
Like the command line application, the graphical interface version can either be downloaded directly <a href="https://static.igem.org/mediawiki/2018/9/9a/T--Bielefeld-CeBiTec--siRCon_1_1_all_versions.zip" style="padding-right:0;">here</a>, or via our <a href="https://github.com/iGEMBielefeldCeBiTec/iGEM_Bielefeld_CeBiTec_2018/releases" style="padding-right:0; margin-right:0;">GitHub&nbsp;repository.</a>
In the graphical interface, the modules are divided into different tabs (Figure 6). The last tab contains usage and copyright information.
+
In the graphical interface, the modules are accessible via tabs (Figure 6). The last tab contains usage and copyright information.
</article>
+
</div>
  
 
<figure role="group">
 
<figure role="group">
                       <img class="figure sixty" src="https://static.igem.org/mediawiki/2018/1/1f/T--Bielefeld-CeBiTec--tabs_siRCon_vk.png" style="width:120%">
+
                       <img class="figure sixty" src="https://static.igem.org/mediawiki/2018/1/1f/T--Bielefeld-CeBiTec--tabs_siRCon_vk.png" style="width:100%">
 
                       <figcaption style="padding-top_10px;">
 
                       <figcaption style="padding-top_10px;">
                           <b>Figure 6:</b> The different modules are diveded into different tabs.
+
                           <b>Figure 6:</b> The different modules are accessible via tabs.
 
                       </figcaption>
 
                       </figcaption>
 
                   </figure>
 
                   </figure>
  
  
<h2>1. siRNA for RNAi</h2>
+
<h3>Tab 1: siRNA for RNAi</h3>
  
<ol>
+
<ol style="font-size:16px; line-height:1.5em; padding-left:5%; padding-bottom:10px;">
 
<li>Insert gene sequence</li>
 
<li>Insert gene sequence</li>
<li>Choose Tace vector system (optionally)</li>
+
<li>Choose TACE vector system (optionally)</li>
 
<li>Constructions of siRNAs</li>
 
<li>Constructions of siRNAs</li>
 
<li>View resulting siRNAs (sense and antisense sequence) and their corresponding probability</li>
 
<li>View resulting siRNAs (sense and antisense sequence) and their corresponding probability</li>
<li>Decide if siRNAs should be saved with MicC scaffold (only if Tace is not used)</li>
+
<li>Decide if siRNAs should be saved with MicC scaffold (only if TACE is not used)</li>
 
<li>Save results as FASTA file</li>
 
<li>Save results as FASTA file</li>
 
</ol>
 
</ol>
  
 
<figure role="group">
 
<figure role="group">
                       <img class="figure hundred" src="https://static.igem.org/mediawiki/2018/9/92/T--Bielefeld-CeBiTec--RNAi_overview_vk.png" style="width:120%">
+
                       <img class="figure hundred" src="https://static.igem.org/mediawiki/2018/9/92/T--Bielefeld-CeBiTec--RNAi_overview_vk.png" style="width:100%">
 
                       <figcaption style="padding-top:10px;">
 
                       <figcaption style="padding-top:10px;">
 
                           <b>Figure 7:</b> Overview and steps of the siRNA for RNAi module.
 
                           <b>Figure 7:</b> Overview and steps of the siRNA for RNAi module.
Line 291: Line 355:
 
                   </figure>
 
                   </figure>
  
<h2>2. siRNA for silencing</h2>
+
<h3>Tab 2: siRNA for silencing</h3>
  
<ol>
+
<ol style="font-size:16px; line-height:1.5em; padding-left:5%; padding-bottom:10px;">
 
<li>Insert gene sequence</li>
 
<li>Insert gene sequence</li>
<li>Choose Tace vector system (optionally)</li>
+
<li>Choose TACE vector system (optionally)</li>
 
<li>Constructions of siRNAs</li>
 
<li>Constructions of siRNAs</li>
 
<li>View resulting siRNAs (sense and antisense sequence) and their corresponding probability</li>
 
<li>View resulting siRNAs (sense and antisense sequence) and their corresponding probability</li>
<li>Decide if siRNAs should be saved with MicC scaffold (only if Tace is not used)</li>
+
<li>Decide if siRNAs should be saved with MicC scaffold (only if TACE is not used)</li>
<li>Decide if siRNAs should be saved with OmpA scaffold (only if Tace is not used)</li>
+
<li>Decide if siRNAs should be saved with OmpA scaffold (only if TACE is not used)</li>
 
<li>Save results as FASTA file</li>
 
<li>Save results as FASTA file</li>
 
</ol>
 
</ol>
  
 
<figure role="group">
 
<figure role="group">
                       <img class="figure hundred" src="https://static.igem.org/mediawiki/2018/3/33/T--Bielefeld-CeBiTec--siRNAmodule_Tool_overview_vk.png" style="width:120%">
+
                       <img class="figure hundred" src="https://static.igem.org/mediawiki/2018/3/33/T--Bielefeld-CeBiTec--siRNAmodule_Tool_overview_vk.png" style="width:100%">
 
                       <figcaption style="padding-top:10px;">
 
                       <figcaption style="padding-top:10px;">
 
                           <b>Figure 8:</b> Overview and steps of the siRNA for silencing module.
 
                           <b>Figure 8:</b> Overview and steps of the siRNA for silencing module.
Line 310: Line 374:
 
                   </figure>
 
                   </figure>
  
<h2>3. Check siRNA</h2>
+
<h2>Tab 3: Check siRNA</h2>
  
<ol>
+
<ol style="font-size:16px; line-height:1.5em; padding-left:5%; padding-bottom:10px;">
 
<li>Insert gene sequence</li>
 
<li>Insert gene sequence</li>
 
<li>Insert siRNA sequences</li>
 
<li>Insert siRNA sequences</li>
 
<li>Choose method the siRNA was constructed for (siRNA for RNAi or siRNA for silencing)</li>
 
<li>Choose method the siRNA was constructed for (siRNA for RNAi or siRNA for silencing)</li>
<li>Choose if siRNA was constructed for Tace (optionally)</li>
+
<li>Choose if siRNA was constructed for TACE (optionally)</li>
 
<li>Validation of entered siRNA for given target gene sequences</li>
 
<li>Validation of entered siRNA for given target gene sequences</li>
 
<li>View results</li>
 
<li>View results</li>
Line 323: Line 387:
  
 
<figure role="group">
 
<figure role="group">
                       <img class="figure hundred" src="https://static.igem.org/mediawiki/2018/7/74/T--Bielefeld-CeBiTec--check_siRNA_vk.png" style="width:120%">
+
                       <img class="figure hundred" src="https://static.igem.org/mediawiki/2018/7/74/T--Bielefeld-CeBiTec--check_siRNA_vk.png" style="width:100%">
 
                       <figcaption style="padding-top:10px;">
 
                       <figcaption style="padding-top:10px;">
 
                           <b>Figure 9:</b> Overview and steps of the check siRNA module.
 
                           <b>Figure 9:</b> Overview and steps of the check siRNA module.
Line 330: Line 394:
  
  
 
+
<a name="outlo" id="outlo" class="shifted-anchor"></a>
 
<h2>Outlook</h2>
 
<h2>Outlook</h2>
  
 
<article>
 
<article>
To help future iGEM teams to control gene expression, we developed siRCon, a bioinformatic application for generation of high-fidelity siRNA sequences in prokaryotic organisms. We introduce this method as an alternative to CRISPR/Cas, since it is open source and free of charge.
+
To help future iGEM teams to control gene expression, we developed siRCon, a bioinformatic application to generate high-fidelity siRNA sequences in prokaryotic organisms. We introduce this method as an alternative to CRISPR/Cas, since it is open source and free of charge.
 
In the future, further improvements and extensions of this applications are intended. On the one side, eukaryotic siRNAs will also be constructed. This is how we want to provide a universal tool for siRNAs. On the other side, we want to improve the already existing features, especially the check siRNA functionality.
 
In the future, further improvements and extensions of this applications are intended. On the one side, eukaryotic siRNAs will also be constructed. This is how we want to provide a universal tool for siRNAs. On the other side, we want to improve the already existing features, especially the check siRNA functionality.
 
</article>
 
</article>
Line 358: Line 422:
  
 
<b>Takasaki, S. (2009).</b> Selecting effective siRNA target sequences by using Bayes’ theorem. Computational Biology and Chemistry 33: 368–372. <br>
 
<b>Takasaki, S. (2009).</b> Selecting effective siRNA target sequences by using Bayes’ theorem. Computational Biology and Chemistry 33: 368–372. <br>
 +
 +
<b>Ui-Tei, K., Naito, Y., Takahashi, F., Haraguchi, T., Ohki-Hamazaki, H., Juni, A., Ueda, R. and Saigo, K. (2004).</b> Guidelines for the selection of highly effective siRNA sequences for mammalian and chick RNA interference. Nucleic Acids Res. 32: 936-948. <br>
  
 
                
 
                

Latest revision as of 03:18, 18 October 2018

siRCon - A siRNA Constructor

Short Summary

In our project, we introduced RNA interference (RNAi) and translation repression with small interfering RNAs (siRNAs) as an alternative to CRISPR/Cas. To use siRNA as silencing agents for the gene-of -interest, we proposed a two-step design process. At first, potential siRNAs for prokaryotic organisms must be designed. In the second step, the silencing effect of these siRNAs can be validated by our siRNA vector system TACE. To facilitate the initial siRNA design step, we developed a siRNA construction tool which identifies possible siRNAs for a given gene sequence, calculates their probability to silence the target gene, and returns candidates ranked based on the calculated score. It consists of three modules: "siRNAs for RNAi", "siRNA", and "check siRNA". The siRNAs predicted by our software are perfectly compatible with our siRNA vector system. To the best of our knowledge, this is the first tool dedicated to predicting customized siRNA for application in prokaryotes. This Python tool comes in two versions: a command line application and an easy-to-use graphical interface.

siRNAs short introduction

siRNAs are small, non-coding single-stranded RNAs with an average length of 21-25 nucleotides which bind a specific complementary coding mRNA and silence its function. In eukaryotic RNAi, siRNAs are loaded to Argonaute proteins which carry out the repression, either by blocking mRNA translation or by degrading the mRNA (Siomi and Siomi, 2009). More detailed information on both possible siRNAs mechanisms is found here.

siRNA design

In order to achieve effective gene silencing or knock-down, the 19 nt binding sequence must be flanked by special, non-binding 5' and 3' extensions (Figure 1). To trigger mRNA degradation by RNase E, the 5’-terminal triphosphate of the siRNA needs to be converted to a monophosphate by RNA pyrophosphohydrolase (RppH). For the siRNA to be recognized by RppH, the 5’ end of the siRNA has to start with the tetranucleotide AGNN which is not allowed to match the targeted mRNA (Foley et al., 2015). At the 3’ end of the siRNA, the small MicC scaffold is added which facilitates the hybridization of siRNA and target mRNA and protects the siRNA from degradation (Na et al., 2013).
Figure 1: Effects of siRNA design on RNAi effectiveness and siRNA stability. A If the siRNA does not carry suitable 5' or 3' extensions, it is quickly degraded. B siRNAs extended by the tetranucleotide AGNN are recognized and processed by the pyrophosphohydrolase RppH. This enzyme converts the 5' triphosphate to a monophosphate which greatly reduces siRNA degradation. This allows the siRNA to hybridize to its target mRNA which in turn is degraded by RNAse E, thus leading to effective mRNA silencing. C Extending siRNAs with a 3' MicC scaffold in addition to the 5' tetranucleotide AGNN further enhances mRNA silencing. MicC facilitates the hybridization of siRNA and target mRNA and protects the siRNA from degradation.
In addition to degradation-based RNAi, siRNA can also be used to block mRNAs without degradation. This is achieved by adding the outer membrane protein A (OmpA) scaffold to the 5' end of the siRNA (Figure 2), enhancing its stability. In addition, the hybridization of the siRNA and the target mRNA can be facilitated by addition of MicC to the 3' terminus.
Both sequence extensions are also part of our vector system, enabling efficient design and construction of effective siRNAs. If our vector system is selected when using our tool, the fitting overlaps to our vectors are added automatically. More theoretical information about the overhangs and scaffolds can be found here.
Figure 2: siRNA design for silencing translation. A If the siRNA does not carry suitable 5' or 3' extensions, it is quickly degraded. B siRNAs supplemented with the outer membrane protein A (OmpA) scaffold are more stable and effectively silence the translation of target mRNAs. C If the siRNA is supplemented with the OmpA as well as the MicC scaffold the repression is enhanced further.

Choosing appropriate design methods

In 2012, the iGEM team SYSU-Software integrated an siRNA cDNA designer as a small part of their project. siRNAs designed with this tool were applicable in eukaryotic organisms. They included two different design methods: Tom Tuschl’s method and Rational siRNA design.
In the following as well as in our software tool siRCon, nucleotide sequences exclusively contain the letter 'T' for sake of simplicity. Please note that in the case of RNA, the corresponding base is uracil.
Figure 3: Structure of an siRNA designed with Tom Tuschl's method. Both siRNA have a characteristic 'TT' overhang at the 3'-terminus (Elbashir et al., 2001).
Tom Tuschl’s method focuses mainly on the existence of 5’ and 3’ ‘TT’ overhangs (Figure 3) (Elbashir et al., 2001). These are not compatible with overhangs and scaffold sequences required by the prokaryotic mechanisms. Therefore, we decided to use the rules published by Ui-Tei as an alternative design method (Naito and Ui-Tei, 2012). Furthermore, we adapted the rational siRNA design as it was more suitable for our application (Reynolds et al., 2004). Both design rules apply only to the 19 nt long target binding sequence.

Rational siRNA design

By a systematic analysis of 180 eukaryotic siRNAs, Reynolds et al. identified eight criteria that are important for their functionality (Reynolds et al., 2004). Each criterion gets a score that is either positive or negative, corresponding to its effect on the siRNA. All siRNA candidates with a score above six are potential highly functional siRNAs.
Table 1:Rational siRNA design criteria with corresponding score (Reynolds et al., 2004)
Rule Score
30%-52% G/C content +1
At least 3 'W' ('A' or 'T') at positions 15-19 +1 (for each 'A' or 'T')
Absence of internal repeats (\(T_m \lt 20\)) +1
An 'A' at position 3 +1
An 'A' at position 19 +1
A 'T' at position 19 +1
An 'A' or 'T' at position 19 -1
An 'A', 'C' or 'T' at position 13 -1
The melting temperature Tm is calculated as follows (Kibbe, 2007): $$ T_m = 79.8 + (18.5 * log_{10}[Na^+]) + (58.4 * [\text{G/C content}]) \\+ (11.8 * [\text{G/C content}]^2) - \left(\frac{820}{\text{[G/C content]}}\right)$$

Ui-Tei rule

Ui-Tei et al. analyzed 62 eukaryotic siRNAs and identified four design rules for effective siRNAs (Ui-Tei, 2004). Only siRNAs fulfilling all four criteria are considered functional siRNAs.
  1. An ‘A’ or ‘T’ at position 19
  2. A ‘G’ or ‘C’ at position 1
  3. At least five ‘T’ or ‘A’ residues from positions 13 to 19
  4. No ‘GC’ stretch more than 9 nt long

Calculating silencing probability

Our software siRCon should report not only the sequences of potential effective siRNAs, but also rank them based on the probability with which they are effective. This is calculated with the help of Bayes’ theorem by calculating probabilities of dependent events. The following calculations and formulas are based on Takasaki (2009).
TThe initial hypothesis is that the given siRNA effectively silences an mRNA. To perform the calculations, a prior probability is necessary. The prior probability for effective gene silencing of mammalian genes can be obtained from former siRNA experiments and is approximately 0.1 (Takasaki, 2009). Since we have no data on prokaryotic siRNAs, we use the same prior probability for our predictions.
The gene silencing probability \(P(eff|X)\) is described as: $$ P(eff|X) = \frac{P^{eff} P(X|eff)}{P^{eff} P(X|eff) + P^{inf} P(X|inf)} \qquad (1)$$ The 19 nt siRNA binding sequence is represented by X, where \(x_i^n\) corresponds to the bases adenine, guanine, cytosine or uracil (indexes 1≤n≤4) at sequence position i. The probabilities P(X|eff) and P(X|inf) are calculated based on prior knowledge about siRNA sequences that were shown to be effective respectively ineffective in silencing their target mRNAs. Based on the analysis of 833 effective and 847 ineffective siRNAs, Takasaki et al. determined the likelyhood with which base n occures at position i in an effective/ineffective siRNA sequences, represented by the coefficients \(q_{x_i^n}^{eff}\) and \(q_{x_i^n}^{inf}\) respectively (Takasaki, 2009). These coeffecients are often referred to as frequency ratios of n at position i.
\(P(X|eff)\) and \(P(X|inf)\) are computed as the product of the frequency ratios for each base n at position i in the siRNA binding sequence: $$ P(X|inf) = \prod_{i=1}^{19} q_{x_i^n}^{eff} \qquad (2)$$ $$ P(X|inf) = \prod_{i=1}^{19} q_{x_i^n}^{inf} \qquad (3)$$
Both probabilities are weighted with their prior probabilities, \(P^{eff}\) and \(P^{inf} = 1-P^{eff}\), where \(P^{eff}\) is set to 0.1 as mentioned previously. With all defined formulas (1), (2) and (3), the gene silencing probability \(P(eff|X)\) is calculated as follows: $$P(eff|X) = \frac{P^{eff} P(X|eff)}{P^{eff} P(X|eff)+P^{inf} P(X|inf)} \\\\= \frac{P^{eff} \prod_{i=1}^{19} q_{x_i^n}^{eff}}{P^{eff} \prod_{i=1}^{19} q_{x_i^n}^{eff}+P^{inf} \prod_{i=1}^{19} q_{x_i^n}^{inf}} $$
In order to actually calculate the silencing probability, only the frequency ratios \(q_{x_i^n}^{eff}\) and \(q_{x_i^n}^{inf}\) of the individual nucleotides at positions 1 to 19 are missing. These could be taken from the same publication from Takasaki as the calculations (Takasaki, 2009).
For each nucleotide, the probability of occurrence was determined for each position of the siRNA. Different models were taken into account in the calculation. First of all, the occurrence of the different nucleotides at positions 1 to 19 can be considered as independent of each other. The probabilities for each position are then calculated independently. However, the occurrences of the nucleotides can also be considered as dependent of each other. This means that the occurrence of a nucleotide depends on the nucleotide at the position before. For the calculation of dependent probabilities, the Simple Markow Model was used. It has been found that the resulting silence probability is most accurate when the frequency ratios of the effective siRNAs are calculated dependent and the frequency ratios of the ineffective siRNAs are calculated independent. All frequency ratios can be looked up here.
In combination with the frequency ratios it is now possible to calculate the silencing probability for the 19 bp long binding site of siRNAs.

siRNA selection for RNAi and repression of translation

The procedures of siRNA selection for both mechanisms, RNAi and repression of translation, are very similar. Thus the first two modules, RNAi and siRNA, are similar. First the mRNA binding sequence is determined using the rational design and the Ui-Tei rules. In the next step, the silencing probability is determined. At the end, the corresponding overhangs and scaffolds are added to the 19 nt long binding sequence to form the mature siRNA.

Check siRNA

Beside the selection of siRNAs, we also implemented a functionality to check siRNAs derived by other methods. For a given target sequence and a corresponding siRNA it is checked whether the siRNA might bind to its target and how well it fulfills the described criteria. Furthermore, its silencing efficiency is calculated.

Command line application

The command line application can be obtained directly here or downloaded from our GitHub repository. To run the command line application, Python 2.7 needs to be installed.
Figure 4: Help message on how to use the command line application.
Used without input, a help message is displayed listing the mandatory and optional input parameters (Figure 4). For more information a README is available in our repository. All resulting siRNAs are saved in one FASTA file. This simplifies the integration into different workflows. For example, it is possible to test the siRNAs on off-target bindings site using BLAST. An exemplary call of the application as well as the results returned can be seen in Figure 5.
Figure 5: Exemplary call and results of the command line application using a GFP gene sequence as input.

Graphical Interface usage

Like the command line application, the graphical interface version can either be downloaded directly here, or via our GitHub repository. In the graphical interface, the modules are accessible via tabs (Figure 6). The last tab contains usage and copyright information.
Figure 6: The different modules are accessible via tabs.

Tab 1: siRNA for RNAi

  1. Insert gene sequence
  2. Choose TACE vector system (optionally)
  3. Constructions of siRNAs
  4. View resulting siRNAs (sense and antisense sequence) and their corresponding probability
  5. Decide if siRNAs should be saved with MicC scaffold (only if TACE is not used)
  6. Save results as FASTA file
Figure 7: Overview and steps of the siRNA for RNAi module.

Tab 2: siRNA for silencing

  1. Insert gene sequence
  2. Choose TACE vector system (optionally)
  3. Constructions of siRNAs
  4. View resulting siRNAs (sense and antisense sequence) and their corresponding probability
  5. Decide if siRNAs should be saved with MicC scaffold (only if TACE is not used)
  6. Decide if siRNAs should be saved with OmpA scaffold (only if TACE is not used)
  7. Save results as FASTA file
Figure 8: Overview and steps of the siRNA for silencing module.

Tab 3: Check siRNA

  1. Insert gene sequence
  2. Insert siRNA sequences
  3. Choose method the siRNA was constructed for (siRNA for RNAi or siRNA for silencing)
  4. Choose if siRNA was constructed for TACE (optionally)
  5. Validation of entered siRNA for given target gene sequences
  6. View results
  7. Save results (optionally)
Figure 9: Overview and steps of the check siRNA module.

Outlook

To help future iGEM teams to control gene expression, we developed siRCon, a bioinformatic application to generate high-fidelity siRNA sequences in prokaryotic organisms. We introduce this method as an alternative to CRISPR/Cas, since it is open source and free of charge. In the future, further improvements and extensions of this applications are intended. On the one side, eukaryotic siRNAs will also be constructed. This is how we want to provide a universal tool for siRNAs. On the other side, we want to improve the already existing features, especially the check siRNA functionality.
Elbashir, S.M., Harborth, J., Lendeckel, W., Yalcin, A., Weber, K., and Tuschl, T. (2001). Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells. Nature 411: 494–498.
Foley, P.L., Hsieh, P., Luciano, D.J., and Belasco, J.G. (2015). Specificity and evolutionary conservation of the Escherichia coli RNA pyrophosphohydrolase RppH. J. Biol. Chem. 290: 9478–9486.
Kibbe, W.A. (2007). OligoCalc: an online oligonucleotide properties calculator. Nucleic Acids Res 35: W43–W46.
Na, D., Yoo, S.M., Chung, H., Park, H., Park, J.H., and Lee, S.Y. (2013). Metabolic engineering of Escherichia coli using synthetic small regulatory RNAs. Nat. Biotechnol. 31: 170–174.
Naito, Y. and Ui-Tei, K. (2012). siRNA Design Software for a Target Gene-Specific RNA Interference. Front Genet 3.
Reynolds, A., Leake, D., Boese, Q., Scaringe, S., Marshall, W.S., and Khvorova, A. (2004). Rational siRNA design for RNA interference. Nature Biotechnology 22: 326–330.
Siomi, H. and Siomi, M.C. (2009). On the road to reading the RNA-interference code. Nature 457: 396–404.
Takasaki, S. (2009). Selecting effective siRNA target sequences by using Bayes’ theorem. Computational Biology and Chemistry 33: 368–372.
Ui-Tei, K., Naito, Y., Takahashi, F., Haraguchi, T., Ohki-Hamazaki, H., Juni, A., Ueda, R. and Saigo, K. (2004). Guidelines for the selection of highly effective siRNA sequences for mammalian and chick RNA interference. Nucleic Acids Res. 32: 936-948.