Difference between revisions of "Team:Uppsala/Transcriptomics/Bioinformatics"

Line 229: Line 229:
 
<p>After a succesfull sequencing has been performed and you’re left with raw data containing millions and millions (and millions) of lines of base sequences, all of this needs to be processed and interpreted. This is where the interdisciplinary field of bioinformatics comes in. A vast range of software tools are available, tailored to different kinds of analysis as well as being unique to the different sequencing methods being used.<br><br>
 
<p>After a succesfull sequencing has been performed and you’re left with raw data containing millions and millions (and millions) of lines of base sequences, all of this needs to be processed and interpreted. This is where the interdisciplinary field of bioinformatics comes in. A vast range of software tools are available, tailored to different kinds of analysis as well as being unique to the different sequencing methods being used.<br><br>
 
   
 
   
Most of the tools we used were available through the free website Usegalaxy.org which as well let us do the processing on their servers. Because we also made use of nanopore sequencing, tailored tools used for the MinION data were available from their community hub which could be run from a terminal window. </p><br><br>
+
Most of the tools we used were available through the free website Usegalaxy.org which as well let us do the processing on their servers. Because we also made use of nanopore sequencing, tailored tools used for the MinION data were available from their community hub which could be run from a terminal window. </p>
  
 
<h2>Experiment</h2>
 
<h2>Experiment</h2>
Line 264: Line 264:
 
                  
 
                  
 
                 <!--End of template with side picture -->
 
                 <!--End of template with side picture -->
<br><br>
+
 
  
 
<div class="card-holder">
 
<div class="card-holder">
Line 301: Line 301:
  
 
<h3>Validating our Transcriptomics Pipeline</h3>
 
<h3>Validating our Transcriptomics Pipeline</h3>
<p>The transcriptomics pipeline was tried out and validated using read files available from the internet. The files consisted of two datasets of <i>E. Coli</i> (triplicates) cultured in regular LB and a sugar solution respectively.</p><br><br>
+
<p>The transcriptomics pipeline was tried out and validated using read files available from the internet. The files consisted of two datasets of <i>E. Coli</i> (triplicates) cultured in regular LB and a sugar solution respectively.</p><br>
 
                      
 
                      
 
                      
 
                      
Line 351: Line 351:
 
                  
 
                  
 
                 <!--End of template with side picture -->
 
                 <!--End of template with side picture -->
<br><br>
 
  
<p>The results after searching for the genes in the NCBI database showed that the most expressed gene from the sugar-cultured <i>E. Coli</i> was shown to be involved in a type of sugar system, proving that the pipeline was indeed working.</p><br><br>
+
 
 +
<p>The results after searching for the genes in the NCBI database showed that the most expressed gene from the sugar-cultured <i>E. Coli</i> was shown to be involved in a type of sugar system, proving that the pipeline was indeed working.</p><br>
 
                 </div>
 
                 </div>
 
                  
 
                  
Line 399: Line 399:
 
       <div class="card-holder">
 
       <div class="card-holder">
  
<br><br>
+
<br>
  
 
<h3>Analyzing Our Own Sequencing Data</h3>
 
<h3>Analyzing Our Own Sequencing Data</h3>
Line 458: Line 458:
  
 
<!-- End of Code For TABLE -->
 
<!-- End of Code For TABLE -->
<br><br>
+
 
  
 
           <p>The resuts from our runs unfortunately did not produce as good results as seen above. Due to the major issues with sequencing and actually generating enough data, it can be seen in <b>figure 6</b> what kind of effects it had. Judging by the adjusted p-values it is clear that even though the genes can indeed be identified as seen in <b>Table 1</b> the statistical significance is extremely uncertain (the minimal accepted threshold is an adjusted p-value of &#60;&#61; 0.05). Any up-or down regulation of fold-change of interest was not able to be identified either. Looking at these errors it can be assumed that no major change in fold-change as well as low significancy is due to simply not enough data being generated from the prior sequencing step. Because of these facts no gene could be identified as a possible candidate for our reporter system.</p>
 
           <p>The resuts from our runs unfortunately did not produce as good results as seen above. Due to the major issues with sequencing and actually generating enough data, it can be seen in <b>figure 6</b> what kind of effects it had. Judging by the adjusted p-values it is clear that even though the genes can indeed be identified as seen in <b>Table 1</b> the statistical significance is extremely uncertain (the minimal accepted threshold is an adjusted p-value of &#60;&#61; 0.05). Any up-or down regulation of fold-change of interest was not able to be identified either. Looking at these errors it can be assumed that no major change in fold-change as well as low significancy is due to simply not enough data being generated from the prior sequencing step. Because of these facts no gene could be identified as a possible candidate for our reporter system.</p>

Revision as of 20:19, 17 October 2018





<