Elinramstrom (Talk | contribs) |
Elinramstrom (Talk | contribs) |
||
Line 260: | Line 260: | ||
</div> | </div> | ||
+ | </div> | ||
Line 265: | Line 266: | ||
<br><br> | <br><br> | ||
− | + | <div class="card-holder"> | |
<h3>Gene counting</h3> | <h3>Gene counting</h3> | ||
Line 314: | Line 315: | ||
<br> | <br> | ||
+ | |||
+ | <!--change src to that of the image you want--> | ||
+ | <img class="content-card-img" src="https://static.igem.org/mediawiki/2018/a/a9/T--Uppsala--Transcriptomics-Bioinformatics2.png"> | ||
<div class="inner-card-text"> | <div class="inner-card-text"> | ||
<!-- start of paragraph--> | <!-- start of paragraph--> | ||
<p><b>Figure 3:</b> Results of the differential gene expression analysis using Deseq2 on test files. The genes (shown with their gene ID) as well as their mean base length and several statistical results can be seen.</p> | <p><b>Figure 3:</b> Results of the differential gene expression analysis using Deseq2 on test files. The genes (shown with their gene ID) as well as their mean base length and several statistical results can be seen.</p> | ||
</div> | </div> | ||
− | |||
− | |||
<!-- end of paragraph --> | <!-- end of paragraph --> | ||
</div> | </div> | ||
<div class="inner-card right-card"> | <div class="inner-card right-card"> | ||
− | + | ||
<br> | <br> | ||
+ | |||
+ | |||
+ | <img class="content-card-img" src="https://static.igem.org/mediawiki/2018/8/81/T--Uppsala--Transcriptomics-Bioinformatics3.png"> | ||
<div class="inner-card-text"> | <div class="inner-card-text"> | ||
<!-- start of paragraph --> | <!-- start of paragraph --> | ||
Line 330: | Line 335: | ||
<!-- End of paragraphs --> | <!-- End of paragraphs --> | ||
</div> | </div> | ||
− | |||
− | |||
Line 351: | Line 354: | ||
<p>The results after searching for the genes in the NCBI database showed that the most expressed gene from the sugar-cultured <i>E. Coli</i> was shown to be involved in a type of sugar system, proving that the pipeline was indeed working.</p><br><br> | <p>The results after searching for the genes in the NCBI database showed that the most expressed gene from the sugar-cultured <i>E. Coli</i> was shown to be involved in a type of sugar system, proving that the pipeline was indeed working.</p><br><br> | ||
</div> | </div> | ||
− | + | ||
− | <div class="card-holder"> | + | |
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | <div class="card-holder"> | ||
− | <div class="content-card | + | <div class="content-card content-card-2"> |
− | <div class=" | + | <div class="inner-card left-card"> |
− | + | ||
− | < | + | <br> |
− | + | <!--change src to that of the image you want--> | |
− | + | <img class="content-card-img" src="https://static.igem.org/mediawiki/2018/4/4c/T--Uppsala--Transcriptomics-Bioinformatics4.png"> | |
+ | <div class="inner-card-text"> | ||
+ | <!-- start of paragraph--> | ||
+ | <p><b>Figure 5:</b> Highly expressed gene produced from the pipeline matching a glucose specific gene.</p> | ||
+ | </div> | ||
+ | <!-- end of paragraph --> | ||
</div> | </div> | ||
+ | <div class="inner-card right-card"> | ||
+ | |||
+ | <br> | ||
+ | |||
− | + | <img class="content-card-img" src="https://static.igem.org/mediawiki/2018/3/3c/T--Uppsala--Transcriptomics-Bioinformatics5.png"> | |
− | + | <div class="inner-card-text"> | |
− | + | <!-- start of paragraph --> | |
− | + | <p><b>Figure 6:</b> Results of the differential gene expression done on our own data.</p> | |
+ | <!-- End of paragraphs --> | ||
+ | </div> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
</div> | </div> | ||
+ | </div> | ||
+ | </div> | ||
+ | |||
+ | |||
+ | |||
+ | <div class="card-holder"> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
<br><br> | <br><br> | ||
Line 406: | Line 411: | ||
</div> | </div> | ||
</div> | </div> | ||
− | + | ||
</body> | </body> | ||
</html> | </html> |
Revision as of 15:48, 17 October 2018
Bioinformatics
After a succesfull sequencing has been performed and you’re left with raw data containing millions and millions (and millions) of lines of base sequences, all of this needs to be processed and interpreted. This is where the interdisciplinary field of bioinformatics comes in. A vast range of software tools are available, tailored to different kinds of analysis as well as being unique to the different sequencing methods being used.
Most of the tools we used were available through the free website Usegalaxy.org which as well let us do the processing on their servers. Because we also made use of nanopore sequencing, tailored tools used for the MinION data were available from their community hub which could be run from a terminal window.
Experiment
We decided to create our bioinformatics pipeline from scratch. This was not an easy task however as nanopore technology is novel and many of the available pipelines are tailored to illumina sequencing. Generally though, a basic transcriptomics pipeline looks like the following: Alignment to a reference genome, gene counting and differential gene expression [1]. However a couple of data processing steps were needed for the nanopore data beforehand such as demultiplexing and adapter trimming.
Demultiplexing and adapter trimming
Gene counting
Result
The transcriptomics pipeline was tried out and validated using read files available from the internet. The files consisted of two datasets of E. Coli (triplicates) cultured in regular LB and a sugar solution respectively.
Figure 3: Results of the differential gene expression analysis using Deseq2 on test files. The genes (shown with their gene ID) as well as their mean base length and several statistical results can be seen.
Figure 4: Results of the differential gene expression after filtering for statistical significance and fold change.
The results after searching for the genes in the NCBI database showed that the most expressed gene from the sugar-cultured E. Coli was shown to be involved in a type of sugar system, proving that the pipeline was indeed working.
Figure 5: Highly expressed gene produced from the pipeline matching a glucose specific gene.
Figure 6: Results of the differential gene expression done on our own data.
The resuts from our runs unfortunately did not produce as good results as seen above. Due to the major issues with sequencing and actually generating enough data, it can be seen in figure 6 what kind of effects it had. Judging by the adjusted p-values it is clear that even though the genes can indeed be identified the statistical significance is extremely uncertain (the minimal accepted threshold is an adjusted p-value of <= 0.05). Any up-or down regulation of fold-change of interest was not able to be identified either. Looking at these errors it can be assumed that no major change in fold-change as well as low significancy is due to simply not enough data being generated from the prior sequencing step. Because of these facts no gene could be identified as a possible candidate for our reporter system.
References
[1] Galaxyproject, 2018. Reference-based RNA-Seq data analysis https://galaxyproject.github.io/training-material/topics/transcriptomics/tutorials/ref-based/tutorial.html Date of visit 2018-10-15