Difference between revisions of "Team:Uppsala/Transcriptomics/Bioinformatics"

 
(17 intermediate revisions by 5 users not shown)
Line 19: Line 19:
 
                  
 
                  
 
             }
 
             }
 +
 +
            .side-img a{
 +
              display: inline-block;
 +
              color: black;
 +
              padding-left: 5px;
 +
              text-decoration: none;
 +
            }
 +
 +
              .inner-card-text a{
 +
              display: inline-block;
 +
              color: black;
 +
              padding-left: 5px;
 +
              text-decoration: none;
 +
            }
 +
   
 
         </style>
 
         </style>
  
Line 30: Line 45:
  
  
  <div class="svg-wrapper">
+
  <div class="svg-wrapper" id="Project_Description">
  
 
      
 
      
Line 177: Line 192:
  
  
</div>
+
 
  
 
<!-- CONTENT OF WHATS ON THE PAGE -->
 
<!-- CONTENT OF WHATS ON THE PAGE -->
Line 184: Line 199:
 
             <div id="toctitle"></div>
 
             <div id="toctitle"></div>
 
             <ul>
 
             <ul>
                 <li class="toclevel tocsection"><a href="#Project_Description" class="scroll"> <span id="whereYouAre"> Project Description  </span> </a>
+
                 <li class="toclevel tocsection"><a href="#Project_Description" class="scroll"> <span id="whereYouAre"> Bioinformatics</span> </a>
 
                         <ul>
 
                         <ul>
                             <li class="toclevel nav-item active"><a href="#top" class="nav-link scroll"> Overview </a></li>
+
                             <li class="toclevel nav-item active"><a href="#Exp" class="nav-link scroll"> Experiment</a></li>
                             <li class="toclevel nav-item"><a href="#Problem" class="nav-link scroll">  Problem  </a></li>
+
                             <li class="toclevel nav-item"><a href="#Results" class="nav-link scroll">  Results</a></li>
                            <li class="toclevel nav-item"><a href="#Solution" class="nav-link scroll">  Solution </a></li>
+
 
                             <li class="toclevel nav-item"><a href="#References" class="nav-link scroll"> References </a></li>
 
                             <li class="toclevel nav-item"><a href="#References" class="nav-link scroll"> References </a></li>
 
                         </ul>
 
                         </ul>
Line 231: Line 245:
 
Most of the tools we used were available through the free website Usegalaxy.org which as well let us do the processing on their servers. Because we also made use of nanopore sequencing, tailored tools used for the MinION data were available from their community hub which could be run from a terminal window. </p>
 
Most of the tools we used were available through the free website Usegalaxy.org which as well let us do the processing on their servers. Because we also made use of nanopore sequencing, tailored tools used for the MinION data were available from their community hub which could be run from a terminal window. </p>
  
<h2>Experiment</h2>
+
<h2 id="Exp">Experiment</h2>
  
 
<p>We decided to create our bioinformatics pipeline from scratch. This was not an easy task however as nanopore technology is novel and many of the available pipelines are tailored to illumina sequencing. Generally though, a basic transcriptomics pipeline looks like the following: Alignment to a reference genome, gene counting and differential gene expression [1]. However a couple of data processing steps were needed for the nanopore data beforehand such as demultiplexing and adapter trimming.</p><br>
 
<p>We decided to create our bioinformatics pipeline from scratch. This was not an easy task however as nanopore technology is novel and many of the available pipelines are tailored to illumina sequencing. Generally though, a basic transcriptomics pipeline looks like the following: Alignment to a reference genome, gene counting and differential gene expression [1]. However a couple of data processing steps were needed for the nanopore data beforehand such as demultiplexing and adapter trimming.</p><br>
Line 255: Line 269:
 
                           <!-- Here goes the big image to the right -->  
 
                           <!-- Here goes the big image to the right -->  
 
                           <img src="https://static.igem.org/mediawiki/2018/3/3b/T--Uppsala--Transcriptomics-Demultiplexing.png">  
 
                           <img src="https://static.igem.org/mediawiki/2018/3/3b/T--Uppsala--Transcriptomics-Demultiplexing.png">  
                             <p><b>Figure 1:</b> Running demultiplexing and barcode trimming from the terminal. The programme first separates the reads according to barcode and then searches for available possible barcodes to be trimmed off.</p>
+
                             <a href="https://static.igem.org/mediawiki/2018/3/3b/T--Uppsala--Transcriptomics-Demultiplexing.png"><p><b>Figure 1.</b> Running demultiplexing and barcode trimming from the terminal. The programme first separates the reads according to barcode and then searches for available possible barcodes to be trimmed off.</p></a>
 
                              
 
                              
 
                         </div>
 
                         </div>
Line 287: Line 301:
 
                         <div class="side-img" style="background-color:darkolivegreen;">
 
                         <div class="side-img" style="background-color:darkolivegreen;">
 
                           <!-- Here goes the big image to the right -->  
 
                           <!-- Here goes the big image to the right -->  
                          <img src="https://static.igem.org/mediawiki/2018/a/a9/T--Uppsala--Transcriptomics-Bioinformatics2.png">
+
                       
                             <p><b>Figure 2:</b> Results of a differential gene expression analysis using Deseq2 on test files. The genes (shown with their gene ID) as well as their mean base length and several statistical results can be seen.</p>
+
                      <img src="https://static.igem.org/mediawiki/2018/a/a9/T--Uppsala--Transcriptomics-Bioinformatics2.png">
 +
                             <a href="https://static.igem.org/mediawiki/2018/a/a9/T--Uppsala--Transcriptomics-Bioinformatics2.png"><p><b>Figure 2.</b> Results of a differential gene expression analysis using Deseq2 on test files. The genes (shown with their gene ID) as well as their mean base length and several statistical results can be seen.</p></a>
 
                         </div>
 
                         </div>
  
Line 298: Line 313:
  
  
<h2>Result</h2>
+
<h2 id="Results">Result</h2>
  
 
<h3>Validating our Transcriptomics Pipeline</h3>
 
<h3>Validating our Transcriptomics Pipeline</h3>
<p>The transcriptomics pipeline was tried out and validated using read files available from the internet. The files consisted of two datasets of <i>E. Coli</i> (triplicates) cultured in regular LB and a sugar solution respectively.</p><br>
+
<p>The transcriptomics pipeline was tried out and validated using read files available from the internet. The files consisted of two datasets of <i>E. coli</i> (triplicates) cultured in regular LB and a sugar solution respectively.</p><br>
 
                      
 
                      
 
                      
 
                      
Line 321: Line 336:
 
                             <div class="inner-card-text">  
 
                             <div class="inner-card-text">  
 
                                 <!-- start of paragraph-->
 
                                 <!-- start of paragraph-->
                                 <p><b>Figure 3:</b> Results of the differential gene expression analysis using Deseq2 on test files. The genes (shown with their gene ID) as well as their mean base length and several statistical results can be seen.</p>
+
                                 <a href="https://static.igem.org/mediawiki/2018/a/a9/T--Uppsala--Transcriptomics-Bioinformatics2.png"><p><b>Figure 3.</b> Results of the differential gene expression analysis using Deseq2 on test files. The genes (shown with their gene ID) as well as their mean base length and several statistical results can be seen.</p></a>
 
                             </div>
 
                             </div>
 
                             <!-- end of paragraph -->
 
                             <!-- end of paragraph -->
Line 328: Line 343:
 
                          
 
                          
 
                             <br>
 
                             <br>
                           
 
 
 
                             <img class="content-card-img" src="https://static.igem.org/mediawiki/2018/8/81/T--Uppsala--Transcriptomics-Bioinformatics3.png">
 
                             <img class="content-card-img" src="https://static.igem.org/mediawiki/2018/8/81/T--Uppsala--Transcriptomics-Bioinformatics3.png">
 
                             <div class="inner-card-text">  
 
                             <div class="inner-card-text">  
 
                                 <!-- start of paragraph -->
 
                                 <!-- start of paragraph -->
                                <p><b>Figure 4:</b> Results of the differential gene expression after filtering for statistical significance and fold change.</p>
+
                              <a href="https://static.igem.org/mediawiki/2018/8/81/T--Uppsala--Transcriptomics-Bioinformatics3.png">          <p><b>Figure 4.</b> Results of the differential gene expression after filtering for statistical significance and fold change.</p></a>
 
                                 <!-- End of paragraphs -->
 
                                 <!-- End of paragraphs -->
 
                             </div>
 
                             </div>
Line 353: Line 366:
  
  
<p>The results after searching for the genes in the NCBI database showed that the most expressed gene from the sugar-cultured <i>E. Coli</i> was shown to be involved in a type of sugar system, proving that the pipeline was indeed working.</p><br>
+
<p>The results after searching for the genes in the NCBI database showed that the most expressed gene from the sugar-cultured <i>E. coli</i> was shown to be involved in a type of sugar system, proving that the pipeline was indeed working.</p><br>
 
                 </div>
 
                 </div>
 
                  
 
                  
Line 374: Line 387:
 
                             <div class="inner-card-text">  
 
                             <div class="inner-card-text">  
 
                                 <!-- start of paragraph-->
 
                                 <!-- start of paragraph-->
                                 <p><b>Figure 5:</b> Highly expressed gene produced from the pipeline matching a glucose specific gene.</p>
+
                                 <a href="https://static.igem.org/mediawiki/2018/4/4c/T--Uppsala--Transcriptomics-Bioinformatics4.png"><p><b>Figure 5.</b> Highly expressed gene produced from the pipeline matching a glucose specific gene.</p></a>
 
                             </div>
 
                             </div>
 
                             <!-- end of paragraph -->
 
                             <!-- end of paragraph -->
Line 386: Line 399:
 
                             <div class="inner-card-text">  
 
                             <div class="inner-card-text">  
 
                                 <!-- start of paragraph -->
 
                                 <!-- start of paragraph -->
                                <p><b>Figure 6:</b> Results of the differential gene expression done on our own data.</p>
+
                              <a href="https://static.igem.org/mediawiki/2018/3/3c/T--Uppsala--Transcriptomics-Bioinformatics5.png"> <p><b>Figure 6.</b> Results of the differential gene expression done on our own data.</p></a>
 
                                 <!-- End of paragraphs -->
 
                                 <!-- End of paragraphs -->
 
                             </div>
 
                             </div>
Line 399: Line 412:
 
       <div class="card-holder">
 
       <div class="card-holder">
  
<br>
+
                <br>
  
<h3>Analyzing Our Own Sequencing Data</h3>
+
            <h3>Analyzing Our Own Sequencing Data</h3>
<p><b>Table 1</b>: The first few genes as a result of the differential gene expression analysis seen in <b>Figure 6</b> together with their  
+
            <p><b>Table 1.</b> The first few genes as a result of the differential gene expression analysis seen in Figure 6
promotor sequence and function in the organism.</p>
+
                together with their promotor sequence and function in the organism.</p>
<!--Start of template with side picutre -->
+
                <!--Start of template with side picutre -->
  
 
                             <!-- Here you put your paragraphs -->  
 
                             <!-- Here you put your paragraphs -->  
 
                             <table class="pgrouptable tablesorter our-table" style="width: 100%;" cellspacing="0" cellpadding="0">
 
                             <table class="pgrouptable tablesorter our-table" style="width: 100%;" cellspacing="0" cellpadding="0">
    <thead><tr>
+
                        <thead><tr>
<th style= “width: auto”>Gene ID</th>
+
                    <th style= “width: auto”>Gene ID</th>
<th style= “width: auto” >Gene name</th>
+
                    <th style= “width: auto” >Gene name</th>
<th style= “width: auto” >Promotor sequence</th>
+
                    <th style= “width: auto” >Promotor sequence</th>
<th style= “width: auto” >Function</th>
+
                    <th style= “width: auto” >Function</th>
<th style= “width: auto” >Fold change</th>
+
                    <th style= “width: auto” >Fold change</th>
</tr></thead>
+
                    </tr></thead>
<tbody><tr>
+
                    <tbody><tr>
  
<td>
+
                    <td>
ER3413_45<br>
+
                    ER3413_45<br>
ER3413_70<br>
+
                    ER3413_70<br>
ER3413_87<br>
+
                    ER3413_87<br>
ER3413_126<br>
+
                    ER3413_126<br>
ER3413_173
+
                    ER3413_173
</td>
+
                    </td>
<td >
+
                    <td >
apaG<br>
+
                    apaG<br>
leuA<br>
+
                    leuA<br>
murG<br>
+
                    murG<br>
panD<br>
+
                    panD<br>
frr
+
                    frr
</td>
+
                    </td>
<td>
+
                    <td>
ggcaccatgcagggtcactacgaaatgatcgatgaaa<br>
+
                    ggcaccatgcagggtcactacgaaatgatcgatgaaa<br>
ttgacatccgtttttgtatccagtaactctaaaagc<br>
+
                    ttgacatccgtttttgtatccagtaactctaaaagc<br>
<p>-</p><br>
+
                    <p>-</p><br>
tagacactaaacaaaaatcgggcaatactgcgtga<br>
+
                    tagacactaaacaaaaatcgggcaatactgcgtga<br>
ttacccgtaatatgtttaatcagggctatacttagcac
+
                    ttacccgtaatatgtttaatcagggctatacttagcac
</td>
+
                    </td>
<td>
+
                    <td>
protein associated with Co2+ and Mg2+ efflux<br>
+
                    protein associated with Co2+ and Mg2+ efflux<br>
2-isopropylmalate synthase<br>
+
                    2-isopropylmalate synthase<br>
N-acetylglucosaminyl transferase<br>
+
                    N-acetylglucosaminyl transferase<br>
putative inner membrane protein<br>
+
                    putative inner membrane protein<br>
inner membrane protein, UPF0118 family
+
                    inner membrane protein, UPF0118 family
</td>
+
                    </td>
<td>
+
                    <td>
0.40<br>
+
                    0.40<br>
0.40<br>
+
                    0.40<br>
0.40<br>
+
                    0.40<br>
0.40<br>
+
                    0.40<br>
0.40
+
                    0.40
</td>
+
                    </td>
</tr><tr>
+
                    </tr><tr>
</tr></tbody></table>
+
                    </tr></tbody></table>
  
  
Line 460: Line 473:
  
  
           <p>The resuts from our runs unfortunately did not produce as good results as seen above. Due to the major issues with sequencing and actually generating enough data, it can be seen in <b>figure 6</b> what kind of effects it had. Judging by the adjusted p-values it is clear that even though the genes can indeed be identified as seen in <b>Table 1</b> the statistical significance is extremely uncertain (the minimal accepted threshold is an adjusted p-value of &#60;&#61; 0.05). Any up-or down regulation of fold-change of interest was not able to be identified either. Looking at these errors it can be assumed that no major change in fold-change as well as low significancy is due to simply not enough data being generated from the prior sequencing step. Because of these facts no gene could be identified as a possible candidate for our reporter system.</p>
+
           <p>The resuts from our runs unfortunately did not produce as good results as seen above. Due to the major issues with sequencing and actually generating enough data, it can be seen in figure 6 what kind of effects it had. Judging by the adjusted p-values it is clear that even though the genes can indeed be identified as seen in Table 1 the statistical significance is extremely uncertain (the minimal accepted threshold is an adjusted p-value of &#60;&#61; 0.05). Any up-or down regulation of fold-change of interest was not able to be identified either. Looking at these errors it can be assumed that no major change in fold-change as well as low significancy is due to simply not enough data being generated from the prior sequencing step. Because of these facts no gene could be identified as a possible candidate for our reporter system.</p>
 
                      
 
                      
                    <h1>References</h1>
+
                     
                    <p><b>[1]</b> Galaxyproject, 2018. Reference-based RNA-Seq data analysis https://galaxyproject.github.io/training-material/topics/transcriptomics/tutorials/ref-based/tutorial.html Date of visit 2018-10-15</p>                   
+
 
                      
 
                      
 
                 </div>
 
                 </div>
  
 +
<div class="card-holder">
 +
<h2 id="References">References</h2>
 +
                   
 +
<p><b>[1]</b> Galaxyproject, 2018. Reference-based RNA-Seq data analysis <a href="https://galaxyproject.github.io/training-material/topics/transcriptomics/tutorials/ref-based/tutorial.html">Galaxyproject</a> Date of visit 2018-10-15</p>
 +
</div>
 
                 <!-- HERE ENDS THE PORTION WHERE YOU PUT IN YOUR CONTENT-->
 
                 <!-- HERE ENDS THE PORTION WHERE YOU PUT IN YOUR CONTENT-->
 
                 <div style="height:5em;"></div>
 
                 <div style="height:5em;"></div>
 
             </div>
 
             </div>
 
         </div>
 
         </div>
          
+
         </div>
 
                    
 
                    
                    </body>
+
    </body>
 
</html>
 
</html>

Latest revision as of 23:37, 17 October 2018