Team:NCTU Formosa/Dry Lab/NGS Data Analysis

Navigation Bar Correlation Model

Overview

Use of biostimulators impacts the entire microbiota. The final resulting bacterial distribution depends on interactions between bacteria and can be illustrated through correlation. From our initial NGS data we can determine the nature of the relationships between bacteria and use these properties to accurately predict how soil microbiota changes with addition of biostimulators using machine learning.

Figure 1: The process of correlation analysis

Original OTU Table

16s NGS uses the 16 small subunit of bacterial ribosomes to differentiate bacteria of different genera. Through this technology we obtain the bacterial distribution of our soil, summarized in the operational taxonomical unit table (OTU Table) below:

Figure 2: Schematic diagram of Operation Taxnomy Unit table

Ratio Analysis

NGS data includes several unclassified entries consisting of incomplete genomic segments that don’t represent functional bacteria. We first delete these entries, then calculate the ratios of the remaining entries to produce the following bar chart:

Figure 3: Stacked bar plot of top-20 bacteria ratio in different samples

Spearman's Rank Correlation

To predict the final microbial distribution of soil as a result of adding biostimulators, we need to understand the interbacterial relationships that exist within the microbiome. Said relationships can be summarized by calculating a Spearman correlation coefficient using the following formula:

$$\rho_s=1-\frac{6\sum d_{i^2}}{n(n^2-1)}$$

Table 1: Variable and Parameter in Spearman's correlation equation.

Symbol

Unit

Explanation

$\rho_s$ -

Spearman's correlation value

$d_i$ -

The difference in the ranked observations from each group

$n$ -

The sample size

The Spearman correlation coefficient is a value ranging from -1 to 1… [ ]
A computer program can then visualize the results in a heat map. A map of the 20 most abundant bacteria of our soil is shown below:

Figure 4: Correlation table of top-20 bacteria

α Diversity Analysis

Use of biostimulators to manipulate soil factors requires careful consideration of the microbiota. Certain stimulators may cause specific genera of bacteria to become overly dominant, damaging soil integrity. As a method of monitoring the balance of the microbial ecosystem, we investigate both the richness and the evenness of the soil.

Richness

Triplicate Analysis

Figure 5: The box plot of richness triplicate analysis

Eveness--Shannon index

$$H'=-\sum_{i=1}^{S}p_{i}lnp_i$$

Table 2: Variable and Parameter in Shannon index equation.

Symbol

Unit

Explanation

$H'$ -

Shannon index

$S$ -

The total number of genuses in samples

$p_i$ -

The ratio of bacteria amount of the ith genus in the sample

Triplicate Analysis

Figure 6: The box plot of shannon index triplicate analysis