Use of biostimulators impacts the entire microbiota. The final resulting bacterial distribution depends on interactions between bacteria and can be illustrated through correlation. From our initial NGS data we can determine the nature of the relationships between bacteria and use these properties to accurately predict how soil microbiota changes with addition of biostimulators using machine learning.
Marker Gene Amplicon Analysis
Microbiome data are generated from 16S ribosomal RNA(rRNA) gene. The PCR primers were designed to amplify the V4 region of the bacterial 16S ribosomal DNA. After profiling 16S rRNA sequencing, we used QIIME to generate operational taxonomic units (OTUs) table. Then we used bioinformatics tools and statistics methods to analyze microbial diversity in soil samples. We also used machine learning to predict how soil microbiota changes with addition of biostimulators.
Operational Taxonomic Units Table (OTUs Table)
Figure 1 is an example of OTUs table. Each column represents the type and amount of bacteria (OTU1, OTU2, …, OTU7) in each soil sample (A1, A2, A3, B1, B2, and B3). We generate seven tables for each level: Phylum, Class, Order, Family, Genus, and Species.
Data Analysis Process
The OTUs tables will consist of unclassified names using the open source pipeline of QIIME. Thus, we have to rearrange the data to facilitate analysis according to the following steps:
Step 1. Delete unclassified genomic segments.
Step 2. Calculate the ratios of the remaining entries.
Step 3. Select the most abundant bacteria within soil samples to observe
their distribution using the following bar charts (Fig. 3).
Spearman's Rank Correlation
The strength of co-occurrence of bacteria within soil samples was evaluated by the Spearman’s rank correlation coefficients. It ranges from -1 to 1. The formula of Spearman correlation coefficient is as follows:
Symbol |
Unit |
Explanation |
---|---|---|
$\rho_s$ | - | Spearman's correlation value |
$d_i$ | - | The difference in the ranked observations from each group |
$n$ | - | The sample size |
Then we use heat maps to visualize the correlation strengths. Figure 4 shows the 20 most abundant bacteria within soil samples.
A computer program can then visualize the results in a heat map. A map of the 20 most abundant bacteria of our soil is shown below:
Alpha-Diversity Analysis
Use of biostimulators to manipulate soil factors requires careful consideration of the microbiota. Certain stimulators may cause specific genera of bacteria to become overly dominant, damaging soil integrity. As a method of monitoring the balance of the microbial ecosystem, we investigate the evenness of the soil.
Eveness--Shannon index
Microbial diversity is measured by alpha-diversity (α-diversity). In our study, α-diversity refers richness and the Shannon diversity index. Richness means the number of OTUs, and evenness of bacterial community is measured by the Shannon diversity index, as shown below:
Symbol |
Unit |
Explanation |
---|---|---|
$H'$ | - | Shannon index |
$S$ | - | The total number of genuses in samples |
$p_i$ | - | The ratio of bacteria amount of the ith genus in the sample |
A higher Shannon index indicates greater evenness. The estimated degree of evenness can be derived from the exponential of the value. For example, a soil sample with Shannon index 2.85 and $$e^{2.85}=17$$ It means that the sample approximately consists of 17bacteria that are equal in numbers. Thus, the Shannon index can be used as an observational tool to determine whether biostimulants decrease the overall evenness or not, and thus health and stability, of the soil.
Triplicate Analysis