16S rRNA Metagenomic analysis

Experiment_protocol

Figure 1. Pipeline of Experiment

After purification, amplicon DNA is end-repaired and A-tailed using the polymerase activity of klenow fragment. Indexed adapters were then ligated to the DNA fragments by DNA ligase. After validating the libraries by QPCR, Experion and Qubit, the library could be sequenced using Illumina HiSeq.

bioinfo_pipeline

Figure 2. Bioinformatic pipeline

  • Sequencing Results

Library Name

Sample Pool

Required data (bp)

200,000,000

Number of total Illumina reads (R1+R2)

2,563,120

Average read length

126

Total Illumina bases

322,953,120

 The CLC Genomics Workbench (CLC Bio, Aarhus, Denmark) supports automatic grouping of samples by sequence tag (barcodes). By assigning individual, unique sample specific barcodes, multiple sequencing runs may be performed in parallel and the resulting reads can later be binned according to sample. The “Not grouped” reads may contain reads with un-consistent barcode.

 

  • Demultiplex Results

Sample

Barcode
No. demultiplexing reads
Percentage of merged pairs

T1

1
780,920
30.5%

T2

2
618,270
24.1%

T3

3
672,196
26.2%
Not grouped
-
491,734
19.2%

 The barcode-labeled paired reads should be merged using the function “fastq_mergepairs” of UPARSE software.

Sample

Barcode
No. of input pairs
No. of merged pairs Percentage of merged pairs

T1

1
390,460
386,804
99.1%

T2

2
309,135
300,186
97.3%

T3

3
336,098
320,731
96.3%

 UPARSE (v8.0.1623_i86linux32, Edgar RC, 2013) is a new method aimed at clustering globally-trimmed sequences into operational taxonomic units (OTUs) with a focus on reducing OTU inflation. Results are reported based on analysis with UPARSE. We followed the recommended UPARSE pipeline, including merging of paired reads, quality filtering, dereplication, discard singletons, OTU clustering, Chimera filtering using reference database, and estimating OTU richness.

 

Region

Number of OTUs

16S V6

3,844

 QIIME is used to classify reads into operational taxonomic units (OTUs), to give an indication of the diversity of species found in a particular sample. Presently, we use ribosomal RNA gene database for classification of the 16S region in archaea and bacteria.
 Taxonomy summary results would be shown in the HTML file. By default, the relative abundance of each taxonomic group will be reported. Because the number of sequences for each sample is different, we need to standardize each sample to make them comparable. The most natural normalization is to go from absolute abundance to relative abundances. So each number in the OTU table will represent the proportion of sequences from that samples belonging to that OTU.


Figure 3. Taxonomy Area chart and bar chart at Current level.

taxonomy_plot

Figure 4. Taxonomy detail at Current level.

taxonomy_text

 Community ecologists are often interested in computing rarefaction curve for samples or groups of samples in their study. Here, we will determine the level of rarefaction curve using QIIME’s alpha_rarefaction.py workflow, the results was shown in the html files in the “Community_analysis” folder.


Figure 5. Rarefaction curve.

curve

Figure 6. Rarefaction curve detail.

curve_text

  1. Edgar RC. (2013) UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nat Methods. 10:996-998.
  2. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, Fierer N, Peña AG, Goodrich JK, Gordon JI, Huttley GA, Kelley ST, Knights D, Koenig JE, Ley RE, Lozupone CA, McDonald D, Muegge BD, Pirrung M, Reeder J, Sevinsky JR, Turnbaugh PJ, Walters WA, Widmann J, Yatsunenko T, Zaneveld J, Knight R. (2010) QIIME allows analysis of high-throughput community sequencing data. Nat Methods. 7:335-336.

© 2017 Yourgene BioScience Inc.