The following Sample Information Table shows information for up to 10 samples. For projects with more than 10 samples, the Complete Sample Information Table can be accessed by clicking on the first link below the table. The Read Processing Summary Table can be accessed by clicking on the second link below the table.
Sample Information Table:
1. sample_id: unique identification code that was assigned to each sample.
2. customer_label: sample name provided for the project and used in analyses.
3. raw_seq_files: names of the associated raw sequencing files for each sample (available from the raw data package download).
4. Subgroup Columns: group comparison information. If group comparison information was provided for the project, it is displayed in the remaining columns of the table.
Read Processing Summary Table:
1. internal_id: unique identification code that was assigned to each sample.
2. customer_label: sample name provided for the project and used in analyses.
3. rawseqs(R1+R2): number of raw sequences generated for each sample.
4. trimmed_seqs(R1+R2): number of sequences retained after quality trimming.
5. dada2_infered: number of sequences retained after DADA2 quality control trimming.
6. chimera_seqs: number of chimeric sequences identified in the dada2_infered sequences.
7. chimera_free_seqs: number of chimera-free sequences identified in the dada2_infered sequences.
8. unique_seqs: number of unique sequences identified in the chimera-free sequences.
9. seqs(after_size_filtration): number of chimera-free sequences that have also undergone further amplicon size filtration. This is the data that is finally analyzed for the rest of the report through QIIME.
10. final_unique_seqs: number of unique sequences identified in size-filtered chimera-free sequences.
sample_id | customer_label | raw_seq_files | Subgroup1 |
---|---|---|---|
in459_61 | F1 | in459_61_R1.fastq.gz;in459_61_R2.fastq.gz | groupC |
in459_67 | F2 | in459_67_R1.fastq.gz;in459_67_R2.fastq.gz | groupC |
in459_68 | F3 | in459_68_R1.fastq.gz;in459_68_R2.fastq.gz | groupC |
in459_86 | F4 | in459_86_R1.fastq.gz;in459_86_R2.fastq.gz | groupC |
in459_96 | F5 | in459_96_R1.fastq.gz;in459_96_R2.fastq.gz | groupC |
in459_106 | F6 | in459_106_R1.fastq.gz;in459_106_R2.fastq.gz | groupC |
in459_62 | F7 | in459_62_R1.fastq.gz;in459_62_R2.fastq.gz | groupD |
in459_63 | F8 | in459_63_R1.fastq.gz;in459_63_R2.fastq.gz | groupD |
in459_74 | F9 | in459_74_R1.fastq.gz;in459_74_R2.fastq.gz | groupD |
in459_76 | F10 | in459_76_R1.fastq.gz;in459_76_R2.fastq.gz | groupD |
The plot below shows the absolute abundance of bacterial (16S) or fungal (ITS) DNA measured in the samples (based on the service requested). For analyses without group comparison, a histogram of gene copies per microliter in each sample is shown. For analyses with group comparison, a box-and-whisker plot of gene copies per microliter in each group is shown. The Absolute Abundance Table, which contains data for gene copies, calculated genome copies, and calculated amount of DNA can be accessed by clicking on the first link below the table. More information about the table can be found by clicking on the More Information button below.
Absolute Abundance Table:
1. sample_id:unique identification code that was assigned to each sample.
2. customer_label:sample name provided for the project and used in analyses.
3. gene_copies: number of gene copies measured in one microliter of DNA sample.
4. genome_copies: number of genome copies in one microliter of DNA sample calculated using an assumed number of four (4) 16S copies per genome or two hundred (200) ITS copies per genome.
5. DNA_ng: amount of DNA in one microliter of DNA sample calculated using genome_copies and an assumed genome size of 4.64 x 106 bp (Escherichia coli) for 16S or 1.20 x 107 bp (Saccharomyces cerevisiae) for ITS.
Absolute Abundance Boxplot By Groups: Subgroup1
Taxa composition plots illustrate the microbial composition at different taxonomy levels from phylum to species. The interactive figure below shows the microbial composition at species level. Additional composition barplots and abundance tables can be accessed by clicking on the link below the figure.
The taxonomy abundance heatmap with sample clustering is a quick way to help identify patterns of microbial distribution among samples. Heatmaps at different taxonomic levels and with or without sample clustering can be found by clicking the links below the figure.
The following heatmap shows the microbial composition of the samples at the species level with the top fifty most abundant species identified. Each row represents the abundance for each taxon, with the taxonomy ID shown on the right. Each column represents the abundance for each sample, with the sample ID shown at the bottom. If available, group information is indicated by the colored bar located on the top of each column. Hierarchical clustering was performed on samples based on Bray-Curtis dissimilarity. Hierarchical clustering was also performed on the taxa so that taxa with similar distributions are grouped together.
Heatmaps with Sample Clustering:
Subgroup1: Phylum Class Order Family Genus Species
Heatmaps without Sample Clustering:
Subgroup1: Phylum Class Order Family Genus Species
The amplicon sequence variant (ASV) abundance heatmap is built directly from the abundance of unique amplicon sequences inferred from raw sequencing data. Heatmaps with or without sample clustering can be found by clicking the links below the figure.
Heatmaps with Sample Clustering: Subgroup1
Heatmaps without Sample Clustering: Subgroup1
Alpha diversity is a measurement of the microbial diversity of each sample. The plot below shows the number of observed species in the samples. For analyses without group comparison, a histogram of observed species in each sample is shown. For analyses with group comparison, a box-and-whisker plot of observed species in each group is shown. Alpha diversity graphs generated by other matrices can be found by clicking the last link
Normally, with deeper sequencing depth, the alpha diversity increases as more taxa at lower abundance are identified. Alpha diversity rarefraction graphs generated by other matrices can be found by clicking the link given below the figure.
Alpha Diversity Boxplots: Subgroup1
Beta diversity is a measurement of microbial diversity differences between samples. The figure below is the 3-dimensional principle coordinate analysis (PCoA) plot created using the matrix of paired-wise distance between samples calculated by the Bray-Curtis dissimilarity using unique amplicon sequence variants (ASV). Interactive 3-dimensional plots of beta-diversity with different matrices can be accessed by clicking the links given below the figures.
Each dot on the beta diversity plot represents the whole microbial composition profile. Samples with similar microbial composition profiles are closer to each other, while samples with different profiles are farther away from each other.
Beta Diversity 3D Emperor Plot View:
LEfSe analysis helps to identify taxa whose distributions are significantly and statistically different among pre-defined groups.
LEfSe uses statistical analysis to identify taxa whose distributions among pre-defined groups is significantly different. It also utilizes the concept of effect size to allow researchers to focus on the taxa of dramatic differences. By default, LEfSe identifies taxa whose distributions among different groups are statistically different with p-value <0.05 and the effect size (LDA score) higher than 2. LEfSe analysis is only possible if group information is given. It can conveniently help researchers identify biomarkers among/between groups (e.g. control group vs. disease group). Major outputs from LEfSe analysis includes the following:
1. Interactive Biomarkers Plot: This plot shows the distribution of the abundance of identified biomarkers among all samples. Click on the bars of biomarkers on the Interactive Biomarkers Plot to access the abundance distribution profile among groups.
2. Biomarkers Plot: This plot lists biomarkers by group definition and effect size.
3. Cladogram Plot: This plot illustrates identified biomarkers (colored based on groups) in a context of phylogenetic tree.
4. LEfSe Statistics Table(Output): This excel file stores the raw data of effect size (4th column/ column D) and P-values (5th column/ column E) from statistical analysis. The group in which the taxa was more abundant is in the 3rd column/column C.
Interactive Biomarkers Plot: Subgroup1
Biomarkers Plot (PDF): Subgroup1
Cladogram Plot (PDF): Subgroup1
Taxa2ASV stands for taxonomy to amplicon sequence variations. In this analysis, a taxon of interest can be decomposed into its unique amplicon sequences to facilitate further analyses.
This section is for demonstrative purpose only. Your generated report will include Taxa2ASV Decomposer Outputs organized by both Family and Genus.