Click on the links below to view the results for each group analysis:
The samples were processed and analyzed with the ZymoBIOMICS® Shotgun Metagenomic Sequencing Service for Microbiome Analysis (Zymo Research, Irvine, CA). Specific details for the project can be found in the final report PDF.
DNA Extraction: One of three DNA extraction kits was used depending on the sample type and sample volume. In most cases, the ZymoBIOMICS®-96 MagBead DNA Kit (D4302, Zymo Research, Irvine, CA) was used to extract DNA using an automated platform. In some cases, ZymoBIOMICS® DNA Miniprep Kit (D4300, Zymo Research, Irvine, CA) was used. For some low biomass samples, such as skin swabs, the ZymoBIOMICS® DNA Microprep Kit (D4301, Zymo Research, Irvine, CA) was used as it permits for a lower elution volume, resulting in more concentrated DNA samples.
Library Preparation: Genomic DNA samples were profiled with shotgun metagenomic sequencing. Sequencing libraries were prepared with the Illumina DNA Prep Kit (Illumina, San Diego, CA) following the manufacturers protocol using 10 bp unique dual indexes. All libraries were quantified with Qubit (Thermo Fisher Scientific) and then pooled together by equal abundance. The final pool was quantified using qPCR.
Sequencing: The final library was sequenced on either the Illumina NextSeq® 2000 or the Illumina NovaSeq® X.
Bioinformatics Analysis: Raw sequence reads were trimmed to remove low quality fractions and adapters with Trimmomatic-0.33 (Bolger et al., 2014): quality trimming by sliding window with 6 bp window size and a quality cutoff of 20 and reads with size lower than 70 bp were removed. After that, host-derived reads were removed using Kraken2 (Wood et al., 2019) against some common Eukaryote host genomes. Low-diversity reads were detected and removed using sdust (https://github.com/lh3/sdust). The surviving reads were subjected to further taxonomy and functional analyses as follows. Antimicrobial resistance and virulence factor gene identification was performed with the DIAMOND sequence aligner (Buchfink et al., 2015) against reference databases internally curated from NCBI repositories. Microbial composition was profiled using Sourmash (Brown and Irber, 2016). The GTDB species representative database (RS207) was used for bacterial and archaea identification. Pre-formatted GenBank databases (v. 2022.03) provided by Sourmash (https://sourmash.readthedocs.io/en/latest/databases.html) were also used for virus, protozoa and fungi identification. Reads were mapped back to the genomes identified by Sourmash using BWA-MEM (Li, 2013) and the microbial abundance was determined based on the counts of mapped reads. The resulting taxonomy and abundance information were further analyzed: (1) to perform alpha- and beta-diversity analyses; (2) to create microbial composition barplots with QIIME (Caporaso et al., 2012); (3) to create taxa abundance heatmaps with hierarchical clustering (based on Bray-Curtis dissimilarity); and (4) for biomarker discovery with LEfSe (Segata et al., 2011) with default settings (p>0.05 and LDA effect size >2). Functional profiling was performed using Humann3 (Beghini, et al., 2021) including identification of UniRef gene family and MetaCyc metabolic pathways.
Beghini, F., McIver, L. J., et al. (2021). Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. Elife, 10, e65088.
Bolger, A.M., Lohse, M., and Usadel, B. (2014) Trimmomatic: a flexible trimmer forIllumina sequence data. Bioinformatics 30: 2114-2120.
Morgulis, A., Gertz, E. M., Schffer, A. A., & Agarwala, R. (2006). A fast and symmetric DUST implementation to mask low-complexity DNA sequences. Journal of computational biology : a journal of computational molecular cell biology, 13(5), 10281040.
Buchfink, B., Xie, C., Huson, D.H. (2015) Fast and sensitive protein alignment using DIAMOND. Nature Methods 12:59-60.
Brown, C. T., & Irber, L. (2016). sourmash: a library for MinHash sketching of DNA. Journal of open source software, 1(5), 27.
Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K. et al. (2010) QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7: 335-336.
Li, H. (2013). Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv preprint arXiv:1303.3997.
Segata, N., Izard, J., Waldron, L., Gevers, D., Miropolsky, L., Garrett, W.S., and Huttenhower, C. (2011) Metagenomic biomarker discovery and explanation. Genome Biol 12: R60.
Wood, D. E., Lu, J., & Langmead, B. (2019). Improved metagenomic analysis with Kraken 2. Genome biology, 20, 1-13.