Sequencing platform and library preparation choices impact viral metagenomes
1 Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, USA
2 Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA
3 CEA, DSV, IG, Genoscope, 2 rue Gaston Crémieux CP5706, Evry, Cedex, 91057, France
4 Department of Microbiology and Immunology, University of British Columbia, Vancouver, BC, Canada
5 Department of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA, USA
6 Austalian Center for Ecogenomics, University of Queensland, Brisbane, QLD, Australia
BMC Genomics 2013, 14:320 doi:10.1186/1471-2164-14-320Published: 10 May 2013
Microbes drive the biogeochemistry that fuels the planet. Microbial viruses modulate their hosts directly through mortality and horizontal gene transfer, and indirectly by re-programming host metabolisms during infection. However, our ability to study these virus-host interactions is limited by methods that are low-throughput and heavily reliant upon the subset of organisms that are in culture. One way forward are culture-independent metagenomic approaches, but these novel methods are rarely rigorously tested, especially for studies of environmental viruses, air microbiomes, extreme environment microbiology and other areas with constrained sample amounts. Here we perform replicated experiments to evaluate Roche 454, Illumina HiSeq, and Ion Torrent PGM sequencing and library preparation protocols on virus metagenomes generated from as little as 10pg of DNA.
Using %G + C content to compare metagenomes, we find that (i) metagenomes are highly replicable, (ii) some treatment effects are minimal, e.g., sequencing technology choice has 6-fold less impact than varying input DNA amount, and (iii) when restricted to a limited DNA concentration (<1μg), changing the amount of amplification produces little variation. These trends were also observed when examining the metagenomes for gene function and assembly performance, although the latter more closely aligned to sequencing effort and read length than preparation steps tested. Among Illumina library preparation options, transposon-based libraries diverged from all others and adaptor ligation was a critical step for optimizing sequencing yields.
These data guide researchers in generating systematic, comparative datasets to understand complex ecosystems, and suggest that neither varied amplification nor sequencing platforms will deter such efforts.