Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Software

Strainer: software for analysis of population variation in community genomic datasets

John M Eppley13, Gene W Tyson23, Wayne M Getz2 and Jillian F Banfield2*

Author Affiliations

1 Department of Bioengineering, University of California, Berkeley, CA 94720, USA

2 Department of Environmental Science, Policy and Management, University of California, Berkeley, CA 94720, USA

3 Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA

For all author emails, please log on.

BMC Bioinformatics 2007, 8:398  doi:10.1186/1471-2105-8-398

Published: 17 October 2007

Abstract

Background

Metagenomic analyses of microbial communities that are comprehensive enough to provide multiple samples of most loci in the genomes of the dominant organism types will also reveal patterns of genetic variation within natural populations. New bioinformatic tools will enable visualization and comprehensive analysis of this sequence variation and inference of recent evolutionary and ecological processes.

Results

We have developed a software package for analysis and visualization of genetic variation in populations and reconstruction of strain variants from otherwise co-assembled sequences. Sequencing reads can be clustered by matching patterns of single nucleotide polymorphisms to generate predicted gene and protein variant sequences, identify conserved intergenic regulatory sequences, and determine the quantity and distribution of recombination events.

Conclusion

The Strainer software, a first generation metagenomic bioinformatics tool, facilitates comprehension and analysis of heterogeneity intrinsic in natural communities. The program reveals the degree of clustering among closely related sequence variants and provides a rapid means to generate gene and protein sequences for functional, ecological, and evolutionary analyses.