Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Software

Automated genome mining for natural products

Michael HT Li12, Peter MU Ung12, James Zajkowski1, Sylvie Garneau-Tsodikova12 and David H Sherman123*

Author Affiliations

1 Life Sciences Institute, University of Michigan, Ann Arbor, MI, USA

2 Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA

3 Departments of Chemistry, and Microbiology & Immunology, University of Michigan, Ann Arbor, MI, USA

For all author emails, please log on.

BMC Bioinformatics 2009, 10:185  doi:10.1186/1471-2105-10-185

Published: 16 June 2009

Abstract

Background

Discovery of new medicinal agents from natural sources has largely been an adventitious process based on screening of plant and microbial extracts combined with bioassay-guided identification and natural product structure elucidation. Increasingly rapid and more cost-effective genome sequencing technologies coupled with advanced computational power have converged to transform this trend toward a more rational and predictive pursuit.

Results

We have developed a rapid method of scanning genome sequences for multiple polyketide, nonribosomal peptide, and mixed combination natural products with output in a text format that can be readily converted to two and three dimensional structures using conventional software. Our open-source and web-based program can assemble various small molecules composed of twenty standard amino acids and twenty two other chain-elongation intermediates used in nonribosomal peptide systems, and four acyl-CoA extender units incorporated into polyketides by reading a hidden Markov model of DNA. This process evaluates and selects the substrate specificities along the assembly line of nonribosomal synthetases and modular polyketide synthases.

Conclusion

Using this approach we have predicted the structures of natural products from a diverse range of bacteria based on a limited number of signature sequences. In accelerating direct DNA to metabolomic analysis, this method bridges the interface between chemists and biologists and enables rapid scanning for compounds with potential therapeutic value.