Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Methodology article

An algorithm for the determination and quantification of components of nucleic acid mixtures based on single sequencing reactions

Alexander Pozhitkov12*, Kathryn Stemshorn1 and Diethard Tautz1

Author Affiliations

1 Institut für Genetik, der Universität zu Köln, Zülpicherstrasse 47, 50674 Köln, Germany

2 Civil and Environmental Engineering, University of Washington, Seattle, 98195, WA, USA

For all author emails, please log on.

BMC Bioinformatics 2005, 6:281  doi:10.1186/1471-2105-6-281

Published: 29 November 2005

Abstract

Background

Determination and quantification of nucleic acid components in a mixture is usually accomplished by microarray approaches, where the mixtures are hybridized against specific probes. As an alternative, we propose here that a single sequencing reaction from a mixture of nucleic acids holds enough information to potentially distinguish the different components, provided it is known which components can occur in the mixture.

Results

We describe an algorithm that is based on a set of linear equations which can be solved when the sequencing profiles of the individual components are known and when the number of sequenced nucleotides is larger than the number of components in the mixture. We have implemented the procedure for one type of sequencing approach, pyrosequencing, which produces a stepwise output of peaks that is particularly suitable for the procedure. As an example we use signature sequences from ribosomal RNA to distinguish and quantify several different species in a mixture. Using simulations, we show that the procedure may also be applicable for dideoxy sequencing on capillary sequencers, requiring only some instrument specific adaptations of protocols and software.

Conclusion

The parallel sequencing approach described here may become a simple and cheap alternative to microarray experiments which aim at routine re-determination and quantification of known nucleic acid components from environmental samples or tissue samples.