Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory

Mark J Chaisson1 and Glenn Tesler2*

Author Affiliations

1 Department Secondary Analysis, Pacific Biosciences, 1005 Hamilton Rd, CA, Menlo Park, USA

2 Department of Mathematics, University of California, San Diego, 9500 Gilman Dr, CA, La Jolla, USA

For all author emails, please log on.

BMC Bioinformatics 2012, 13:238  doi:10.1186/1471-2105-13-238

Published: 19 September 2012

Abstract

Background

Recent methods have been developed to perform high-throughput sequencing of DNA by Single Molecule Sequencing (SMS). While Next-Generation sequencing methods may produce reads up to several hundred bases long, SMS sequencing produces reads up to tens of kilobases long. Existing alignment methods are either too inefficient for high-throughput datasets, or not sensitive enough to align SMS reads, which have a higher error rate than Next-Generation sequencing.

Results

We describe the method BLASR (Basic Local Alignment with Successive Refinement) for mapping Single Molecule Sequencing (SMS) reads that are thousands of bases long, with divergence between the read and genome dominated by insertion and deletion error. The method is benchmarked using both simulated reads and reads from a bacterial sequencing project. We also present a combinatorial model of sequencing error that motivates why our approach is effective.

Conclusions

The results indicate that it is possible to map SMS reads with high accuracy and speed. Furthermore, the inferences made on the mapability of SMS reads using our combinatorial model of sequencing error are in agreement with the mapping accuracy demonstrated on simulated reads.