BMC Bioinformatics

official impact factor 3.03

This article is part of the supplement: Highlights from the Fifth International Society for Computational Biology (ISCB) Student Council Symposium

Open Access Oral presentation

Paired-end read length lower bounds for genome re-sequencing

Rayan Chikhi* and Dominique Lavenier

  • * Corresponding author: Rayan Chikhi

Author Affiliations

ENS Cachan/IRISA, Campus de Beaulieu, 35042 Rennes, France

For all author emails, please log on.

BMC Bioinformatics 2009, 10(Suppl 13):O2 doi:10.1186/1471-2105-10-S13-O2

Published: 19 October 2009

First paragraph (this article has no abstract)

Next-generation sequencing technology is enabling massive production of high-quality paired-end reads. Many platforms (Illumina Genome Analyzer, Applied Biosystems SOLID, Helicos HeliScope) are currently able to produce "ultra-short" paired reads of lengths starting at 25 nt. An analysis by Whiteford et al. [1] on sequencing using unpaired reads shows that ultra-short reads theoretically allow whole genome re-sequencing and de novo assembly of only small eukaryotic genomes. Chaisson, Brinza and Pevzner [2] recently determined that the paired read length threshold for de novo assembly of the E. coli genome is ≈ 35 nt, and ≈ 60 nt for the S. cerevisiae genome. The latter read length is unfeasible for some next-generation technologies. By conducting an analysis extending Whiteford et al. results, we investigate to what extent genome re-sequencing is feasible with ultra-short paired reads. We obtain theoretical read length lower bounds for re-sequencing that are also applicable to paired-end de novo assembly.