Email updates

Keep up to date with the latest news and content from BMC Research Notes and BioMed Central.

Open Access Highly Accessed Short Report

A cost-effective and universal strategy for complete prokaryotic genomic sequencing proposed by computer simulation

Jingwei Jiang1, Jun Li1, Hoi Shan Kwan234, Chun Hang Au2, Patrick Tik Wan Law45, Lei Li2, Kai Man Kam6, Julia Mei Lun Ling7 and Frederick C Leung1*

Author Affiliations

1 School of Biological Sciences, Faculty of Science, The University of Hong Kong, Hong Kong, China

2 Biology Programme, School of Life Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China

3 Food and Nutritional Sciences Programme, School of Life Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China

4 Food Research Centre, The Chinese University of Hong Kong, Hong Kong SAR, China

5 Core Facilities - Genome Sequencing Laboratory, The Chinese University of Hong Kong, Hong Kong SAR, China

6 Microbiology Division, Public Health Laboratory Services Branch, Centre for Health Protection, Department of Health, Hong Kong SAR, China

7 Department of Microbiology, The Chinese University of Hong Kong, Prince of Wales Hospital, Hong Kong SAR, China

For all author emails, please log on.

BMC Research Notes 2012, 5:80  doi:10.1186/1756-0500-5-80

Published: 31 January 2012

Abstract

Background

Pyrosequencing techniques allow scientists to perform prokaryotic genome sequencing to achieve the draft genomic sequences within a few days. However, the assemblies with shotgun sequencing are usually composed of hundreds of contigs. A further multiplex PCR procedure is needed to fill all the gaps and link contigs into complete chromosomal sequence, which is the basis for prokaryotic comparative genomic studies. In this article, we study various pyrosequencing strategies by simulated assembling from 100 prokaryotic genomes.

Findings

Simulation study shows that a single end 454 Jr. run combined with a paired end 454 Jr. run (8 kb library) can produce: 1) ~90% of 100 assemblies with < 10 scaffolds and ~95% of 100 assemblies with < 150 contigs; 2) average contig N50 size is over 331 kb; 3) average single base accuracy is > 99.99%; 4) average false gene duplication rate is < 0.7%; 5) average false gene loss rate is < 0.4%.

Conclusions

A single end 454 Jr. run combined with a paired end 454 Jr. run (8 kb library) is a cost-effective way for prokaryotic whole genome sequencing. This strategy provides solution to produce high quality draft assemblies for most of prokaryotic organisms within days. Due to the small number of assembled scaffolds, the following multiplex PCR procedure (for gap filling) would be easy. As a result, large scale prokaryotic whole genome sequencing projects may be finished within weeks.