Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

Viral quasispecies inference from 454 pyrosequencing

Wan-Ting Poh1, Eryu Xia7, Kwanrutai Chin-inmanu23, Lai-Ping Wong1, Anthony Youzhi Cheng1, Prida Malasit245, Prapat Suriyaphol23, Yik-Ying Teo1106789* and Rick Twee-Hee Ong110

Author Affiliations

1 Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore

2 Center for Emerging and Neglected Infectious Diseases, Mahidol University, Bangkok, Thailand

3 Division of Bioinformatics and Data Management for Research, Office for Research and Development, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand

4 Medical Biotechnology Research Unit, National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, Bangkok, Thailand

5 Dengue Hemorrhagic Fever Research Unit, Office for Research and Development, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand

6 Department of Statistics and Applied Probability, National University of Singapore, Singapore, Singapore

7 NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore, Singapore

8 Life Sciences Institute, National University of Singapore, Singapore, Singapore

9 Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore

10 Center for Infectious Disease Epidemiology and Research, National University of Singapore, Singapore, Singapore

For all author emails, please log on.

BMC Bioinformatics 2013, 14:355  doi:10.1186/1471-2105-14-355

Published: 5 December 2013

Abstract

Background

Many potentially life-threatening infectious viruses are highly mutable in nature. Characterizing the fittest variants within a quasispecies from infected patients is expected to allow unprecedented opportunities to investigate the relationship between quasispecies diversity and disease epidemiology. The advent of next-generation sequencing technologies has allowed the study of virus diversity with high-throughput sequencing, although these methods come with higher rates of errors which can artificially increase diversity.

Results

Here we introduce a novel computational approach that incorporates base quality scores from next-generation sequencers for reconstructing viral genome sequences that simultaneously infers the number of variants within a quasispecies that are present. Comparisons on simulated and clinical data on dengue virus suggest that the novel approach provides a more accurate inference of the underlying number of variants within the quasispecies, which is vital for clinical efforts in mapping the within-host viral diversity. Sequence alignments generated by our approach are also found to exhibit lower rates of error.

Conclusions

The ability to infer the viral quasispecies colony that is present within a human host provides the potential for a more accurate classification of the viral phenotype. Understanding the genomics of viruses will be relevant not just to studying how to control or even eradicate these viral infectious diseases, but also in learning about the innate protection in the human host against the viruses.

Keywords:
Virus quasispecies; Sequence alignment; 454 pyrosequencing