Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Highly Accessed Methodology article

Optimizing illumina next-generation sequencing library preparation for extremely at-biased genomes

Samuel O Oyola1*, Thomas D Otto1, Yong Gu1, Gareth Maslen1, Magnus Manske1, Susana Campino1, Daniel J Turner2, Bronwyn MacInnis1, Dominic P Kwiatkowski1, Harold P Swerdlow1 and Michael A Quail1

Author Affiliations

1 Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK

2 Oxford Nanopore Technologies, Edmund Cartwright House, 4 Robert Robinson Avenue, Oxford OX4 4GA, UK

For all author emails, please log on.

BMC Genomics 2012, 13:1  doi:10.1186/1471-2164-13-1

Published: 3 January 2012

Abstract

Background

Massively parallel sequencing technology is revolutionizing approaches to genomic and genetic research. Since its advent, the scale and efficiency of Next-Generation Sequencing (NGS) has rapidly improved. In spite of this success, sequencing genomes or genomic regions with extremely biased base composition is still a great challenge to the currently available NGS platforms. The genomes of some important pathogenic organisms like Plasmodium falciparum (high AT content) and Mycobacterium tuberculosis (high GC content) display extremes of base composition. The standard library preparation procedures that employ PCR amplification have been shown to cause uneven read coverage particularly across AT and GC rich regions, leading to problems in genome assembly and variation analyses. Alternative library-preparation approaches that omit PCR amplification require large quantities of starting material and hence are not suitable for small amounts of DNA/RNA such as those from clinical isolates. We have developed and optimized library-preparation procedures suitable for low quantity starting material and tolerant to extremely high AT content sequences.

Results

We have used our optimized conditions in parallel with standard methods to prepare Illumina sequencing libraries from a non-clinical and a clinical isolate (containing ~53% host contamination). By analyzing and comparing the quality of sequence data generated, we show that our optimized conditions that involve a PCR additive (TMAC), produces amplified libraries with improved coverage of extremely AT-rich regions and reduced bias toward GC neutral templates.

Conclusion

We have developed a robust and optimized Next-Generation Sequencing library amplification method suitable for extremely AT-rich genomes. The new amplification conditions significantly reduce bias and retain the complexity of either extremes of base composition. This development will greatly benefit sequencing clinical samples that often require amplification due to low mass of DNA starting material.

Keywords:
Next-Generation Sequencing; Illumina; Library; Plasmodium falciparum; AT-rich; Malaria; Clinical isolate; PCR; Tetramethyammonium chloride; PCR-free; Isothermal; Linear; Exponential