Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Highly Accessed Research article

Pacific biosciences sequencing technology for genotyping and variation discovery in human data

Mauricio O Carneiro1*, Carsten Russ2, Michael G Ross2, Stacey B Gabriel1, Chad Nusbaum2 and Mark A DePristo1

Author affiliations

1 Broad Institute of MIT and Harvard, Medical and Population Genetics Program, 301 Binney St, Cambridge, MA, 02141, USA

2 Broad Institute of MIT and Harvard, Genome Sequencing and Analysis Program, 320 Charles St, Cambridge, MA, 02141, USA

For all author emails, please log on.

Citation and License

BMC Genomics 2012, 13:375  doi:10.1186/1471-2164-13-375

Published: 5 August 2012

Abstract

Background

Pacific Biosciences technology provides a fundamentally new data type that provides the potential to overcome some limitations of current next generation sequencing platforms by providing significantly longer reads, single molecule sequencing, low composition bias and an error profile that is orthogonal to other platforms. With these potential advantages in mind, we here evaluate the utility of the Pacific Biosciences RS platform for human medical amplicon resequencing projects.

Results

We evaluated the Pacific Biosciences technology for SNP discovery in medical resequencing projects using the Genome Analysis Toolkit, observing high sensitivity and specificity for calling differences in amplicons containing known true or false SNPs. We assessed data quality: most errors were indels (~14%) with few apparent miscalls (~1%). In this work, we define a custom data processing pipeline for Pacific Biosciences data for human data analysis.

Conclusion

Critically, the error properties were largely free of the context-specific effects that affect other sequencing technologies. These data show excellent utility for follow-up validation and extension studies in human data and medical genetics projects, but can be extended to other organisms with a reference genome.