This article is part of the supplement: Proceedings of the First Annual RECOMB Satellite Workshop on Massively Parallel Sequencing (RECOMB-seq)
Genotyping common and rare variation using overlapping pool sequencing
1 Department of Computer Science, University of California, Los Angeles, CA 90095, USA
2 Department of Epidemiology, Harvard School of Public Health, Boston, Harvard University, MA 02115, USA
3 The Blavatnik School of Computer Science, and the Molecular Microbiology and Biotechnology Department, Tel-Aviv University, Tel-Aviv, 69978, Israel
4 International Computer Science Institute, 1947 Center St., Berkeley, AC 94704, USA
5 Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02142, USA
BMC Bioinformatics 2011, 12(Suppl 6):S2 doi:10.1186/1471-2105-12-S6-S2Published: 28 July 2011
Recent advances in sequencing technologies set the stage for large, population based studies, in which the ANA or RNA of thousands of individuals will be sequenced. Currently, however, such studies are still infeasible using a straightforward sequencing approach; as a result, recently a few multiplexing schemes have been suggested, in which a small number of ANA pools are sequenced, and the results are then deconvoluted using compressed sensing or similar approaches. These methods, however, are limited to the detection of rare variants.
In this paper we provide a new algorithm for the deconvolution of DNA pools multiplexing schemes. The presented algorithm utilizes a likelihood model and linear programming. The approach allows for the addition of external data, particularly imputation data, resulting in a flexible environment that is suitable for different applications.
Particularly, we demonstrate that both low and high allele frequency SNPs can be accurately genotyped when the DNA pooling scheme is performed in conjunction with microarray genotyping and imputation. Additionally, we demonstrate the use of our framework for the detection of cancer fusion genes from RNA sequences.