This article is part of the supplement: Selected articles from the International Conference on Intelligent Biology and Medicine (ICIBM 2013): Genomics
QChIPat: a quantitative method to identify distinct binding patterns for two biological ChIP-seq samples in different experimental conditions
- Equal contributors
1 Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
2 School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, 518055, P.R., China
3 Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, 518055, China
4 Shanghai Key Laboratory of Intelligent Information Processing, Shanghai, P. R., China
5 Department of Molecular Medicine, University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA
6 Comprenhensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA
BMC Genomics 2013, 14(Suppl 8):S3 doi:10.1186/1471-2164-14-S8-S3Published: 9 December 2013
Many computational programs have been developed to identify enriched regions for a single biological ChIP-seq sample. Given that many biological questions are often asked to compare the difference between two different conditions, it is important to develop new programs that address the comparison of two biological ChIP-seq samples. Despite several programs designed to address this question, these programs suffer from some drawbacks, such as inability to distinguish whether the identified differential enriched regions are indeed significantly enriched, lack of distinguishing binding patterns, and neglect of the normalization between samples.
In this study, we developed a novel quantitative method for comparing two biological ChIP-seq samples, called QChIPat. Our method employs a new global normalization method: nonparametric empirical Bayes (NEB) correction normalization, utilizes pre-defined enriched regions identified from single-sample peak calling programs, uses statistical methods to define differential enriched regions, then defines binding (histone modification) pattern information for those differential enriched regions. Our program was tested on a benchmark data: histone modifications data used by ChIPDiffs. It was then applied on two study cases: one to identify differential histone modification sites for ChIP-seq of H3K27me3 and H3K9me2 data in AKT1-transfected MCF10A cells; the other to identify differential binding sites for ChIP-seq of TCF7L2 data in MCF7 and PANC1 cells.
Several advantages of our program include: 1) it considers a control (or input) experiment; 2) it incorporates a novel global normalization strategy: nonparametric empirical Bayes correction normalization; 3) it provides the binding pattern information among different enriched regions. QChIPat is implemented in R, Perl and C++, and has been tested under Linux. The R package is available at http://motif.bmi.ohio-state.edu/QChIPat webcite.