Effect of false positive and false negative rates on inference of binding target conservation across different conditions and species from ChIP-chip data
-
* Corresponding author: Hongyu Zhao hongyu.zhao@yale.edu
1 Department of Biomedical Engineering, Yale University, New Haven, CT 06520, USA
2 Department of Epidemiology and Public Health, Yale University, New Haven, CT 06520, USA
3 Department of Genetics, Yale University, New Haven, CT 06520, USA
BMC Bioinformatics 2009, 10:23 doi:10.1186/1471-2105-10-23
Published: 19 January 2009Abstract
Background
ChIP-chip data are routinely used to identify transcription factor binding targets. However, the presence of false positives and false negatives in ChIP-chip data complicates and hinders analyses, especially when the binding targets for a specific transcription factor are compared across conditions or species.
Results
We propose an Expectation Maximization based approach to infer the underlying true counts of "positives" and "negatives" from the observed counts. Based on this approach, we study the effect of false positives and false negatives on inferences related to transcription regulation.
Conclusion
Our results indicate that if there is a significant degree of association among the binding targets across conditions/species (log odds ratio > 4), moderate values of false positive and false negative rates (0.005 and 0.4 respectively) would not change our inference qualitatively (i.e. the presence or absence of conservation) based on the observed experimental data despite a significant change in the observed counts. However, if the underlying association is marginal, with odds ratios close to 1, moderate to large values of false positive and false negative rates (0.01 and 0.2 respectively) could mask the underlying association.