Positive predictive value (fraction of bases within predicted sites that overlap a ChIP-Seq peak for the same TF) curves for 20 non-developmental TFs, as a function of the number of TFBS being predicted. Three prediction approaches are considered: (i) Blue curves: PWM scanning; (ii) Red curves: PWM scanning limited to highly conserved regions identified by PhastCons on the eutherian genomes alignment; (iii) PWM scanning limited to regions identified as conserved binding loci by our algorithm. For the red and blue curves, the desired number of predicted sites is obtained by varying the LLR threshold (but always maintaining it above the minimum threshold chosen for each TF). For the green curves, the desired number of predicted sites is obtained by varying the threshold on the binding locus scores and reporting all sites with LLR score above the minimum threshold located within these regions.
Blanchette BMC Bioinformatics 2012 13(Suppl 19):S2 doi:10.1186/1471-2105-13-S19-S2