Sp1 results. Top left: ROC curves for cross-validation on Sp1 data set using shuffled versions of the held-out test sequences as counter-examples. Top right: ROC curves for the motifs found on the small data set when applied to a large Sp1 binding data set from TRANSFAC. The AUC statistics are given in the legend. Bottom: A known TRANSFAC motif for Sp1 (the reverse complement of M00196) and the motifs found by the methods we tested. In our model, 32% of the binding sites will have a T inserted after the fifth base. Note that modelling this optional base allows our method to avoid some ambiguity which is present in the Cs preceding the central G in the TRANSFAC motif.

