Log on / register
Feedback | Support
Open AccessResearch article

Comparing segmentations by applying randomization techniques

Niina Haiminen1 email, Heikki Mannila1,2 email and Evimaria Terzi1 email

1HIIT Basic Research Unit, Department of Computer Science, P.O.Box 68, FI-00014 University of Helsinki, Finland

2Laboratory of Computer and Information Science, Helsinki University of Technology, FI-02015 TKK, Finland

author email corresponding author email

BMC Bioinformatics 2007, 8:171doi:10.1186/1471-2105-8-171

Published: 23 May 2007

Abstract

Background

There exist many segmentation techniques for genomic sequences, and the segmentations can also be based on many different biological features. We show how to evaluate and compare the quality of segmentations obtained by different techniques and alternative biological features.

Results

We apply randomization techniques for evaluating the quality of a given segmentation. Our example applications include isochore detection and the discovery of coding-noncoding structure. We obtain segmentations of relevant sequences by applying different techniques, and use alternative features to segment on. We show that some of the obtained segmentations are very similar to the underlying true segmentations, and this similarity is statistically significant. For some other segmentations, we show that equally good results are likely to appear by chance.

Conclusion

We introduce a framework for evaluating segmentation quality, and demonstrate its use on two examples of segmental genomic structures. We transform the process of quality evaluation from simply viewing the segmentations, to obtaining p-values denoting significance of segmentation similarity.


© 1999-2008 BioMed Central Ltd unless otherwise stated