A comparative study of S/MAR prediction tools
1 School of Crystallography, Birkbeck College, Malet Street, London, WC1E 7HX, UK
2 Functional Genomics Laboratory, Wolfson Institute for Biomedical Research, University College London, The Cruciform Building, Gower Street, London WC1E 6AU, UK
BMC Bioinformatics 2007, 8:71 doi:10.1186/1471-2105-8-71Published: 2 March 2007
S/MARs are regions of the DNA that are attached to the nuclear matrix. These regions are known to affect substantially the expression of genes. The computer prediction of S/MARs is a highly significant task which could contribute to our understanding of chromatin organisation in eukaryotic cells, the number and distribution of boundary elements, and the understanding of gene regulation in eukaryotic cells. However, while a number of S/MAR predictors have been proposed, their accuracy has so far not come under scrutiny.
We have selected S/MARs with sufficient experimental evidence and used these to evaluate existing methods of S/MAR prediction. Our main results are: 1.) all existing methods have little predictive power, 2.) a simple rule based on AT-percentage is generally competitive with other methods, 3.) in practice, the different methods will usually identify different sub-sequences as S/MARs, 4.) more research on the H-Rule would be valuable.
A new insight is needed to design a method which will predict S/MARs well. Our data, including the control data, has been deposited as additional material and this may help later researchers test new predictors.