BMC Bioinformatics

official impact factor 3.03

Open Access Software

XRate: a fast prototyping, training and annotation tool for phylo-grammars

Peter S Klosterman1, Andrew V Uzilov1, Yuri R Bendaña1, Robert K Bradley1, Sharon Chao1, Carolin Kosiol3,2, Nick Goldman2 and Ian Holmes1*

Author Affiliations

1 Department of Bioengineering, University of California, Berkeley CA, USA

2 European Bioinformatics Institute, Hinxton, Cambridgeshire, UK

3 Department of Biological Statistics and Computational Biology, Cornell University, Ithaca NY, USA

For all author emails, please log on.

BMC Bioinformatics 2006, 7:428 doi:10.1186/1471-2105-7-428

Published: 3 October 2006

Abstract

Background

Recent years have seen the emergence of genome annotation methods based on the phylo-grammar, a probabilistic model combining continuous-time Markov chains and stochastic grammars. Previously, phylo-grammars have required considerable effort to implement, limiting their adoption by computational biologists.

Results

We have developed an open source software tool, xrate, for working with reversible, irreversible or parametric substitution models combined with stochastic context-free grammars. xrate efficiently estimates maximum-likelihood parameters and phylogenetic trees using a novel "phylo-EM" algorithm that we describe. The grammar is specified in an external configuration file, allowing users to design new grammars, estimate rate parameters from training data and annotate multiple sequence alignments without the need to recompile code from source. We have used xrate to measure codon substitution rates and predict protein and RNA secondary structures.

Conclusion

Our results demonstrate that xrate estimates biologically meaningful rates and makes predictions whose accuracy is comparable to that of more specialized tools.