Statistical power of phylo-HMM for evolutionarily conserved element detection
1 Department of Statistics, Harvard University, Boston, MA, USA
2 Genetics, Rosetta Inpharmatics LLC, a wholly owned subsidiary of Merck & Co., Inc. Seattle, WA, USA
BMC Bioinformatics 2007, 8:374 doi:10.1186/1471-2105-8-374Published: 5 October 2007
An important goal of comparative genomics is the identification of functional elements through conservation analysis. Phylo-HMM was recently introduced to detect conserved elements based on multiple genome alignments, but the method has not been rigorously evaluated.
We report here a simulation study to investigate the power of phylo-HMM. We show that the power of the phylo-HMM approach depends on many factors, the most important being the number of species-specific genomes used and evolutionary distances between pairs of species. This finding is consistent with results reported by other groups for simpler comparative genomics models. In addition, the conservation ratio of conserved elements and the expected length of the conserved elements are also major factors. In contrast, the influence of the topology and the nucleotide substitution model are relatively minor factors.
Our results provide for general guidelines on how to select the number of genomes and their evolutionary distance in comparative genomics studies, as well as the level of power we can expect under different parameter settings.