How accurately is ncRNA aligned within whole-genome multiple alignments?
1 Department of Computer Science and Engineering, University of Washington, Box 352350, Seattle, WA 98195, USA
2 Department of Genome Sciences, University of Washington, Box 355065, Seattle, WA 98195, USA
BMC Bioinformatics 2007, 8:417 doi:10.1186/1471-2105-8-417Published: 26 October 2007
Multiple alignment of homologous DNA sequences is of great interest to biologists since it provides a window into evolutionary processes. At present, the accuracy of whole-genome multiple alignments, particularly in noncoding regions, has not been thoroughly evaluated.
We evaluate the alignment accuracy of certain noncoding regions using noncoding RNA alignments from Rfam as a reference. We inspect the MULTIZ 17-vertebrate alignment from the UCSC Genome Browser for all the human sequences in the Rfam seed alignments. In particular, we find 638 instances of chimeric and partial alignments to human noncoding RNA elements, of which at least 225 can be improved by straightforward means. As a byproduct of our procedure, we predict many novel instances of known ncRNA families that are suggested by the alignment.
MULTIZ does a fairly accurate job of aligning these genomes in these difficult regions. However, our experiments indicate that better alignments exist in some regions.