RibAlign: a software tool and database for eubacterial phylogeny based on concatenated ribosomal protein subunits
1 Microbial Genomics Group, Max Planck Institute for Marine Microbiology, D-28359 Bremen, Germany
2 International University Bremen, D-28759 Bremen, Germany
BMC Bioinformatics 2006, 7:66 doi:10.1186/1471-2105-7-66Published: 13 February 2006
Until today, analysis of 16S ribosomal RNA (rRNA) sequences has been the de-facto gold standard for the assessment of phylogenetic relationships among prokaryotes. However, the branching order of the individual phlya is not well-resolved in 16S rRNA-based trees. In search of an improvement, new phylogenetic methods have been developed alongside with the growing availability of complete genome sequences. Unfortunately, only a few genes in prokaryotic genomes qualify as universal phylogenetic markers and almost all of them have a lower information content than the 16S rRNA gene. Therefore, emphasis has been placed on methods that are based on multiple genes or even entire genomes. The concatenation of ribosomal protein sequences is one method which has been ascribed an improved resolution. Since there is neither a comprehensive database for ribosomal protein sequences nor a tool that assists in sequence retrieval and generation of respective input files for phylogenetic reconstruction programs, RibAlign has been developed to fill this gap.
RibAlign serves two purposes: First, it provides a fast and scalable database that has been specifically adapted to eubacterial ribosomal protein sequences and second, it provides sophisticated import and export capabilities. This includes semi-automatic extraction of ribosomal protein sequences from whole-genome GenBank and FASTA files as well as exporting aligned, concatenated and filtered sequence files that can directly be used in conjunction with the PHYLIP and MrBayes phylogenetic reconstruction programs.
Up to now, phylogeny based on concatenated ribosomal protein sequences is hampered by the limited set of sequenced genomes and high computational requirements. However, hundreds of full and draft genome sequencing projects are on the way, and advances in cluster-computing and algorithms make phylogenetic reconstructions feasible even with large alignments of concatenated marker genes. RibAlign is a first step in this direction and may be particularly interesting to scientists involved in whole genome sequencing of representatives of new or sparsely studied eubacterial phyla. RibAlign is available at http://www.megx.net/ribalign webcite