t2prhd: a tool to study the patterns of repeat evolution
1 Institute of Genetics, Biological Research Center of the Hungarian Academy of Sciences, P.O. Box 521, H-6701 Szeged, Hungary
2 Department of Ecology, University of Szeged, Egyetem street 2, H-6721, Szeged, Hungary
BMC Bioinformatics 2008, 9:27 doi:10.1186/1471-2105-9-27Published: 18 January 2008
The models developed to characterize the evolution of multigene families (such as the birth-and-death and the concerted models) have also been applied on the level of sequence repeats inside a gene/protein. Phylogenetic reconstruction is the method of choice to study the evolution of gene families and also sequence repeats in the light of these models. The characterization of the gene family evolution in view of the evolutionary models is done by the evaluation of the clustering of the sequences with the originating loci in mind. As the locus represents positional information, it is straightforward that in the case of the repeats the exact position in the sequence should be used, as the simple numbering according to repeat order can be misleading.
We have developed a novel rapid visual approach to study repeat evolution, that takes into account the exact repeat position in a sequence. The "pairwise repeat homology diagram" visualizes sequence repeats detected by a profile HMM in a pair of sequences and highlights their homology relations inferred by a phylogenetic tree. The method is implemented in a Perl script (t2prhd) available for downloading at http://t2prhd.sourceforge.net webcite and is also accessible as an online tool at http://t2prhd.brc.hu webcite. The power of the method is demonstrated on the EGF-like and fibronectin-III-like (Fn-III) domain repeats of three selected mammalian Tenascin sequences.
Although pairwise repeat homology diagrams do not carry all the information provided by the phylogenetic tree, they allow a rapid and intuitive assessment of repeat evolution. We believe, that t2prhd is a helpful tool with which to study the pattern of repeat evolution. This method can be particularly useful in cases of large datasets (such as large gene families), as the command line interface makes it possible to automate the generation of pairwise repeat homology diagrams with the aid of scripts.