This article is part of the supplement: Italian Society of Bioinformatics (BITS): Annual Meeting 2007
FunClust: a web server for the identification of structural motifs in a set of non-homologous protein structures
1 Centre for Molecular Bioinformatics, Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
2 Department of Biochemical Sciences, University of Rome “La Sapienza”, Rome, Italy
BMC Bioinformatics 2008, 9(Suppl 2):S2 doi:10.1186/1471-2105-9-S2-S2Published: 26 March 2008
The occurrence of very similar structural motifs brought about by different parts of non homologous proteins is often indicative of a common function. Indeed, relatively small local structures can mediate binding to a common partner, be it a protein, a nucleic acid, a cofactor or a substrate. While it is relatively easy to identify short amino acid or nucleotide sequence motifs in a given set of proteins or genes, and many methods do exist for this purpose, much more challenging is the identification of common local substructures, especially if they are formed by non consecutive residues in the sequence.
Here we describe a publicly available tool, able to identify common structural motifs shared by different non homologous proteins in an unsupervised mode. The motifs can be as short as three residues and need not to be contiguous or even present in the same order in the sequence. Users can submit a set of protein structures deemed or not to share a common function (e.g. they bind similar ligands, or share a common epitope). The server finds and lists structural motifs composed of three or more spatially well conserved residues shared by at least three of the submitted structures. The method uses a local structural comparison algorithm to identify subsets of similar amino acids between each pair of input protein chains and a clustering procedure to group similarities shared among different structure pairs.
FunClust is fast, completely sequence independent, and does not need an a priori knowledge of the motif to be found. The output consists of a list of aligned structural matches displayed in both tabular and graphical form. We show here examples of its usefulness by searching for the largest common structural motifs in test sets of non homologous proteins and showing that the identified motifs correspond to a known common functional feature.