Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

H2r: Identification of evolutionary important residues by means of an entropy based analysis of multiple sequence alignments

Rainer Merkl1* and Matthias Zwick12

Author Affiliations

1 Institut für Biophysik und Physikalische Biochemie, Universität Regensburg, D-93040 Regensburg, Germany

2 Biozentrum, Universität Basel, CH-4056 Basel, Switzerland

For all author emails, please log on.

BMC Bioinformatics 2008, 9:151  doi:10.1186/1471-2105-9-151

Published: 18 March 2008



A multiple sequence alignment (MSA) generated for a protein can be used to characterise residues by means of a statistical analysis of single columns. In addition to the examination of individual positions, the investigation of co-variation of amino acid frequencies offers insights into function and evolution of the protein and residues.


We introduce conn(k), a novel parameter for the characterisation of individual residues. For each residue k, conn(k) is the number of most extreme signals of co-evolution. These signals were deduced from a normalised mutual information (MI) value U(k, l) computed for all pairs of residues k, l. We demonstrate that conn(k) is a more robust indicator than an individual MI-value for the prediction of residues most plausibly important for the evolution of a protein. This proposition was inferred by means of statistical methods. It was further confirmed by the analysis of several proteins. A server, which computes conn(k)-values is available at webcite.


The algorithms H2r, which analyses MSAs and computes conn(k)-values, characterises a specific class of residues. In contrast to strictly conserved ones, these residues possess some flexibility in the composition of side chains. However, their allocation is sensibly balanced with several other positions, as indicated by conn(k).