Protein family comparison using statistical models and predicted structural information
Department of Computer Science, Cornell University, Ithaca, NY 14850, USA
BMC Bioinformatics 2004, 5:183 doi:10.1186/1471-2105-5-183Published: 25 November 2004
This paper presents a simple method to increase the sensitivity of protein family comparisons by incorporating secondary structure (SS) information. We build upon the effective information theory approach towards profile-profile comparison described in [Yona & Levitt 2002]. Our method augments profile columns using PSIPRED secondary structure predictions and assesses statistical similarity using information theoretical principles.
Our tests show that this tool detects more similarities between protein families of distant homology than the previous primary sequence-based method. A very significant improvement in performance is observed when the real secondary structure is used.
Integration of primary and secondary structure information can substantially improve detection of relationships between remotely related protein families.