This article is part of the supplement: Italian Society of Bioinformatics (BITS): Annual Meeting 2012
Identification and analysis of conserved pockets on protein surfaces
1 Department of Biology, University "Federico II", Via Cinthia, 80126, Naples, Italy
2 Istituto di Chimica Biomolecolare -CNR, Comprensorio Olivetti, 80078, Pozzuoli, Italy
3 Istituto di Biostrutture e Bioimmagini-CNR, Via Mezzocannone, 80134, Napoli, Italy
BMC Bioinformatics 2013, 14(Suppl 7):S9 doi:10.1186/1471-2105-14-S7-S9Published: 22 April 2013
The interaction between proteins and ligands occurs at pockets that are often lined by conserved amino acids. These pockets can represent the targets for low molecular weight drugs. In order to make the research for new medicines as productive as possible, it is necessary to exploit "in silico" techniques, high throughput and fragment-based screenings that require the identification of druggable pockets on the surface of proteins, which may or may not correspond to active sites.
We developed a tool to evaluate the conservation of each pocket detected on the protein surface by CastP. This tool was named DrosteP because it recursively searches for optimal input sequences to be used to calculate conservation. DrosteP uses a descriptor of statistical significance, Poisson p-value, as a target to optimize the choice of input sequences. To benchmark DrosteP we used monomeric or homodimer human proteins with known 3D-structure whose active site had been annotated in UniProt. DrosteP is able to detect the active site with high accuracy because in 81% of the cases it coincides with the most conserved pocket. Comparing DrosteP with analogous programs is difficult because the outputs are different. Nonetheless we could assess the efficacy of the recursive algorithm in the identification of active site pockets by calculating conservation with the same input sequences used by other programs.
We analyzed the amino-acid composition of conserved pockets identified by DrosteP and we found that it differs significantly from the amino-acid composition of non conserved pockets.
Several methods for predicting ligand binding sites on protein surfaces, that combine 3D-structure and evolutionary sequence conservation, have been proposed. Any method relying on conservation mainly depends on the choice of the input sequences. DrosteP chooses how deeply distant homologs must be collected to evaluate conservation and thus optimizes the identification of active site pockets. Moreover it recognizes conserved pockets other than those coinciding with the sites annotated in UniProt that might represent useful druggable sites. The distinctive amino-acid composition of conserved pockets provides useful hints on the fundamental principles underlying protein-ligand interaction.