Mapping the distribution of packing topologies within protein interiors shows predominant preference for specific packing motifs
1 Crystallography and Molecular Biology Division, Saha Institute of Nuclear Physics, 1/AF, Bidhannagar, Kolkata - 700 064, India
2 Biophysics Division, Saha Institute of Nuclear Physics, 1/AF, Bidhannagar, Kolkata - 700 064, India
BMC Bioinformatics 2011, 12:195 doi:10.1186/1471-2105-12-195Published: 24 May 2011
Mapping protein primary sequences to their three dimensional folds referred to as the 'second genetic code' remains an unsolved scientific problem. A crucial part of the problem concerns the geometrical specificity in side chain association leading to densely packed protein cores, a hallmark of correctly folded native structures. Thus, any model of packing within proteins should constitute an indispensable component of protein folding and design.
In this study an attempt has been made to find, characterize and classify recurring patterns in the packing of side chain atoms within a protein which sustains its native fold. The interaction of side chain atoms within the protein core has been represented as a contact network based on the surface complementarity and overlap between associating side chain surfaces. Some network topologies definitely appear to be preferred and they have been termed 'packing motifs', analogous to super secondary structures in proteins. Study of the distribution of these motifs reveals the ubiquitous presence of typical smaller graphs, which appear to get linked or coalesce to give larger graphs, reminiscent of the nucleation-condensation model in protein folding. One such frequently occurring motif, also envisaged as the unit of clustering, the three residue clique was invariably found in regions of dense packing. Finally, topological measures based on surface contact networks appeared to be effective in discriminating sequences native to a specific fold amongst a set of decoys.
Out of innumerable topological possibilities, only a finite number of specific packing motifs are actually realized in proteins. This small number of motifs could serve as a basis set in the construction of larger networks. Of these, the triplet clique exhibits distinct preference both in terms of composition and geometry.