Evolution, substrate specificity and subfamily classification of glycoside hydrolase family 5 (GH5)
- Equal contributors
1 Division of Glycoscience, School of Biotechnology, KTH - Royal Institute of Technology, AlbaNova University Center, Stockholm SE-106 91, Sweden
2 Michael Smith Laboratories and Department of Chemistry, University of British Columbia, 2185 East Mall, Vancouver V6T 1Z4, Canada
3 Architecture et Fonction des Macromolécules Biologiques, Aix-Marseille Université, CNRS, UMR 7257, 163 Avenue de Luminy, Marseille 13288, France
BMC Evolutionary Biology 2012, 12:186 doi:10.1186/1471-2148-12-186Published: 20 September 2012
The large Glycoside Hydrolase family 5 (GH5) groups together a wide range of enzymes acting on β-linked oligo- and polysaccharides, and glycoconjugates from a large spectrum of organisms. The long and complex evolution of this family of enzymes and its broad sequence diversity limits functional prediction. With the objective of improving the differentiation of enzyme specificities in a knowledge-based context, and to obtain new evolutionary insights, we present here a new, robust subfamily classification of family GH5.
About 80% of the current sequences were assigned into 51 subfamilies in a global analysis of all publicly available GH5 sequences and associated biochemical data. Examination of subfamilies with catalytically-active members revealed that one third are monospecific (containing a single enzyme activity), although new functions may be discovered with biochemical characterization in the future. Furthermore, twenty subfamilies presently have no characterization whatsoever and many others have only limited structural and biochemical data. Mapping of functional knowledge onto the GH5 phylogenetic tree revealed that the sequence space of this historical and industrially important family is far from well dispersed, highlighting targets in need of further study. The analysis also uncovered a number of GH5 proteins which have lost their catalytic machinery, indicating evolution towards novel functions.
Overall, the subfamily division of GH5 provides an actively curated resource for large-scale protein sequence annotation for glycogenomics; the subfamily assignments are openly accessible via the Carbohydrate-Active Enzyme database at http://www.cazy.org/GH5.html webcite.