Open Access Research article

Structure- and context-based analysis of the GxGYxYP family reveals a new putative class of Glycoside Hydrolase

Daniel J Rigden1*, Ruth Y Eberhardt23, Harry J Gilbert4, Qingping Xu5, Yuanyuan Chang6 and Adam Godzik7

Author Affiliations

1 Institute of Integrative Biology, University of Liverpool, Liverpool, UK

2 Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK

3 European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridgeshire CB10 1SD, UK

4 Institute for Cell and Molecular Biosciences, The Medical School, Newcastle University, Framlington Place, Newcastle Upon Tyne NE2 4HH, UK

5 Joint Center for Structural Genomics, Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, Menlo Park CA 94025, USA

6 Joint Center for Structural Genomics, Program on Bioinformatics and Systems Biology, Sanford-Burnham Medical Research Institute, La Jolla CA 92037, USA

7 Joint Center for Structural Genomics, Center for Research in Biological Systems, University of California, San Diego, La Jolla CA 92093, USA

For all author emails, please log on.

BMC Bioinformatics 2014, 15:196  doi:10.1186/1471-2105-15-196

Published: 17 June 2014



Gut microbiome metagenomics has revealed many protein families and domains found largely or exclusively in that environment. Proteins containing the GxGYxYP domain are over-represented in the gut microbiota, and are found in Polysaccharide Utilization Loci in the gut symbiont Bacteroides thetaiotaomicron, suggesting their involvement in polysaccharide metabolism, but little else is known of the function of this domain.


Genomic context and domain architecture analyses support a role for the GxGYxYP domain in carbohydrate metabolism. Sparse occurrences in eukaryotes are the result of lateral gene transfer. The structure of the GxGYxYP domain-containing protein encoded by the BT2193 locus reveals two structural domains, the first composed of three divergent repeats with no recognisable homology to previously solved structures, the second a more familiar seven-stranded β/α barrel. Structure-based analyses including conservation mapping localise a presumed functional site to a cleft between the two domains of BT2193. Matching to a catalytic site template from a GH9 cellulase and other analyses point to a putative catalytic triad composed of Glu272, Asp331 and Asp333.


We suggest that GxGYxYP-containing proteins constitute a novel glycoside hydrolase family of as yet unknown specificity.

Carbohydrate metabolism; Glycoside hydrolase; Polysaccharide Utilization Locus; PUL; Protein function prediction; JCSG; 3D structure; Protein family; Gut microbiota