|
Resolution: standard / high Figure 6.
SCs identified by FASST within the β-lactamases. Applying FASST to expose the substructural diversity of a catalytic substructure
among the β-lactamases reveals many distinct clusters within the family. The GMM clustering step
of FASST identifies 13 sub-groups within the family and the colors/shapes of points
in the SCs correspond to cluster assignment. MESH then constructs one consensus motif
for each cluster identified, resulting in an ensemble of 13 motifs. Function prediction
sensitivity improves from 35.0% (single-structure motif) to 81.2% when using the motif
ensemble constructed by FASST-MESH. For the highly diverse family of β-lactamases, the SCs output by FASST shows that many distinct sub-groups exist within
the family. MESH takes advantage of this information to more completely model the
geometric diversity present, thereby improving functional annotation coverage of the
family. Mapping Family- and Phylum-level phylogenetic data to each of the substructures as shown in the corresponding
plots on the right reveals that some, but not all, of the clusters identified are
due to evolutionary distance between proteins. For example, the Bacillaceae proteins can be seen to form a single sub-group while Enterobacteriaceae proteins are distributed throughout the SCs in several clusters, indicating that another
biological factor is working in concert with phylogenetic distance among the family
of β-lactamases to produce the structural diversity uncovered by FASST.
Bryant et al. BMC Bioinformatics 2010 11:242 doi:10.1186/1471-2105-11-242 |