Figure S1. Rectangular phylogram view of the phylogenetic tree of family GH5. Branches corresponding to subfamilies 1–53 are shown in color and the individual subfamilies have their corresponding subfamily numbers as indicated in Figure 1. The branches corresponding to sequences not included into subfamilies are in black. Each individual protein module node is identified by a varying number of fields separated by “|” indicating: (i) the organism, with 3 letters for the genre and either 5 letters for the species or full strain code; (ii) the protein accession in public databases, typically GenBank; (iii) if attributed, the subfamily number or other information; (iv) if available, EC numbers (node in bold) or a “*” (node in bold and italic) to indicate precise enzyme characterizations or a simple activity tests, respectively. A suffix like “_2” may indicate the module position if more than one GH5 module is present on peptide. Lower confidence nodes with a SH-like local support below 0.7 (varying from low 0 to strong 1) are indicated with a black dot. Identified sequences without complete catalytic machinery are in red. Individual subfamily trees are also included in this file.

