Optimal phenotypic clustering of clinical data set. The optimal set of clusters obtained by using constrained k-medoids clustering with integer programming. 36 multi-membered clusters and 14 single-member "clusters", or outliers, were obtained. Each row represents one cluster. The second column indicates the cluster size. The next 9 columns represent the cluster centroids' phenotypic drug resistance scores, colored according to the legend. The columns at right indicate mutations in the sequence selected to represent the cluster at selected positions. Because isolates with mixtures at any of the specified positions were not allowed to represent a cluster, certain single-membered clusters do not have a representative "sequence." The representative sequences chosen for clusters 29, 31, 34, and 36 show no mutations at the positions listed here, but they have substitutions at other positions (Table S1, Additional File 1).
Doherty et al. BMC Bioinformatics 2011 12:477 doi:10.1186/1471-2105-12-477