Grammar-based cluster formations. Forty sequences being processed via a vector quantizer. Each of the four genera is represented by ten sequences. Every sequence is grammatically compared to the same two sequences from within the set. The resulting pair of distances form two-dimensional vectors in a space. When considering the clusters in this space, the representative sequence of the cluster should be the sequence that is nearest the cluster center.
Russell et al. BMC Bioinformatics 2010 11:601 doi:10.1186/1471-2105-11-601