|
Resolution: standard / high Figure 2.
Distributions over adjective lemmas as tagged by the C&C parser trained on Genia. Clockwise from the top left: the heatmap shows the pairwise Jensen-Shannon Divergence
(top half) and statistical significance (bottom half), as well as the homogeneity
(diagonal). The dendrogram shows hierarchical clustering based on cosine difference
between each subdomain's JSD values. The scatter plot is colored according to the
best K-means clustering (determined by the Gap statistic) projected onto the first
two principal components (normalized). The line plot shows the intra-subdomain spread
of JSD values generated by random sampling.
Lippincott et al. BMC Bioinformatics 2011 12:212 doi:10.1186/1471-2105-12-212 |