Test for significant co-duplication of genes. A) Gene Duplications and losses are mapped to the branches (B1 - B36) of the assumed species phylogeny (see Methods for details). Circles represent gene duplication events; different shades represent different gene families. B) Gene duplication data illustrated in matrix form. Rows are gene families (G1-G21) columns are branches on the phylogeny (B1-B36). Gene duplication events are represented in the matrix as a 1, absence of gene duplications on branches are represented as 0. C) All pair-wise comparisons were made between gene families (rows). P-values were calculated using Spearman's rho . The number of significantly similar gene family pairs is represented in this panel (see Methods). D) To test whether the observed number of significantly co-duplicating gene family pairs could be due to chance, we next shuffled each row of the data matrix (G1S-G21S), thus randomizing the gene duplication events on the tree. In this way, we created 1000 shuffled matrices. E) Using the shuffled matrices (D), we calculated p-values for the similarity of each pairwise comparison of shuffled matrix rows. The number of significantly similar rows was counted for each of the 1000 shuffled matrices to form a null distribution (Fig 4C), to which the observed value was compared.
Rivera et al. BMC Evolutionary Biology 2010 10:123 doi:10.1186/1471-2148-10-123