Examples of the data processing inequality. (a) g1, g2, g3, and g4 are connected in a linear chain relationship. Although all six gene pairs will likely have enriched mutual information, the DPI will infer the most likely path of information flow. For example, g1 ↔ g3 will be eliminated because I(g1, g2) >I(g1, g3) and I(g2, g3) >I(g1, g3). g2 ↔ g4 will be eliminated because I(g2, g3) >I(g2, g4) and I(g3, g4) >I(g2, g4). g1 ↔ g4 will be eliminated in two ways: first, because I(g1, g2) >I(g1, g4) and I(g2, g4) >I(g1, g4), and then because I(g1, g3) >I(g1, g4) and I(g3, g4) >I(g1, g4). (b) If the underlying interactions form a tree (and MI can be measured without errors), ARACNE will reconstruct the network exactly by removing all false candidate interactions (dashed blue lines) and retaining all true interactions (solid black lines).
Margolin et al. BMC Bioinformatics 2006 7(Suppl 1):S7 doi:10.1186/1471-2105-7-S1-S7