Table 1

Parameters used for prediction analysis and their properties

Parameter

Information it gives

Drawback


Total = Frequency of gene pair co-expression

Total number of times a gene pair is expressed, excluding missing values

Some genes are expressed more frequently than others


MIM = Mutual Information Measure

Specificity of co-expression

When Total is small, MIM can be artificially high


R2 = Pearson’s Correlation Coefficient

Correlation between gene pair expression levels

Will detect global, but not conditional, co-regulation. Also, non-expression is far more common than expression, biasing R2 (e.g., 2 genes never expressed will show perfect correlation)


P = Purity

When co-expression “behavior” is described in terms of discrete categories, purity reflects a relative breakdown of behavioral observations

Information can be lost when discretizing a continuous variable


Dozmorov et al. BMC Bioinformatics 2011 12(Suppl 10):S14   doi:10.1186/1471-2105-12-S10-S14

Open Data