This article is part of the supplement: Third Annual MCBIOS Conference. Bioinformatics: A Calculated Discovery
A method for computing the overall statistical significance of a treatment effect among a group of genes
1 Division of Biometry and Risk Assessment, National Center for Toxicological Research, Jefferson, Arkansas 72079 USA
2 School of Public Health, LSU Health Science Center, New Orleans, LA 70112 USA
BMC Bioinformatics 2006, 7(Suppl 2):S11 doi:10.1186/1471-2105-7-S2-S11Published: 26 September 2006
In studies that use DNA arrays to assess changes in gene expression, our goal is to evaluate the statistical significance of treatments on sets of genes. Genes can be grouped by a molecular function, a biological process, or a cellular component, e.g., gene ontology (GO) terms. The meaning of an affected GO group is often clearer than interpretations arising from a list of the statistically significant genes.
Computer simulations demonstrated that correlations among genes invalidate many statistical methods that are commonly used to assign significance to GO terms. Ignoring these correlations overstates the statistical significance. Meta-analysis methods for combining p-values were modified to adjust for correlation. One of these methods is elaborated in the context of a comparison between two treatments. The form of the correlation adjustment depends upon the alternative hypothesis.
Reliable corrections for the effect of correlations among genes on the significance level of a GO term can be constructed for an alternative hypothesis where all transcripts in the GO term increase (decrease) in response to treatment. For general alternatives, which allow some transcripts to increase and others to decrease, the bias of naïve significance calculations can be greatly decreased although not eliminated.