Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Research article

Handling multiple testing while interpreting microarrays with the Gene Ontology Database

Michael V Osier1*, Hongyu Zhao23 and Kei-Hoi Cheung34

Author Affiliations

1 Department of Biological Sciences, Rochester Institute of Technology, 85 Lomb Memorial Drive, Rochester, NY 14623, USA

2 Dept. of Epidemiology and Public Health, Yale University, New Haven, CT 06520, USA

3 Dept. of Genetics, Yale University, New Haven, CT 06520, USA

4 Yale Center for Medical Informatics, 300 George St. Suite 501, New Haven, CT 06511, USA

For all author emails, please log on.

BMC Bioinformatics 2004, 5:124  doi:10.1186/1471-2105-5-124

Published: 6 September 2004



The development of software tools that analyze microarray data in the context of genetic knowledgebases is being pursued by multiple research groups using different methods. A common problem for many of these tools is how to correct for multiple statistical testing since simple corrections are overly conservative and more sophisticated corrections are currently impractical. A careful study of the nature of the distribution one would expect by chance, such as by a simulation study, may be able to guide the development of an appropriate correction that is not overly time consuming computationally.


We present the results from a preliminary study of the distribution one would expect for analyzing sets of genes extracted from Drosophila, S. cerevisiae, Wormbase, and Gramene databases using the Gene Ontology Database.


We found that the estimated distribution is not regular and is not predictable outside of a particular set of genes. Permutation-based simulations may be necessary to determine the confidence in results of such analyses.