Open Access Highly Accessed Research article

Sources of variation in baseline gene expression levels from toxicogenomics study control animals across multiple laboratories

Michael J Boedigheimer1, Russell D Wolfinger2, Michael B Bass1, Pierre R Bushel3, Jeff W Chou3, Matthew Cooper4, J Christopher Corton5, Jennifer Fostel3, Susan Hester5, Janice S Lee5, Fenglong Liu6, Jie Liu7, Hui-Rong Qian8, John Quackenbush69, Syril Pettit10 and Karol L Thompson11*

Author Affiliations

1 Amgen Inc., Thousand Oaks, CA 91320, USA

2 SAS Institute Inc., Cary, NC 27513, USA

3 NIEHS, Research Triangle Park, NC 27709, USA

4 Roche Palo Alto LLC, Palo Alto, CA 94304, USA

5 US EPA, Research Triangle Park, NC 27711, USA

6 Dana-Farber Cancer Institute, Boston, MA 02115, USA

7 ICS, NCI at NIEHS, Research Triangle Park, NC 27709, USA

8 Eli Lilly and Co., Indianapolis, IN 46285, USA

9 Harvard School of Public Health, Boston, MA 02115, USA

10 ILSI/HESI, Washington, DC 20005, USA

11 CDER, US FDA, Silver Spring, MD 20993, USA

For all author emails, please log on.

BMC Genomics 2008, 9:285  doi:10.1186/1471-2164-9-285

Published: 12 June 2008



The use of gene expression profiling in both clinical and laboratory settings would be enhanced by better characterization of variance due to individual, environmental, and technical factors. Meta-analysis of microarray data from untreated or vehicle-treated animals within the control arm of toxicogenomics studies could yield useful information on baseline fluctuations in gene expression, although control animal data has not been available on a scale and in a form best served for data-mining.


A dataset of control animal microarray expression data was assembled by a working group of the Health and Environmental Sciences Institute's Technical Committee on the Application of Genomics in Mechanism Based Risk Assessment in order to provide a public resource for assessments of variability in baseline gene expression. Data from over 500 Affymetrix microarrays from control rat liver and kidney were collected from 16 different institutions. Thirty-five biological and technical factors were obtained for each animal, describing a wide range of study characteristics, and a subset were evaluated in detail for their contribution to total variability using multivariate statistical and graphical techniques.


The study factors that emerged as key sources of variability included gender, organ section, strain, and fasting state. These and other study factors were identified as key descriptors that should be included in the minimal information about a toxicogenomics study needed for interpretation of results by an independent source. Genes that are the most and least variable, gender-selective, or altered by fasting were also identified and functionally categorized. Better characterization of gene expression variability in control animals will aid in the design of toxicogenomics studies and in the interpretation of their results.