Regulon organization of Arabidopsis
1 CRS4 Bioinformatics Laboratory, Parco Scientifico e Technologico POLARIS, 09010 Pula (CA), Italy
2 Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50011, USA
BMC Plant Biology 2008, 8:99 doi:10.1186/1471-2229-8-99Published: 30 September 2008
Despite the mounting research on Arabidopsis transcriptome and the powerful tools to explore biology of this model plant, the organization of expression of Arabidopsis genome is only partially understood. Here, we create a coexpression network from a 22,746 Affymetrix probes dataset derived from 963 microarray chips that query the transcriptome in response to a wide variety of environmentally, genetically, and developmentally induced perturbations.
Markov chain graph clustering of the coexpression network delineates 998 regulons ranging from one to 1623 genes in size. To assess the significance of the clustering results, the statistical over-representation of GO terms is averaged over this set of regulons and compared to the analogous values for 100 randomly-generated sets of clusters. The set of regulons derived from the experimental data scores significantly better than any of the randomly-generated sets. Most regulons correspond to identifiable biological processes and include a combination of genes encoding related developmental, metabolic pathway, and regulatory functions. In addition, nearly 3000 genes of unknown molecular function or process are assigned to a regulon. Only five regulons contain plastomic genes; four of these are exclusively plastomic. In contrast, expression of the mitochondrial genome is highly integrated with that of nuclear genes; each of the seven regulons containing mitochondrial genes also incorporates nuclear genes. The network of regulons reveals a higher-level organization, with dense local neighborhoods articulated for photosynthetic function, genetic information processing, and stress response.
This analysis creates a framework for generation of experimentally testable hypotheses, gives insight into the concerted functions of Arabidopsis at the transcript level, and provides a test bed for comparative systems analysis.