Computational identification of transcriptionally co-regulated genes, validation with the four ANT isoform genes
1 INRA, UMR 1019, Unité de Nutrition Humaine, 63122, St Genès-Champanelle, France
2 Université d'Auvergne, Unité de Nutrition Humaine, Clermont Université, BP 10448, 63000, Clermont-Ferrand, France
3 Institut des Neurosciences, Equipe Nanomédecine et Cerveau, Inserm U836, 38700, La Tronche, France
4 Université Joseph Fourier 1, Grenoble, 38041, France
5 Plate-forme Transcriptome et Protéome Cliniques, Institut de Biologie et Pathologie, CHU Grenoble, 38043, Grenoble, France
6 CNRS, 38042, Grenoble, France
BMC Genomics 2012, 13:482 doi:10.1186/1471-2164-13-482Published: 15 September 2012
The analysis of gene promoters is essential to understand the mechanisms of transcriptional regulation required under the effects of physiological processes, nutritional intake or pathologies. In higher eukaryotes, transcriptional regulation implies the recruitment of a set of regulatory proteins that bind on combinations of nucleotide motifs. We developed a computational analysis of promoter nucleotide sequences, to identify co-regulated genes by combining several programs that allowed us to build regulatory models and perform a crossed analysis on several databases. This strategy was tested on a set of four human genes encoding isoforms 1 to 4 of the mitochondrial ADP/ATP carrier ANT. Each isoform has a specific tissue expression profile linked to its role in cellular bioenergetics.
From their promoter sequence and from the phylogenetic evolution of these ANT genes in mammals, we constructed combinations of specific regulatory elements. These models were screened using the full human genome and databases of promoter sequences from human and several other mammalian species. For each of transcriptionally regulated ANT1, 2 and 4 genes, a set of co-regulated genes was identified and their over-expression was verified in microarray databases.
Most of the identified genes encode proteins with a cellular function and specificity in agreement with those of the corresponding ANT isoform. Our in silico study shows that the tissue specific gene expression is mainly driven by promoter regulatory sequences located up to about a thousand base pairs upstream the transcription start site. Moreover, this computational strategy on the study of regulatory pathways should provide, along with transcriptomics and metabolomics, data to construct cellular metabolic networks.