This article is part of the supplement: NIPS workshop on New Problems and Methods in Computational Biology

Open Access Highly Accessed Proceedings

ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context

Adam A Margolin12, Ilya Nemenman2, Katia Basso3, Chris Wiggins24, Gustavo Stolovitzky5, Riccardo Dalla Favera3 and Andrea Califano12*

Author affiliations

1 Department of Biomedical Informatics, Columbia University, New York, NY 10032

2 Joint Centers for Systems Biology, Columbia University, New York, NY 10032

3 Institute for Cancer Genetics, Columbia University, New York, NY 10032

4 Department of Applied Physics and Applied Mathematics, Columbia University, New York, NY 10032

5 IBM T.J. Watson Research Center, Yorktown Heights, NY 10598

For all author emails, please log on.

Citation and License

BMC Bioinformatics 2006, 7(Suppl 1):S7  doi:10.1186/1471-2105-7-S1-S7

Published: 20 March 2006

Abstract

Background

Elucidating gene regulatory networks is crucial for understanding normal cell physiology and complex pathologic phenotypes. Existing computational methods for the genome-wide "reverse engineering" of such networks have been successful only for lower eukaryotes with simple genomes. Here we present ARACNE, a novel algorithm, using microarray expression profiles, specifically designed to scale up to the complexity of regulatory networks in mammalian cells, yet general enough to address a wider range of network deconvolution problems. This method uses an information theoretic approach to eliminate the majority of indirect interactions inferred by co-expression methods.

Results

We prove that ARACNE reconstructs the network exactly (asymptotically) if the effect of loops in the network topology is negligible, and we show that the algorithm works well in practice, even in the presence of numerous loops and complex topologies. We assess ARACNE's ability to reconstruct transcriptional regulatory networks using both a realistic synthetic dataset and a microarray dataset from human B cells. On synthetic datasets ARACNE achieves very low error rates and outperforms established methods, such as Relevance Networks and Bayesian Networks. Application to the deconvolution of genetic networks in human B cells demonstrates ARACNE's ability to infer validated transcriptional targets of the cMYC proto-oncogene. We also study the effects of misestimation of mutual information on network reconstruction, and show that algorithms based on mutual information ranking are more resilient to estimation errors.

Conclusion

ARACNE shows promise in identifying direct transcriptional interactions in mammalian cellular networks, a problem that has challenged existing reverse engineering algorithms. This approach should enhance our ability to use microarray data to elucidate functional mechanisms that underlie cellular processes and to identify molecular targets of pharmacological compounds in mammalian cellular networks.