BMC Bioinformatics

official impact factor 3.03

This article is part of the supplement: Asia Pacific Bioinformatics Network (APBioNet) Sixth International Conference on Bioinformatics (InCoB2007)

Open Access Proceedings

Motif-directed network component analysis for regulatory network inference

Chen Wang1, Jianhua Xuan1*, Li Chen1, Po Zhao2, Yue Wang1, Robert Clarke3 and Eric Hoffman2

Author Affiliations

1 Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, USA

2 Research Center for Genetic Medicine, Children's National Medical Center, Washington, DC, USA

3 Departments of Oncology and Physiology & Biophysics, Georgetown University School of Medicine, Washington, DC, USA

For all author emails, please log on.

BMC Bioinformatics 2008, 9(Suppl 1):S21 doi:10.1186/1471-2105-9-S1-S21

Published: 13 February 2008

Abstract

Background

Network Component Analysis (NCA) has shown its effectiveness in discovering regulators and inferring transcription factor activities (TFAs) when both microarray data and ChIP-on-chip data are available. However, a NCA scheme is not applicable to many biological studies due to limited topology information available, such as lack of ChIP-on-chip data. We propose a new approach, motif-directed NCA (mNCA), to integrate motif information and gene expression data to infer regulatory networks.

Results

We develop motif-directed NCA (mNCA) to incorporate motif information into NCA for regulatory network inference. While motif information is readily available from knowledge databases, it is a "noisy" source of network topology information consisting of many false positives. To overcome this problem, we develop a stability analysis procedure embedded in mNCA to resolve the inconsistency between motif information and gene expression data, and to enable the identification of stable TFAs. The mNCA approach has been applied to a time course microarray data set of muscle regeneration. The experimental results show that the inferred TFAs are not only numerically stable but also biologically relevant to muscle differentiation process. In particular, several inferred TFAs like those of MyoD, myogenin and YY1 are well supported by biological experiments.

Conclusion

A novel computational approach, mNCA, has been developed to integrate motif information and gene expression data for regulatory network reconstruction. Specifically, motif analysis is used to obtain initial network topology, and stability analysis is developed and applied with mNCA to extract stable TFAs. Experimental results on muscle regeneration microarray data have demonstrated that mNCA is a practical and reliable computational method for regulatory network inference and pathway discovery.