Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Proceedings of the Sixth Annual MCBIOS Conference. Transformational Bioinformatics: Delivering Value from Genomes

Open Access Proceedings

Threshold selection in gene co-expression networks using spectral graph theory techniques

Andy D Perkins1* and Michael A Langston2

Author Affiliations

1 Department of Computer Science and Engineering, Mississippi State University, Mississippi State, MS, USA

2 Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN, USA

For all author emails, please log on.

BMC Bioinformatics 2009, 10(Suppl 11):S4  doi:10.1186/1471-2105-10-S11-S4

Published: 8 October 2009

Abstract

Background

Gene co-expression networks are often constructed by computing some measure of similarity between expression levels of gene transcripts and subsequently applying a high-pass filter to remove all but the most likely biologically-significant relationships. The selection of this expression threshold necessarily has a significant effect on any conclusions derived from the resulting network. Many approaches have been taken to choose an appropriate threshold, among them computing levels of statistical significance, accepting only the top one percent of relationships, and selecting an arbitrary expression cutoff.

Results

We apply spectral graph theory methods to develop a systematic method for threshold selection. Eigenvalues and eigenvectors are computed for a transformation of the adjacency matrix of the network constructed at various threshold values. From these, we use a basic spectral clustering method to examine the set of gene-gene relationships and select a threshold dependent upon the community structure of the data. This approach is applied to two well-studied microarray data sets from Homo sapiens and Saccharomyces cerevisiae.

Conclusion

This method presents a systematic, data-based alternative to using more artificial cutoff values and results in a more conservative approach to threshold selection than some other popular techniques such as retaining only statistically-significant relationships or setting a cutoff to include a percentage of the highest correlations.