Abstract
Background
Networks are ubiquitous in modern cell biology and physiology. A large literature exists for inferring/proposing biological pathways/networks using statistical or machine learning algorithms. Despite these advances a formal testing procedure for analyzing networklevel observations is in need of further development. Comparing the behaviour of a pharmacologically altered pathway to its canonical form is an example of a salient onesample comparison. Locating which pathways differentiate disease from nodisease phenotype may be recast as a twosample network inference problem.
Results
We outline an inferential method for performing one and twosample hypothesis tests where the sampling unit is a network and the hypotheses are stated via network model(s). We propose a dissimilarity measure that incorporates nearby neighbour information to contrast one or more networks in a statistical test. We demonstrate and explore the utility of our approach with both simulated and microarray data; random graphs and weighted (partial) correlation networks are used to form network models. Using both a wellknown diabetes dataset and an ovarian cancer dataset, the methods outlined here could better elucidate coregulation changes for one or more pathways between two clinically relevant phenotypes.
Conclusions
Formal hypothesis tests for gene or proteinbased networks are a logical progression from existing genebased and geneset tests for differential expression. Commensurate with the growing appreciation and development of systems biology, the dissimilaritybased testing methods presented here may allow us to improve our understanding of pathways and other complex regulatory systems. The benefit of our method was illustrated under select scenarios.
Background
Networks, a construct emphasizing the interrelations between objects, have application in studying human behaviour, mathematics, physics, econometrics, etc. With the introduction of microarrays and other highthroughput systems, networks increasingly provide a means to organize and study the interdependencies of genes, proteins, metabolites, etc. Gene transcription/regulatory networks, metabolic pathways, proteinprotein interaction systems (PPIs), signal transduction pathways, and phylogenetic trees are established tools in biology. Pronetonoise experiments are routinely coupled with computational algorithms to infer relationships. Given the empirical reliance on uncertain data it seems natural to ask, “Do these networks differ from one another?” This paper outlines and demonstrates an inferential process for performing one and twosample hypothesis tests when the sample data are biological networks.
Numerous books are devoted to networks. Bornholdt et al.[1] and Junker et al. [2] provide a broad introduction to networks with application to biology, e.g., correlation profiles and motifs, network behaviour in nematode development, etc. EmmertStreib et al. [3] is a collection devoted to inferring microarraybased networks. Kolaczyk [4] appears to be the first statistics text devoted solely to networks. Brandes et al. [5] provide an overview of analysis methods from a computer science perspective.
Testing graphs is not trivial; comparing two static graphs is conceptually different from comparing a sample of stochastic graphs under one or more treatments. Defining a suitable null probability model for a network is a difficult consideration [6]. ErdősRényi random graphs, where the probability of an edge between any two vertices is a fixed constant p, play an important conceptual role in our understanding of graphs [7]. The use of these graphs in forming statistical tests has received some criticism [6]. Chung et al. [8] explore a complex hybridgraph model to mimic observed smallworld networks. Simple models, e.g., random, scalefree, or smallworld graphs, etc., although useful for comparing features across a class of networks, may have less utility for weighted biological networks. Schwöbbermeyer [9], in discussing network motifs, makes a troubling comment regarding the formation of actual biological networks, “A single network generation mechanism may not be sufficient to resemble the structure of these networks.” This difficulty in defining a suitably rich or ‘realistic’ parametric model poses a challenge in forming a hypothesis testing procedure for network parameters. Exponential random graph models (ERGMs), referred to as p* models in social network analysis, offer a theoretical model for stochastic networks and have been used in biological applications [10]. ERGM parameterizations can contain attractive motiflike structures. But, these models do not typically assume that the nodes of a graph are aligned and suffer from model degeneracy concerns [4]. Motif frequencies have been used to compare biological networks via a statistical test [9]. Wiuf et al. [11] outline a fulllikelihood probability model approach to estimating the parameters of a C. elegans protein interaction growth model. Cardelli [12] suggests that qualitative models provide more insight than quantitative models due to parameter estimation and criticality concerns. In contrast, Steinhauser et al. [13] argue and demonstrate that correlation networks, a form of weighted network, provide more understanding of cellular systems relative to qualitative edge/noedge models or nodebased cell inventory quantitative models. Unlike traditional parametercentric testing procedures, this dichotomy suggests that a testing procedure might benefit from distinguishing structural (edge) distinctions from weight (quantitative) distinctions. The methods presented here will accommodate this need for a flexible model.
Approaches for both two and onesample comparisons occur in the literature. Faust et al. [14] use a combination of p* models and correspondence analysis to compare networks. Banks et al. [15], using a symmetric difference based on a Hamming distance, outline an estimation/hypothesis testing approach for labeled unweighted loopfree graphs. Sanil et al. [16] extend [15] to networks whose edge set evolves in time. Kahlem et al. [17], in a reverse engineering application, suggest three approaches (experimentally testable perturbations, training/validation test sets, using synthetic system data) for comparing two networks that conceptually differs from the approach pursued here. Stolovitzky et al. [18] propose ROC and precisionrecall curves as the methodofchoice for validating inferred models, a motivating example for a onesample comparison. Chen et al. [19] use an additive elementwise score to compare a gene regulatory network estimate to a known network, a parallel to the onesample procedure developed here.
While our focus here is on forming statistical comparisons of network model parameters, computer scientists can divide the graph comparison problem into two areas – exact graph matching and graph similarity [5]. Graph similarity, tailored to deal with errors in the network data, has been addressed with three broad strategies: compare the difference of path lengths using all pairs of vertices, locate the maximal common subgraph between the two graphs, or use an edit distance. Edit distance, which motivated and resembles the dissimilarity index proposed here, has been used in string matching applications and uses graph operations, e.g., node/edge insertions/deletions, to transform one graph into a second graph [5]. Insertions/deletions can be implemented using set operators. XulviBrunet et al. [20] propose a bootstrapped degree of similarity via union/intersection operations. Gill et al. [21] combine a partial least squaresbased connectivity score with an intersection/union measure to test for differential modular structures via permutation. Expanding beyond strings to motif or neighbourhoodlike objects has demonstrated biological benefit. Li et al. [22] extend a nodebased topological overlap dissimilarity to assist in defining relevant gene neighbourhoods. Chen et al. [19] capture a study where the statistical accuracy of protein function prediction was improved by incorporating information beyond the protein’s immediate neighbours in the network. Such precedent helped motivate the definition of our dissimilarity measure. Graph matching is typically limited to unlabeled graphs. We assume here that the graphs are labeled, an assumption critical for computational and interpretation reasons.
The use and analysis of weighted (correlationbased) networks is increasingly relevant in biological applications. Zhang et al. [23], in a close parallel to pure correlationbased networks, form weighted gene coexpression networks. Langfelder et al. [24], as part of an R package for analyzing weighted correlation networks, provide several measures for comparing network topologies. Here, we utilize network models based on two forms of correlation matrices. Anderson [25] contains largesample statistical tests for (partial) correlation coefficients, canonical correlations, and various tests for covariance matrices. Anderson [26] uses a distancebased dissimilarity measure combined with a permutation procedure to compare dispersion matrices.
Resampling methods are common in network analysis, e.g., see [27], and will be used extensively here. Their use in validating network models is less common. Perkins [28] uses crossvalidation and resampling methods for validating a gap gene development model. Toh et al. [29] use bootstrap samples to assess edge reliability in a partial correlation network. EmmertStreib [30] combines a permutationbased procedure with a graphedit distance to compare disease pathways. Xiong et al.[31] use a permutation procedure to test the largest elementwise difference in a matrix of genetic network parameter estimates.
Hypothesis tests for networks have a variety of obvious applications. A onesample comparison can occur when comparing a network estimate to a known ‘gold standard’ model. The standard could reside in an online ontology, be defined via a data processing algorithm (e.g., RMA normalization), or reflect the starting t_{0} state of a signal transduction network. Algorithms that derive networks using in silico or in vitro/vivo data could be tested for the comparability of their network model estimates in either a one or twosample context. For obvious reasons, twosample comparisons have a broad application range. Twosample tests are conducted by drug developers to compare pathways between an investigative compound at various doses or to a competitor’s compound. We conduct one and twosample tests using both simulated and microarraybased network data. To evaluate both Type I error control and the power of our procedure, we examine both null and nonnull cases under a small set of network models. We assume that each sampling unit is an independent realization of a network. Our testing approach follows the traditional hypothesis testing route: define the null and alternate hypotheses (e.g., η=η_{0} versus η>η_{0} or η_{1}=η_{2} versus η_{1}≠η_{2}) and the risk associated with a decision; define a ‘nearby neighbour’ dissimilaritybased test statistic for testing the hypothesis; compute the test statistic; compute the distribution of the test statistic under the null hypothesis; and, make a decision using the sampling distribution and the calculated test statistic. Our approach applies to edge/noedge graphs, weighted graphs, and can be extended to directional networks. In the onesample hypothesis testing context a null network model is assumed or based on resamples from an a priori null sample to generate the sampling distribution of our test statistic under the null hypothesis. To implement a twosample comparison we rely on the permutation testing principle as the formal basis of our inferential approach [32].
Methods
Differential testing via dissimilarity
We consider an observed network as a realization of a stochastic process. A sample {x_{i}, i = 1, …, n} of independent observations is therefore a set of networks. In contrast to social networks, where the sampling unit is often the node, we assume that the biological network is inherent to each sample observation. We propose a dissimilarity measure as a test statistic to capture the separation between two networks. We assume that we can align the nodes across the networks to be compared. Since each node in the network represents a gene or other molecular entity, the ability to align nodes across a set of networks implies that we know a unique identifier for each node to allow for a moleculebymolecule comparison across a set of networks. We determine the sampling distribution of the test statistic under the null hypothesis via resampling techniques.
First, we begin with some definitions. Our definitions are consistent with Bollobás [7]. A graph G is an ordered pair of disjoint sets (V,E) where both V and E are finite sets. V = V(G) is the set of vertices and E = E(G) is the set of edges. An edge {x,y} is said to join, or tie, the vertices x and y and is denoted xy. If xy ∈ E(G) then x and y are adjacent, or neighbouring, vertices of G and the vertices x and y are incident with the edge xy. Two edges are adjacent if they have exactly one common endvertex. G’ = (V’,E’) is a subgraph of G = (V,E) if V’ ⊂ V and E’ ⊂ E. If x is a vertex of G we write x ∈ G in place of x ∈ V(G). The order of G is the number of vertices in G; the size of G is the number of edges of G. G(n,m) denotes an arbitrary graph of order n and size m. A graph of order n and size
is a complete ngraph. A covariance matrix of nonzero elements with dimension n, Σ_{n}, is a complete ngraph. The set of vertices adjacent to x ∈ G, the neighbourhood of x, is Γ(x). For adjacent vertices x and y we have x ∈ Г(y) and y ∈ Г(x). The degree of x is Γ(x). A vertex of degree 0 is an isolated vertex (or isolate). A path is a graph P where V(P) = {x_{0}, x_{1}, …, x_{l}} and E(P) = {x_{0}x_{1}, x_{1}x_{2}, …, x_{l1}x_{l}}. The length of P is the size of P. A graph without any cycles, a path with length greater than or equal to three and comprised of distinct vertices, is an acylic graph. Unlike trees, we allow for cyclic graphs with isolates. Paths have an obvious tie to motifs and other regulatory functions. By definition, a loop (xx ∈ E(G)) is not allowed; multiple edges joining the same two vertices are not allowed. A graph G can contain a subgraph G’ that is not connected to the remainder of G. Isolated nodes and subgraphs occur in biological networks. Accommodating isolates is necessary to align nodes between two graphs.It is common to represent a graph G in matrix form. The adjacency matrix A = A(G) = (a_{ij}) of a graph G is the nxn matrix given by
To extend the definition to a weighted graph replace 1 with w_{ij}, where w_{ij} is the strength, covariance, etc., between vertices v_{i} and v_{j}. Given nxn network matrices G = (g_{ij}) and H = (h_{ij}) we define GH in the standard algebraic sense, i.e., g_{ij}h_{ij}. We do not require a matrix be square; some directed network forms are nxm matrices. But, our use of elementwise subtraction is key; we are not suggesting a definition based on subspaces/subgraphs or set complements. We need to map an nxn network onto the real line in order to define a measure of separation. Under this definition of matrix subtraction, GH = 0, where 0 is an appropriately dimensioned matrix of zeros, implies no separation between two networks.The concept of dissimilarity (or similarity) is standard fare, especially in cluster analysis and pattern recognition. The dissimilarity measure d^{rs} between r and s satisfies the following: d^{rs}≥ 0 for every r, s, d^{rr}= 0 for every r, and d^{rs}= d^{sr} for every r, s. Refer to Gan et al. [33] for an excellent catalogue of measures. Dissimilarity measures for categorical data x and y are generally based on a simple matching distance,
. The wellknown Hamming distance [34] is a symmetrical form of a simple matching distance for binary strings and is used in communication theory. To craft our dissimilarity measure we propose a modified elementwise version of a matrix norm. Elementwise measures do not account for the interrelations present between nodes. Similar to linkage measures in genetics (e.g., LOD score), where markers are often correlated, we desire a measure that incorporates these interrelationships. Since networks are defined using interrelationships the inclusion of information for both a node and its neighbours is an intuitive concept.Let W^{O}= (w_{ij}^{O}) be a (weighted) adjacency (or directed incidence) matrix for the observed network estimate and W^{T}= (w_{ij}^{T}) the target network. In a onesample comparison W^{T} represents the true network model. For a twosample comparison the distinction between the two labels is arbitrary. Both W^{O} and W^{T} represent graphs of order n; the nodes are labeled, common to, and aligned between both graphs. For node i define the dissimilarity at that node to be a combination of that node’s dissimilarity,
, and the dissimilarity for node i’s neighbours, , for nodes j ≠ i, j ∈ Г(i). For the overall network, the dissimilarity D is defined as, where for weighted networks and specified by the researcher for unweighted networks. I is defined as the standard mathematical indicator function. For a graph of order n a set/neighbourhood is formed at each node w_{ii}, i = 1,…,n. We measure the dissimilarity between a node and its adjacent neighbours between the two graphs. To account for the intrinsic network structure the neighbourhood is then extended to those neighbouring nodes that are incident to nodes in Γ(w_{ii}) for both the target and observed networks. I.e., we now measure the dissimilarity for the two subgraphs induced by Γ(w_{jj}), where i ≠ j and w_{jj} is an element of Γ(w_{ii}). This ‘extended’ neighbourhood dissimilarity is added to the dissimilarity measured at w_{ii}. The contribution of the second nearest neighbours is weighted by c_{ij} in the definition of D. In a weighted network, e.g., a correlation network, this weight is easily motivated. In an unweighted network c_{ij} is set by the researcher. Assuming a weight of c_{ij}= 0 for an unweighted 0–1 graph reduces D to an unscaled version of the familiar Hamming distance and is comparable to the edit distance approach listed in [30].
We form D using separate edge and weight L_{1}norms. We elaborate on this choice later. The need to align the two graphs is critical to calculating D. Our approach does result in additional computational overhead since edge xy will be counted for nodes x and y. But, the counting is consistent and avoids the need for complex singlecount network partitioning schemes. Only those nodes with a path length of 2 or less from w_{ii} are included; D can easily be extended to include path lengths greater than 2.
One and twosample differential network comparisons
Defining an appropriate hypothesis in the context of networks can be nontrivial. For an ErdősRényi graph of order n, G(n,p), the obvious parameter to test is p. Apart from ERGMs and (partial) correlation networks, explicit network parametric models may not be readily apparent. The basic form of a onesample network hypothesis test here is H_{0}: η = η_{0} versus H_{1}: η ≠ η_{0}, where η is anticipated to be a vectorvalued parameter for most realistic (weighted) biological networks. For a G(n,p) graph we have η = p and one could test H_{0}:G(n,p) = G(n,p_{0}) versus H_{1}:G(n,p) ≠ G(n,p_{0}). We intentionally provide no explicit guidance for how to determine p. And, we make explicit the probability model for the graph rather than state the hypothesis in a more compact manner, i.e., H_{0}:p = p_{0}. For a network defined using a correlation matrix Ω we construct hypotheses of the form H_{0}:Ω = Ω_{0} versus H_{1}:Ω ≠ Ω_{0}. One needs to make explicit the procedure used to establish the edges in the network and the probability model for the observation data. In an example we test H_{0}:G(n,p) = G(n,p_{0}) versus H_{1}:G(n,p) > G(n,p_{0}). Here, one needs to recognize that the amount of randomness/entropy for a G(n,p) graph is less when p is close to 0 or 1 relative to p close to 0.5 to successfully perform a onesided test. Primarily, we are interested in the question, “Does this differ from the target?”
We employ a resampling approach to perform one and twosample network hypothesis tests. Following the fivestep procedure outlined in Good [35] we first analyze the problem. We identify the null and alternate hypotheses under an assumed probability model and choose a suitable Type I error rate. Second, we select a test statistic to test the hypothesis. Here, D may be suitable in its stock form or require customization for the problem at hand. Third, compute the test statistic. Fourth, determine the distribution of the test statistic under the null hypothesis via a suitable resampling procedure. Finally, make a decision using the sampling distribution of the test statistics as a guide.
To generate a null distribution for D in the onesample case we assume a parametric model or explicit generative algorithm is available in order to draw samples from the null network model. In most customary testing situations a test statistic is an estimate for a parameter of interest. However, in some cases the sample itself is the statistic – concise reductions of the data may be limited or not obvious/possible. Biological networks, where each edge or weight may be associated with transcription activity or a regulatory cascade, are inherently highdimensional objects and may therefore lack parsimonious model parameterizations. In many applications a network algorithm F is required to produce an observed network. In these instances we may need to resample from F(x) instead of resampling from the observed data {x_{i}}. In other cases, e.g., the ErdősRényi G(n,p) graph, the role of the x_{i} may be suppressed or not apparent since we observe F(x). For networks based on explicit probability models, e.g., (partial) correlation networks, parametric bootstraps or Monte Carlo procedures may be possible under suitable assumptions [35].
Although a onesample comparison has application for biomedical researchers, relative comparisons are of broad practical relevance. Research clinicians and pharmacologists are interested in exploring standardofcare and new treatment comparisons for therapeutics. Transitioning to the H_{0}: η_{1}= η_{2} versus H_{1}: η_{1}≠ η_{2} twosample problem allows one to draw upon established parametric and nonparametric comparisons. A null hypothesis of network equality versus an alternate hypothesis of network inequality is expected to be commonplace. Alternate hypotheses such as H_{1}: η_{1}> η_{2} are possible but not explored here. We make the standard assumption of two independent and identically distributed samples {x_{1}, …, x_{n}} and {y_{1}, …, y_{m}}, where x_{i} and y_{j} are networkvalued. Following the notation of [35], let P be a family of distributions for {X_{1}, …, X_{n}} that are symmetric in the sense that for a permutation π of the subscripts {1, …, n} we have P{( X_{1}, …, X_{n}) ∈ B} = P{( X_{π(1)}, …, X_{π(n)}) ∈ B} for all Borel sets B. The random variables X_{i} are said to be exchangeable – a condition established under the assumption of independent and identically distributed samples or via the principle of randomization/random allocation in experimental design. As noted in Good, permutation tests rely on the assumption of exchangeability under the null hypothesis.
Permutation testing, a procedure which relies on samples drawn from the combined pool of experimental units and the random assignment of a treatment label to each unit, is common in the bioinformatics literature due to the prevalence of ‘n<p’ wide data and the lack of closedform sampling distributions for various test statistics proposed. Pesarin [32] lists a variety of settings where these conditional inference procedures are useful. Some of the items listed that may apply to biological networks are: the distributional models for the responses are nonparametric, distributional models are not wellspecified or may rely on too many nuisance parameters, the asymptotic null sampling distribution is unknown or depends on unknown quantities, or the sample size is less than the number of responses. To continue, these procedures might prove useful for multivariate problems where some variables are categorical (e.g., edge) and others quantitative (e.g., weight), in select multivariate inference problems where the component variables have different degrees of importance (e.g., edges discrepancies may be more severe than weight differences), and when treatment effects impact more than one aspect of the network. Applying the permutation testing principle, as stated in [32], to the twosample network comparison problem via the customary mechanics serves as the inferential foundation for our twosample testing strategy.
Computer simulation
We first demonstrate D using an ErdősRényi G(n,p) graph. We test H_{0}:G(n,p) = G(n,p_{0}) versus H_{1}:G(n,p) > G(n,p_{0}). The order of G is 25 and p_{0} is 0.20. We simulate four cases with 100 hypothesis tests (or experiments) in each case. For the null case we assume that the observed network is a p = p_{0} = 0.20 model. For the remaining two cases we assume that p = 0.25. We set c_{ij} = 0 for both a null and an alternate case and c_{ij} = exp(−2) for the remaining two cases. These four cases illustrate the Type I and II error rates of our procedure both with and without the inclusion of the neighbouring information in calculating D. 1,000 resamples were used to estimate the null distribution for D. The execution time for the set of 100 experiments using 1,000 resamples was approximately 1 hour on a standard personal computer. The R package Statnet [36] was used to generate the G(n,p) graphs. All of our computations were conducted using R (http://www.rproject.org/ webcite). For testing large networks or large numbers of networks a more computationally efficient language such as Fortran or C++ is recommended. The representation diversity and size of networks, combined with a need or interest to tailor D for a particular application, poses a challenge for software developers.
To evaluate D for a correlationbased network model we assume that the pdimensional observations follow a multivariate normal distribution, N_{p}(μ,Σ), with mean vector μ and positive definite covariance matrix Σ. Transforming Σ into correlation form Ω allows us to form a partial correlation matrix. Applying a threshold to each ρ estimate or a testing procedure to the entries of Ω can be used to define a correlation network. Given Ω^{1}= (ω_{ij}) we compute the partial correlation matrix Π = (π_{ij}) via (π_{ij}) = −ω_{ij}/√(ω_{ii}ω_{jj}). Under multivariate normality two variables are conditionally independent given the remaining variables if and only if the partial correlation vanishes. The zeros in Ω^{1} determine the conditional independence graph. As before, a threshold or testing procedure is used to define the partial correlation network. We evaluate both the Type I error rate and the power of D under an alternate hypothesis.
Apart from the ErdősRényi example, the one and twosample simulation comparisons assume that the observations are in their correlation form, i.e., x_{i}~ N(0, Ω). To mimic a sparse biological network Ω consists of 6 nonzero 5x5 blocks along the diagonal. A sample rejection scheme guaranteed that the magnitude of each block entry exceeded a predefined threshold. The number of nodes (30) and the threshold (ρ = 0.2) was common to all simulations. The same threshold ρ was used to define the data generation model and to estimate the observed correlation network in the onesample case. For both comparisons we evaluated D using sample sizes of n_{1}= n_{2} = 200. For the onesample comparison the target correlation network Ω_{0} is estimated from the sample data and the resamples are drawn from the observed samples to determine the null distribution for D. Admittedly, this approach violates the true spirit of a onesample test under an assumed null model. But, if historical sample observations are available for analysis and reflect the null hypothesis then these samples might provide the most scientifically defensible null distribution for Ω_{0}. The same approach is used for the derived microarraybased correlation network comparison; a case where the true null model is unknown. To simulate the alternate hypothesis of network inequality at least 10% of the 5 × 5 blocks for a 30 × 30 correlation matrix, with a minimum of one block per experiment, were varied using a random number generator. A total of 100 experiments were performed and 1,000 resamples used in calculating each pvalue. For both the correlation and partial correlation networks we calculate D using only the weight portion of the index since the existence of an edge was defined via the (partial) correlation estimate. The execution time for the 100 experiments using 1,000 resamples was approximately 1–2 hours on a standard personal computer for the onesample comparison. The twosample setting execution time was on the order of 4–6 hours. For select simulation data the algorithm used to estimate the partial correlation network, see below, would abruptly terminate. All of the resample pvalues shown here were obtained upon a successful completion of the estimation process. The computation time for the actual biological data was negligible. This is likely due to the small networks involved; network comparisons involving a large number of nodes and/or edges present a nontrivial computational burden. Refer to Additional file 1 for the R code used.
Format: DOC Size: 92KB Download file
This file can be viewed with: Microsoft Word Viewer
We selected partial correlation networks as presented in [37] (commonly referred to as Gaussian graphical models, or GGMs, for multivariate normal observations) for the twosample comparisons due to their use in the literature, e.g., De la Fuente et al. [38] use partial correlations up to order 2 to model genomic data, and partial correlations are formed using a plurality of variables – an intuitive appeal for defining a network. Markowetz et al. [39] suggest that partial correlations may better reflect interdependencies in a network relative to a standard correlation coefficient. We fit these networks using the GeneNet algorithm presented in OpgenRhein et al. [40], an extension of the algorithm from Schäfer et al. [37]. The GeneNet R package is available from the CRAN R archive (http://cran.rproject.org webcite). The algorithm broadly consists of 3 steps. First, the correlation (or covariance) matrix Ω is converted to a partial correlation matrix Π; pseudoinverses may be involved. The second step tests for the presence of ‘significant’ edges. The final step extracts the significant edges based on a userdefined criterion. We used the magnitude of the estimated (shrunken) partial coefficient to establish this cutoff – the cutoff.ggm parameter for π_{ij} was set to 0.5. Default settings were used for the remainder of the GeneNet settings. We do not claim that GGMs are superior for modeling gene/protein networks. [37] demonstrate via simulation that the quality of various partial correlation estimators can vary according to the sample size and dimensionality of the observed data. [39] document the need for larger samples in the practical use of GGMs.
Type II diabetes mellitus data: onesample comparison
Type II diabetes mellitus (DM2) is a serious metabolic disorder affecting a large number of people worldwide. Mootha et al. [41], using microarray transcriptional profiles obtained from 17 normal and 17 DM2 muscle biopsy samples, presented a study in conjunction with the Gene Set Enrichment Analysis (GSEA) tool to detect differential expression patterns among functionallyrelated gene sets. They identified a single gene set, OXPHOS – an oxidative phosphorylation pathway, which exhibited differential gene expression levels between the two phenotypes and linked this gene set to clinically important variation in human metabolism. Mootha et al. analyzed 149 gene sets – 113 were selected based on their involvement in metabolic pathways and the remainder based on coregulated gene clusters from a mouse expression atlas. The transcript expression data and the gene set definitions from the original GSEA study were obtained from the authors’ website [42].
The original study examined average expression levels across the two groups. Here, we explore the differences in covariance or correlation structures between the two phenotypes as expressed in a correlation network formed via a threshold for ρ. To facilitate a onesample comparison the normal tissue samples are used to define a target network. The resampling procedure was outlined in the previous section. Rather than analyze severely illconditioned correlation matrices (numerous gene sets contained over 100 genes), we formed correlation networks only for those gene sets with fewer than 18 probes in the pathway. A significant onesample finding for one or more gene sets may suggest differences in coregulation network behaviour of a person with DM2 relative to a normal subject.
Ovarian cancer: twosample comparison
Ovarian cancer is the foremost lethal neoplasm of the female genital tract. We examine here the gene expression signatures of ovarian serous carcinomas (SCAs) relative to serous borderline tumors (SBTs) based on three recent studies. Sieben et al. [43] confirmed the activated role of a mitogenic pathway in SBTs and uncovered downstream genes that helped differentiate SBTs and SCAs. De Meyer et al. [44] investigated the role of the E2F/Rb pathway in SBTs and SCAs. Chien et al. [45] demonstrated the significance of the p53 and E2F pathways in serous carcinomas and reinforced the role of E2Fs listed in [44]. These three studies demonstrate that differential expression patterns exist between SBTs and SCAs. We examine a subset of these data to test whether or not covariation patterns differ between SBTs and SCA1s (lowgrade carcinomas) and between SCA1s and SCA3s (highgrade carcinomas). Changes in covariation patterns may assist in the design of followup studies, suggest a novel biomarker test or a better categorization of SCAs, or provide insight into a patient’s responsiveness to chemotherapy agents.
The microarray data analyzed here was obtained from the NCBI GEO database [46] via accession number GSE12471. These data were originally presented in [43] and contain the mRNA expression profiles of 11 SBT, 10 SCA1, and 15 SCA3 samples. 2 micropapillary pattern SBT samples were omitted from our analysis. De Meyer et al. [44] screened the original expression profiles to reduce the number of genes examined and crossreferenced their E2F target genes with the studies of Bracken et al. [47] and Bieda et al. [48]. 43 of these genes were classified by 6 biological processes in [47]: 5 for the G1/S phase of the cell cycle, 13 from the S/G2 phase of the cell cycle, 6 checkpoint genes (e.g., BRCA2), 1 development gene, 5 DNA damage and repair genes, and 13 DNA synthesis/replication genes. A table of the specific genes included is in Additional file 2. Excluding the singleton subset, we estimated partial correlation networks for each of the 5 processdefined gene subsets across the three carcinoma subtypes.
Additional file 2:. Ovarian cancer genes analyzed. Subset of analyzed genes as categorized by Bracken et al. [47].
Format: DOC Size: 35KB Download file
This file can be viewed with: Microsoft Word Viewer
Results
Simulation study
For the ErdősRényi G(n=25, p=0.20) random graph comparison refer to Figure 1. Apart from the appearance of a weak bias for both the c_{ij} = 0 and c_{ij} = exp(−2) cases the pvalues are approximately uniformly distributed under the null hypothesis. In panels (a) and (c) we see that the nominal Type I α level is approximately 0.05. Comparing panel (b) to (d) we see the improvement in the power of D when p = 0.25 and c_{ij} = exp(−2). When the neighbouring information was not used in calculating D 34% of the resample pvalues were below a nominal α level of 0.05. When the neighbouring information was included in D and scaled by c_{ij} = exp(−2) 55% of the pvalues were below the nominal 0.05 level. The likelihood of D detecting the true alternative hypothesis increased from 34% to 55% just by including the neighbouring information here.
Figure 1. Onesample tests for an ErdősRényi graph. Pvalue results from 100 independent tests of H_{0}:G(25,p) = G(25,0.20) versus H_{1}:G(25,p) > G(25,0.20). The yaxis is the observed resample pvalue; the xaxis is the expected pvalue under the null hypothesis. Panels (a) and (c), via a uniform distribution qqplot, illustrate the Type I error rate using 2 settings for c_{ij}. Panels (b) and (d) illustrate the performance of D under the alternate hypothesis for 2 settings of c_{ij}; a horizontal line corresponding to an α = 0.05 level is provided.
For the onesample correlation network comparison we examined both the Type I error rate and the power of D to reject H_{0}: Ω = Ω_{0} versus H_{1}: Ω ≠ Ω_{0}. The tests were performed using c_{ij} = 0, i.e., excluding the neighbouring information, and c_{ij}= r_{ij}, the thresholded correlation estimate. Under the assumed null model the distribution of pvalues, for both c_{ij} = 0 and c_{ij}= r_{ij}, had less mass at the extremes of the pvalue range. Additional simulation work, see [49], suggested that when n_{1}= n_{2} = 100 the Type I error rate was conservative, produced an inflated error rate when n_{1}= n_{2} = 2,000, and the test achieved a proper size when the a priori sample size was a factor of 10 larger than the observed network based on a sample of n = 200. These results suggest caution when trying to determine the null distribution for D using a finite set of a priori samples. Figure 2 graphs pairs of pvalues obtained from 100 experiments under the alternate hypothesis. This figure demonstrates that incorporating the neighbouring information in the calculation of D via c_{ij}= r_{ij} penalized our ability to reject the null hypothesis when compared to the use of c_{ij} = 0.
Figure 2. Onesample comparison for a correlation network under H_{1}. 100 resample pvalue for a test of H_{0}: Ω = Ω_{0} versus H_{1}: Ω ≠ Ω_{0} under an assumed alternate hypothesis. The yaxis indicates the pvalue obtained excluding the use of the neighbouring information in calculating D; the xaxis corresponds to the pvalue obtained using the neighbouring information in calculating D.
Additional file 3 graphs the Type I error performance for the twosample comparison under the null hypothesis of partial correlation network equality, H_{0}: Π_{1}= Π_{2}. In Figure 3 we plot the pairs of pvalues obtained under an alternate hypothesis. In both figures we illustrate the inclusion/exclusion of the neighbouring information in calculating D (i.e.,
). Under the null hypothesis we see that the pvalues are approximately uniformly distributed with reasonable Type I error control. Including the neighbouring information does suggest more lack of fit; this is not surprising given the data’s correlated block structure and the correlated components used in D. Under the alternate hypothesis simulated here 45 of the pvalues determined with the neighbouring information were less than the corresponding pvalue calculated excluding the neighbouring information, a result comparable to the flip of a fair coin. The more dramatic result is comparing the number of times we reject H_{0} at an alpha level of 0.05. Under H_{1}, 40 of the pvalues were less than 0.05 when D included the neighbouring information; 24 of the pvalues were less than 0.05 when D excluded the neighbouring information. This gain in power by including the neighbouring information differs from the onesample correlation network simulations. But, as the pvalues shift away from 0, and move in favour of H_{0}, excluding the neighbouring information consistently produced smaller pvalues. This suggests the behaviour of D may vary according to the network inference method employed.Figure 3. Twosample comparison for partial correlation networks under H_{1}. 100 resample pvalue for a test of H_{0}: Π_{1}= Π_{2} versus H_{1}: Π_{1}≠ Π_{2} under an assumed alternate hypothesis. The yaxis indicates the pvalue obtained excluding the use of the neighbouring information in calculating D; the xaxis corresponds to the pvalue obtained using the neighbouring information in calculating D.
Additional file 3:. Twosample comparison for partial correlation networks under H_{0.} A uniform qqplot of the 100 resample pvalues for a test of H_{0} Π_{1}= Π_{2} versus H_{1} Π_{1}≠ Π_{2} under the null hypothesis. The yaxis is the observed pvalue; the xaxis the expected pvalue.
Format: PDF Size: 50KB Download file
This file can be viewed with: Adobe Acrobat Reader
Comparison of derived biological networks
We first compare the correlation networks for the DM2 phenotype to the Normal phenotype. We test H_{0}: Ω_{DM2} = Ω_{Normal} versus H_{1}: Ω_{DM2} ≠ Ω_{Normal} where we assume Ω_{Normal} is known. We form Pearson productmomentbased correlation networks only for those gene sets with fewer than 18 probes in the pathway. The 17 normal samples were used to form Ω_{Normal}. We emphasize here the potential power of our test since the true state of the null or alternate hypothesis is unknown. To generate the null distribution for D we resample, with replacement, from the original 17 normal samples. 1,000 resamples were used throughout. As in the simulation study, the level of the test may be unreliable due to the resampling procedure employed. Estimating an overparameterized correlation matrix based on small samples is a general challenge for the practical estimation of network models.
Comparable to the earlier simulation study Figure 4 illustrates the performance of D including/excluding the edge portion of D and including/excluding the neighbouring information at a correlation threshold of ρ = 0.5. Panel (a) suggests that the edge portion of D is redundant to the weight component and panels (bc) suggest that including the neighbouring information detracts from the power of D. Table 1 lists the resample pvalues for the 37 network comparisons performed at correlation thresholds ρ = 0.35 and ρ = 0.5 (see [49] for additional results). The edge portion was excluded from D and, despite the potential loss in power, the neighbouring information was included. Even at less conservative Type I error levels we fail to declare a network difference in the majority of the cases. The pvalues have not been adjusted for the number of comparisons. To illustrate the estimated correlation networks for the two phenotypes near the pvalue extremes we provide the Normal and DM2 correlation network estimates for the MAP290 and MAP472 gene sets in Table 2. Yates [49] provides a post hoc analysis of the MAP290 correlation network under the assumption of a declared significant network difference.
Figure 4. Onesample correlation network comparison of Type II diabetes versus Normal phenotype. Resample pvalues for the 37 gene sets analyzed. Edge/noedge indicates the inclusion/exclusion of the edge portion in calculating D. Neighbour/noneighbour indicates the inclusion/exclusion of the neighbouring information in calculating D.
Table 1. Onesample correlation network comparison of Type II diabetes versus Normal phenotype
Table 2. Valine leucine and isoleucine biosynthesis (MAP290) and DArginine and D  Ornithine Metabolism (MAP472) gene sets for the Normal and Type II (DM2) phenotypes
For the twosample comparisons we examined a limited ordered comparison of the SBT, SCA1, and SCA3 phenotypes for the 5 biological processes. Our subjective rationale was that a comparison of the SBT and SCA1 phenotypes might suggest a biomarker candidate; comparing the SCA1 and SCA3 phenotypes might better characterize disease progression or provide insight regarding resistance to chemotherapy agents. Table 3 lists the number of edges in the estimated partial correlation networks obtained using GeneNet for the subsets categorized by Bracken et al. [47]. Apart from the DNA synthesis and replication process the estimated GGMs are either empty or sparse. As a side note – some of our simulation work suggests that GeneNet tends to underfit a network. Table 4 lists the resample pvalues for a test of H_{0}: Π_{1}= Π_{2} versus H_{1}: Π_{1}≠ Π_{2} across the eight comparisons. All of these comparisons used the neighbouring information in calculating D; apart from the smallest pvalue listed the differences between the neighbour/noneighbour pvalues were negligible. At a standard alpha level of 0.05, a conservative value for a comparison of covariancebased matrices, none of the hypotheses would be rejected. Rather than adopt such a conservative view, and ignoring the topic of multiple comparisons, we chose to provide the partial correlation networks for the 13 genes in the SCA1 and SCA3 phenotypes (pvalue 0.142) in Table 5. This table suggests an observable difference between the two networks.
Discussion
The definition of D was inspired by the question, “How long is the coast of Britain?” This question motivates the definition of fractal dimension, a concept suited for complex objects. Cutler’s [50] definitions of packing and pointwise dimension suggest measures based on an additive decomposition of sets/neighbourhoods. The neighbourhood size and scaling behaviour can vary across the points in the set. Combining these ideas with a Riemannlike sum is our basis for a topological comparison of networks. Nacu et al. [51] suggest the use of neighbourhoods, of potentially varying radius, for identifying differentially expressed pathways. Choosing a node as the centre of the local neighbourhood, and not an edge, allows for several advantages: a reduction in combinatorial complexity, facilitates molecular nodebased post hoc tests, avoids the need for an ‘optimal’ tiling or network partition, and it limits the number of relational features/motiflike structures to compare. We limited our set/neighbourhoodbased definition of D to internodal path lengths strictly less than three since this is the minimum distance necessary to capture a feedforward/feedback loop, it limits the range of topological comparisons, and D is prevented from revisiting the ‘centre’ of a neighbourhood in a cyclic graph. D can be extended to include a larger neighbourhood. Apart from obvious computational implications, defining a second set of c_{ij}’s is necessary and the efficiency of D may deteriorate as D integrates a larger set of imprecise or variable estimates. In contrast to fractals, viewing a network as an inhomogeneous mixture of subgraphs suggests that a local distance may be preferable to large distances that span a large set of nodes.
We do not claim that D is or will be optimal under a broad range of network models. Apart from the literature related to ERGMs, weighted biological network models with welldeveloped inferential theory analogous to maximum likelihood, uniformly minimumvariance unbiased, or minimax estimators is largely unavailable. The ability to apply D to a broad range of edge/noedge and weighted/unweighted networks is sure to involve tradeoffs. Banks et al. [52] illustrate a case where a metric for a clustering application is illsuited for a phylogenetic inference problem. [53] state the need for fixed/absent edges in their informative prior approach to network inference. A decomposable additive measure can be tailored to reflect meaningful comparisons, e.g., [54] modify an L_{1}based edit distance for unweighted binary networks using protein signaling logic. While not constituting a definitive argument, we speculate that the value of including the neighbouring information in the G(n,p) simulation example is due to the uniformity of p across the set of nodes. The loss in performance using the neighbouring information for a pairwise correlation network model is likely due to D unnecessarily integrating across a larger set of imprecise estimates. For the twosample partial correlation network comparison, the gain in performance via the use of neighbours may be influenced by the fact that each edge is formed using a plurality of variables. For researchers interested in alternate network models, e.g., preferential attachment or smallworld models, evaluating D (or comparing it to a competing procedure) under various scenarios is recommended.
Basing D on a more general form of a Hamming distance parallels efforts in the area of global network alignment scoring. Others have proposed alignmentbased operations for use with biological networks, e.g., [55] for comparing phylogenetic trees. The use of set algebraic operations, e.g., union, intersection, strict and symmetric differences, is commonplace [20,21,56]. Having D incorporate neighbouring information might be obvious to a systems biologist; see [57] for an example in classifying breast cancer metastasis and [58] for a pathway differential expression application. [38], using partial correlations, found it useful to restrict the number of genes/nodes to condition on. [59] use level1 and level2 neighbours to predict protein function. [23] outline a weighted topological overlap measure to cluster gene modules. [22] present a topological overlap measure that generalizes pairwise similarity to one based on shared neighbours. The precedent of incorporating neighbouring information for network objects is clear; the question of, “But how far do we go?” is less clear. As mentioned earlier, the matter of neighbourhood size is liable to impact the behaviour of D and may interact with network estimation procedures, data collection requirements, etc.
Despite D’s simplicity and potential for use across a broad set of network models, the importance (and weighting) of D’s components should receive serious consideration by the investigator. Gower [60] concedes that weighting components of a measure is challenging. A robustness study is recommended to address this topic, see Yates [49]. For both one and twosample comparisons, tuning D to gage or improve its performance for a specific network model is advised. The selection of the weight constant c_{ij} for weighted graphs was motivated by the idea of conditional probabilities. If A and B represent two adjacent edges and information flows through their common vertex then it is reasonable to assume that some form regarding the state of B is meaningful to A. It is plausible to assume that the force two objects exert on one another is proportional to their proximity. Preferential attachment networks assume that new edges are formed at a node conditional on the number of existing edges at that node. Wei et al. [61] employ genespecific prior probabilities in a spatially correlated mixture model application. But, the quality of the weights may be suspect. Ashyraliyev et al. [62] found quantitative parameter estimates to be unreliable in modeling a gap gene circuit but that inferring a reliable qualitative network topology was possible. The relative weighting of edge and weight differences is a topic to discuss here. The idea of scaling or normalizing portions of D is complicated. Gao et al. [63] allow hubs to exert an unequal influence; Yip et al. [64] normalize their generalized topological overlap measure to the unit interval. Given a rough similarity between D and the total sums of squares in regression modeling we allow for nodes with a high degree (i.e., ‘large degree of freedom’ tests) to contribute more to the calculation of D. If one chooses to normalize portions of D at each node by some topological property then one has to justify the scaling factor. Does one scale by the node’s (weighted) degree, a clustering coefficient, in or outdegree for a directed network, etc.? If a network model has an efficient estimator whose sampling distribution is well characterized, the use of D may be contraindicated. At present, we do not expect this to be the case for experimentbased weighted biological networks. As a conservative measure, D could be reduced to its Hamminglike form and only the inferential procedures outlined here applied. Evaluating D using an in silico model under plausible scientific scenarios of interest to the investigator could also reduce concerns about aspects of the definition of D.
Our choice of network architectures to evaluate with D was limited. Computational expediency, an ability to explore and contrast common parametric models and their associated weighted network forms, and the use of a canonical structure (e.g., G(n,p) graphs) motivated our network model selection. Both Marbach et al. [65] and Altay et al. [66] discuss difficulties associated with currently available network inference algorithms. MüllerLinow et al. [67] found that the proximity of metabolites in a correlation network did not align with metabolite proximity observed in genome databases. Hubert et al. [68] suggest that final structural representations are unlikely to be global optima since the selection procedure, while reasonable, does not make use of a verifiable optimal search strategy. Even the limited simulation work presented here suggests the nuance that network algorithms or models can inject into the process. Relying on a priori samples or an assumed null model to generate a sampling distribution for D may be restrictive in the onesample case; but, the complexities surrounding network probability models and the intractability of networkbased test statistics does not suggest easy alternatives. Large (weighted) networks are likely to be costly in terms of the data needed to estimate a family of network parameters. The literature on the analysis of large networks, with its emphasis on select topological comparisons, typically imposes a vast reduction in network complexity. Thorne et al. [69], in proposing a method to generate confidence intervals for networkrelated correlations and motifabundances reinforced the complexities in defining a suitable null model for a biological process.
General criticisms leveled against the use of resampling methods are applicable here; see Berger [70]. The practical matter of using D for observation studies raises the topic of partial exchangeability. It is unreasonable to assume that either of the biological datasets presented here met the assumption of exchangeability. However, unconditional procedures also struggle when confronted with observational data and/or missing or hidden covariates. Our one and twosample comparison approaches require or assume a hypothesized network model for the purpose of sampling; we do not employ fixednetwork random rewiring schemes or variations that seek to align a set of topological measures across a set of stochastic networks.
A significant one or twosample finding naturally suggests the question, “Where do the networks differ?” Similar to individual effect tests in regression, the answer to this question will involve one or more nodes or subgraphs. Answering targeted questions routinely appear in the literature, see [71,72]. Dong and Horvath [73] suggest approximately factorizable networks; [74] locate regulatory hotspots by integrating network topology and transcriptome data. Since D is a sum of nodebased dissimilarities, a test for dissimilarity between two aligned nodes or subgraphs can be formed using the portion of D attributable to that node or subgraph. Such an approach may help isolate molecular targets or subnetworks for closer study. Determining the sampling distribution for the modified test statistic is identical to the procedure developed here. See Yates [49] for more details and examples.
Conclusions
In this paper we have demonstrated an inferential framework for performing one and twosample hypothesis tests for biological networks. We proposed a suitable test statistic and evaluated it with both simulated and microarray data under a variety of situations. The dissimilarity test statistic proposed is a logical extension of existing measures used in sequence alignment/nominal data comparisons. In generating the null distribution of the test statistic we used an assumed parametric model, resamples from a collection of null samples, and made use of permutation testing principles. Resampling methods suggest that counterintuitive behaviours may occur when dealing with a (potentially complex) network model. While not explicitly demonstrated here, our procedure easily facilitates nodelevel or subgraph post hoc tests.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
This work is based on the PhD dissertation of PDY. NDM suggested the problem and supervised the research. All authors read and approved the final manuscript.
Acknowledgements
The authors thank Drs. EL Boone, RK Elswick, Jr., L. Dumenci, and V. Ramakrishnan for serving on the dissertation committee. The authors also thank the referees for their helpful comments and suggestions.
References

Bornholdt S, Schuster HG: Handbook of Graphs and Networks: From the Genome to the Internet. WileyVCH; 2003.

Junker B, Schreiber F: Analysis of Biological Networks. Wiley; 2008.

EmmertStreib F, Dehmer M: Analysis of Microarray Data: A NetworkBased Approach. WileyVCH Verlag; 2008.

Kolaczyk E: Statistical Analysis of Network Data: Methods and Models. Springer Science+Business Media; 2009.

Brandes U, Erlebach T: Network Analysis: Methodological Foundations. SpringerVerlag; 2005.

Steuer R, López G: Global network properties. In Analysis of Biological Networks. Edited by Junker B, Schreiber F. Wiley; 2008:3163.

Chung F, Lu L: Complex Graphs and Networks. American Mathematical Society; 2006.

Schwöbbermeyer H: Network motifs. In Analysis of Biological Networks. Edited by Junker B, Schreiber F. Wiley; 2008:85111.

Saul Z, Filkov V: Exploring biological network structure using exponential random graph models.
Bioinformatics 2007, 23(19):26042611. PubMed Abstract  Publisher Full Text

Wiuf C, Brameier M, Hagberg O, Stumpf M: A likelihood approach to analysis of network data.
Proc Natl Acad Sci U S A 2006, 103:75667570. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Cardelli L: Abstract machines of systems biology. In Transactions on Computational Systems Biology III. Edited by Priami C, Merelli E, Gonzalez P, Omicini A. SpringerVerlag; 2005:145168.

Steinhauser D, Krall L, Müssig C, Büssis D, Usadel B: Correlation networks. In Analysis of Biological Networks. Edited by Junker B, Schreiber F. Wiley; 2008:305333.

Faust K, Skvoretz J: Comparing networks across space and time, size and species.
Sociol Methodol 2002, 32:267299. Publisher Full Text

Banks D, Carley K: Metric inference for social networks.
J Classif 1994, 11:121149. Publisher Full Text

Sanil A, Banks D, Carley K: Models for evolving fixed node networks: model fitting and model testing.
Social Networks 1995, 17:6581. Publisher Full Text

Kahlem P, Birney E: ENFIN – a network to enhance integrative systems biology. In Reverse Engineering Biological Networks: Opportunities and Challenges in Computational Methods for Pathway Inference. Edited by Stolovitzky G, Califano A. New York Academy of Sciences; 2007:2331.

Stolovitzky G, Monroe D, Califano A: Dialogue on reverseengineering assessment and methods: the DREAM of highthroughput pathway inference. In Reverse Engineering Biological Networks: Opportunities and Challenges in Computational Methods for Pathway Inference. Edited by Stolovitzky G, Califano A. New York Academy of Sciences; 2007:122.

Chen L, Wang R, Zhang X: Biomolecular Networks: Methods and Applications in Systems Biology. Wiley; 2009.

XulviBrunet R, Li H: Coexpression networks: graph properties and topological comparisons.
Bioinformatics 2010, 26:205214. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Gill R, Datta S, Datta S: A statistical framework for differential network analysis from microarray data.
BMC Bioinforma 2010, 11:95. BioMed Central Full Text

Li A, Horvath S: Network neighborhood analysis with the multinode topological overlap measure.
Bioinformatics 2007, 23:222231. PubMed Abstract  Publisher Full Text

Zhang B, Horvath S: A general framework for weighted gene coexpression network analysis.

Langfelder P, Horvath S: WGCNA: an R package for weighted correlation network analysis.
BMC Bioinforma 2008, 9:559. BioMed Central Full Text

Anderson T: An Introduction to Multivariate Statistical Analysis. 3rd edition. Wiley; 2003.

Anderson M: Distancebased tests for homogeneity of multivariate dispersions.
Biometrics 2006, 62:245253. PubMed Abstract  Publisher Full Text

Holmes S: Bootstrapping phylogenetic trees: theory and methods.
Stat Sci 2003, 18(2):241255. Publisher Full Text

Perkins T: The gap gene system of Drosophila melanogaster: modelfitting and validation. In Reverse Engineering Biological Networks: Opportunities and Challenges in Computational Methods for Pathway Inference. Edited by Stolovitzky G, Califano A. New York Academy of Sciences; 2007:116131.

Toh H, Horimoto K: Inference of a genetic network by a combined approach of cluster analysis and graphical Gaussian modelling.
Bioinformatics 2002, 18(2):287297. PubMed Abstract  Publisher Full Text

EmmertStreib F: The chronic fatigue syndrome: a comparative pathway analysis.
J Comput Biol 2007, 14(7):961972. PubMed Abstract  Publisher Full Text

Xiong M, Li J, Fang X: Identification of genetic networks.
Genetics 2004, 166(2):10371052. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Pesarin F: Multivariate Permutation Tests: With Application in Biostatistics. Wiley; 2001.

Gan G, Ma C, Wu J: Data Clustering: Theory, Algorithms, and Applications. ASASIAM: ASASIAM Series on Statistics and Applied Probability; 2007.

Lewis T: Network Science: Theory and Applications. Wiley; 2009.

Good P: Permutation, Parametric, and Bootstrap Tests of Hypotheses. 3rd edition. Springer Science+Business Media; 2005.

Handcock M, Hunter D, Butts C, Goodreau S, Morris M: statnet: software tools for the representation, visualization, analysis and simulation of network data.
J Stat Softw 2008, 24:1. PubMed Abstract  PubMed Central Full Text

Schäfer J, Strimmer K: An empirical Bayes approach to inferring largescale gene association networks.
Bioinformatics 2005, 21:754764. PubMed Abstract  Publisher Full Text

De La Fuente A, Bing N, Hoeschele I, Mendes P: Discovery of meaningful associations in genomic data using partial correlation coefficients.
Bioinformatics 2004, 20:35653574. PubMed Abstract  Publisher Full Text

Markowetz F, Spang R: Inferring cellular networks – a review.
BMC Bioinforma 2007, 8(Suppl 6):S5. BioMed Central Full Text

OpgenRhein R, Strimmer K: From correlation to causation networks: a simple approximate learning algorithm and its application to highdimensional plant gene expression data.
BMC Syst Biol 2007, 1:37. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Mootha VK, Lindgren CM, Eriksson KF, Subramanian A, Sihag S, Lehar J, Puigserver P, Carlsson E, Ridderstrale M, Laurila E: PGC1 alpharesponsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes.
Nat Genet 2003, 34(3):267273. PubMed Abstract  Publisher Full Text

The Broad Institute.
http://www.broadinstitute.org/publications/broad991s webcite

Sieben NLG, Oosting J, Flanagan AM, Prat J, Roemen GMJM, KolkmanUljee SM, van Eijk R, Cornelisse CJ, Fleuren GJ, van Engeland M: Differential gene expression in ovarian tumors reveals Dusp 4 and Serpina 5 as key regulators for benign behavior of serous borderline tumors.
J Clin Oncol 2005, 23(29):72577264. PubMed Abstract  Publisher Full Text

De Meyer T, Bijsmans IT, Van de Vijver KK, Bekaert S, Oosting J, Van Criekinge W, van Engeland M, Sieben N: E2Fs mediate a fundamental cellcycle deregulation in highgrade serous ovarian carcinomas.
J Pathol 2009, 217:1420. PubMed Abstract  Publisher Full Text

Chien J, Fan JB, Bell DA, April C, Klotzle B, Ota T, Lingle WL, Gonzalez Bosquet J, Shridhar V, Hartmann LC: Analysis of gene expression in stage I serous tumors identifies critical pathways altered in ovarian cancer.
Gynecol Oncol 2009, 114:311. PubMed Abstract  Publisher Full Text

Edgar R, Domrachev M, Lash AE: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository.
Nucleic Acids Res 2002, 30:207210. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Bracken AP, Ciro M, Cocito A, Helin K: E2F target genes: unraveling the biology.
Trends Biochem Sci 2004, 29(8):409417. PubMed Abstract  Publisher Full Text

Bieda M, Xu X, Singer MA, Green R, Farnham PJ: Unbiased location analysis of E2F1binding sites suggests a widespread role for E2F1 in the human genome.
Genome Res 2006, 16:595605. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Yates PD: An inferential framework for network hypothesis tests: with applications to biological networks. Virginia Commonwealth University, Department of Biostatistics; 2010.

Cutler C: A review of the theory and estimation of fractal dimension. In Dimension Estimation and Models. Edited by Tong H. World Scientific Publishing; 1993:1107.

Nacu S, CritchleyThorne R, Lee P, Holmes S: Gene expression network analysis and applications to immunology.
Bioinformatics 2007, 23:850858. PubMed Abstract  Publisher Full Text

Banks D, Constantine G: Metric models for random graphs.
J Classif 1998, 15:199223. Publisher Full Text

Mukherjee S, Speed T: Network inference using informative priors.
Proc Natl Acad Sci U S A 2008, 105:1431314318. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Trusina A, Sneppen K, Dodd I, Shearwin K, Egan J: Functional alignment of regulatory networks: a study of temperate phages.
PLoS Comput Biol 2005, 1(7):e74. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Diaconis P, Holmes S: Matchings and phylogenetic trees.
Proc Natl Acad Sci U S A 1998, 95:1460014602. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Forst C, Flamm C, Hofacker I, Stadler P: Algebraic comparison of metabolic networks, phylogenetic inference, and metabolic innovation.
BMC Bioinforma 2006, 7:67. BioMed Central Full Text

Chuang H, Lee E, Liu Y, Lee D, Ideker T: Networkbased classification of breast cancer metastasis.
Mol Syst Biol 2007, 3:140. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Draghici S, Khatri P, Tarca A, Amin K, Done A, Voichita C, Georgescu C, Romero R: A systems biology approach for pathway level analysis.
Genome Res 2007, 17:15371545. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Chua H, Sung W, Wong L: Exploiting indirect neighbours and topological weight to predict protein function from proteinprotein interactions.
Bioinformatics 2006, 22:16231630. PubMed Abstract  Publisher Full Text

Gower J: A general coefficient of similarity and some of its properties.
Biometrics 1971, 27:857871. Publisher Full Text

Wei P, Pan W: Incorporating gene networks into statistical tests for genomic data via a spatially correlated mixture model.
Bioinformatics 2008, 24:404411. PubMed Abstract  Publisher Full Text

Ashyraliyev M, Jaeger J, Blom J: Parameter estimation and determinability analysis applied to Drosophila gap gene circuits.
BMC Syst Biol 2008, 2:83. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Gao S, Wang X: TAPPA: topological analysis of pathway phenotype association.
Bioinformatics 2007, 23:31003102. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Yip A, Horvath S: Gene network interconnectedness and the generalized topological overlap measure.
BMC Bioinforma 2007, 8:22. BioMed Central Full Text

Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G: Revealing strengths and weaknesses of methods for gene network inference.
Proc Natl Acad Sci U S A 2010, 107:62866291. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Altay G, EmmertStreib F: Revealing differences in gene network inference algorithms on the network level by ensemble methods.
Bioinformatics 2010, 26:17381744. PubMed Abstract  Publisher Full Text

MüllerLinow M, Weckwerth W, Hütt MT: Consistency analysis of metabolic correlation networks.
BMC Syst Biol 2007, 1:44. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Hubert L, Arabie P, Meulman J: The Structural Representation of Proximity Matrices with MATLAB. ASA/SIAM; 2006.

Thorne T, Stumpf M: Generating confidence intervals on biological networks.
BMC Bioinforma 2007, 8:467. BioMed Central Full Text

Berger VW: Pros and cons of permutation testing in clinical trials.
Statistics in Medicine 2000, 19:13191328. PubMed Abstract  Publisher Full Text

Fuller TF, Ghazalpour A, Aten JE, Drake TA, Lusis AJ, Horvath S: Weighted gene coexpression network analysis strategies applied to mouse weight.
Mammalian Genome 2007, 18:463472. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Zhang B, Li H, Riggins RB, Zhan M, Xuan J, Zhang Z, Hoffman EP, Clarke R, Wang Y: Differential dependency network analysis to identify conditionspecific topological changes in biological networks.
Bioinformatics 2009, 25:526532. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Dong J, Horvath S: Understanding network concepts in modules.
BMC Syst Biol 2007, 1:24. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Oliveira A, Patil K, Nielsen J: Architecture of transcriptional regulatory circuits is knitted over the topology of biomolecular interaction networks.
BMC Syst Biol 2008, 2:17. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text