| Network motifs: structure does not determine function1Department of Mathematics, Imperial College London, 180 Queen's Gate, London, SW7 2AZ, UK 2Centre for Bioinformatics, Division of Molecular Biosciences, Wolfson Building, Imperial College London, South Kensington Campus, London, SW7 2AY, UK
BMC Genomics 2006, 7:108doi:10.1186/1471-2164-7-108 The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2164/7/108
©
2006 Ingram et al; licensee BioMed Central Ltd. AbstractBackgroundA number of publications have recently examined the occurrence and properties of the feed-forward motif in a variety of networks, including those that are of interest in genome biology, such as gene networks. The present work looks in some detail at the dynamics of the bi-fan motif, using systems of ordinary differential equations to model the populations of transcription factors, mRNA and protein, with the aim of extending our understanding of what appear to be important building blocks of gene network structure. ResultsWe develop an ordinary differential equation model of the bi-fan motif and analyse variants of the motif corresponding to its behaviour under various conditions. In particular, we examine the effects of different steady and pulsed inputs to five variants of the bifan motif, based on evidence in the literature of bifan motifs found in Saccharomyces cerevisiae (commonly known as baker's yeast). Using this model, we characterize the dynamical behaviour of the bi-fan motif for a wide range of biologically plausible parameters and configurations. We find that there is no characteristic behaviour for the motif, and with the correct choice of parameters and of internal structure, very different, indeed even opposite behaviours may be obtained. ConclusionEven with this relatively simple model, the bi-fan motif can exhibit a wide range of dynamical responses. This suggests that it is difficult to gain significant insights into biological function simply by considering the connection architecture of a gene network, or its decomposition into simple structural motifs. It is necessary to supplement such structural information by kinetic parameters, or dynamic time series experimental data, both of which are currently difficult to obtain. BackgroundThe concept of a network motif, introduced by Alon and co-workers [1], has rapidly become one of the central topics of interest in the analysis of complex networks. These networks promise to provide a framework for the understanding of biological processes involving many components such as intra- or inter-cellular networks of interacting genes or proteins. The analysis of such frameworks is one of the key techniques in the rapidly emerging field of systems biology, which makes extensive use of protein interaction, metabolic and gene regulatory networks. Several authors have argued that knowing the structure of these networks, that is knowing the pattern of interactions, will allow us to understand how combinations of genes or proteins interact to achieve specific functional outcomes, and to predict new functional relationships. A network motif in the sense introduced by Alon and co-workers is a pattern or small sub-graph that occurs more often (at some statistically significant level) in the true network than in an ensemble of networks generated by randomly rewiring the edges in the true network, where the number of nodes and the degree of each node is kept fixed. Of interest are the differences in the frequencies with which network motifs occur in real (biological as well as technological) networks. The recurrent presence of certain motifs has been linked to systematic differences [2] in the functional properties required from networks. In analogy to electric circuits which are built up of smaller modules, such as logic gates, it has been suggested that the motifs in biological networks reflect functional or computational units which combine to regulate the cellular behaviour as a whole. Recently the work of Prill et al. [3] has looked at how one aspect of motifs, their stability, influences biological network organisation and specifically the abundance of different motifs in the network. The detection and enumeration of network motifs has now been followed up by studying the dynamics of corresponding mathematical models of these motifs, especially in the context of transcription regulation networks. These networks aim to describe the links between those genes which code for transcription factors and the genes whose products they control. At the moment, due to the diversity of stimuli a cell/organism can experience, our understanding of the complete sets of regulatory relationships is only preliminary and because of the apparent importance of post-transcriptional regulation, captures only one aspect of the regulatory machinery. Additionally, it must be recalled that these motifs do not exist in isolation within the network, and their behaviour will be heavily influenced by both global and local changes in the cellular environment and the state of the network as a whole. These considerations alone may make attempts to draw positive conclusions about how a motif will behave overly optimistic. Network motifs and transcriptional regulationTo a great extent, the control of gene transcription is performed by regulatory proteins known as transcription factors, which bind to specific sites on the DNA. Transcription factors may regulate a gene in isolation, but more commonly there are multiple transcription factors acting in concert. The transcription factors are of course themselves products of other (or possibly the same) genes, resulting in a network of interacting regulatory genes. Milo [1] and others have recently looked at ways in which such networks can be broken down into smaller functional units in order to more easily identify structures within the network. It is hoped that the appearance of such smaller units may be indicative of modular structure and efficient design. One of the most important motifs that has hitherto been identified is the feed-forward motif (see Figure 1(a)). A number of recent papers have examined the dynamics of mathematical models of the feed-forward motif [4-6]. Recently, however, it was noted that while the feed-forward loop motif is unusually common, other motifs may be even more prevalent [7]. In particular it was emphasized that when determining the relative statistical significance of the abundance of various motifs, it was important to use an appropriate null "random" model [8]. It was suggested that previously the background structure, that is, physical distance and compartmentalization, had not been adequately taken into account when generating random networks. By using a more sophisticated null model which took into account spatial separation when considering whether nodes would be connected it was found that the "bi-fan" motif was the most prevalent in the transcription regulation networks of both E. coli and S. cerevisiae. Thus far, however, there has been no detailed study of the dynamics of this motif.
Bi-fan motifs in S. cerevisiaeIn Table 1 we list bi-fan motifs extracted from the TRANSFAC database, and in Figure 2 highlight the regulatory relationships reported in the literature for some of these motifs. As is apparent from Table 1, several genes are involved in more than one bi-fan motif. Also, the regulatory interactions documented in the database have been ascertained in a non-uniform way, this simply reflects the non-exhaustive nature of present molecular interaction data-sets. Table 1. Regulators and regulated genes observed in bifan motifs in S. cerevisiae and references to the relevant literature.
In Figure 2 repressive interactions are shown in red, while promoting interactions are depicted in green. Even in the small subset of genes for which interaction data was available we were able to find exemplars for a range of distinct bi-fan architectures. We consider and contrast the dynamics of each of these variants. Dynamical models of motifsGiven a particular transcription regulation motif, such as those in Figure 1, and straightforward assumptions about the binding kinetics of its constituent molecular species (genes and proteins), we can derive a mathematical model for its dynamics. This then allows analysis of the characteristic responses of these constituents of a motif following an external stimulus. Both deterministic and stochastic models are possible. A number of recent papers have constructed and analyzed models for the feed-forward motif. In particular Mangan and Alon [4] have shown that this single simple motif can exhibit a vast range of different dynamical behaviours. We believe that these attempts to link structure (of motifs) to function in terms of mathematical models raise a number of interesting questions and problems. The present work looks in some detail at the dynamics of the motifs identified in Table 1. Our analysis follows the approach of [5] for the feed-forward motif and in particular we model the bi-fan motif using a system of ordinary differential equations (ODEs). We use the number of molecules of the two second tier proteins as a measure of motif behaviour. We look in turn at the four variants of the motif identified in yeast under a series of example dynamical scenarios which reflect the diversity of behaviour that can be demonstrated by this simple motif, starting with coherent motifs, that is, motifs in which every transcription factor acts to promote transcription. ResultsThe motifs in Figure 2 can broadly be separated into two categories – A, B and D are generally referred to as "incoherent motifs", while C is a coherent motif. This nomenclature is due to the arrangement in C in which both inputs act as promoters, in comparison to A and D which are both fully incoherent (both second tier proteins have both promoter and repressor inputs), whilst B may be considered to be partially incoherent, as one of the second tier proteins has incoherent inputs, whilst the other has coherent inputs (the model used in this case is given in equation (1), in which we consider co-operative binding in the case that the order of binding is unimportant). In addition we also add a derivative of the C model which we denote as C', in which both inputs still act as promoters, however the assumption about the way the promoters act is different – we no longer require both promoters to bind for PW to be expressed (in practise it can be seen that these operate as an AND and an OR gate respectively). Further discussion of the many ways in which the transcription factors operates can be modelled can be found in the methods section. Constant inputsWe first consider the effect of providing the motif with steady inputs, a biological scenario which corresponds to continued exposure to a either an environmental condition triggering independent factors, or to a constant signal which is split into two signals by the network structure. To examine the various responses, we consider the effect of both signals being turned on continuously at a high level (which we denote in the in Table 3 as a green square), the result of one of the factors occurring at a high level whilst the other is either low (denoted by a yellow square) or off (denoted by red). In this way one can see the variation in the responses of the two output proteins, which we denote here as PZ and PW for all motifs. In this table, we have simply provided the transcription factors at a reasonably high concentration and quantified the response of each protein after the system has stabilized. All of the simulations are performed with the same kinetic parameters, as detailed in the methods section, and the qualification of a high or low response is based on comparison with the response of other motifs and the response of the other protein. Again, these results are colour-coded for easy comparison, with green, yellow and red again corresponding to high, low or no expression. We can see that there is significant variation in the characterization of the response of the variants of the motif. Response to simultaneous pulsesWe next look at the responses of the motif in the case in which both transcription factors are turned on simultaneously for 3600 seconds (an hour), and then turned off again. Again, the precise amplitudes and durations of the protein responses varies greatly, and are dependant on the precise values of the kinetic parameters. Two illustrative cases can be seen in Figure 5, again corresponding to variants A and B of the motifs however. The qualitative results are summarized in Figure 4.
Staggered pulsesFinally we look at the effect of offset pulses in the levels of transcription factors. In the first case, providing the first transcription factor (denoted by IX) for 3600 seconds, then turning that off, and turning the second factor (IY) on for 3600 seconds. This is then reversed. The results can be see in Figure 6, again using the colour coding scheme described above.
Two illustrative cases can be seen in Figure 7, corresponding to variants A and B, in the case in which first IX and then IY are activated.
DiscussionWe have seen that the the bi-fan motif exhibits a rich variety of dynamical behaviour and has the ability to perform a number of potentially useful functions, considering for example Figure 3, the results of steady inputs, we can see that the motif can pass through the inputs as outputs (variant A), act as a logical AND gate (variant C) or an OR gate (variant C'). From this table of responses to the simplest dynamical stimulation we can see already that it is very hard to draw any firm conculsions about the role of a bifan motif without knowing a great deal about its internal structure. Furthermore the simple distinction of coherent or incoherent is also insufficient, as we can see from the opposing behaviours of C and C'. Furthermore, there is great variation in the detailed dynamical response as we have seen from Figures 5 and 7, with differences in total expression levels, steepness of response and timing of peak expression. It must be noted that the majority of the variation in behaviour is not unexpected, and arises as a consequence of the parameters used, however this only exacerbates the difficulties of trying to think of, and use motifs as higher level "functional modules". ConclusionAnalysis of many networks in a large number of scenarios, from biological and social networks to technological networks has revealed the presence of motifs, simple patterns which occur with a greater than expected frequency. In the context of biological networks, and specifically transcription regulatory networks, it has been argued [9] that motifs have evolved independently, indicating optimal design. It has been suggested that such motifs may represent "computational elements", and in the case of the feedforward motif, the possibility of the motif acting as a Boolean AND or OR gate has been investigated [4]. Here we have systematically studied a range of dynamical behaviours possible for bi-fan motifs. It had previously been demonstrated that even for the simpler feed-forward motif there is already a vast range of possible dynamical behaviours. For the bi-fan, which is only slightly more complex, we found again that there is a large range of possible response behaviours. Most notably we observed that entirely opposing behaviours (for example, in the case of the responses of variants C and C') can be elicited depending on the nature and strengths of individual interactions within the motif. We were able to identify a variety of combinations of such interactions in the bi-fan motifs found in the transcriptional network of S. cerevisiae. In particular, we found examples of both coherent and incoherent architectures. This suggests that simply identifying the presence of particular motifs, without a detailed experimental evaluation of their respective dynamics, is unlikely to offer much insight into the functional properties of real transcriptional networks. In essence this means that knowing the structure of a network, or an inventory of the discrete modules making up that structure, doesn't provide enough information to predict how functional processes occur or how biochemical reactions proceed in a biological system. Admittedly, our analysis does not take into account the full complexity of real biological bi-fan motifs. This, however, makes the interpretation of bi-fan motif occurrences in nature even more difficult: if simple mathematical models can already demonstrate such different types of behaviour then it is likely that real bi-fan motifs exhibit an even richer repertoire of behaviour. One should also remember that motifs are themselves generally only artificially identified local structures, there is no good reason to believe that their dynamics can necessarily lead to a modularization in understanding the behaviour of transcription networks. Many of the databases used to mine for motifs are based on yeast-2-hybrid experimental data which gives no indication of whether elements of identified structures are active at the same time. Furthermore we usually lack information about the behaviour of the input signals, essential to understand the relationship of the motif to the network as a whole. Thus, we can conclude that simply knowing the connection structure of this motif is insufficient to give much insight into its function or even its dynamical response. In order to do this, we would need much more detailed experimental information about binding and unbinding rates and other kinetic information. Currently, such experimental data is difficult to determine in a wholesale fashion, making the large scale analysis of the function of transcription networks very problematic. MethodsFollowing [5] we only consider deterministic models and hence use systems of ordinary differential equations (ODEs) to describe the time evolution of the number of molecules of the various constituent species of a bi-fan motif. In many cases the copy numbers of some of these (e.g. transcription factors) will be too low for ODEs to capture the dynamics of the real system. In such cases we can use a stochastic model which can be simulated using for instance the Gillespie algorithm [10]. However [5] found that for feed-forward motif the average behaviour of an ensemble of such simulations is well characterised by ODEs describing the mean behaviour. We confirmed this for the bi-fan motif (data not shown). Following the custom in the literature for feed-forward motifs, we refer to the bi-fan network in which all interactions act directly to promote expression as "coherent", and use the term "incoherent" otherwise. We analyse the dynamical response of the motif by observing the expression levels of the proteins in the motif. We assume that the motif is initially at equilibrium with zero input so that in the absence of basal transcription none of the proteins in the motif are initially expressed. We then stimulate the system at one or both inputs for varying periods and at varying strengths, corresponding to different dynamical situations. Our models of transcription regulation were translated into systems of approximately 30 ordinary differential equations [see 1], which were subsequently solved using Mathematica (Wolfram Research, Urbana Champagne Illinois). Additional File 1. Supplementary material with details of the differential equations used to model a coherent bi-fan motif in which there is full cooperativity. Format: PDF Size: 24KB Download file This file can be viewed with: Adobe Acrobat Reader The bi-fan modelThe bi-fan motif, see Figure 1, consists of four regulatory systems, denoted as X, Y, Z and W. It is necessary to represent each of the systems in a biologically relevant way and with realistic parameter choices. The model used here follows that of [5], in that each of the systems is composed of a transcriptional part whereby one or more transcription factors bind to promoter regions and regulate the production of mRNA, as well as a translational part whereby the mRNA is translated into protein, which may act as a transcription factor for another regulatory system. The model is certainly not the simplest available – we could instead have modelled the motif as a Boolean network or with weighted funtions between nodes, however we believe that to do so does not take advantage of the knowledge that has been gained in study of the physical mechanism of gene regulation, and futhermore the model we do use has the advantage that there is considerable experimental data available to justify the choice of rate constants, and not to have used this would have been to not take full advantage of the experimental data. Equally, this model does not attempt to model the many hundreds of intermediate steps involved in each process such as transcription, as the steps introduced would necessarily have been arbitrary, and without experimental data to justify rate constants. The model attempts to describe the process of gene regulation from transcription binding to protein production in a physically reasonable way. Each system, for example X, is represented as having a section of DNA DX which codes for the mRNA MX. First, the transcription factors, which may either be proteins produced by one of the systems in the motif, or by an external signal, bind to the promoter region to form a complex QX. RNA polymerase molecules RX then bind to this complex as they read the DNA, forming a secondcomplex Modelling the interactionsThe detailed modelling of the interactions between the two tiers of proteins can be carried out in a number of ways, all of which will have an effect on the resultant dynamics. Co-operative binding – order unimportantIn this case, considered in the paper [5], both transcription factors act to promote transcription, however they are both needed for transcription to take place. This is modelled by introducing an intermediate species, T. The equations for this model are then as follows: Co-operative binding – order importantIndependent promoters – order unimportantThis is a simpler case than the above, and is simply the case in which both transcription factors act to promote transcription independently. The equations are then as follows: Promoter/repressor combinationIn this case the repressor sequesters DNA, making it unavailable for the promoter to bind. Promoter/repressor combination – binding order importantIn this case, the binding order of PX and PY to the DNA for gene W now plays a role. If PY binds first it will block the production of PW, perhaps by altering the conformation of the binding site; if, however, PX binds first then PY is still required for production of the protein. This has the effect of making PY a repressor when it binds first to the regulation site. Examples of where this order specific behaviour may occur include the effect of the transcription factor p53 on chromatin structure [11]. Other papers which discuss the importance of binding order include [12] and [13]. We see the effect that this can have on motif behaviour in Figure 8, where the two graphs correspond to high IX and low IY in (a), and the reverse in (b). Here the motif is apparently able to differentiate between signal combinations.
Illustrative caseWith these considerations in mind, we give the full model for variant C, the coherent motif (the other cases are similar). The basic model for X and Y, where IX and IY represent the amounts of externally produced transcription factors: For protein X: For protein Y: In the case in which both transcription factors act as promoters, and in which binding order is unimportant, that is, Equation (1), then the equations for proteins Z and W then become: The parameters used for the rates of transcription and translation are based on the derivations in [14], and were obtained from experimentally determined rates [15]. The values used for the basic model are shown in Table 2. Table 2. Kinetic parameters used in the model. Other parameters not given here are low number mulitples of the corresponding parameter, and were altered as described in the text. Authors' contributionsPJI performed the analysis. PJI, MPHS and JS designed the study and jointly wrote the paper. Appendix: motifs and networksA network or a graph is a mathematical object consisting of a set V of vertices, and a set E of edges connecting vertices. If the edges have arrows or directions then the network is called a directed network. Many different types of data and relationships may be represented in this way. Examples of undirected networks are networks of pairs of proteins which are known to interact. The networks considered in this paper have as their vertices genes which regulate or are regulated by the product of other genes. The edges then represent the relationship of control, and therefore the network is a directed network, with the direction of the edges indicating the direction of control. Figure A1 is an example of a directed network. Motifs, introduced in [1] are small sub-networks or patterns of vertices and edges which occur with in the network. A motif is considered to be interesting if it occurs unusually frequently in the network. In Figure A1 a (feed-forward) motif is highlighted in colour. Appendix figureA directed network with a feed-forward motif highlighted in colour. In the case of a transcription regulation network the arrows indicate the direction of transcriptional control. AcknowledgementsWe thank the Wellcome Trust for a research studentship (PJI) and a research fellowship (MPHS). We would also like to thank the reviewers for their helpful comments. References
Have something to say? Post a comment on this article! |



on Google Scholar







author email
corresponding author email
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.




Figure 8.



