Abstract
Background
Probabilistic Boolean Network (PBN) is a popular model for studying genetic regulatory networks. An important and practical problem is to find the optimal control policy for a PBN so as to avoid the network from entering into undesirable states. A number of research works have been done by using dynamic programmingbased (DP) method. However, due to the high computational complexity of PBNs, DP method is computationally inefficient for a large size network. Therefore it is natural to seek for approximation methods.
Results
Inspired by the state reduction strategies, we consider using dynamic programming in conjunction with state reduction approach to reduce the computational cost of the DP method. Numerical examples are given to demonstrate both the effectiveness and the efficiency of our proposed method.
Conclusions
Finding the optimal control policy for PBNs is meaningful. The proposed problem has been shown to be . By taking state reduction approach into consideration, the proposed method can speed up the computational time in applying dynamic programmingbased algorithm. In particular, the proposed method is effective for larger size networks.
Background
An important goal for studying genetic regulatory network is to understand the gene behavior and to develop optimal control policy for potential applications to medical therapy. While many models have been proposed for modeling gene regulatory networks, Boolean Networks (BNs) [13] and thier extension Probabilistic Boolean Networks (PBNs) [4] have received much attention. Because they form a class of models which can capture the logical interactions of genes and they are also effective in modeling pathways for drug discovery [5]. Recently applications in medical treatment for Parkinson's disease can also be found in [6]. In fact, a PBN can be considered as a collection of BNs driven by a Markov chain and therefore its dynamics and behavior can be studied by using Markov chain theory. For reviews on BNs and PBNs, we refer interested readers to [79] and the references therein.
Many methods in control theory are available for the intervention of PBNs. A gene control model has been proposed in [10]. The control model is formulated as a mixed integer programming problem and it aims at driving the PBN from the undesirable states to the desirable ones. A class of PBN control problems with hard constraints has been proposed in [11,12]. The motivation of the control model is to reduce the sideeffects of medical treatment. In [11], hard constraints are included in the optimal control problem and an approximation method is then proposed in [12] to obtain the optimal controls efficiently.
Datta et al. [13] proposed an external intervention method based on optimal control theory. In their work, genes are classified as internal nodes and external nodes (control nodes). One can intervene the values of internal nodes in some desirable manner by controlling the values of certain external nodes. By defining the control cost for each control input and terminal cost for each state, the problem is to find a sequence of control inputs that leads the network into desirable states at the terminal step with minimum average cost. The classical technique of dynamic programming is then employed to solve the optimal control problem.
Chen et al. [14] then consider an external intervention problem based on optimal control theory and dynamic programming. Given the terminal cost of each state, the objective is to drive the network into the state with the maximum cost being minimized by applying external controls. The problem is important in the view of medical therapy because patients/organisms would like to minimize the damage even for the worst case. They proved that both minimizing the maximum cost and minimizing the average cost are . A dynamic programmingbased algorithm is then proposed for finding a control sequence that minimizes the maximum cost in control of PBN. The above dynamic programmingbased methods still have high computational complexity. The size of the underlying transition probability matrix increases exponentially with the number of nodes in the PBN. To tackle this problem a possible remedy is to consider network reduction approach.
Several reduction methods have been proposed recently. In [15], a CoDbased reduction algorithm is introduced. Coefficient of Determination (CoD) helps to evaluate the influence of a candidate node for deletion on the target node and find the optimal candidate node for deletion. The proposed algorithm can well preserve the attractor structure and longrun dynamics of the original network.
Qian et al. [16] proposed a state reduction method by considering deleting states directly. Instead of deleting the nodes in a network, they delete the outmost states having less influence to the network. Here we consider a transition probabilitybased reduction strategy. This strategy is easy to implement as we do not need to compute the stationary distribution of the PBN beforehand.
We consider the problem of minimizing the maximum cost in control of PBN and we employ transition probabilitybased reduction strategy to reduce the network complexity of a PBN. We show that under some condition and in many of our numerical examples, the optimal control sequence obtained from the reduced network is the same as the one in the original network. Then we apply the dynamic programmingbased algorithm to the reduced network. The computational complexity of dynamic programmingbased algorithm when applied to the original network is O(2^{n}) (depending on the number of network states) when the number of control nodes m and the number of steps M are fixed. When our state reduction method is applied, the computational complexity is reduced to O(R), where R is the set of states after reduction.
The remainder of the paper is structured ae follows. We first give a brief review on PBNs and the dynamic programming method. We then introduce our state reduction approach together with some theoretical results to support our proposed approach. Numerical examples are given to demonstrate both the effectiveness and the efficiency of our proposed method. Finally some discussion will be given to conclude the paper.
A brief review on BNs and PBNs
A BN consists of a set of n nodes (genes) as follows: {v_{1},v_{2},..., v_{n}}, v_{i}∈ {0,1} and a set of Boolean functions denoted by {f_{1}, f_{2},..., f_{n}}. Each v_{i}(t) is defined as the state of node i at time t. The rules of regulatory interactions among nodes are then represented by the Boolean functions: v_{i}(t + 1) = f_{i}(v_{i}_{1}, v_{i}_{2},..., v_{ik}) where {v_{i}_{1},v_{i}_{2},...,v_{ik}} are input nodes of f_{i}, and they are called parent nodes of node v_{i}. We define IN(vi) = {v_{i}_{1}, v_{i}_{2},..., v_{ik}}. The number of parent nodes to v_{i}is called the indegree of v_{i}. The largest indegree of {v_{1}, v_{2},..., v_{n}} is called the maximum indegree of BN and is denoted by K.
Since BN is a deterministic model, a stochastic model is more preferable due to the measurement noise in inferring a gene regulatory network. A stochastic version of BN, PBN [4,9] is then introduced to cope with the weakness. A PBN can be regarded as an extension of BN to a probabilistic setting. In a PBN, each node v_{i}has a set of Boolean functions:
The state of v_{i}at time t + 1 is predicted by one of the Boolean functions in (1) with selection probabilities . Here
A PBN can be regarded as a finite collection of BNs over a fixed set of nodes, where each BN has a fixed set of Boolean functions . The BN having Boolean function set f_{j}(j = 1,2,...,N) is called the jth BN. At each time step t, the selection process of Boolean functions is assumed to be independent, and the selection probability is given by and the states of {v_{1}(t + 1),v_{2}(t + 1),...,v_{n}(t + 1)} is predicted by the Boolean function set f_{j}. Then we introduce the decimal representation of states. Suppose the current state is {v_{1}(t), v_{2}(t),..., v_{n}(t)}, we define
Since v_{i}(t) ∈ {0,1}, w(t) can take any integral value in [1, 2^{n}].
The dynamics of a PBN can be studied by using Markov chain theory, see for instance [17]. The onestep transition probability can be represented by using the transition probability matrix A where each entry A_{ij }is given by
Here i = w(t + 1) and j = w(t) and is set of BNs that the network can enter state i from state j. We remark that A is a column stochastic matrix, i.e., .
A review on dynamic programming
In this section, we first introduce several definitions to facilitate the discussion. We then introduce the dynamic programmingbased algorithm. Suppose a PBN has a set of internal nodes {v_{1},v_{2},...,v_{n}} which is the same as the node set defined in the previous Section, and a set of external nodes (control nodes) {v_{n+}_{1},v_{n+}_{2},...,v_{n+m}}. At time t+ 1, the states of v_{i}, i = 1,2,..., n are predicted by where v_{ik} can be either an internal node or an external node. This provides a possible way for intervening the states of internal nodes by controlling the values of external nodes. To facilitate our discussion, we adopt the following state representation of the network and define
to be the state of network. Then we define control input as
Here we are interested in the following problem: Minimizing the maximum cost in control of PBN.
Given the terminal cost C(z_{M}) for each state z_{M}∈ {1,2,...,2^{n}} at terminal time step M, find a sequence of control input u_{0},u_{1},...,u_{M}such that starting from the given initial state the network will enter into the state with minimized maximum cost at time step M. In [14], a dynamic programmingbased method is proposed for the above problem:
Step 0: Set t = M;
J(z_{M}, h_{M}) = C(z_{M}) for all h_{M }= {0,..., M}.
Step 1: t : = t 1.
Step 2: For any z_{t}∈ {1,..., 2^{n}} and h_{t}∈ {0,..., M}, compute
and
Step 3: If t >0, go back to Step 1; Otherwise, stop.
The state reduction approach
In this section we propose our state reduction method.
Transition probabilitybased state reduction strategy
Due to the high network complexity of a PBN, one has to deal with matrices of huge size which increases exponentially with the number of internal nodes. Network reduction is therefore an important issue to be addressed in this situation. In [16], a transition probabilitybased state reduction strategy is proposed. In a PBN, we consider all attractor states and initial state as critical states, and they are preserved during state reduction. A state i can be deleted if the following equation is satisfied:
where ξ >0 is a parameter to be predetermined. The value of ξ depends on perturbation probability and it is usually not large. When we consider PBNs without perturbation, Equation (5) can be rewritten as
Which means that the network will never enter state i from other states. Hence, deleting state i will not influence the steadystate distribution of the network.
The dynamic programmingbased algorithm on the reduced network
Since the computational complexity of the dynamic programming is O(2^{n}) when the number of control nodes m and the number of steps M are fixed, using state reduction may reduce the computation complexity to O(R), where R is the set of states after reduction. It is straightforward to see that we have the following proposition.
Proposition 1 The result of dynamic programmingbased algorithm on the reduced network will be the same as the one on the original network.
It is straightforward to see that, starting from the initial state, the network will never enter into transient states to be deleted. Therefore the network will never stop at those states at the terminal time step. This means that the deleted states will not be included in the optimal route, and the cost of deleted states will not be counted. Hence deleting these transient states will not influence the result obtained from the DP method when applied to the reduced network.
Based on transition probabilitybased strategy, one can iteratively delete those transient states until all the remaining states are critical states. In each step, we need to update the transition matrix for the reduced network by deleting the corresponding row and column from the transition matrix. After making the reduction, one can get a reduced network with a set of states R and a RbyR transition probability matrix B. Then we can apply dynamic programmingbased algorithm on the reduced network. In the following, we give a theoretical result on the reduction method when the indegree of the network is one.
An analysis of the reduction method when indegree K = 1
In a PBN of n genes, there are totally 2n states in the network. When K = 1, it means that each gene is controlled by only a single gene. Table 1 gives an example when the number of genes is two. It is straightforward to compute the number of all the possible BNs which is actually 4×4 = 16.
Table 1. All the possible BNs for 2 genes when K = 1
In general, we can also compute the number of all the possible BNs for n genes: (2n)^{n}. For example, from Table 1, one can calculate there are totally (2 × 2)^{2 }= 16 networks with 2^{2 }× 2^{2 }sizes. When every row contains 1, it means the number of nonzero rows is 2^{2}. To satisfy this condition, we have to choose 2 genes as parent genes and consider every gene has two possible states. Thus, we can deduce that the number of such networks is where . But when the number of nonzero rows is 2^{1}, we just select only one gene as the parent gene and the corresponding selected possibilities are where . Since for each gene, there exist two states to be selected. Therefore, the total number of such networks is . In determination of the linear combination of BNs for construction of PBN, the intrinsic structure of BNs plays an utmost role. Here we study the distribution of nonzero rows in BNs and we give the following distribution theory.
Proposition 2 When the indegree of a BN is one, the distribution of zero row is given in Table 2. Moreover, the probability of getting a BN having no zero row decreases to zero at a fast rate of as the number of genes n increases to infinite.
Table 2. Distribution of number of nonzero rows in BNs when K = 1
In Table 2, when the number of zero rows is 0, it means that there is no zero row, there are n!2^{n }such kind of BNs. This means that after transition, all the states will still be visited. In calculating the number of BNs satisfying this particular condition, we should ensure that the n genes have n parent nodes. Therefore it is easy to deduce that we can have n!2^{n }BNs having no zero row. As a matter of fact, if we define a function F_{dis}for mapping the number of nonzero rows in BNs to the number of the parent nodes for a ngene set, we can have
Therefore to compute the number of BNs when the number of nonzero rows is 2^{nk}, one should select nk out of n genes as parent nodes. And that is the reason why we have . Since the nk parent nodes will fill in n positions, we should take all possible selection pattern into account. Then we have the double summation part for calculating the number of BNs when the number of nonzero rows is 2^{nk}, k = 1,2,..., (n  1).
Furthermore, since there are 2^{n }states for n genes, the number of rows in BNs is 2^{n}. One can observe that with the increase of n, the ratio of the number of BNs with full number of rows to the whole number of (2n)^{n }BNs is decreasing fast because
Hence this guarantees the efficiency of state reduction.
State reduction for PBN with random perturbation
In this section we discuss the state reduction strategy for PBNs with random perturbation. Let p be the perturbation probability of single gene (flipping the value of single gene from 1 to 0 or 0 to 1). Suppose the current state is v(t), then state at the next time step is determined by the transition matrix without perturbation A with probability (1p)^{n}, or by randomly perturbation with probability 1(1p)^{n}. Therefore the transition matrix with perturbation is given by
where P is the perturbation matrix [18]:
where
To carry out the state reduction strategy, we need to delete all the states which can only be entered by random perturbation. Here we set the threshold for ξ as the row sum of P: ξ = 1  (1  p)^{n}. If for some state i, the following inequality
is satisfied, then we can delete the state.
Table 3 gives the reduction rates (percentage of states deleted after network reduction) for PBNs with random perturbation. In the experiment, each PBN has 4 BNs, and the maximum indegree is K = 2. We consider the cases p = 0.001, 0.002, 0.005, 0.01 and n = 6, 8,10,12. For each case, we perform the simulation for 10 times and report the average results. From Table 3, one can see that the PBNs can delete more rows when the value of perturbation probability p increases.
Table 3. Reduction Rates for PBN with Gene Perturbations
Results and discussions
In this section, we give some numerical examples to compare the result of dynamic programmingbased algorithm on the reduced network with the one on the original network.
A 6gene example
We first consider a 6node example. We consider the cases of m = 1,2, N = 2,4,8 and K = 2, 3. The Boolean function set of PBN are randomly generated. We let M = 20 and C(z_{M}) = z_{M}. When m = 1, there are 5 internal nodes and 1 control node. The original network size is 2^{5}. When m = 2, there are 4 internal nodes and 2 control nodes. The original network size is 2^{4}. Table 4 gives the numerical results for PBNs without perturbation. Table 5 gives the numerical results for PBNs with random perturbation. The second column gives the network size before and after reduction. The third column gives minimized maximum cost obtained by using the dynamic programmingbased algorithm on the original and reduced network. The last column records the CPU time of running the program for dynamic programmingbased algorithm before and after reduction.
Table 4. A 6Node Example for PBN Without Perturbation
Table 5. A 6Node Example for PBN With Random Perturbation
A 12gene example
We then consider a 12node example. We consider the cases of m = 1,2, N = 2,4,8 and K = 2,3. Again the Boolean function set of PBN are randomly generated. We let M = 40,C(z_{M}) = z_{M}. When m = 1, there are 11 internal nodes and 1 control node. The original network size is 2^{11}. When m = 2, there are 10 internal nodes and 2 control nodes. The original network size is 2^{10}. Table 6 gives the numerical results for PBNs without perturbation. Table 7 gives the numerical results for PBNs with random perturbation. We see that our proposed reduction method is both efficient and effective.
Conclusions
From the experiment results, one can see that applying dynamic programmingbased algorithm on the reduced network can reduce the computational complexity. The performance of the algorithm on the reduced network depends on the parameters of n, m, N and K. For n = 6, from Table 3 and Table 4, one can see that in general, there are some improvements in computational time when reduction method is applied. However, for n = 12, Table 6 and Table 7 indicate that when the number of nodes is large and K = 2, the algorithm on the reduced network performs much better than the one on the original network. Therefore, our proposed method is effective for larger size networks. Future research issues will pay attention to statistical analysis of the distribution of zero rows in transition matrix in terms of n. Moreover, we will keep exploring ways of reducing computational complexity of intervention strategies.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
XC came up with the idea. XC and WKC designed the research. XC, HJ and YQ performed the research and analyzed the results. XC, HJ and WKC wrote the paper. All authors read and approved the final manuscript.
Acknowledgements
The authors would like to thank the three anonymous referees for their helpful and encouraging comments and suggestions. The preliminary version has been presented in IEEE Conference on Systems Biology (ISB), Zhuhai, China and published in the conference proceedings [19]. Research support in part by GRF grant and HKU CERG Grants, National Natural Science Foundation of China Grant No. 10971075 and Guangdong Provincial Natural Science Grant No. 9151063101000021.
This article has been published as part of BMC Systems Biology Volume 6 Supplement 1, 2012: Selected articles from The 5th IEEE International Conference on Systems Biology (ISB 2011). The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/6/S1.
References

Kauffman S: Metabolic Stability and Epigenesis in Randomly Constructed Genetic Nets.
Journal of Theoretical Biology 1969, 22:437467. PubMed Abstract  Publisher Full Text

Kauffman S: Homeostasis and Differentiation in Random Genetic Control Networks.
Nature 1969, 224:177178. PubMed Abstract  Publisher Full Text

Kauffman S: The Origins of Order: Self Organization and Selection in Evolution. Oxford University Press; 1993.

Shmulevich I: Probabilistic Boolean Networks : the Modeling and Control of Gene Regulatory Networks.
Philadelphia : Society for Industrial and Applied Mathematics 2010.

Watterson S, Marshall S, Ghazal P: Logic Models of Pathway Biology.
Drug Discovery Today 2008, 13:447456. PubMed Abstract  Publisher Full Text

Ma Z, Wang J, McKeown M: Probabilistic Boolean Network Analysis of Brain Connectivity in Parkinson's Disease.

Akutsu T, Hayashida M, Tumura T: Algorithms for Inference, Analysis and Control of Boolean Networks.
Proc 3rd International Conference on Algebraic Biology (AB 2008), Lecture Notes in Computer Science, Austria 2008, 115.

Bornholdt S: Boolean Network Models of Cellular Regulation: Prospects and Limitations.
J R Soc Interface 2008, 5:8594. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Shmulevich I, Dougherty E, Kim S, Zhang W: Probabilistic Boolean Networks: A Rulebased Uncertainty Model for Gene Regulatory Networks.
Bioinformatics 2002, 18:261274. PubMed Abstract  Publisher Full Text

Ng M, Zhang S, Ching W, Akutsu T: A Control Model for Markovian Genetic Regulatory Networks.
Transactions on Computational Systems Biology 2006, 4070:3648.

Ching W, Zhang S, Jiao Y, Akutsu T, Tsing N, Wong A: Optimal Control Policy for Probabilistic Boolean Networks with Hard Constraints.
IET Systems Biology 2009, 3:9099. PubMed Abstract  Publisher Full Text

Y Cong NT, Ching W, Leung H: On FiniteHorizon Control of Genetic Regulatory Networks with Multiple HardConstraints.
BMC Systems Biology 2010, 4(Suppl 2):S14. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Datta A, Pal R, Choudhary A, Dougherty E: Control Approaches for Probabilistic Gene Regulatory Networks.

Chen X, Akutsu T, Tamura T, Ching W: Finding Optimal Control Policy in Probabilistic Boolean Networks with Hard Constraints by Using Integer Programming and Dynamic Programming.
International Journal of Data Mining and Bioinformatics, in press.

Ghaffari N, Ivanov I, Qian X, Dougherty E: A CoDbased Reduction Algorithm for Designing Stationary Control Policies on Boolean Networks.
Bioinformatics 2010, 26:15561563. PubMed Abstract  Publisher Full Text

Qian X, Ghaffari N, Ivanov I, Dougherty E: State Reduction for Network Intervention in Probabilistic Boolean Networks.
Bioinformatics 2010, 26:30983104. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Ching W, Zhang S, Ng M, Akutsu T: An Approximation Method for Solving the Steadystate Probability Distribution of Probabilistic Boolean Networks.
Bioinformatics 2007, 23:15111518. PubMed Abstract  Publisher Full Text

Xu W, Ching W, Zhang S, Li W, Chen X: A Matrix Perturbation Method for Computing the Steadystate Probability Distributions of Probabilistic Boolean Networks with Gene Perturbations.
Journal of Computational and Applied Mathematics 2011, 235:22422251. Publisher Full Text

Chen X, Ching W: Finding Optimal Control Policy by Using Dynamic Programming in Conjunction with State Reduction. Proceedings of the IEEE Conference on Systems Biology (ISB) Zhuhai, China 24; 2011:274278.
IEEE