The primary metabolism of organisms is very well conserved whereas the secondary metabolism is dependent on the lifestyle and the living environment of organisms . Thus, there is a need to trace the evolution of metabolism in organisms. The aim of this project is to construct a "species independent" network database of metabolic pathways based on the Enzyme database, which will help us trace the evolution of metabolism for any organism . The benefits of having such a network database are:
• Users will be able to decide how the network should be expressed i.e., whether the network should contain the ubiquitous molecules or not (eg. H2O, ATP...).
• The ability to represent the metabolic pathways graphically with either enzymes or substrates as nodes or even as a bipartite graph with enzyme and substrate nodes .
1. Systematically calculate the shortest path within the vertex.
2. A brief discussion of interactions between metabolisms.
1. MySQL database
As the existing graphical models queries are very complex, it might be necessary to move to an object-relational model. The data for this project will be stored in a relational database to facilitate rapid searching and comparison. The data contains EC numbers, their substrates and products respectively. All dataset are based on the public domain as Enzyme database. The ubiquitous molecules have unique identifiers to avoid systematic errors and loss of significant information, and to deal with the hub molecule problem.
The project will use Dijkstra's algorithm , which is known to be a good algorithm for finding the shortest path to create the hierarchy between EC numbers and metabolites (substrates or products).
Some parts of the existing enzyme centric model have been lost in the automated processing of the Enzyme database. Some of these reactions do not represent genuine metabolic functions such as cytoskeletal proteins consuming ATP. In some cases, an equation has been lost as it has a generic description of the analytic reaction that could not be resolved. Because its equation is available for natural langrage descriptions only, it is computer unreadable. Therefore, those equations cannot be parsed into our dataset. In other cases, the equation would be lost if the involved ubiquitous molecules that need to be removed in order to create a more realistic representation. However, this problem has been resolved, by individualising each instance of the hub, by adding the EC number of the enzyme catalysing the reaction to the name. In general, a solution to all of these problems is the incorporation of other biological knowledge to create a context dependent system. In addition, does the separate reaction from central pathways make biological sense or do they indicate that we are missing some nodes of the network?
Nucleic Acids Res (34 Database):D354-7.
2006 Jan 1; doi:10.1093/nar/gkj102 pmid:16381885.PubMed Abstract | Publisher Full Text | PubMed Central Full Text
2004 Sep 1; doi:10.1093/bioinformatics/bth199 pmid:15073012PubMed Abstract | Publisher Full Text
Brief Bioinform 2003, 4(3):246-59.
pmid:14582519.PubMed Abstract | Publisher Full Text
J Chem Phys 122(23):234903.
2005 Jun 15; doi:10.1063/1.1931587 pmid:16008483.PubMed Abstract | Publisher Full Text