Prediction of the role of one protein in a network of regulatory and metabolic interactions involves analysis of many complex data sets in their interference, as no experimental source along is completely and generally reliable. To discriminate the most promising pathways related to a protein of interest from background noise we integrate multiple bioinformatics data by means of a graphical designer/database of protein-protein interactions supported by data mining and logical modeling of resulted networks.
Materials and methods
The SEED database http://theseed.uchicago.edu/FIG/index.cgi webcite is an open-source genomic platform provided by the Fellowship for Interpretation of Genomes (FIG) http://www.figresearch.com/ webcite, which supports the encoding and projection of metabolic subsystems across the entire collection of integrated genomes. It provides two similar but distinguished approaches which were applied to compare chromosome regions. The first method 'Compare region' is similar to one applied to analysis of clustering of functionally related genes on the prokaryotic chromosome [4,6]. The approach involves computation of "pairs of close bidirectional best hits. Using these pairs, one can compose evidence (based on the number of distinct genomes and the phylogenetic distance between the orthologous pairs) that a pair of genes is potentially functionally coupled. Another approach underlies 'Pinned regions' resource. It allows one to align chromosome loci that contain open reading frames for homologous proteins, or, in other words: to 'pin' these loci through homologous genes. Similarity threshold can be customized.
Gene Spring was used for an initial analysis of micro array data; Seed and NCBI databases-for the analysis of sequences derived from Two-Hybrid experiments. All data were integrated by means of PathBlazer https://bioinformatics.ccr.buffalo.edu/software/vector webcite.
Results and conclusions
In a process of a network reconstruction we have been evaluating predictive capacities of three data sources: gene expression data, potential protein-protein interaction data derived in two-hybrid system experiments and gene positional clustering [4,5] to show that gene co-localization on a chromosome can be a reliable indication for a functional relevance between the encoded proteins.
We demonstrate a capability of the SEED database and its unique tools for eukaryotic chromosome loci alignment and phylogenetic analysis to serve in retrieval of conservative gene patterns and uncovering of potential functional partners of a protein of interest (see Figure 1).
Figure 1. Left: An example of gene conservative clustering. Chromosome loci (50 kb) from different vertebrate genomes are 'pinned through' SIRT2 homologs. Arrows correspond to open reading frames: red-SIRT2 homologs, green-NF-kappaB inhibitor beta. Right: A proposed model of interactions between SIRT proteins and and NF-kappaB regulatory network. Dashed lines-putative links derived from the gene positional clustering.
Here we present procedures of unraveling the functional connections between sirtuin family proteins  and NF-kappaB regulatory network , and the GSTPi  function as a carrier of Nitric oxide and Iron complex in the regulation of mitochondrial respiration.
Smith JS, Brachmann CB, Celic I, Kenna MA, Muhammad S, Starai VJ, Avalos JL, Escalante-Semerena JC, Grubmeyer C, Wolberger C, Jef BoekeD: A phylogenetically conserved NAD+-dependent protein deacetylase activity in the Sir2 protein family.
Ruscoe JE, Rosario LA, Wang T, Gaté L, Arifoglu P, Wolf CR, Henderson CJ, Ronai Z, Tew KD: Pharmacologic or genetic manipulation of glutathione S-transferase P1-1 (GSTpi) influences cell proliferation pathways.
Overbeek R, Begley T, Butler RM, Choudhuri JV, Chuang H, Cohoon M, de Crécy-Lagard V, Diaz N, Disz T, Edwards R, et al.: The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes.