<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-7-30</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Software</dochead>
      <bibl>
         <title>
            <p>WholePathwayScope: a comprehensive pathway-based analysis tool for high-throughput data</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Yi</snm>
               <fnm>Ming</fnm>
               <insr iid="I1"/>
               <email>myi@ncifcrf.gov</email>
            </au>
            <au id="A2">
               <snm>Horton</snm>
               <mi>D</mi>
               <fnm>Jay</fnm>
               <insr iid="I2"/>
               <email>Jay.Horton@utsouthwestern.edu</email>
            </au>
            <au id="A3">
               <snm>Cohen</snm>
               <mi>C</mi>
               <fnm>Jonathan</fnm>
               <insr iid="I2"/>
               <insr iid="I3"/>
               <email>Jonathan.Cohen@utsouthwestern.edu</email>
            </au>
            <au id="A4">
               <snm>Hobbs</snm>
               <mi>H</mi>
               <fnm>Helen</fnm>
               <insr iid="I2"/>
               <insr iid="I3"/>
               <insr iid="I4"/>
               <email>Helen.Hobbs@utsouthwestern.edu</email>
            </au>
            <au id="A5" ca="yes">
               <snm>Stephens</snm>
               <mi>M</mi>
               <fnm>Robert</fnm>
               <insr iid="I1"/>
               <email>bobs@ncifcrf.gov</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Advanced Biomedical Computing Center, National Cancer Institute-Frederick/SAIC-Frederick Inc., Frederick, MD 21702, USA</p>
            </ins>
            <ins id="I2">
               <p>McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center at Dallas, TX 75390-9046, USA</p>
            </ins>
            <ins id="I3">
               <p>Departments of Internal Medicine and Molecular Genetics, University of Texas Southwestern Medical Center at Dallas, TX 75390-9046, USA</p>
            </ins>
            <ins id="I4">
               <p>The Howard Hughes Medical Institute, University of Texas Southwestern Medical Center at Dallas, TX 75390-9046, USA</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2006</pubdate>
         <volume>7</volume>
         <issue>1</issue>
         <fpage>30</fpage>
         <url>http://www.biomedcentral.com/1471-2105/7/30</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">16423281</pubid>
               <pubid idtype="doi">10.1186/1471-2105-7-30</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>09</day>
               <month>5</month>
               <year>2005</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>19</day>
               <month>1</month>
               <year>2006</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>19</day>
               <month>1</month>
               <year>2006</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2006</year>
         <collab>Yi et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Analysis of High Throughput (HTP) Data such as microarray and proteomics data has provided a powerful methodology to study patterns of gene regulation at genome scale. A major unresolved problem in the post-genomic era is to assemble the large amounts of data generated into a meaningful biological context. We have developed a comprehensive software tool, WholePathwayScope (WPS), for deriving biological insights from analysis of HTP data.</p>
            </sec>
            <sec>
               <st>
                  <p>Result</p>
               </st>
               <p>WPS extracts gene lists with shared biological themes through color cue templates. WPS statistically evaluates global functional category enrichment of gene lists and pathway-level pattern enrichment of data. WPS incorporates well-known biological pathways from KEGG (Kyoto Encyclopedia of Genes and Genomes) and Biocarta, GO (Gene Ontology) terms as well as user-defined pathways or relevant gene clusters or groups, and explores gene-term relationships within the derived gene-term association networks (GTANs). WPS simultaneously compares multiple datasets within biological contexts either as pathways or as association networks. WPS also integrates Genetic Association Database and Partial MedGene Database for disease-association information. We have used this program to analyze and compare microarray and proteomics datasets derived from a variety of biological systems. Application examples demonstrated the capacity of WPS to significantly facilitate the analysis of HTP data for integrative discovery.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>This tool represents a pathway-based platform for discovery integration to maximize analysis power. The tool is freely available at <url>http://www.abcc.ncifcrf.gov/wps/wps_index.php</url>.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>In today's post-genomic era, the sequencing projects and the development of High Throughput (HTP) technologies such as microarray and proteomics provide great opportunities to uncover and explore the complexity of biological problems using systems biology. HTP technologies have provided a powerful approach to address a diverse array of biological questions by allowing analysis of the complete transcriptional and translational repertoire of cells or tissues. Pathologically identical tumors can be differentiated into clinically meaningful subgroups by microarray analysis <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>, and new pathways perturbed in disease states have been identified using microarray analysis <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>. Expression arrays also reveal new participants in biological pathways <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>, new gene targets for pharmaceutical agents, and new functions of genes <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>.</p>
         <p>Today, the use of DNA microarrays is increasingly widespread and affordable and great expectations have been placed on technological advances in proteomics. However, analyzing the enormous quantity of data generated from such HTP experiments remains a major challenge. A variety of software tools are available to extract and analyze HTP data that primarily focus on microarray data. Two major strategies used are: 1) unsupervised clustering, in which genes are clustered according to changes in expression pattern with no accommodation for biological context, and 2) supervised classification, in which genes are classified according to an underlying or pre-known biology. Numerous existing microarray analysis tools such as GeneCluster <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>, TreeView <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, TM4 <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>; and GeneSpring <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> primarily use clustering algorithms, which require significant user effort to connect with biological information. Current HTP data analysis methods, which are primarily based on the computation of data values for each individual gene, such as clustering and classification (Hierarchical, Principle Component Analysis <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, and Significance Analysis of Microarray <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>), do provide great insights in many aspects of experimental analysis. However, a more comprehensive way to integrate and analyze HTP data in the context of biological pathways and networks has become the current need in both academics and industry. As the amount of HTP data has increased and more insightful analysis approaches have been identified, the exploration of the underlying gene regulatory and biochemical networks of pathways to analyze data derived from a variety of HTP technologies has become one of the major challenges in the fields of bioinformatics and computational biology.</p>
         <p>Many software tools capable of analyzing HTP data within the context of biological pathways have been developed <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp>. Recently released commercial software packages including PathwayAssist&#8482; <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, PathArt <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, Ingenuity Pathways Analysis tool <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, MetaCore <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> also compete in the field of pathway-based HTP analysis. These tools provide an assortment of interfaces for the visualization of gene networks, natural language processing (NLP) extracted, or hand-curated biological pathway/association network databases and accept gene-list based data input. Each of these tools has one or more unique features that distinguish it from others. Some open source or publicly accessible software, such as GenMAPP <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, Cytoscape <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, Pathway Processor <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> and ViMac <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>, display microarray data within the context of pathways annotated in the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B22">22</abbr></abbrgrp>, and provide statistical assessment of the reliability of each differentially expressed gene <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. However, one of the limitations of these tools is the inability to handle multiple datasets simultaneously in an intuitive way. There is a need for more flexibile and comprehensive HTP data analysis software tools in the public domain that are accessible to the academic community and can provide a suite of utilities to analyze HTP data in biological contexts, such as pathways.</p>
         <p>To facilitate the simultaneous analysis and comparison of multiple HTP experiments in the context of biological pathways and association networks, and allow pattern extraction of a selected gene list with biological themes, we developed a stand-alone, Windows-based software tool called WholePathwayScope, or WPS. This software program not only provides many unique ways to analyze and visualize HTP data, but also combines advantages of clustering methodology with a more intuitive pathway or association network-based analysis, and many other features that allow for more comprehensive data analysis.</p>
      </sec>
      <sec>
         <st>
            <p>Implementation</p>
         </st>
         <sec>
            <st>
               <p>WPS provides a pathway-based platform for integrative data analysis</p>
            </st>
            <p>WholePathwayScope or WPS is a software tool that displays HTP data in user-defined or stored gene groups or pathways. The program incorporates a suite of pre-defined biological pathways and allows for the construction of additional user-defined pathways or collections of genes. It also allows generation of biological association networks composed of gene-pathway/term relationships, which can be further manipulated and converted to subnetworks, gene-gene, or pathway/term-pathway/term networks. Results from multiple HTP experiments can be visualized simultaneously, both as summary data from multiple pathways (WSCP) and as detailed data for individual pathways (PSCP). Results can be displayed numerically and can be color-coded according to user-defined criteria to facilitate visual analysis. The program also offers statistical evaluation of global functional category (GO term, pathway etc.) enrichment in a user's gene list, or of user-defined pattern enrichment of choice genes that have been color-coded with HTP data directly.</p>
            <p>The program is written in Microsoft Visual Basic 6 and runs in the Microsoft Windows environment. It utilizes Microsoft Access Databases including the internal databases for gene annotations, pathways, gene ontology and disease association information as well as designated criteria (CRI) files for HTP data. Pathways and association networks are created and presented in windows or graphical user interfaces (GUIs), and stored and accessed either in individual files or dynamically within Microsoft Access Databases. Users control the program through a user interface involving GUIs provided from a series of panels, menus and windows. There is also an extensive context-sensitive help system.</p>
         </sec>
         <sec>
            <st>
               <p>Internal database for gene, pathway, and disease annotation</p>
            </st>
            <p>WPS includes an internal database for integrating gene annotation information from both mouse and human genomes. Annotation covered includes GenBank accession numbers (GenBank IDs), Unigene IDs, Locuslink IDs (Now Entrez Gene), Gene Symbols, Aliases <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>, SwissProt IDs (Protein IDs) <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>, and disease information from both Genetic Association Database <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> and a partial MedGene Database <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. It also carries pathway/term information including KEGG <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B22">22</abbr><abbr bid="B35">35</abbr></abbrgrp>, Biocarta <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>, CGAP <abbrgrp><abbr bid="B37">37</abbr></abbrgrp> and Gene Ontology information <abbrgrp><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp> for the purpose of gene-term association network generation and Fisher's exact test.</p>
         </sec>
         <sec>
            <st>
               <p>Designed Files in WPS</p>
            </st>
            <p>The overall program layout is described in Figure <figr fid="F1">1</figr>. Three types of files with different formats have been developed to display the HTP data. These three file types lay out the pathways being visualized, interconnections amongst those pathways, and data filtering and color criteria for the HTP data respectively. The first file type is the PathwayScope File (PSCP), which is graphical presentation of many metabolic pathways and gene groupings. PSCP files contain the identifiers (gene tags) for each gene in a group of related genes or in a pathway, which is colorable (see PSCP file examples in Figure <figr fid="F4">4</figr> and <figr fid="F5">5</figr>). PSCP files can be customized and created by users or created dynamically from the internal database. For Biocarta pathways, the pathway graphs can be also visualized in a separate internet browser and data of selected genes in CRI files (see below) can be highlighted in the graphs as well (see <supplr sid="S1">Additional file 1</supplr> for screenshot).</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Overview of WholePathwayScope (WPS): schematic work flow for WPS basic data and file processing</p>
               </caption>
               <text>
                  <p>Overview of WholePathwayScope (WPS): schematic work flow for WPS basic data and file processing. WPS imports the data from the microarray file to the criteria (CRI) file, from which gene lists can be obtained by pattern extraction and then subjected to statistical evaluation by Fisher's exact test or to generate a GTAN. The GenBank IDs of genes in microarray datasets and pathway files (PSCP and WSCP files) are mapped to BaseGenBankIDs. The user then sets the criteria by which the gene tags and pathway tags in the PSCP and WSCP files are to be colored. PathwayScope Files (PSCP) including the GTAN (gene-term association network) files and WholeScope Files (WSCP) are either provided or created by the user. WPS integrates the data from the CRI files with the PSCP and WSCP files so that genes and pathways are flagged according to the specifications of the user.</p>
               </text>
               <graphic file="1471-2105-7-30-1"/>
            </fig>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Example of a CRI file created for analysis of microarray data from the <it>G</it>5<it>G</it>8<sup><it>Tg </it></sup>mice, shown in the Data Conversion Window</p>
               </caption>
               <text>
                  <p>Example of a CRI file created for analysis of microarray data from the <it>G</it>5<it>G</it>8<sup><it>Tg </it></sup>mice, shown in the Data Conversion Window. The pathway level criteria (PTW CRI Name) are based on the criteria selected at gene level (Gene CRI Name).</p>
               </text>
               <graphic file="1471-2105-7-30-2"/>
            </fig>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>An example of WSCP files in which multiple microarray datasets were analyzed simultaneously</p>
               </caption>
               <text>
                  <p>An example of WSCP files in which multiple microarray datasets were analyzed simultaneously. A WSCP file that included a collection of metabolic pathways and gene families with relevance to lipid and glucose metabolism was analyzed using data from the five different microarray datasets (see <supplr sid="S11">Additional file 11</supplr> for description of material and data preparation). Pooled hepatic mRNA isolated from two sets of <it>G</it>5<it>G</it>8<sup><it>Tg </it></sup>mice (G5G8<sup>Tg1</sup>Liver, G5G8<sup>Tg2</sup>Liver) and three sets of <it>G</it>5<it>G</it>8 knockout mice (G5G8<sup>-/-1</sup>Liver, G5G8<sup>-/-2</sup>Liver, G5G8<sup>-/-3</sup>Liver) were hybridized to Affymetrix chips. Each pathway tag is divided into boxes that are tandemly arrayed in the same order as shown at the top of the figure. Once the data is loaded from each of the experiments, the boxes are colored according to the criteria set by the user. The numbers displayed next to the pathway tags (or divided boxes) are numbers of genes matched with pathway criteria-based gene criteria. (Note: if a single CRI is loaded, the whole pathway tag will be colored by this CRI).</p>
               </text>
               <graphic file="1471-2105-7-30-3"/>
            </fig>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>An example of PSCP files in which multiple microarray datasets were analyzed simultaneously</p>
               </caption>
               <text>
                  <p>An example of PSCP files in which multiple microarray datasets were analyzed simultaneously. When the pathway tag labeled as 'Cholesterol Synthesis' in Fig. 3 is clicked, the PSCP file is accessed. The numbers displayed proximal to the gene tags (or divided boxes) are the fold changes of the specific genes.</p>
               </text>
               <graphic file="1471-2105-7-30-4"/>
            </fig>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>An example of PSCP files in which multiple microarray datasets were analyzed simultaneously</p>
               </caption>
               <text>
                  <p>An example of PSCP files in which multiple microarray datasets were analyzed simultaneously. The PSCP file for "Cholesterol Synthesis" pathway analyzed for the data from 11 different microarray datasets or CRI files representing a time course experiment (see <supplr sid="S11">Additional file 11</supplr> for description of material and data preparation). Pooled hepatic mRNA were isolated from female wild type mice sacrificed at different time points during fetal and post-natal development indicated. The time point at 9 day before birth was used as the reference level of mRNA. "Day-5" and "Day-3" indicates 5 days or 3 day prior to birth, respectively.</p>
               </text>
               <graphic file="1471-2105-7-30-5"/>
            </fig>
            <suppl id="S1">
               <title>
                  <p>Additional File 1</p>
               </title>
               <text>
                  <p>A Microsoft PowerPoint file including a few slides of screenshots to describe the features for displaying a Biocarta pathway graph and highlighting selected genes to display their data in the graph. <b>Slide1: </b>The window for creating a PSCP file for a Biocarta pathway from the internal database. <b>Slide 2: </b>The PSCP file including all the genes in the created Biocarta pathway "FXR and LXR Regulation of Cholesterol Metabolism". <b>Slide 3: </b>Color the created PSCP file with loaded CRI files (the time-course data used in Fig. <figr fid="F5">5</figr>) with gradient coloring scheme. <b>Slide 4: </b>WPS can display the corresponding Biocarta pathway diagram in a separate internet browser and show the data with designated arrows (red arrows) for the selected genes highlighted in created PSCP file (slide 3). Clickable buttons labeled with names of loaded datasets are to allow displaying of data for corresponding CRI file for the selected genes.</p>
               </text>
               <file name="1471-2105-7-30-S1.ppt">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>The second file type is the WholeScope File (WSCP file), which is composed of a series of pathway tags. Each pathway tag in the WSCP file is linked to a PSCP file saved in a user's desktop or represents a term (either a pathway or GO term in the internal database), which is also colorable like gene tags. The PCSP file can be accessed if it is linked by a pathway tag or a new PSCP file(s) can be dynamically created if the pathway tag represents a database associated term from the WSCP file by clicking on the pathway tag. Global changes in expression levels of genes in each pathway or term can be indicated by setting criteria in CRI files (see below) to color code the pathway tags (see WSCP file example in Figure <figr fid="F3">3</figr>).</p>
            <p>The third file type is the criteria (CRI) file, which is used to enter the user-defined color criteria and HTP data (see <supplr sid="S20">Additional files 35&#8211;50</supplr> for examples of raw data files used in this manuscript). Each CRI file is a Microsoft Access file that contains a HTP (e.g. microarray) dataset, the mapped gene identifiers (BaseGenBankID) for each microarray element, and the user-defined color criteria for the PSCP and WSCP files (see <supplr sid="S19">Additional files 19&#8211;34</supplr> for examples of CRI files used in this manuscript). HTP datasets are converted to CRI files in the program from Excel files (Microsoft, Inc) containing the HTP data through a Data Conversion Window (Fig. <figr fid="F2">2</figr>). WPS can be used for any high-throughput data as long as it is formatted as spreadsheets in Excel files and contains one of the three types of standard gene or protein identifiers including GenBank Accessions, Unigene IDs, or SwissProt IDs (see <supplr sid="S2">Additional file 2</supplr> or our program demo web page <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> for the file format and procedure for conversion into CRI files). For genes without available standard IDs, they can also be included into CRI files with user-assigned identifiers and used for analysis such as Pattern Extraction (see below). The HTP data in an Excel file can be an individual dataset, or combined multiple datasets in a single file (i.e. in Stanford format <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>). In the latter case, the user can select an appropriate data column to build color criteria specifically for one dataset in the whole file as individual CRI file, or multiple data columns to build color criteria even for multiple datasets within a single CRI file (see the <supplr sid="S3">Additional file 3</supplr> for more details on the three file types in WPS).</p>
            <suppl id="S2">
               <title>
                  <p>Additional File 2</p>
               </title>
               <text>
                  <p>A Microsoft PowerPoint file including a slide of a screenshot for a microarray raw dataset in a worksheet of an Excel file to graphically illustrate the format and 3 requirements of a data file to be converted into a CRI file in WPS.</p>
               </text>
               <file name="1471-2105-7-30-S2.ppt">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S3">
               <title>
                  <p>Additional File 3</p>
               </title>
               <text>
                  <p>A Microsoft Word file including a detailed description of the three types of files in WPS.</p>
               </text>
               <file name="1471-2105-7-30-S3.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Data analysis using WPS</p>
            </st>
            <p>Once a series of PSCP, WSCP and CRI files have been loaded into the program, the user has the option of proceeding along several analysis courses. Some of the features available for this analysis are described briefly below. The result section of this manuscript illustrates the program using real data examples to describe some scenarios to apply the program for data analysis. In addition, a set of tutorial movies and illustration image files may be obtained from our program demo web page <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> for many major features and general usage of the program (see additional files <supplr sid="S15">15</supplr>, <supplr sid="S16">16</supplr>, <supplr sid="S17">17</supplr>, <supplr sid="S18">18</supplr> for some examples of demo movies).</p>
         </sec>
         <sec>
            <st>
               <p>Pattern extraction of a list of genes using color cue templates for biological themes</p>
            </st>
            <p>In CRI files, users can define criteria to color code specific categories of behaviors of genes in datasets (e.g. red color for no less than 2 fold change of genes, or green color for genes flagged as down-regulated genes etc.). This kind of criteria definition can be used to extract a gene list of genes matching a specific pattern of such criteria across one or more CRI dataset files.</p>
            <p>Pattern extraction can work in two ways: global pattern extraction across datasets, or local or pathway PSCP file scoped pattern extraction. The extracted gene list can be immediately copied and pasted to other utility windows for further analysis (see Fig. <figr fid="F6">6</figr>, <figr fid="F7">7</figr>, <figr fid="F8">8</figr> and <supplr sid="S4">Additional file 4</supplr> for illustrative examples).</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>The &lt;Extract Patterned Gene List from CRI Files> window for pattern extraction</p>
               </caption>
               <text>
                  <p>The &lt;Extract Patterned Gene List from CRI Files> window for pattern extraction. CRI files from time course experiments were entered into this window. The template pattern was set as shown in middle of the window. Totally 11 color blocks represented 11 CRI files of time course experiments. The small check boxes under the color blocks are to determine if the corresponding color block or CRI template is ignored If the underneath check box is checked. The multiple color templates were set for No. 3 (Day 1) and No. 7 (Day 18) CRI files as indicated by vertically discrete boxes (green and black boxes). Extracted genes were entered into the table for further processing.</p>
               </text>
               <graphic file="1471-2105-7-30-6"/>
            </fig>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>The &lt;Fisher's exact test for Given Lists> window</p>
               </caption>
               <text>
                  <p>The &lt;Fisher's exact test for Given Lists> window. The gene list derived from pattern extraction in Fig. 6 was copied and pasted into this window. The gene list from Day 1 CRI file was chosen as the total population list. GO: Biological Process was used as the system for Fisher's exact test. The computation results are ranked based on the p-values of each category (i.e. GO term) of the system. The less p-value, the higher the category ranks.</p>
               </text>
               <graphic file="1471-2105-7-30-7"/>
            </fig>
            <fig id="F8">
               <title>
                  <p>Figure 8</p>
               </title>
               <caption>
                  <p>The window for generation of gene-term association network</p>
               </caption>
               <text>
                  <p>The window for generation of gene-term association network. The gene list derived from pattern extraction in Fig. 6 was copied and pasted into this window. GO:Biological Process, Biocarta, and KEGG pathways were searched with the gene list and results were entered into the table in a gene-term pairwise format.</p>
               </text>
               <graphic file="1471-2105-7-30-8"/>
            </fig>
            <suppl id="S4">
               <title>
                  <p>Additional File 4</p>
               </title>
               <text>
                  <p>A Microsoft PowerPoint file including a few slides of screenshots to describe the feature for pattern extraction of genes from a colored PSCP file. <b>Slide1: </b>A colored PSCP file (previously has been loaded with CRI files) subjected to pattern extraction. <b>Slide 2: </b>The pattern extraction window for extraction of genes from the colored PSCP file in slide 1 that match with the defined color pattern in the color template panel. <b>Slide 3: </b>The created PSCP file including the extracted genes in slide 2 to verify the pattern of extracted genes colored with same set of CRI files.</p>
               </text>
               <file name="1471-2105-7-30-S4.ppt">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Generation and manipulation of Gene-Term Association Network (GTAN) to explore gene-pathway or gene-term relations</p>
            </st>
            <p>Using an input or filtered gene list, such as a list of genes derived from clustering analysis from other programs or pattern extraction in WPS, the associated pathways or GO terms can be identified from the internal database or user-defined PSCP files. These results are listed into the result table in a gene-term pairwise format (Fig. <figr fid="F8">8</figr>). Then, the pairwise relationships between genes and their associated pathways or GO terms in the results table can be used to generate a gene-term association network (GTAN) within a PSCP file and is illustrated in a graphical view (Fig. <figr fid="F9">9</figr>). The generation of such a network is based on the Scalable Vector Graphics (SVG) technology <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>, a standard for describing the layout of two-dimensional graphics in XML. The gene-term association network can be manipulated and filtered for the purpose of different analysis needs in many ways (see a concrete example illustrated in Fig. <figr fid="F9">9</figr>, <figr fid="F11">11</figr>, <figr fid="F12">12</figr>, <figr fid="F14">14</figr> and see <supplr sid="S5">Additional file 5</supplr> for more detailed description of this feature). In addition, genes associated with disease terms, which were derived from Genetic Association Database and MedGene Database and included in the internal database described above, can be highlighted and selected from the network for further analysis and network manipulation (see <supplr sid="S6">Additional file 6</supplr> for screenshot).</p>
            <fig id="F9">
               <title>
                  <p>Figure 9</p>
               </title>
               <caption>
                  <p>Graphical display of a gene-term association network (GTAN) within WPS</p>
               </caption>
               <text>
                  <p>Graphical display of a gene-term association network (GTAN) within WPS. A GTAN was created based on the searched result from Fig. 8. The numbers of gene, association, and terms in this GTAN are indicated. All the genes in the network are highlighted in red. The rest of the tags in the network are pathway tags representing pathways/terms. The lines linked the genes and associated terms indicated the associations (Note: This diagram is to show the whole layout of the network. When running WPS, the legends of genes and pathway tags can be visibly seen when mouse is placed on top of the tags even if they are not visible in this diagram.)</p>
               </text>
               <graphic file="1471-2105-7-30-9"/>
            </fig>
            <fig id="F10">
               <title>
                  <p>Figure 10</p>
               </title>
               <caption>
                  <p>Window for GTAN filtering and manipulation</p>
               </caption>
               <text>
                  <p>Window for GTAN filtering and manipulation. Many filtering options are available including Filtering by Fisher's exact test result, by association degree, and by retrieving neighbors for highlighted nodes, merging to gene-gene or term-term network, and disease-associated gene highlight feature.</p>
               </text>
               <graphic file="1471-2105-7-30-10"/>
            </fig>
            <fig id="F11">
               <title>
                  <p>Figure 11</p>
               </title>
               <caption>
                  <p>Manipulation and filtering of GTAN to gain insights of overall relationships</p>
               </caption>
               <text>
                  <p>Manipulation and filtering of GTAN to gain insights of overall relationships. The GTAN from Fig. 9 was filtered using the Window from Fig. 10 based on Fisher's exact test result shown in Fig. 7 as reference using cutoff p-value &lt; = 0.05. Genes are highlighted in red. All the terms shown in the network have p-values &lt; = 0.05, some of which are labeled individually and some of which are labeled as summarized descriptions due to close relevancy.</p>
               </text>
               <graphic file="1471-2105-7-30-11"/>
            </fig>
            <fig id="F12">
               <title>
                  <p>Figure 12</p>
               </title>
               <caption>
                  <p>Manipulation and filtering of GTAN based on association degree</p>
               </caption>
               <text>
                  <p>Manipulation and filtering of GTAN based on association degree. The GTAN from Fig. 9 was filtered to get rid of terms with minimal association of genes. Genes are highlighted in red. Terms tend to be shared by multiple genes and are labeled as summarized descriptions for simplicity purpose.</p>
               </text>
               <graphic file="1471-2105-7-30-12"/>
            </fig>
            <fig id="F13">
               <title>
                  <p>Figure 13</p>
               </title>
               <caption>
                  <p>Manipulation and filtering of GTAN to gain insights of gene-gene relationships</p>
               </caption>
               <text>
                  <p>Manipulation and filtering of GTAN to gain insights of gene-gene relationships. The GTAN from Fig. 12 was merged into a gene-gene network through shared terms: if two genes share a common term, then draw a line in between these two genes and get rid of the term and its original association lines, and then the graph was rebuilt by SVG (see <supplr sid="S14">Additional file 14</supplr>). Before merging, some generic GO terms (high rank in GO hierarchy, e.g. physiological process) were eliminated for simplicity purpose. Genes are selected by highlighting in red for further manipulation (see <supplr sid="S14">Additional file 14</supplr>). Then the obtained PSCP file containing the merged network was colored by the CRI files from the time course experiments (same CRI files as in Fig. 5).</p>
               </text>
               <graphic file="1471-2105-7-30-13"/>
            </fig>
            <fig id="F14">
               <title>
                  <p>Figure 14</p>
               </title>
               <caption>
                  <p>Manipulation and filtering of GTAN to gain insights of gene-term relationships</p>
               </caption>
               <text>
                  <p>Manipulation and filtering of GTAN to gain insights of gene-term relationships. Highlighted the selected genes in Fig. 13 including Egln3, Hes6, Ldlr, Rab4a, Dhcr7, Fdps, and Idi1, and then retrieved their neighbors (terms) to re-build the graph by SVG. Terms are labeled as summarized descriptions due to close relevancy.</p>
               </text>
               <graphic file="1471-2105-7-30-14"/>
            </fig>
            <suppl id="S5">
               <title>
                  <p>Additional File 5</p>
               </title>
               <text>
                  <p>A Microsoft Word file including description of the feature for manipulation and filtering of GTANs.</p>
               </text>
               <file name="1471-2105-7-30-S5.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S6">
               <title>
                  <p>Additional File 6</p>
               </title>
               <text>
                  <p>A Microsoft PowerPoint file including a slide for screenshot of the window for searching network for specific genes or terms or for disease-associated genes. The selected disease from database is used to search and highlight the associated genes in current GTAN/PSCP file for further analysis.</p>
               </text>
               <file name="1471-2105-7-30-S6.ppt">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Fisher's exact test for biological significance of gene lists and pathway-level pattern enrichment of high-throughput data</p>
            </st>
            <p>The Fisher's exact test is performed based on 2 &#215; 2 contingency tables (whether a gene is in the given list or not vs whether this gene is associated with a pathway/term or not; see <supplr sid="S7">Additional file 7</supplr> for illustration of an example of 2 &#215; 2 contingency table). Similar to EASE <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>, Fisher's exact test p-values are computed for each term in a chosen system, which are then ranked from smaller to larger values, to estimate the statistical significance and enrichment of global functional categories (GO terms, pathways etc.) within a given system for a list of genes of a user's interests or that match a pattern. The biological themes of the gene list can be rapidly retrieved from GO system and Biocarta and KEGG pathway collections as top ranked terms or pathways based on the Fisher's exact test p-values (see Figure <figr fid="F7">7</figr> and <supplr sid="S8">Additional file 8</supplr> for a concrete example).</p>
            <suppl id="S7">
               <title>
                  <p>Additional File 7</p>
               </title>
               <text>
                  <p>A Microsoft PowerPoint file including a slide for illustration of a 2 &#215; 2 contingency table used as basis for Fisher's exact test.</p>
               </text>
               <file name="1471-2105-7-30-S7.ppt">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S8">
               <title>
                  <p>Additional File 8</p>
               </title>
               <text>
                  <p>A Microsoft Excel file including an example result of Fisher's exact test exported from WPS in Figure <figr fid="F7">7</figr>.</p>
               </text>
               <file name="1471-2105-7-30-S8.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>In contrast to the global functional statistical estimation of a gene list, within a PSCP file being analyzed and colored by CRI file(s), the statistical significance and enrichment of genes with certain user-defined criteria, can be also estimated by Fisher's exact test for the corresponding CRI file(s) (see <supplr sid="S9">Additional file 9</supplr> for a screenshot illustration of this feature).</p>
            <suppl id="S9">
               <title>
                  <p>Additional File 9</p>
               </title>
               <text>
                  <p>A Microsoft PowerPoint file including a few slides of screenshots to describe the feature for pathway or PSCP-scoped "local Fisher's exact test" of user-defined pattern enrichment of choice genes colored with CRI file(s) in a PSCP file being analyzed. <b>Slide1: </b>A colored PSCP file (previously has been loaded with CRI files) subjected to "local Fisher's exact test". <b>Slide 2: </b>The "local Fisher's exact test" window for measuring statistically the enrichment of genes with user-defined criteria, in this example, the enrichment degree of differentiated expressed genes (red and green colors in the color template panel) for each dataset within this pathway.</p>
               </text>
               <file name="1471-2105-7-30-S9.ppt">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Other utilities: information search and dataset manipulation</p>
            </st>
            <p>Within the information search window, one can type in a keyword (e.g. Gene Name, GenBankID etc.) to search for its relevant information, including annotation information, as well as associated disease information from the internal database. Two dataset manipulation utilities are available to conveniently manipulate the size of data files so that one or more subsets of a dataset, or sum of multiple datasets can be used for further analysis: 1. Sorting a dataset into pathway/term scoped "sub-datasets" based on PSCP files, or pathways/terms in the internal database (see <supplr sid="S10">Additional file 10</supplr> for screenshot); 2. Merge data files.</p>
            <suppl id="S10">
               <title>
                  <p>Additional File 10</p>
               </title>
               <text>
                  <p>A Microsoft PowerPoint file including a slide for screenshot of the window from WPS for sorting a dataset to pathway/term scoped sub-datasets for further processing.</p>
               </text>
               <file name="1471-2105-7-30-S10.ppt">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <p>Some of the features of WPS are illustrated using experimental data in the following section. Although only microarray data is utilized, any source of HTP data would be equally suited to the analysis.</p>
         <sec>
            <st>
               <p>Comparison of multiple datasets within multiple pathways</p>
            </st>
            <p>WholePathwayScope displays HTP data within a biological context. Figure <figr fid="F1">1</figr> provides an overview of the basic work flow of the WPS program for data and file processing. Microarray data or other HTP data are entered into criteria (CRI) files in which the parameters for analysis are set. The data are then loaded into the PSCP/WSCP files either from the provided file collections or from the internal database, and those genes and pathways that meet user-defined criteria are flagged. In addition, gene lists can be extracted from CRI files for further Fisher's exact test analysis or creation of GTAN. Figure <figr fid="F2">2</figr> provides an example of the data conversion window, in which gene and pathway criteria are entered. For each gene, GenBank accession numbers that correspond to the gene are collated and given a single BaseGenbank ID number.</p>
            <p>By way of example, the WPS program was used to compare gene expression profiles between wild-type mice and two strains of genetically-modified mice that either express high levels of <it>ABCG5 </it>and <it>ABCG8 </it>(<it>G5G8 </it><sup><it>Tg </it></sup>mice) or no <it>ABCG5 </it>or <it>ABCG8 </it>(<it>G5G8 </it><sup>-/- </sup>mice) <abbrgrp><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr></abbrgrp> also see <supplr sid="S11">Additional file 11</supplr> for a description of material and data preparation). <it>ABCG5 </it>and <it>ABCG8 </it>encode ABC half transporters that heterodimerize to limit the intestinal absorption of dietary sterols and to promote the secretion of sterols from the liver into bile <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>.</p>
            <suppl id="S11">
               <title>
                  <p>Additional File 11</p>
               </title>
               <text>
                  <p>A Microsoft Word file describing the materials and methods for preparation of microarray data used for describing the program features.</p>
               </text>
               <file name="1471-2105-7-30-S11.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>To compare gene expression patterns in the <it>G5G8 </it><sup><it>Tg </it></sup>and <it>G5G8 </it><sup>-/- </sup>mice and assess the reproducibility of the microarray results, microarray datasets from five expression array experiments (two from <it>G5G8 </it><sup><it>Tg </it></sup>mice and three from <it>G5G8 </it><sup>-/- </sup>mice) were analyzed simultaneously in a WSCP file, which includes a subset of the biochemical pathways and gene families involved in lipid metabolism (Fig. <figr fid="F3">3</figr>). If two or more genes in a pathway were significantly up-regulated or down-regulated (designated "UpHighCertainty" or "DownHighCertainty" in Table <tblr tid="T1">1</tblr>, respectively) for a dataset, the corresponding divided box of the pathway tag was colored red or green, respectively. Comparison of the data from the two experiments from the <it>G5G8 </it><sup><it>Tg </it></sup>mice demonstrated consistent results for some pathways (e.g. cholesterol synthesis and bile acid synthesis) but not for other pathways (e.g. glycogen synthesis).</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>An example of gene criteria for the microarray dataset for <it>G5G8 </it><sup><it>Tg </it></sup>mice vs wild-type mice.</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="center">
                        <p>
                           <b>Criteria Priority</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Criteria Name</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Criteria Details</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Color</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <b>1</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>UpHighCertainty</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>([TGvsWT_A_Change] = 'I' OR [TGvsWT_A_Change] = 'MI')AND ([WT_A_Detection] &lt;> 'A' OR [TGvsWT_A_Detection] &lt;> 'A')</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>red</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <b>2</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>DownHighCertainty</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>[TGvsWT_A_Change] = 'D' OR [TGvsWT_A_Change] = 'MD') AND ([WT_A_Detection] &lt;> 'A' OR [TGvsWT_A_Detection] &lt;> 'A')</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>green</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <b>3</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>UpLowCertainty</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>[foldchange] >= 2</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>orange</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <b>4</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>DownLowCertainty</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>[foldchange] &lt; = 0.5</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>light blue</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <b>5</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>No Criteria Met</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>Program built-in criteria (all of the above criteria not met)</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>gray</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <b>6</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Not Found</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>Gene does not exist in microarray dataset</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>white</b>
                        </p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>[TGvsWT_A_Change]: Change call of Affymetrix array for genes in Transgene array in compared to Wild type array (I: Increase; D: Derease; MI: Medium Increase; MD: Medium Decrease)</p>
                  <p>[WT_A_Detection]: Detection Call of Affymetrix array for Wild Type array (A: absent, P: present).</p>
                  <p>[TGvsWT_A_Detection]: Detection Call of Affymetrix array for G5G8Tg array (A: absent; P: present).</p>
               </tblfn>
            </tbl>
            <p>When the WSCP file window is displayed in the program, it is interactive. Each pathway tag in the WSCP file links to a PSCP file or a term in the internal database. For example, clicking on one of its divided boxes of the pathway tag "Cholesterol Synthesis" (Fig. <figr fid="F3">3</figr>) will open a new window with the details of the pathway, including the expression levels of the individual genes (Fig. <figr fid="F4">4</figr>). Many genes in the cholesterol biosynthetic pathway were colored red indicative of up-regulation in both datasets from the <it>G5G8 </it><sup><it>Tg </it></sup>mice (Fig. <figr fid="F4">4</figr>). These genes were expressed at significantly lower levels (colored green) in samples from the <it>G5G8 </it><sup>-/-</sup>, confirming that differences in expression levels of <it>ABCG5 </it>and <it>ABCG8 </it>have a significant impact on cholesterol biosynthesis in the liver. When cholesterol synthesis was measured in these two genetically-modified strains of mice, it was found to be increased in the transgenic animals and decreased in the knockouts, verifying the biochemical changes seen in these experiments <abbrgrp><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr></abbrgrp>.</p>
            <p>To assess the ontogeny of expression of genes in the cholesterol biosynthetic pathway, we analyzed microarray datasets from livers of wild-type mice or embryos sacrificed at different time points during development (Fig. <figr fid="F5">5</figr>). In utero, most cholesterol for the developing mouse is derived from endogenous synthesis, as reflected by the increased expression of many genes in the biosynthetic pathway at day -5 and day -3. Most of these mRNAs decreased within 24 h of birth, which correlates with the initiation of nursing. Many hepatic mRNAs in the group were increased between postnatal days 18 and 21, when the pups transition from consuming a high cholesterol diet (milk) to a low cholesterol chow diet (0.02% cholesterol by weight). During this time period, endogenous cholesterol synthesis compensates for the reduced dietary intake of cholesterol associated with weaning.</p>
         </sec>
         <sec>
            <st>
               <p>Pattern extraction and statistical evaluation of gene lists for biological themes</p>
            </st>
            <p>One strategy to analyze microarray or other HTP data is to look for genes with certain user-defined expression patterns across one or more datasets with some biological implications and themes. The expression level-switch phenomenon at birth and weaning across the time course experiment within the cholesterol synthesis pathway described earlier (Fig. <figr fid="F5">5</figr>), prompts us to further investigate the underlying mechanisms and relevant genes and biological processes to cholesterol metabolism. We used WPS to perform pattern extraction with a defined color criteria pattern that reflects the expression level switch around the birth and weaning time points from the time course experiments (Fig. <figr fid="F6">6</figr>). The pattern-matching panel and color template affords the user great flexibility in determining which dataset(s) to include and what color(s) to accept (Fig. <figr fid="F6">6</figr>). The resulting gene list containing 24 unique genes was copied and pasted to Fisher's exact test window to evaluate the statistical enrichment of pathways or GO terms within this list (Fig. <figr fid="F7">7</figr>; also see <supplr sid="S8">Additional file 8</supplr> for complete result list). As expected, the Fisher's exact test clearly indicated enrichment of sterol/lipid metabolism and synthesis GO terms in the resulting gene list (Fig. <figr fid="F7">7</figr>).</p>
         </sec>
         <sec>
            <st>
               <p>Gene-Term Association Network (GTAN) for gene-specific functional subnetwork domains or function-oriented gene clusters</p>
            </st>
            <p>To further study the underlying relationships between genes and involved/enriched pathways or GO terms, we used WPS to search involved pathways from Biocarta, KEGG and GO/Biological Processes terms for the extracted gene list from Fig. <figr fid="F6">6</figr>, and then dynamically generated a gene-term association network (GTAN) (Fig. <figr fid="F8">8</figr>, <figr fid="F9">9</figr>). Thereby, we visualized gene-term association as well as gene-gene and term-term relationship in a graphical manner. Within a typical GTAN, any gene with an associated term has a line linked to it representative of the association relation. In this interactive window, the line linkage between the gene and its associated term will still be maintained even if the gene or the term is moved to a different location. As shown in Fig. <figr fid="F9">9</figr>, among total 24 genes from the above extracted gene list, only 14 genes have annotated association pathway/GO terms from Biocarta, KEGG and GO/Biological Processes, totalling 90 terms and 206 gene-term associations, which are included in the network (see <supplr sid="S12">Additional file 12</supplr> for complete pair-wise gene-term relations in this GTAN). All the genes that are highlighted in red (Fig. <figr fid="F9">9</figr>) are presented as a network along with their associated pathways/terms. At this level of a GTAN, the legends of genes and associated terms are invisible (Fig. <figr fid="F9">9</figr>). One can explore the network by moving the mouse over the top of genes or terms, the legend of which can then be displayed visibly near the mouse. Alternatively, one can "zoom in" on a selected area of the network to take a closer look at the subnetwork. There is also a specialized window available for users to manipulate, filter, and explore the network (Fig. <figr fid="F10">10</figr>).</p>
            <suppl id="S12">
               <title>
                  <p>Additional File 12</p>
               </title>
               <text>
                  <p>A Microsoft Excel file including the complete pair-wise gene-term relations in the GTAN in Fig. <figr fid="F9">9</figr>.</p>
               </text>
               <file name="1471-2105-7-30-S12.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>To investigate the biological themes from the resulting GTAN, we used a variety of manipulation methods available in a utility window to filter and simplify the network within WPS (Fig. <figr fid="F10">10</figr>). In combination with the Fisher's exact test result (Fig. <figr fid="F7">7</figr>), we filtered out a subnetwork of genes with associated GO terms that have Fisher's exact test p-values no larger than 0.05 (Fig. <figr fid="F10">10</figr>, <figr fid="F11">11</figr>). Interestingly, although the majority of the enriched terms are engaged in sterol/lipid synthesis and metabolism, endocytosis and vesicle-mediated transport GO terms, which are enriched in the extracted gene list, link <it>Ldlr </it>and <it>Rab4a </it>together in the subnetwork (Fig. <figr fid="F11">11</figr>). We also retrieved terms with minimal associations with genes, which tend to be unique terms/pathways describing their involved genes (see <supplr sid="S13">Additional file 13</supplr>). Notably, <it>Rab4a </it>is functionally involved in signaling and cell communication (see <supplr sid="S13">Additional file 13</supplr>). Thus, it appears that <it>Rab4a </it>could be the critical signaling component that triggers and mediates the sterol/lipid synthesis machinery in the body through <it>Ldlr </it>, probably by the mechanism of endocytosis (Fig. <figr fid="F11">11</figr>, and see <supplr sid="S13">Additional file 13</supplr>).</p>
            <suppl id="S13">
               <title>
                  <p>Additional File 13</p>
               </title>
               <text>
                  <p>A graphical tif file to illustrate a filtered GTAN from the GTAN of Fig. <figr fid="F9">9</figr> for terms with minimal associations of genes, which tend to be unique or specific for their associated genes. Genes are highlighted in red.</p>
               </text>
               <file name="1471-2105-7-30-S13.tiff">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Shared term extraction and disease gene annotation</p>
            </st>
            <p>We then filtered out genes with minimal association of terms and looked for shared terms within the network (Fig. <figr fid="F12">12</figr>). As evident in Fig. <figr fid="F12">12</figr>, sterol/lipid synthesis and metabolism are major shared terms in the network, which is consistent with the filtered network from the Fisher's exact test result (Fig. <figr fid="F11">11</figr>). The fact that <it>Hmgcs1 </it>, <it>Hes6 </it>and <it>srebf1 </it>(SREBP) heavily shared transcription-related terms (Fig. <figr fid="F12">12</figr>), is consistent with the facts that <it>Hmgcs1 </it>and <it>srebf1 </it>have been previously shown to be involved in sterol metabolism regulation and <it>Hes6 </it>has been implicated in transcriptional regulation <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>, suggesting that they may collaborate in metabolism-related transcriptional regulation. <it>Egln3 </it>only shared high level or more generic GO terms (e.g. cellular biological process) with other genes (Fig. <figr fid="F12">12</figr>), confirming its unique functional involvement in cell death among the gene list (see <supplr sid="S13">Additional file 13</supplr>). Furthermore, <it>Rab4a </it>and <it>Ldlr </it>shared many pathway/terms, suggesting these two may be functionally coupled.</p>
            <p>Interestingly, when we used disease-association highlight feature in WPS (see <supplr sid="S6">Additional file 6</supplr> for screenshot), we found that <it>Rab4a </it>, <it>Ldlr </it>, and <it>srebf1 </it>are all more or less associated with obesity annotated in Genetic Association Database <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> and MedGene databases <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> (data not shown).</p>
         </sec>
         <sec>
            <st>
               <p>Gene-gene networks and neighbor identification</p>
            </st>
            <p>To study how genes are related to each other directly within the network, we merged the gene-term association network in Figure <figr fid="F12">12</figr> into a gene-gene network by their shared terms, after eliminating some generic terms for clarity purposes. The gene-gene network formed a domain-like architecture (see <supplr sid="S14">Additional file 14</supplr>, Fig <figr fid="F13">13</figr>). The genes that are heavily involved in sterol/lipid synthesis and metabolism formed a clustered domain with massive links among them. Distinct genes with different functional trends from others tended to be separated out of the "crowded" region such as <it>Egln3 </it>, <it>Hes6 </it>, and <it>Rab4a </it>.</p>
            <suppl id="S14">
               <title>
                  <p>Additional File 14</p>
               </title>
               <text>
                  <p>A graphical tif file to illustrate a GTAN derived from the GTAN of Fig. <figr fid="F12">12</figr> by merging into a gene-gene network through shared terms. Some genes are highlighted in red used for further manipulation.</p>
               </text>
               <file name="1471-2105-7-30-S14.tiff">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S15">
               <title>
                  <p>Additional File 15</p>
               </title>
               <text>
                  <p>A Shockwave Flash file to show a movie clip as a program demo for how to convert a dataset file to a CRI file to be used in WPS. (Note: the movie files can be viewed directly using internet browser with Flash Animation plug-in)</p>
               </text>
               <file name="1471-2105-7-30-S15.swf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S16">
               <title>
                  <p>Additional File 16</p>
               </title>
               <text>
                  <p>A Shockwave Flash file to show a movie clip as a program demo for how to load CRI files(s) to color a PSCP or a WSCP file.</p>
               </text>
               <file name="1471-2105-7-30-S16.swf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S17">
               <title>
                  <p>Additional File 17</p>
               </title>
               <text>
                  <p>A Shockwave Flash file to show a movie clip as a program demo for how to create a PSCP file or WSCP file from the internal database.</p>
               </text>
               <file name="1471-2105-7-30-S17.swf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S18">
               <title>
                  <p>Additional File 18</p>
               </title>
               <text>
                  <p>A Shockwave Flash file to show a movie clip as a program demo for how to do pattern extraction from selected CRI file(s), how to do the global Fisher's exact test for a given list (e.g. a gene list from pattern extraction), and how to create a GTAN from a given list.</p>
               </text>
               <file name="1471-2105-7-30-S18.swf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S19">
               <title>
                  <p>Additional Files 19&#8211;34</p>
               </title>
               <text>
                  <p>zip files that include each of 16 microarray CRI files used in the application examples in the manuscript (Day-5.zip, Day-3.zip, Day1.zip, Day5.zip, Day10.zip, Day14.zip, Day18.zip, Day21.zip, Day30.zip, Day60.zip, Day90.zip, G5G8KO1.zip, G5G8KO2.zip, G5G8KO3.zip, G5G8Tg1.zip, G5G8Tg2.zip) (unzip them using WinZip program or other appropriate programs before use).</p>
               </text>
               <file name="1471-2105-7-30-S19.zip">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S20">
               <title>
                  <p>Additional Files 35&#8211;50</p>
               </title>
               <text>
                  <p>zip files that include each of 16 microarray raw files (Excel files) used in the application examples in the manuscript (rDay-5.zip, rDay-3.zip, rDay1.zip, rDay5.zip, rDay10.zip, rDay14.zip, rDay18.zip, rDay21.zip, rDay30.zip, rDay60.zip, rDay90.zip, rG5G8KO1.zip, rG5G8KO2.zip, rG5G8KO3.zip, rG5G8Tg1.zip, rG5G8Tg2.zip (unzip them using WinZip program or other appropriate programs before use)</p>
               </text>
               <file name="1471-2105-7-30-S20.zip">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>In order to learn more relationships among these "distinct genes", we highlighted them in red with WPS (see <supplr sid="S14">Additional file 14</supplr>, Fig. <figr fid="F13">13</figr>) and then retrieved the immediate neighbors for them from the original GTAN network to create a subnetwork (Fig. <figr fid="F14">14</figr>). Since <it>Ldlr </it>is the only one from the "Sterol metabolism clustered domain" that connects to <it>Egln3 </it>, we also included it for further analysis. Notably, within the newly created subnetwork, all the selected genes are linked, although distinct domains or clusters for each gene are apparently separated very well (Fig. <figr fid="F14">14</figr>). All these domains may represent the major functional aspects underlying the "expression level-switch" phenomenon and are networked together as a dynamic biological theme. Interestingly, these genes were originally selected due to a similar expression pattern across the "cholesterol metabolism" switch points (Fig. <figr fid="F6">6</figr>, Fig. <figr fid="F13">13</figr>). Especially, <it>Rab4a </it>, <it>Egln3 </it>and <it>Hes6 </it>have been previously implicated in their functional categories including signaling and endocytosis <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>, cell death <abbrgrp><abbr bid="B50">50</abbr></abbrgrp> and transcription regulation <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>, respectively. Whether and/or how they play roles in cholesterol metabolism related signaling, cell death and transcription regulation remains to be investigated.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>WPS has unique features not found in a single similar application</p>
            </st>
            <p>The new software program, WPS, is described that facilitates and enhances the analysis of HTP data. Unique features of WPS include the ability to simultaneously display HTP data from multiple experiments within the context of known biological pathways, visualizing and analyzing gene-pathway/term and gene-gene relationships and biological implications within created gene-term association networks, extracting a gene list that may reflect certain biological themes by means of a user-defined pattern template with color cues, and statistically estimating the enrichment of biological pathways or GO terms within a distinguished list or a PSCP file under analysis (see Table <tblr tid="T2">2</tblr> for comparison with several free and commercial pathway analysis tools).</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Comparison of major features of WPS with other pathway analysis tools.</p>
               </caption>
               <tblbdy cols="9">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>WPS</p>
                     </c>
                     <c ca="center">
                        <p>GenMAPP</p>
                     </c>
                     <c ca="center">
                        <p>Pathway Processor</p>
                     </c>
                     <c ca="center">
                        <p>Cytoscape</p>
                     </c>
                     <c ca="center">
                        <p>PathwayAssist</p>
                     </c>
                     <c ca="center">
                        <p>Ingenuity</p>
                     </c>
                     <c ca="center">
                        <p>MetaCore</p>
                     </c>
                     <c ca="center">
                        <p>PathArt</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="9">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Customized Pathways or Gene Groups</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>?</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Public Canonical Pathway Collection (e.g. KEGG, BioCarta)</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Color Code Genes with Datasets</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Color Code Pathways (Pathway Tags) with Datasets</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Concurrent Visualization of Multiple Datasets (in Colored Pathways/Network)</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Multiple Datasets Visualization Once a Time (in Colored Pathways/Network)</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Summarized View of Multiple Pathways (i.e. Pathway Tags)</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Pattern Extraction from Data Files</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Pattern Extraction from Colored Pathways</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Global Enrichment Test (e.g. Fisher's exact test) for a Given List</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Pathway-Scoped Fisher's exact test for Patterned Genes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Network View (Gene-Gene Association Network*)</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Network View (Gene-Term Association Network*)</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Term-based and Fisher's exact test result-based network filtering**</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Sort Whole Dataset into Pathway Scoped Sub-datasets***</p>
                     </c>
                     <c ca="center">
                        <p>Yes</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                     <c ca="center">
                        <p>No</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Types of Products/Availability (Free to Academic: A; Commercial: C)</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>C</p>
                     </c>
                     <c ca="center">
                        <p>C</p>
                     </c>
                     <c ca="center">
                        <p>C</p>
                     </c>
                     <c ca="center">
                        <p>C</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>WPS has unique features not found in a single similar application. Seven pathway analysis tools (first row of the table: three of them are free to academic and the other four are commercial products) were selected for comparison of major features (first column of the table) with WPS. The other seven tools for comparison include: GenMAPP [27]; Pathway Processor [29]; Cytoscape [28]; PathwayAssist [23]; Ingenuity: Ingenuity Pathway Analysis tool [25]; MetaCore [26]; PathArt [24].</p>
                  <p>Yes: has the indicated feature; No: does not have the indicated feature at current time;</p>
                  <p>?: Not clear.</p>
                  <p>*: Gene-Term Association Network (GTAN); gene-to-term relation binary network in WPS (can be converted to gene-gene association network too); Gene-Gene Association Network: gene-to-gene relation binary network.</p>
                  <p>**: Filter a GTAN into a more specific subnetwork with terms user selected and highlighted or with ranked terms from Fisher's exact test result (see Fig. 10, 11, 13, 14)</p>
                  <p>***: Sort a whole dataset into pathway scoped sub-datasets for further analysis: e.g. batch computation of the sub-datasets for pathway-scoped correlation study.</p>
               </tblfn>
            </tbl>
            <p>WPS also interfaces easily with clustering programs by accepting the gene lists from clustering analysis. This will aid in the identification of interactions among biological pathways and relating the expression profiles of genes of unknown function to those of established pathways. The program accepts data from any microarray platform (including oligonucletide arrays and cDNA arrays), and accommodates data generated by SAGE (serial analysis of gene expression) <abbrgrp><abbr bid="B51">51</abbr></abbrgrp> as well as proteomics data.</p>
            <p>In summary, WPS was developed to provide the following important features which were not previously available under a single application (Table <tblr tid="T2">2</tblr>):</p>
         </sec>
         <sec>
            <st>
               <p>Analyze multiple datasets simultaneously</p>
            </st>
            <p>First, many of the current programs display data from just a single HTP experiment. This limitation hampers direct visual comparison of results from different HTP experiments. When the number of datasets is large, as in a typical time course experiment, it becomes much harder for investigators to remember what is happening at each time point. So far, to our knowledge, our unique way in WPS of displaying multiple datasets simultaneously is absent from most, if not all, of the free and commercial pathway-based HTP data analysis tools. Even a very large number of datasets, if pre-processed and combined for same or similar categories, can still be effectively displayed and visualized in the program. In our example, shown in Figure <figr fid="F5">5</figr>, the ability of WPS to display and analyze datasets of a time course experiment in the cholesterol synthesis pathway simultaneously is very helpful for pattern recognition, especially within the genes in this pathway at the time points of birth and weaning (Most genes tended to be down-regulated after birth and up-regulated after weaning). This is more difficult to discover in such an easy and intuitive way within a pathway without using the simultaneous coloring feature of the program. Benefited from this feature in WPS, the resulting pattern recognition in turn would lead to pattern extraction of a specific color pattern in WPS, which might have biological implications (Fig. <figr fid="F6">6</figr>, discussed below).</p>
         </sec>
         <sec>
            <st>
               <p>Analyze multiple pathways simultaneously and generation of GTAN to explore gene-term relations in an intuitive graphical manner</p>
            </st>
            <p>The second feature that distinguishes WPS from current microarray programs is the ability to display multiple pathways simultaneously, either in their entirety or in summary form. A given collection of pathway files can be grouped into a single WSCP file by means of pathway tags, and changes in the behavior of genes in each pathway can be flagged according to criteria specified by the user. Thus, the pathway(s) that are significantly affected by the experimental conditions are easily identified without having to visualize each pathway individually.</p>
            <p>Furthermore, the generation and manipulation of biological gene-term association network (GTAN) greatly expands the capacity to study the gene-term and gene-gene relationships in a genome-wide fashion and provides a new way to look at genes and their involved pathways or functional GO terms. WPS has the statistical capacity, specifically using the Fisher's exact test method, to identify over-represented biological themes (pathways/processes/GO terms) in a given list of genes. More importantly, the filtering of the GTANs based on the Fisher's exact test result would give rise to a subnetwork enriched in genes and terms/pathways with statistical significance. This would help to narrow down the "core" genes and their associated terms/pathways with biological relevance of higher priority. A solid example of network filtering with the help of Fisher's exact test result is described in Fig. <figr fid="F11">11</figr>. Endocytosis and vesicle-mediated transport, which are enriched in the extracted gene list, co-exists and connects with the major enriched function sterol/lipid synthesis and metabolism in the filtered subnetwork. The layout of the filtered subnetwork brought these functions together with visible relations within the genes of interest (Fig. <figr fid="F11">11</figr>). Some specialized tools with more sophisticated statistical methods have been previously described including EASE <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B52">52</abbr></abbrgrp>, Fatigo <abbrgrp><abbr bid="B53">53</abbr></abbrgrp> and GOMiner <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>, which prioritize biological themes embedded in the given gene list. However, none of these tools directly visualize and study the gene-term and gene-gene relationship within the biological contexts represented by the gene list in an intuitive graphical manner, nor do they take advantage of the statistical enrichment of terms to grasp the relations of the key genes and their associated terms/pathways like in WPS. In contrast, within a GTAN created by WPS, displaying all possible associated pathways/terms for genes of interests and allowing the dynamic layout of a network based on the current biological contexts, will allow one or more context-dependent, specific functional terms to be assigned to the gene based on its current role in such context. On the other hand, its shared terms with other genes would implicate the biological connection of this gene with its "neighbor" genes. We believe such visual cues derived from GTANs not only provide an overall biological picture of the current biological context, but also shed light on the function of a gene in the network, which might only be obvious when seen with its partners.</p>
            <p>Currently, prediction and creation of genome-wide pathways <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>, as well as utilization, and exploration of biological networks (genetic, regulation, and biochemical) as a method for data analysis is becoming a major trend in systems biology and computational biology <abbrgrp><abbr bid="B56">56</abbr><abbr bid="B57">57</abbr></abbrgrp>. There are many new tools and algorithms being developed to move in this direction <abbrgrp><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr><abbr bid="B61">61</abbr></abbrgrp>. Many use complicated algorithms such as Bayesian network, Petri nets, probabilistic graphical model or newly defined rules to simulate regulatory networks and dynamic trajectories of genes. In addition, most of the commercial tools use gene-gene relationships as their major components of the networks, such that the users may easily lose track of the role of genes. The GTAN approach in WPS is not only simple to use, but also unique and effective in that gene-term association relationships are the major components of the network, so that users can easily keep track of genes and their involved functional terms or pathways and predict gene-gene relationships through their shared terms.</p>
         </sec>
         <sec>
            <st>
               <p>Color cue template-based pattern extraction of gene lists for biological themes</p>
            </st>
            <p>A third unique feature of WPS is pattern extraction, which is different from other pattern or profile-based approaches (e.g. typical clustering and classification methods) and some statistics-based methods (e.g. SAM <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>). Instead of relying solely on data values, the pattern extraction method in WPS takes advantage of user-defined color criteria in CRI files representing HTP datasets. Since users define criteria in CRI files with logical expression and not just based on the data values of genes in datasets, genes with quite different data values can be defined as the same class of data in terms of their behaviors. For example, a gene with fold change of 2 may be defined into the same data class category as another gene with fold change of 8, if the user defines "no less than 2 fold" as up-regulated genes. If the user defined this criterion with red color, then red color would represent a class category of genes, which are up-regulated, with fold change no less than 2 fold no matter whether it is 2 fold or 8 fold change, as long as the user is confident with the definition based on his own experience. In fact, one can enhance this definition by adding other quality control factors such as p-value in the definition logical expression, another advantage of CRI color criteria. Thus, this kind of definition eliminates the mathematical difference but maintains the embedded biological meanings in the data values, since biological processes are more qualitative than quantitative in most cases.</p>
         </sec>
         <sec>
            <st>
               <p>Limitations and future direction</p>
            </st>
            <p>WPS facilitates comprehensive analysis and visualization of HTP data within the context of known biological pathways and gene-term association networks. The program will continue to be improved, as characterization of biological pathways and networks becomes increasingly comprehensive and challenging. The ultimate goal of WPS is to integrate all the available information and databases as well as an individual user's data with different forms and formats in the contexts of biological pathways and networks.</p>
            <p>The current version of WPS is a windows-based program and serves as proof of concept of pathway/gene analysis of HTP data. The future version of the program will move to a three-tier architecture in a production-scale platform to allow WPS, through a middle layer, such as Java Servlets, to communicate with the server's resource, which may have excellent data storage as well as computation capacity. Its front end interface will also evolve into a platform-independent client such as a web-browser, depending on resource performance and other factors. Integration with other data sources and additional pathways are also to be added, so that the magnitude of HTP data analysis can be largely extended with the power of an expandable server.</p>
            <p>WPS provides Fisher's exact test method for statistical evaluation in both global system and local current PSCP files either derived from internal database or user-customized under analysis. It could be improved by the addition of more sophisticated statistical utilities such as false discovery rate (FDR) estimation <abbrgrp><abbr bid="B62">62</abbr></abbrgrp> or other statistical enhancements (e.g. bootstrap method) to analyze HTP data and determine the significance of functional enrichment in individual pathways or GO terms in a more solid way. Computational requirements limit the full integration of these statistical methods, but even without them, our software contributes significantly to improve integrative data analysis.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>We have described WPS, as a new pathway-based analysis tool, that facilitates and enhances the analysis of HTP data in the context of biological pathways and networks. WPS has many unique features not found in a single existing application. WPS has implemented a clustering analysis-like approach but using a more biologically relevant approach in the color cue-templated pattern extraction method. In addition, WPS uses Fisher's exact test to evaluate statistical significance of identified genes. Finally, WPS incorporates pathway and association network-based biological contexts as a platform, and unique coloring scheme with multiple datasets and multiple pathways as an intuitive way to visualize and analyze data of different resources. This is likely to be important for comparison of HTP data from diverse sources such as microarray and proteomics. Within WPS, the new way of pattern extraction may provide another dimension for uncovering genes with more quality-based, not just quantity-based, expression patterns likely with implications and themes more closely related to ongoing biological processes. Within WPS, the new way of visualizing and analyzing the biological relations among genes, pathways, and terms under GTANs provides a new platform for integrated discovery. This tool represents a pathway-based platform for discovery integration to maximize analysis power.</p>
      </sec>
      <sec>
         <st>
            <p>Availability and requirements</p>
         </st>
         <p><b>Project name</b>: Pathway analysis tool WPS for high-throughput data;</p>
         <p><b>Project home page</b>: <url>http://www.abcc.ncifcrf.gov/wps/wps_index.php</url><abbrgrp><abbr bid="B63">63</abbr></abbrgrp></p>
         <p><b>Operating system</b>: Microsoft Window 2000 or XP</p>
         <p><b>Programming language</b>: Microsoft Visual Basic 6</p>
         <p><b>Other requirements</b>: Internal databases for different species and a collection of over 1900 PathwayScopeFiles (PSCP files for mouse) available on web site; Additional user-provided PSCP files and those from other sources will be made available as they are collected.</p>
         <p><b>License</b>: Free to academics; distributed through license agreement</p>
         <p><b>Any restrictions to use by non-academics</b>: commercial license needed</p>
      </sec>
      <sec>
         <st>
            <p>List of abbreviations used</p>
         </st>
         <p>WPS &#8211; WholePathwayScope</p>
         <p>PSCP &#8211; PathwayScope File</p>
         <p>WSCP &#8211; WholeScope File</p>
         <p>CRI &#8211; Criteria File</p>
         <p>GTAN &#8211; Gene-Term Association Network</p>
         <p>HTP &#8211; High Throughput</p>
         <p>G5G8 &#8211; <it>ABCG5 </it>and <it>ABCG8</it></p>
         <p>ABC &#8211; ATP-binding cassette</p>
         <p>Tg &#8211; Transgenic</p>
         <p>KO &#8211; Knockout</p>
         <p>KEGG &#8211; Kyoto Encyclopedia of Genes and Genomes</p>
         <p>CGAP &#8211; The Cancer Genomes Anatomy Project</p>
         <p>SVG &#8211; Scalable Vector Graphics technology</p>
         <p>GO &#8211; Gene Ontology</p>
         <p>SAGE &#8211; Serial analysis of gene expression</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p><b>M Y </b>&#8211; Programmer Analyst III and Scientific Application Specialist from ABCC, responsible for design and implementation of WPS, a former member of UTSW.</p>
         <p><b>J D H </b>&#8211; Assistant Professor of UTSW, a collaborator who provided microarray data in the application examples.</p>
         <p><b>J C C </b>&#8211; Associate Professor of UTSW, a scientific partner of H H H, responsible for initial design of WPS in UTSW.</p>
         <p><b>H H H </b>&#8211; Director of McDermott Center and Investigator of HHMI, responsible for initial funding and design of WPS in UTSW.</p>
         <p><b>R M S </b>&#8211; Senior author, Supervisor of M Y in ABCC, responsible for design and improvement of WPS and funding for WPS in ABCC.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Linda Giang, Jigui Shan, Tammy Qiu, and Gary Smythers for technical assistance. We also thank Yanhui Hu and Joshua Labaer from Harvard Medical School for kindly providing partial MedGene database, Richard A. Lempicki and Wei Gao from NIAID, NIH for technical assistance and valuable discussion. We sincerely thank Carl Schaefer from National Cancer Institute Center for Bioinformatics (NCICB) for providing CGAP biocarta pathway data and information as well as other technical assistance. We especially thank Robert Guzman, Norma Anderson and Esther Nie from UT Southwestern Med Ctr. for excellent technical assistance. We also thank David W. Russell, Alexander Pertsemlidis and Jeff Schageman from UT Southwestern Med Ctr. for their helpful discussions. This work has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institute of Health, under Contract No. NO1-CO-12400. Initial funding came from The Howard Hughes Medical Institute and the National Institute of Health (ROI HL72304 and NHLBI Program for Genomic Applications UOI HL66880).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Molecular classification of cancer: class discovery and class prediction by gene expression monitoring</p>
            </title>
            <aug>
               <au>
                  <snm>Golub</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Slonim</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Tamayo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Huard</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gaasenbeek</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Coller</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Loh</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Downing</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Caligiuri</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Bloomfield</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>286</volume>
            <fpage>531</fpage>
            <lpage>537</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.286.5439.531</pubid>
                  <pubid idtype="pmpid" link="fulltext">10521349</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Systematic variation in gene expression patterns in human cancer cell lines</p>
            </title>
            <aug>
               <au>
                  <snm>Ross</snm>
                  <fnm>DT</fnm>
               </au>
               <au>
                  <snm>Scherf</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Perou</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Rees</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Spellman</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Iyer</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Jeffrey</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Van de Rijn</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Waltham</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pergamenschikov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>JCF</fnm>
               </au>
               <au>
                  <snm>Lashkari</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Shalon</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>TG</fnm>
               </au>
               <au>
                  <snm>Weinstein</snm>
                  <fnm>JN</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>24</volume>
            <fpage>227</fpage>
            <lpage>235</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/73432</pubid>
                  <pubid idtype="pmpid" link="fulltext">10700174</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications</p>
            </title>
            <aug>
               <au>
                  <snm>Sorlie</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Perou</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Aas</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Geisler</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Johnsen</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hastie</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>van de Rijn</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jeffrey</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Thorsen</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Quist</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Matese</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lonning</snm>
                  <fnm>PE</fnm>
               </au>
               <au>
                  <snm>Borresen-Dale</snm>
                  <fnm>AL</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2001</pubdate>
            <volume>98</volume>
            <fpage>10869</fpage>
            <lpage>10874</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">58566</pubid>
                  <pubid idtype="pmpid" link="fulltext">11553815</pubid>
                  <pubid idtype="doi">10.1073/pnas.191367098</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Gene expression profiling predicts clinical outcome of breast cancer</p>
            </title>
            <aug>
               <au>
                  <snm>Van't Veer</snm>
                  <fnm>LJ</fnm>
               </au>
               <au>
                  <snm>Dai</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>van de Vijver</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>YD</fnm>
               </au>
               <au>
                  <snm>Hart</snm>
                  <fnm>AAM</fnm>
               </au>
               <au>
                  <snm>Mao</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Peterse</snm>
                  <fnm>HL</fnm>
               </au>
               <au>
                  <snm>van der Kooy</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Marton</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Witteveen</snm>
                  <fnm>AT</fnm>
               </au>
               <au>
                  <snm>Schreiber</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>Kerkhoven</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Roberts</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Linsley</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Bernards</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Friend</snm>
                  <fnm>SH</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>415</volume>
            <fpage>530</fpage>
            <lpage>536</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/415530a</pubid>
                  <pubid idtype="pmpid" link="fulltext">11823860</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Drug target validation and identification of secondary drug target effects using DNA microarrays</p>
            </title>
            <aug>
               <au>
                  <snm>Marton</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>DeRisi</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Bennett</snm>
                  <fnm>HA</fnm>
               </au>
               <au>
                  <snm>Iyer</snm>
                  <fnm>VR</fnm>
               </au>
               <au>
                  <snm>Meyer</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Roberts</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Stoughton</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Burchard</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Slade</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Dai</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bassett</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Hartwell</snm>
                  <fnm>LH</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Friend</snm>
                  <fnm>SH</fnm>
               </au>
            </aug>
            <source>Nat Med</source>
            <pubdate>1998</pubdate>
            <volume>4</volume>
            <fpage>1293</fpage>
            <lpage>1301</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/3282</pubid>
                  <pubid idtype="pmpid" link="fulltext">9809554</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Pharmacogenetics and practice of medicine</p>
            </title>
            <aug>
               <au>
                  <snm>Roses</snm>
                  <fnm>AD</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>405</volume>
            <fpage>857</fpage>
            <lpage>865</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35015728</pubid>
                  <pubid idtype="pmpid" link="fulltext">10866212</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>A gene expression database for the molecular pharmacology of cancer</p>
            </title>
            <aug>
               <au>
                  <snm>Scherf</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Ross</snm>
                  <fnm>DT</fnm>
               </au>
               <au>
                  <snm>Waltham</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>LH</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>JK</fnm>
               </au>
               <au>
                  <snm>Tanabe</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Kohn</snm>
                  <fnm>KW</fnm>
               </au>
               <au>
                  <snm>Reinhold</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>TG</fnm>
               </au>
               <au>
                  <snm>Andrews</snm>
                  <fnm>DT</fnm>
               </au>
               <au>
                  <snm>Scudiero</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Sausville</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Pommier</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Weinstein</snm>
                  <fnm>JN</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>24</volume>
            <fpage>236</fpage>
            <lpage>244</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/73439</pubid>
                  <pubid idtype="pmpid" link="fulltext">10700175</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Molecular analysis of commensal host-microbial relationships in the intestine</p>
            </title>
            <aug>
               <au>
                  <snm>Hooper</snm>
                  <fnm>LV</fnm>
               </au>
               <au>
                  <snm>Wong</snm>
                  <fnm>MH</fnm>
               </au>
               <au>
                  <snm>Thelin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hansson</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Falk</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Gordon</snm>
                  <fnm>JI</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>291</volume>
            <fpage>881</fpage>
            <lpage>884</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.291.5505.881</pubid>
                  <pubid idtype="pmpid" link="fulltext">11157169</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>The plasticity of dendritic cell responses to pathogens and their components</p>
            </title>
            <aug>
               <au>
                  <snm>Huang</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Majewski</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Schulte</snm>
                  <fnm>LC</fnm>
               </au>
               <au>
                  <snm>Korn</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Hacohen</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>294</volume>
            <fpage>870</fpage>
            <lpage>875</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.294.5543.870</pubid>
                  <pubid idtype="pmpid" link="fulltext">11679675</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Integrated genomics and proteomics analyses of a systematically perturbed metabolic network</p>
            </title>
            <aug>
               <au>
                  <snm>Ideker</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Thorsson</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Ranish</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Christmas</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Buhler</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Eng</snm>
                  <fnm>JK</fnm>
               </au>
               <au>
                  <snm>Bumgarner</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Goodlett</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Aebersold</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hood</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>292</volume>
            <fpage>929</fpage>
            <lpage>934</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.292.5518.929</pubid>
                  <pubid idtype="pmpid" link="fulltext">11340206</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Specific ablation of Stat3 beta distorts the pattern of Stat3-responsive gene expression and impairs recovery from endotoxic shock</p>
            </title>
            <aug>
               <au>
                  <snm>Yoo</snm>
                  <fnm>J-Y</fnm>
               </au>
               <au>
                  <snm>Huso</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Nathans</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Desiderio</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2002</pubdate>
            <volume>108</volume>
            <fpage>331</fpage>
            <lpage>344</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0092-8674(02)00636-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">11853668</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Functional discovery via a compendium of expression profiles</p>
            </title>
            <aug>
               <au>
                  <snm>Hughes</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Marton</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Roberts</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Stoughton</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Armour</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Bennett</snm>
                  <fnm>HA</fnm>
               </au>
               <au>
                  <snm>Coffey</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Dai</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>YD</fnm>
               </au>
               <au>
                  <snm>Kidd</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>King</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Meyer</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Slade</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lum</snm>
                  <fnm>PY</fnm>
               </au>
               <au>
                  <snm>Stepaniants</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Shoemaker</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Gachotte</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Chakraburtty</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Simon</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bard</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Friend</snm>
                  <fnm>SH</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2000</pubdate>
            <volume>102</volume>
            <fpage>109</fpage>
            <lpage>126</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0092-8674(00)00015-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">10929718</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation</p>
            </title>
            <aug>
               <au>
                  <snm>Tamayo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Slonim</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Kitareewan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Dmitrovsky</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Golub</snm>
                  <fnm>TR</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <fpage>2907</fpage>
            <lpage>2912</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">15868</pubid>
                  <pubid idtype="pmpid" link="fulltext">10077610</pubid>
                  <pubid idtype="doi">10.1073/pnas.96.6.2907</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Clustering analysis and display of genome-wide expression patterns</p>
            </title>
            <aug>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Spellman</snm>
                  <fnm>PT</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>14863</fpage>
            <lpage>14868</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">24541</pubid>
                  <pubid idtype="pmpid" link="fulltext">9843981</pubid>
                  <pubid idtype="doi">10.1073/pnas.95.25.14863</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>The Institute for Genomic Research (TIGR) software download page</p>
            </title>
            <url>http://www.tigr.org/software/tm4/</url>
         </bibl>
         <bibl id="B16">
            <title>
               <p>GeneSpring, a product from Silicon Genetics Inc</p>
            </title>
            <url>http://www.silicongenetics.com</url>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Principal component analysis for clustering gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Yeung</snm>
                  <fnm>KY</fnm>
               </au>
               <au>
                  <snm>Ruzzo</snm>
                  <fnm>WL</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>763</fpage>
            <lpage>774</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/17.9.763</pubid>
                  <pubid idtype="pmpid" link="fulltext">11590094</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Significance analysis of microarrays applied to ionizing radiation response</p>
            </title>
            <aug>
               <au>
                  <snm>Tusher</snm>
                  <fnm>VG</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Chu</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2001</pubdate>
            <volume>98</volume>
            <fpage>5116</fpage>
            <lpage>5121</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">33173</pubid>
                  <pubid idtype="pmpid" link="fulltext">11309499</pubid>
                  <pubid idtype="doi">10.1073/pnas.091062498</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>The KEGG databases at GenomeNet</p>
            </title>
            <aug>
               <au>
                  <snm>Kanehisa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Goto</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kawashima</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Nakaya</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>42</fpage>
            <lpage>46</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99091</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752249</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.42</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Integrated pathway-genome databases and their role in drug discovery</p>
            </title>
            <aug>
               <au>
                  <snm>Karp</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Krummenacker</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Paley</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wagg</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Trends Biotechnol</source>
            <pubdate>1999</pubdate>
            <volume>17</volume>
            <fpage>275</fpage>
            <lpage>281</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0167-7799(99)01316-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">10370234</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>The MetaCyc Database</p>
            </title>
            <aug>
               <au>
                  <snm>Karp</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Riley</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Paley</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Pellegrini-Toole</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>59</fpage>
            <lpage>61</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99148</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752254</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.59</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Genome-scale gene expression analysis and pathway reconstruction in KEGG</p>
            </title>
            <aug>
               <au>
                  <snm>Nakao</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bono</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kawashima</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kamiya</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sato</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Goto</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kanehisa</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Inform Ser Workshop</source>
            <pubdate>1999</pubdate>
            <volume>10</volume>
            <fpage>94</fpage>
            <lpage>103</lpage>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Pathway studio &#8211; the analysis and navigation of molecular networks</p>
            </title>
            <aug>
               <au>
                  <snm>Nikitin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Egorov</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Daraselia</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Mazo</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>1</fpage>
            <lpage>3</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1093/bioinformatics/btg290</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>PathArt, a product of Jubilant Biosys Ltd</p>
            </title>
            <url>http://www.jubilantbiosys.com</url>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Ingenuity Pathways Analysis tool, a product of Ingenuity Systems Inc</p>
            </title>
            <url>http://www.ingenuity.com</url>
         </bibl>
         <bibl id="B26">
            <title>
               <p>MetaCore, a product of GeneGO Inc</p>
            </title>
            <url>http://www.genego.com</url>
         </bibl>
         <bibl id="B27">
            <title>
               <p>GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways</p>
            </title>
            <aug>
               <au>
                  <snm>Dahlquist</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Salomonis</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Vranizan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Lawlor</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Conklin</snm>
                  <fnm>BR</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2002</pubdate>
            <volume>31</volume>
            <fpage>19</fpage>
            <lpage>20</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng0502-19</pubid>
                  <pubid idtype="pmpid" link="fulltext">11984561</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Cytoscape: A software environment for integrated models of biomolecular interaction networks</p>
            </title>
            <aug>
               <au>
                  <snm>Shannon</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Markiel</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ozier</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Baliga</snm>
                  <fnm>NS</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Ramage</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Amin</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Schwikowski</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Ideker</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>2498</fpage>
            <lpage>2504</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">403769</pubid>
                  <pubid idtype="pmpid" link="fulltext">14597658</pubid>
                  <pubid idtype="doi">10.1101/gr.1239303</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Pathway Processor: a tool for integrating whole-genome expression results into metabolic networks</p>
            </title>
            <aug>
               <au>
                  <snm>Grosu</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Townsend</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Hartl</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Cavalieri</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>1121</fpage>
            <lpage>1126</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">186628</pubid>
                  <pubid idtype="pmpid" link="fulltext">12097350</pubid>
                  <pubid idtype="doi">10.1101/gr.226602</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Visualizing metabolic activity on a genome-wide scale</p>
            </title>
            <aug>
               <au>
                  <snm>Luyf</snm>
                  <fnm>ACM</fnm>
               </au>
               <au>
                  <snm>de Cast</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>van Kampen</snm>
                  <fnm>AHC</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>813</fpage>
            <lpage>818</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.6.813</pubid>
                  <pubid idtype="pmpid" link="fulltext">12075016</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>National Center for Biotechnology information (NCBI) website</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov</url>
         </bibl>
         <bibl id="B32">
            <title>
               <p>UniProt/SwissProt Knowledgebase Home Page</p>
            </title>
            <url>http://us.expasy.org/sprot/</url>
         </bibl>
         <bibl id="B33">
            <title>
               <p>The genetic association database</p>
            </title>
            <aug>
               <au>
                  <snm>Becker</snm>
                  <fnm>GK</fnm>
               </au>
               <au>
                  <snm>Barnes</snm>
                  <fnm>CK</fnm>
               </au>
               <au>
                  <snm>Bright</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>SA</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2004</pubdate>
            <volume>36</volume>
            <fpage>431</fpage>
            <lpage>432</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng0504-431</pubid>
                  <pubid idtype="pmpid" link="fulltext">15118671</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>MedGene Database, a database from Institute of Proteomics, Harvard Medical School</p>
            </title>
            <url>http://hipseq.med.harvard.edu/MEDGENE/</url>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Kyoto Encyclopedia of Genes and Genomes (KEGG) home page</p>
            </title>
            <url>http://www.genome.ad.jp/kegg</url>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Biocarta Pathway Collections</p>
            </title>
            <url>http://www.biocarta.com/genes/allPathways.asp</url>
         </bibl>
         <bibl id="B37">
            <title>
               <p>CGAP, the Cancer Genomes Anatomy Project home page</p>
            </title>
            <url>http://cgap.nci.nih.gov/</url>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Gene Ontology Consortium home page</p>
            </title>
            <url>http://www.geneontology.org</url>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Gene Ontology: tool for the unification of biology</p>
            </title>
            <aug>
               <au>
                  <snm>Ashburner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ball</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Blake</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Butler</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Cherry</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Dolinski</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dwight</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Eppig</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Harris</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Hill</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Issel-Tarver</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Kasarskis</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lewis</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Matese</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Richardson</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Ringwald</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Sherlock</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>25</volume>
            <fpage>25</fpage>
            <lpage>29</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/75556</pubid>
                  <pubid idtype="pmpid" link="fulltext">10802651</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Creating the Gene Ontology Resource: design and implement</p>
            </title>
            <aug>
               <au>
                  <cnm>The Gene Ontology Consortium</cnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>11</volume>
            <fpage>1425</fpage>
            <lpage>1433</lpage>
         </bibl>
         <bibl id="B41">
            <title>
               <p>WPS web pages for illustration image files and demo movies:</p>
            </title>
            <url>http://www.abcc.ncifcrf.gov/wps/wps_demo.php</url>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Stanford format description:</p>
            </title>
            <url>http://www.tm4.org/stanford_file_description.pdf</url>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Scalable Vector Graphics (SVG) specification</p>
            </title>
            <note><url>http://www.w3.org/TR/SVG/</url> and <b>Graphviz open source graph visualization technology </b><url>http://www.graphviz.org</url></note>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Identifying biological themes within lists of genes with EASE</p>
            </title>
            <aug>
               <au>
                  <snm>Hosack</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Dennis Jr</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Sherman</snm>
                  <fnm>BT</fnm>
               </au>
               <au>
                  <snm>Lane</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Lempicki</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Genome Biology</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <fpage>R70.1</fpage>
            <lpage>R70.8.</lpage>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Overexpression of ABCG5 and ABCG8 promotes biliary cholesterol secretion and reduces fractional absorption of dietary cholesterol</p>
            </title>
            <aug>
               <au>
                  <snm>Yu</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Li-Hawkins</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hammer</snm>
                  <fnm>RE</fnm>
               </au>
               <au>
                  <snm>Berge</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Horton</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Cohen</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Hobbs</snm>
                  <fnm>HH</fnm>
               </au>
            </aug>
            <source>J Clin Invest</source>
            <pubdate>2002</pubdate>
            <volume>110</volume>
            <fpage>671</fpage>
            <lpage>680</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">151111</pubid>
                  <pubid idtype="pmpid" link="fulltext">12208868</pubid>
                  <pubid idtype="doi">10.1172/JCI200216001</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Disruption of Abcg5 and Abcg8 in mice reveals their crucial role in biliary cholesterol secretion</p>
            </title>
            <aug>
               <au>
                  <snm>Yu</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Hammer</snm>
                  <fnm>RE</fnm>
               </au>
               <au>
                  <snm>Li-Hawkins</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Von Bergmann</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Lutjohann</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Cohen</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Hobbs</snm>
                  <fnm>HH</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>16237</fpage>
            <lpage>16242</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">138595</pubid>
                  <pubid idtype="pmpid" link="fulltext">12444248</pubid>
                  <pubid idtype="doi">10.1073/pnas.252582399</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Accumulation of dietary cholesterol in sitosterolemia caused by mutations in adjacent ABC transporters</p>
            </title>
            <aug>
               <au>
                  <snm>Berge</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Tian</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Graf GA Yu</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Grishin</snm>
                  <fnm>NV</fnm>
               </au>
               <au>
                  <snm>Schultz</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kwiterovich</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Shan</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Barnes</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hobbs</snm>
                  <fnm>HH</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>290</volume>
            <fpage>1771</fpage>
            <lpage>1775</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.290.5497.1771</pubid>
                  <pubid idtype="pmpid" link="fulltext">11099417</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>The bHLH gene Hes6, an inhibitor of Hes1, promotes neuronal differentiation</p>
            </title>
            <aug>
               <au>
                  <snm>Bae</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bessho</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hojo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kageyama</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Development</source>
            <pubdate>2000</pubdate>
            <volume>127</volume>
            <fpage>2933</fpage>
            <lpage>2943</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10851137</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>A FYVE-finger-containing protein, Rabip4, is a Rab4 effector involved in early endosomal traffic</p>
            </title>
            <aug>
               <au>
                  <snm>Cormont</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mari</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Galmiche</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hofman</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Le Marchand-Brustel</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2001</pubdate>
            <volume>98</volume>
            <fpage>1637</fpage>
            <lpage>1642</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">29309</pubid>
                  <pubid idtype="pmpid" link="fulltext">11172003</pubid>
                  <pubid idtype="doi">10.1073/pnas.031586998</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Induction of SM-20 in PC12 cells leads to increased cytochrome c levels, accumulation of cytochrome c in the cytosol and caspase-dependent cell death</p>
            </title>
            <aug>
               <au>
                  <snm>Straub</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Lipscomb</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Yoshida</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Freeman</snm>
                  <fnm>RS</fnm>
               </au>
            </aug>
            <source>J Neurochem</source>
            <pubdate>2003</pubdate>
            <volume>85</volume>
            <fpage>318</fpage>
            <lpage>328</lpage>
            <xrefbib>
               <pubid idtype="pmpid">12675908</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Serial analysis of gene expression</p>
            </title>
            <aug>
               <au>
                  <snm>Velculescu</snm>
                  <fnm>VE</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Vogelstein</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kinzler</snm>
                  <fnm>KM</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1995</pubdate>
            <volume>270</volume>
            <fpage>484</fpage>
            <lpage>487</lpage>
            <xrefbib>
               <pubid idtype="pmpid">7570003</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>EASE download page</p>
            </title>
            <url>http://david.niaid.nih.gov/david/ease.htm</url>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Fatigo, Data mining tool with Gene Ontology from Spanish National Cancer Center</p>
            </title>
            <url>http://fatigo.bioinfo.cnio.es</url>
         </bibl>
         <bibl id="B54">
            <title>
               <p>Gominer, a Gene Ontology-based tool for biological interpretation of "omic" data from National Cancer Institute</p>
            </title>
            <url>http://discover.nci.nih.gov/gominer/index.jsp</url>
         </bibl>
         <bibl id="B55">
            <title>
               <p>Computational prediction of human metabolic pathways from the complete human genome</p>
            </title>
            <aug>
               <au>
                  <snm>Romero</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Wagg</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Kaiser</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Krummenacker</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Karp</snm>
                  <fnm>PD</fnm>
               </au>
            </aug>
            <source>Genome Biology</source>
            <pubdate>2004</pubdate>
            <volume>6</volume>
            <fpage>R2.1</fpage>
            <lpage>R2.17</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1186/gb-2004-6-1-r2</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Systematic interpretation of genetic interactions using protein networks</p>
            </title>
            <aug>
               <au>
                  <snm>Kelley</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ideker</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nat Biotech</source>
            <pubdate>2005</pubdate>
            <volume>23</volume>
            <fpage>561</fpage>
            <lpage>566</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1038/nbt1096</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Metabolic pathway analysis web service (Pathway Hunter Tool at CUBIC)</p>
            </title>
            <aug>
               <au>
                  <snm>Rahman</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Advani</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Schunk</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Schrader</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Schomburg</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>1189</fpage>
            <lpage>1193</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti116</pubid>
                  <pubid idtype="pmpid" link="fulltext">15572476</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Genetic network analyzer: qualitative simulation of genetic regulatory network</p>
            </title>
            <aug>
               <au>
                  <snm>de Jong</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Geiselmann</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hernandez</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Page</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>336</fpage>
            <lpage>344</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btf851</pubid>
                  <pubid idtype="pmpid" link="fulltext">12584118</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>Topnet &#8211; An application for interactive analysis of expression data and biological networks</p>
            </title>
            <aug>
               <au>
                  <snm>Hanisch</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Sohler</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Zimmer</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <fpage>1470</fpage>
            <lpage>1471</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bth096</pubid>
                  <pubid idtype="pmpid" link="fulltext">14962941</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>The yeast cell-cycle network is robustly designed</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Long</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Ouyang</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Tang</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2004</pubdate>
            <volume>101</volume>
            <fpage>4781</fpage>
            <lpage>4786</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">387325</pubid>
                  <pubid idtype="pmpid" link="fulltext">15037758</pubid>
                  <pubid idtype="doi">10.1073/pnas.0305937101</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Segal</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Shapira</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Regev</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pe'er</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Koller</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2003</pubdate>
            <volume>34</volume>
            <fpage>166</fpage>
            <lpage>176</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12740579</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>Statistical significance for genomewide studies</p>
            </title>
            <aug>
               <au>
                  <snm>Storey</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <fpage>9440</fpage>
            <lpage>9445</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">170937</pubid>
                  <pubid idtype="pmpid" link="fulltext">12883005</pubid>
                  <pubid idtype="doi">10.1073/pnas.1530509100</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>Download website for WholePathwayScope (WPS)</p>
            </title>
            <url>http://www.abcc.ncifcrf.gov/wps/wps_index.php</url>
         </bibl>
      </refgrp>
   </bm>
</art>
