<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>1471-2105-7-S2-S23</ui>
	<ji>1471-2105</ji>
	<fm>
		<dochead>Proceedings</dochead>
		<bibl>
			<title>
				<p>GOFFA: Gene Ontology For Functional Analysis &#8211; A FDA Gene Ontology Tool for Analysis of Genomic and Proteomic Data</p>
			</title>
			<aug>
				<au id="A1">
					<snm>Sun</snm>
					<fnm>Hongmei</fnm>
					<insr iid="I1"/>
					<email>hongmei.sun@fda.hhs.gov</email>
				</au>
				<au id="A2">
					<snm>Fang</snm>
					<fnm>Hong</fnm>
					<insr iid="I1"/>
					<email>hong.fang@fda.hhs.gov</email>
				</au>
				<au id="A3">
					<snm>Chen</snm>
					<fnm>Tao</fnm>
					<insr iid="I2"/>
					<email>tao.chen@fda.hhs.gov </email>
				</au>
				<au id="A4">
					<snm>Perkins</snm>
					<fnm>Roger</fnm>
					<insr iid="I1"/>
					<email>roger.perkins@fda.hhs.gov</email>
				</au>
				<au id="A5" ca="yes">
					<snm>Tong</snm>
					<fnm>Weida</fnm>
					<insr iid="I2"/>
					<email>weida.tong@fda.hhs.gov</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Z-tech Corporation, 3900 NCTR Road, Jefferson, Arkansas, 72079 USA</p>
				</ins>
				<ins id="I2">
					<p>National Center for Toxicological Research, Food and Drug Administration, 3900 NCTR Road, Jefferson, Arkansas, 72079 USA</p>
				</ins>
			</insg>
			<source>BMC Bioinformatics</source>
			<supplement>
				<title>
					<p>Third Annual MCBIOS Conference. Bioinformatics: A Calculated Discovery</p>
				</title>
				<editor>Jonathan D Wren (Senior Editor), Stephen Winters-Hilt, Yuriy Gusev, Andrey Ptitsyn</editor>
				<note>Proceedings</note>
				<url>http://www.mcbios.org</url>
			</supplement>
			<conference>
				<title>
					<p>Third Annual MidSouth Computational Biology and Bioinformatics Society (MCBIOS) Conference. Bioinformatics: A Calculated Discovery</p>
				</title>
				<location>Baton Rouge, LA, USA</location>
				<date-range>2&#8211;4 March, 2006</date-range>
			</conference>
			<issn>1471-2105</issn>
			<pubdate>2006</pubdate>
			<volume>7</volume>
			<issue>Suppl 2</issue>
			<fpage>S23</fpage>
			<xrefbib>
				<pubidlist><pubid idtype="pmpid">17118145</pubid><pubid idtype="doi">10.1186/1471-2105-7-S2-S23</pubid>
				</pubidlist></xrefbib>
		</bibl>
		<history>
			<pub>
				<date>
					<day>26</day>
					<month>9</month>
					<year>2006</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2006</year>
			<collab>Sun et al; licensee BioMed Central Ltd.</collab>
			<note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
		</cpyrt>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p>Gene Ontology (GO) characterizes and categorizes the functions of genes and their products according to biological processes, molecular functions and cellular components, facilitating interpretation of data from high-throughput genomics and proteomics technologies. The most effective use of GO information is achieved when its rich and hierarchical complexity is retained and the information is distilled to the biological functions that are most germane to the phenomenon being investigated.</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st>
					<p>Here we present a FDA GO tool named Gene Ontology for Functional Analysis (GOFFA). GOFFA first ranks GO terms in the order of prevalence for a list of selected genes or proteins, and then it allows the user to interactively select GO terms according to their significance and specific biological complexity within the hierarchical structure. GOFFA provides five interactive functions (Tree view, Terms View, Genes View, GO Path and GO TreePrune) to analyze the GO data. Among the five functions, GO Path and GO TreePrune are unique. The GO Path simultaneously displays the ranks that order GOFFA Tree Paths based on statistical analysis. The GO TreePrune provides a visual display of a reduced GO term set based on a user's statistical cut-offs. Therefore, the GOFFA visual display can provide an intuitive depiction of the most likely relevant biological functions.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusion</p>
					</st>
					<p>With GOFFA, the user can dynamically interact with the GO data to interpret gene expression results in the context of biological plausibility, which can lead to new discoveries or identify new hypotheses.</p>
				</sec>
				<sec>
					<st>
						<p>Availability</p>
					</st>
					<p>GOFFA is available through ArrayTrack software</p>
					<p><url>http://edkb.fda.gov/webstart/arraytrack/</url>.</p>
				</sec>
			</sec>
		</abs>
	</fm>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>DNA microarray technology is a key application in pharmaco- and toxicogenomics, a field identified in the U.S. Food and Drug Administration (FDA) Critical Path Initiative <url>http://www.fda.gov/oc/initiatives/criticalpath/</url> as a major opportunity for advancing medical product development and personalized medicine. It is expected that the review of microarray-based medical devices and microarray data will become an essential regulatory responsibility for the FDA. A single microarray experiment generates a large volume of data and the <it>management</it>, <it>analysis </it>and <it>interpretation </it>of this data challenge both sponsors and regulatory reviewers. Realizing that the integration of these three essential components into one single application will help to realize the full value of this exciting technology, FDA's National Center for Toxicological Research(NCTR/FDA) developed ArrayTrack<abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>, a FDA free bioinformatics resource providing an integrated solution to manage, analyze, and interpret microarray data and the extension to systems biology data. ArrayTrack has been utilized by FDA for the review of genomic data submissions <url>http://www.fda.gov/cder/guidance/6400fnl.pdf</url>.</p>
			<p>The primary emphasis of ArrayTrack is the direct linking of analysis results with functional information for facilitating the interaction between the choice of analysis methods and the biological relevance of analysis results. By selecting one of the analysis methods, the ArrayTrack user can directly link analysis results with functional information such as biological pathways and gene ontology. GOFFA (<ul>G</ul>ene <ul>O</ul>ntology <ul>F</ul>or <ul>F</ul>unctional <ul>A</ul>nalysis) is the primary biological interpretation tool using Gene Ontology (GO) <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp> in ArrayTrack.</p>
			<p>GO contains a complex and rich information, posing a challenge in developing statistical and visualization tools to effectively/efficiently utilize and present the information. Many approaches have been investigated to facilitate interpretation of gene expression data using the GO resource <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>. Most freely available GO tools are documented on the GO website <url>http://www.geneontology.org/GO.tools.microarray</url>. These tools are useful to browse and view the GO context when interpreting genomic and proteomic data. However, some do not provide text-annotated GO tree structures (e.g., GoSurfer1.1), or do not retain the fundamental GO hierarchical tree structure (e.g., GoStat, EASE, Onto-Express), or are only microarray specific (e.g., Ontology Traverser), or has operating system dependency limitation (e.g., GOSurfer1.1). Khatri et al <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, Zeeberg et al <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> and Zhang et al <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> did extensive comparisons of various GO-based tools.</p>
			<p>Statistical analysis and visualization capabilities are the most important components of any GO tool. Statistical analysis is focused on determining the significant or enriched GO terms. The hypergeometric distribution <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>, chi-square <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> and Fisher's exact test <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> are three most commonly used enrichment methods. Recently, the Relative Enrichment Factor is also introduced by Zeeberg et al <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. Reducing GO information to a comprehendible subset based only on statistics alone is unsatisfactory without the aid of visualization. Thus, visualization of the GO hierarchy becomes another important part of the functionality for a useful GO tool. It is highly desirable that a complex query can be directly made to the visually displayed tree to fully integrate statistics and visualization for efficiently mining the GO data.</p>
			<p>Here we report the GOFFA software that is designed to further the ability of utilizing GO for interpreting microarray data. GOFFA provides most commonly used statistical functions in an interactive and user-friendly environment. Two effective functions in particular, GO Path and GO TreePrune, were implemented in GOFFA. Unlike other statistical methods that consider each GO term separately by ignoring the hierarchical nature of GO in the enrichment analysis, GO Path identifies the significant terms based on the GO hierarchical tree path using the Fisher's inverse Chi-Squared method <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>. GO TreePrune is an interactive tool providing statistical means to adjust and reduce the complexity of GO hierarchical tree information in the form of the node-like visualization. As an integrated component of ArrayTrack, GOFFA has been used in the FDA to interpret both genomic and proteomic data submitted by the sponsors through the Voluntary Genomics Data Submission (VGDS) mechanism <url>http://www.fda.gov/cder/genomics/VGDS.htm</url>.</p>
		</sec>
		<sec>
			<st>
				<p>Methods</p>
			</st>
			<p>GOFFA's core programming is based on the client-server model. The client is written in JAVA, runs on platforms with the Java run-time environment 1.4 or higher. The server is ORACLE. GOFFA is an integrated component of ArrayTrack, but also can be operated as an independent tool. Figure <figr fid="F1">1</figr> shows the program logical structure.</p>
			<fig id="F1">
				<title>
					<p>Figure 1</p>
				</title>
				<caption>
					<p>Schematic overview of GOFFA's data flow</p>
				</caption>
				<text>
					<p>Schematic overview of GOFFA's data flow. GO terms from the Gene Ontology project and gene identifiers from the Entrez Gene databases are combined and linked in the GOFFA database. Lists of genes or proteins from an experiment are analyzed by five functional modules, Tree View, Terms View, Genes View, GO Path and GO TreePrune.</p>
				</text>
				<graphic file="1471-2105-7-S2-S23-1"/>
			</fig>
			<sec>
				<st>
					<p>GOFFA Database</p>
				</st>
				<p>GOFFA uses an ORACLE database containing the GO project data together with gene identifiers for human, mouse and rat from the NCBI Entrez gene database <url>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene</url>. The database currently contains 16,389 mouse, 11,934 human and 11,599 rat genes. The genes from these three species can also be combined in analyses using the "cross-products" feature, where the same gene symbols from human, mouse and rat (regardless of case) are considered to share the same functional annotation in GO; in this case, the GOFFA database contains 26,564 unique gene symbols for cross species annotation.</p>
			</sec>
			<sec>
				<st>
					<p>GOFFA Tree</p>
				</st>
				<p>Data from the GO website are downloaded in a structure called a directed acyclic graph (DAG), a name that denotes an unclosed structure where a particular child node associated with a GO term can have multiple parent nodes. GOFFA converts the DAG structure to a tree structure by constructing distinct paths from the highest parent node (least specific), successively down through progeny to the lowest (most specific) child node. In converting the data, GOFFA maintains the GO database's so-called true path rule by assuring that a gene product GO term applicable to a child node also applies to all parent terms. Thus, during the conversion to a tree structure, the DAG structure for each GO term can become many separate traversals from highest parent to lowest child. Each such traversal in GOFFA is called a GOFFA Tree Path, and each node along a GOFFA Tree Path is assigned a unique identification called a GOFFA ID. Consequently, the same GO term occurring in different GOFFA Tree Paths has a distinct GOFFA ID in each path. The restructuring of GO information in the GOFFA Tree Path format not only markedly speed up database queries but, most importantly, enable developing two unique utilities, GO Path and GO TreePrune (more in Results).</p>
			</sec>
			<sec>
				<st>
					<p>Statistical Analysis</p>
				</st>
				<p>A fundamental step in analyzing DNA microarray data is to determine the differentially expressed genes (DEGs) for subsequent biological interpretation. GOFFA applies three statistical approaches to determine the significant GO terms for a given list of DEGs, two previously reported methods and one novel approach:</p>
				<p>&#8226; Fisher's Exact Test &#8211; A right-sided Fisher's Exact Test <url>http://www.matforsk.no/ola/fisher.htm</url> is used to estimate the statistical significance of GO term <it>i</it>. Four lists of genes are needed to calculate the significance (i.e., p-value): The number of inputted genes (<it>M</it>) <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, the subset of <it>M </it>genes that belong to GO term <it>i </it>(<it>m</it><sub><it>i</it></sub>), the set of reference genes (<it>N</it>) and the subset of <it>N </it>genes that belong to GO term <it>i (<sub><it>ni</it></sub>)</it>. The accuracy of p-value is largely dependent on the choice of the set of reference genes. There are two options in GOFFA to determine <it>N</it>, depending on whether the genes derived from a known gene or not. For a known microarray chip, <it>N </it>is the total number of genes on the chip; in this case, p-value less than 0.01 normally is indicative of a statistically significant finding. GOFFA provides information associated with most commercial array platforms, including most GeneChip platforms from Affymetrix, most one- and two-channel array platforms from Agilent, as well as numerous other array platforms such as those from GE HealthCare CodeLink, Illumina BeadArray, and Applied Biosystems arrays, etc. If the microarray chip's genes are unknown, the total number of genes in the GOFFA database is assigned as <it>N</it>. In this case, the choice of <it>N </it>is dependent on the selected species, currently, 16,389 genes for mouse, 11,934 for human, 11,599 for rat, and 26,564 if combining all three species. Thus, the selection of <it>N </it>is an important factor to interpret p-value.</p>
				<p>&#8226; Relative Enrichment Factor &#8211; GOFFA also calculates the Relative Enrichment Factor (E) for assessing the significance of GO term <it>i </it>for a given list of DEGs <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. The E-value is calculated as:</p>
				<p>E = (<it>m</it><sub><it>i</it></sub>/<it>M</it>)/(<it>n</it><sub><it>i</it></sub>/<it>N</it>) &#160;&#160;&#160; (1)</p>
				<p>where <it>m</it><sub><it>i</it></sub>, <it>M</it>,<it> n</it><sub><it>i </it></sub>and <it>N </it>are defined the same as for Fisher's Exact Test described in the preceding paragraph. E provides a direct measure of the prevalence of a GO term <it>i </it>among the <it>M </it>significant genes compared to the prevalence of the same GO term <it>i </it>among <it>N </it>total genes. Accordingly, E = 1.0 corresponds to GO term <it>i </it>occurring among the DEGs at the same prevalence as among the <it>N </it>total genes. E = 2.0 indicates that GO term <it>i </it>occurring in the DEGs two times more than occurring in the <it>N </it>total genes, indicating significant findings.</p>
				<p>&#8226; GOFFA Tree Path Ranking &#8211; Criteria based on Fisher's Exact Test and/or Relative Enrichment Factor sometimes fail to sufficiently condense and clarify results for effective interpretation, especially for large lists of significant genes. This provided the motivation of developing this unique function in GOFFA. The method applied the Fisher's inverse Chi-Squared method <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp> to sort GOFFA Tree Paths in accordance with their likely significance, and then renders an interactive graphic display for visualization and interpretation. The Fisher's inverse Chi-Square method uses the fact that given a uniform distributed p-value, -2log(<it>p</it>) has a chi-square distribution with two degrees of freedom, and hence the statistic</p>
				<p>
					<graphic file="1471-2105-7-S2-S23-i1.gif"/>
				</p>
				<p>follows a chi-square distribution with 2<it>K </it>degrees of freedom when the joint null is true. In our case, <it>p</it><sub><it>k </it></sub>is the Fisher's Exact Test probability value of GO term <it>k </it>and <it>K </it>is a total number of GO terms within the traverse of the GOFFA Tree Path from the upper level of the tree down to GO term <it>i</it>. Thus, <it>R</it><sub><it>i </it></sub>is a relative metric of the prevalence of a GOFFA Tree Path from the upper level to the level GO term <it>i </it>belongs, given that the <it>p</it><sub><it>k </it></sub>values are known for each GO term on the path. The greater the value of <it>R</it><sub><it>i</it></sub>, the less likely it is that the significance of a GOFFA Tree Path is a chance occurrence.</p>
			</sec>
			<sec>
				<st>
					<p>Availability</p>
				</st>
				<p>GOFFA is available through ArrayTrack software <url>http://edkb.fda.gov/webstart/arraytrack/</url>.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Results</p>
			</st>
			<sec>
				<st>
					<p>GOFFA Features</p>
				</st>
				<p>The GOFFA's software GUI, shown in Figure <figr fid="F2">2</figr>, has three panels with different functions that are designed for intuitive and interactive use. The left panel (labeled 1) is for queries, the center panel (labeled 2) for tabular and/or graphical displays of and for interaction with the GO information, and the right panel (labeled 3) lists the individual genes associated with the GO information presented in the center panel.</p>
				<fig id="F2">
					<title>
						<p>Figure 2</p>
					</title>
					<caption>
						<p>GOFFA interface and Tree Window &#8211; The GOFFA interface contains three panels: the left panel (labeled 1) is for queries, the center panel (labeled 2) for tabular and/or graphical displays of and for interaction with the GO information, and the right panel (labeled 3) lists the individual genes associated with the GO information presented in the center panel</p>
					</caption>
					<text>
						<p>GOFFA interface and Tree Window &#8211; The GOFFA interface contains three panels: the left panel (labeled 1) is for queries, the center panel (labeled 2) for tabular and/or graphical displays of and for interaction with the GO information, and the right panel (labeled 3) lists the individual genes associated with the GO information presented in the center panel. The displayed Tree Window in the center panel is the default view of GOFFA, which enables the hierarchical display of the GO terms in a outline-like tree format; p- and E-values as well as the number of genes are also displayed for each GO term. E-values &gt;1 are shown in green and those &lt;1 in red, respectively denoting greater or lesser prevalence, respectively, of the GO term in the inputted gene list rather than in the overall experimental platform. The user can query the tree by GO term, gene name/symbol, p-value, E-value and in combination with functions below the view. The query-match GO terms are highlighted as blue.</p>
					</text>
					<graphic file="1471-2105-7-S2-S23-2"/>
				</fig>
				<p>Queries are initiated in the left panel by pasting DEG ID's into the query window, one gene per line. The input gene ID's must correspond to the "Select data type" option chosen by the user. Currently, GOFFA supports four types of gene identifiers: (1) GenBank Accession number, (2) Unigene ID, (3) LocusLink ID (or Entrez Gene ID) and (4) Gene Name. In addition, GOFFA supports two protein identifiers, IPI ID (EBI International Protein Index database) and Swiss-Prot accession number for proteomics data analysis. The GOFFA database currently contains GO annotation data for 105 microarray platforms that, with the "Select array type" option, is coupled with the GO analysis and available for display. Query results are displayed in the center panel in five interactive viewing windows, Tree, Terms, Genes, GO Path and GO TreePrune, that are activated with tabs at the top of the panel. These five windows provide the means for applying and iteratively re-applying statistical operators to the inputted (DEG) gene list, viewing statistical results, and viewing the results of GOFFA's Tree, GO Path, and GO PruneTree analysis. The data and results within both tables and plots are synchronized components, enabling mouse-click toggling between window views. For example, genes associated with GO terms that are selected through mouse clicks in the GOFFA Tree (panel 2, Figure <figr fid="F2">2</figr>) are displayed as a list in the right panel (panel 3, Figure <figr fid="F2">2</figr>).</p>
				<p>The user can toggle between the center panel's five windows (panel 2, Figure <figr fid="F2">2</figr>), providing another level of iterative interactivity. Each window either displays information differently, or displays different information related to the inputted genes:</p>
				<p>&#8226; Tree window &#8211; The Tree window is the default viewer that is launched after a search, and appears in the center panel. As shown in Figure <figr fid="F2">2</figr>, the Tree window displays GO terms in an outline-like hierarchical tree format (conventional view). The number of associated genes, the Relative Enrichment Factor (E-value), and the p-value from the Fisher's Exact Test are displayed for each GO Term at each GO hierarchical level. Since query results can form an extensive list, a flexible search capability is provided below the tree display. The user can search the tree by GO term, gene name/symbol, p-value, E-value, and their combination, and search results are then highlighted in blue within the display for easy location with associated gene (s) listed in the right panel.</p>
				<p>&#8226; Terms and Genes windows &#8211; These two windows provide alternative, tabular presentations of the information contained in the tree window (Figure <figr fid="F3">3</figr>). Whereas the Tree window combines the three categories of GO path information, both the Terms window (Figure <figr fid="F3">3a</figr>) and Genes window (Figure <figr fid="F3">3b</figr>) separately display Molecular Function, Biological Process and Cellular Component category information, as chosen with a tab, and presents it in an excel-like spreadsheet format. As indicated by their names, the Terms window aggregates information by individual GO term, whereas the Genes window aggregates information by individual gene. Both windows display results of statistical operators (p-value and E-value). The Terms window displays the number of significant genes associated with each GO term, as well as the average hierarchical level at which the gene appears in the GO term. Tables in both windows can be sorted in either ascending or descending order of any column, and can be cut and pasted or exported to external software for further analysis.</p>
				<fig id="F3">
					<title>
						<p>Figure 3</p>
					</title>
					<caption>
						<p>Terms and Genes Windows &#8211; The Terms Window (A) and Genes Window (B) summarize the findings associated with GO terms and genes respectively in the tabular format along with various statistical parameters (e.g., p- and E-values)</p>
					</caption>
					<text>
						<p>Terms and Genes Windows &#8211; The Terms Window (A) and Genes Window (B) summarize the findings associated with GO terms and genes respectively in the tabular format along with various statistical parameters (e.g., p- and E-values). Each View contains three tables corresponding to three categories of GO (molecular functions, biological processes and cellular components). The table can be sorted in every column by clicking on the column header. Sorting on multiple columns is also supported (pressing Ctrl key while clicking on the second column header for sorting). Both copy/paste and export functions are available to transfer data to external software.</p>
					</text>
					<graphic file="1471-2105-7-S2-S23-3"/>
				</fig>
				<p>&#8226; GO Path window &#8211; The GOFFA GO Path plot (Figure <figr fid="F4">4</figr>) visually presents the GOFFA Tree Paths estimated as the most relevant by equation 2. The GOFFA algorithm first rank-orders all GOFFA Tree Paths using equation 2 values, and then plots the 10 paths with the highest values, with the X-axis corresponding to descending hierarchical tree level, and the Y-axis corresponding to the log p value at each hierarchical level (Figure <figr fid="F4">4</figr>). Double clicking any GOFFA Tree Path in the graph or its color key located below the graph will launch a Tree window view (Figure <figr fid="F2">2</figr>, panel 2) with the GO terms corresponding to the GOFFA Tree Path highlighted in blue for easy recognition. The GO Path visualization could be considered as a condensed rendering of the most salient GO information associated with the DEG's data.</p>
				<fig id="F4">
					<title>
						<p>Figure 4</p>
					</title>
					<caption>
						<p>GO Path &#8211; GO Path sorts, by descending statistical significance based on an inverse Chi-Squared test, the GOFFA Tree Paths (i.e., linked GO terms) and graphically displays them from high to low at each hierarchical level</p>
					</caption>
					<text>
						<p>GO Path &#8211; GO Path sorts, by descending statistical significance based on an inverse Chi-Squared test, the GOFFA Tree Paths (i.e., linked GO terms) and graphically displays them from high to low at each hierarchical level. GO Path plots the top ten paths with solid circles representing the GO terms on the path. The X-axis has the hierarchical level to which the GO term belongs and the Y-axis (log p) indicates the statistical significance of the term. A color key for the top 10 paths (as determined by equation 2) is located beneath the plot. Clicking either a circle in a path in the plot or its corresponding color key launches a Tree View (Figure 2) with the selected path highlighted in blue. Other features are also available from a popup menu obtained by right clicking the plot, including zoom in/out, export figure, etc.</p>
					</text>
					<graphic file="1471-2105-7-S2-S23-4"/>
				</fig>
				<p>&#8226; GO TreePrune &#8211; This visualization tool display the GO terms in a node-like hierarchical tree structure, as shown in the Figure <figr fid="F5">5</figr> example. Note that the plot is annotated with the p-value, E-value, and number of associated genes at each node of the tree. The number of genes associated with each node is also depicted in the pie chart as a fraction of the genes associated with the root node. The GO TreePrune plots can be very large and complex; as a result, GOFFA provides a tool for pruning the tree by assigning arbitrary and simultaneous cutoffs for p-value, E-value, and number of genes. Nodes below the cutoff values specified by the user are removed from the plot.</p>
				<fig id="F5">
					<title>
						<p>Figure 5</p>
					</title>
					<caption>
						<p>GO TreePrune &#8211; This node-like tree display allows the user to filter out nodes and thus reduce the complexity of a tree by specifying the p- and E-value as well as the user-defined number of genes in the end node</p>
					</caption>
					<text>
						<p>GO TreePrune &#8211; This node-like tree display allows the user to filter out nodes and thus reduce the complexity of a tree by specifying the p- and E-value as well as the user-defined number of genes in the end node. A GO term is represented by a sectored pie, where the red sector shows the percentage of the inputted genes associated with the term. The individual genes associated with each term are displayed in the right panel by single clicking the term. The annotation of a term can be turned on or off by double clicking the term. Each term is movable with mouse drag, which is convenient when working on a dense tree or with many annotations. The tree diagram can be zoomed and moved by holding down the right or left button of the mouse, respectively.</p>
					</text>
					<graphic file="1471-2105-7-S2-S23-5"/>
				</fig>
			</sec>
			<sec>
				<st>
					<p>GOFFA Application</p>
				</st>
				<p>A dataset from a toxicogenomics study was used to demonstrate the utility of GOFFA. In this study, the renal toxicity and carcinogenicity associated with the treatment of aristolochic acid (AA) in rats was studied using DNA microarray. AA is an active component of herbal drugs derived from some plants that has been used for medicinal purposes since ancient time <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. The compound is a nephrotoxin and carcinogen in human and rodents. To investigate the effect of AA exposure on gene expression in rat kidney, a toxicogenomics study is conducted; the experimental protocol is described by Chen et al. in an accompanying paper of the same issue. Briefly, six-week old Big Blue rats were treated with AA and control vehicle for 3 month. One day after the last treatment, the animals were sacrificed and the kidneys were removed for microarray analysis using the Applied Biosystems Rat Genome Survey Microarray. Both treated and control samples had six biological replicates (rats). The data normalization and analysis were conducted using ArrayTrack. The DEG list was determined based on p &lt; 0.01 and Fold Change &gt; 2. Since GOFFA is fully integrated with ArrayTrack, the DEGs from ArrayTrack were directly passed to GOFFA for functional analysis. Of 1176 identified genes, 417 genes had GO information for analysis <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. The GOFFA results are summarized in Figures <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, <figr fid="F5">5</figr>.</p>
				<p>The statistics based on a combination of Fisher's Exact Test (p &lt; 0.05) and Relevant Enrichment Factor (E &gt; 2) identified 52 enriched GO terms in the GO biological process. The majority of the terms are related to four functional categories, induction of apoptosis, defense response, response to stress, and amino acid metabolism. These four functional categories reflect the known biological and pharmacological responses of kidney to the AA treatment <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. Out of these four functional categories, GO Path ranked "defense response" as an important mechanism associated with the AA treatment (Figure <figr fid="F4">4</figr>), and similar results were obtained from GO TreePrune as well (Figure <figr fid="F5">5</figr>). This finding is consistent with the general understanding that defense response, which includes immune response, is a complex network response of a tissue to toxins and carcinogens (such as AA) for defending the body. Figure <figr fid="F2">2</figr> gives the GO Path results in the Tree window, where the majority of genes involved in the defense response are up-regulated to oppose damage by AA. For example, the <it>inhba </it>gene (first gene in the right panel) is a growth factor with 4.1-fold increase in expression in kidney. This is a tumor-suppressor gene and it produces protein that increases arrest in the G1 phase of tumor cells <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Therefore, its induction inhibits tumorigenesis in kidney treated with AA.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion</p>
			</st>
			<p>A fundamental step in analyzing DNA microarray data is to determine the differentially expressed genes (DEGs) that are presumably relevant to the biological phenomena under study. However, in microarray experiments using chips with thousands of genes where a small subset of DEGs is determined for a disease or toxicity, the potential for both type 1 and type 2 errors could be large. Both types of errors suggest the need for the biologists to intervene in the data reduction and analysis process beyond the application of statistics. The GOFFA software was designed with the biologist in mind. The platform provides a means to analyze and scrutinize the complex data from genomics and proteomics experiments in the context of the existing knowledge of gene function as embodied by the GO database. It provides the biologist alternate ways to summarize data, statistically select the most relevant data, or examine in fine detail the biological phenomena associated with selected data.</p>
			<p>GOFFA is a client-server application, written in JAVA language for portability, and has a GUI designed with the assistance of biologists for their own intuitive ease of use. The GUI is logically divided into three panels (Figure <figr fid="F2">2</figr>), for queries (panel 1), analysis and results (panel 2), and gene lists (panel 3), respectively. The GO analysis, results tables, graphs, and visualization tools are accessed from the analysis and results panel (Figure <figr fid="F2">2</figr>, panel 2) that maintains data linkage assuring ease in examining selected data in different ways.</p>
			<p>GOFFA's efficiency and effectiveness for data interpretation results from treating GO data as a set of distinct hierarchical GOFFA Tree Paths. Application of statistical tests to the GOFFA Tree Paths enables two unique interpretive functions, GO Path and GO TreePrune. GO Path provides the rank ordered estimates of the statistically important GOFFA Tree Paths. GO TreePrune provides the ability to prune GO trees by removing the GO terms according to their p- and E-values in conjunction of the user-defined number of genes the terms contain. These two functions apply the different statistical approaches to rank and/or narrow down the GO terms for further analysis/interpretation. When used together, the functions enable the biologist to reduce complexity of data to that which is most relevant, select that information, and then drill down to examine it further at a more refined level of detail.</p>
			<p>The statistical estimators used in GOFFA (as well as other similar GO tools) should be interpreted as heuristic metrics of the potential biological significance of GO terms, rather than formal inferences of biological relevance. They are most reliable for problem solving when all genes from an experiment are known, since the prevalent GO terms in DEG's are compared to the prevalent GO terms in the set of reference genes. For example, the absolute p-value from the Fisher's Exact Test has little value unless the total number of genes on the chip is used as the set of reference genes. This is equally applied to the E-value. GOFFA currently provides gene lists for over 100 commercial array types (e.g., most GeneChip and Agilent's arrays), for which the GO terms are pre-mapped and stored in the database for quick retrieval and analysis. With this information, GOFFA's statistical estimators can provide more meaningful significance assessment for interpretation of the GO results. If the inputted gene list is not associated with an array type, the total numbers of genes in the GOFFA database is for statistical estimates; while this will, for example, unrealistically skew p-values, p-values across the GO terms will still retain meaning in a relative sense.</p>
			<p>While GOFFA itself is a powerful analysis tool, its full utility derives from its integration as a module of the ArrayTrack software. ArrayTrack is a comprehensive software platform for microarray data management, analysis and interpretation <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. The integration of GOFFA with ArrayTrack enables the microarray data to be easily processed in the ArrayTrack environment and the resultant DEG list immediately interpret with GOFFA. Importantly, ArrayTrack has been interfaced with various commercial pathway software, providing an additional means to investigate the validity of GOFFA findings with respect to relevant gene ontologies.</p>
		</sec>
		<sec>
			<st>
				<p>Conclusion</p>
			</st>
			<p>A common characteristic of high-throughput omics technologies, such as DNA microarray, is the generation of huge datasets that provide the ability to examine differential expression between corresponding genes in treatment and control groups. GOFFA enhances the capability to interpret data generated from these technologies. GOFFA applies statistical analysis in conjunction with intuitive visual display to present GO terms, trees and paths in a manner to facilitate biological interpretation. There are two unique tools available in GOFFA, GO Path and GO TreePrune, both enabling fast and interactive interrogation of significant gene and protein lists through statistical assessment and visual inspection. GOFFA is a module of ArrayTrack that is FDA's microarray data management, analysis and interpretation software.</p>
		</sec>
		<sec>
			<st>
				<p>Authors' contributions</p>
			</st>
			<p>HS has developed GOFFA and finished the first draft of the manuscript. WT conceived the concept of the GO Path function and finalized the manuscript. HF was involved in the GOFFA interface design and testing and contributed significantly on finishing the first draft of the manuscript. TC helped preparing the section for the real-world application of GOFFA. All authors were involved with the design of the GOFFA functions and user interface. All authors participated in preparation of the manuscript, and approved its final form.</p>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>The authors express gratitude to Steve Harris and Xiaoxi Cao, the developers of ArrayTrack, for advising on many aspects of the software and database programming, and in particular for assistance with interfacing GOFFA and ArrayTrack.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>Development of public toxicogenomics software for microarray data management and analysis</p>
				</title>
				<aug>
					<au>
						<snm>Tong</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Harris</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Cao</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Fang</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Shi</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Sun</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Fuscoe</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Harris</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Hong</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Xie</snm>
						<fnm>Q</fnm>
					</au>
					<etal/>
				</aug>
				<source>Mutat Res</source>
				<pubdate>2004</pubdate>
				<volume>549</volume>
				<issue>1&#8211;2</issue>
				<fpage>241</fpage>
				<lpage>253</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">15120974</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>ArrayTrack &#8211; supporting toxicogenomic research at the U.S. Food and Drug Administration National Center for Toxicological Research</p>
				</title>
				<aug>
					<au>
						<snm>Tong</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Cao</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Harris</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Sun</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Fang</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Fuscoe</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Harris</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Hong</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Xie</snm>
						<fnm>Q</fnm>
					</au>
					<au>
						<snm>Perkins</snm>
						<fnm>R</fnm>
					</au>
					<etal/>
				</aug>
				<source>Environ Health Perspect</source>
				<pubdate>2003</pubdate>
				<volume>111</volume>
				<issue>15</issue>
				<fpage>1819</fpage>
				<lpage>1826</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1241745</pubid>
						<pubid idtype="pmpid">14630514</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Gene ontology: tool for the unification of biology. The Gene Ontology Consortium</p>
				</title>
				<aug>
					<au>
						<snm>Ashburner</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Ball</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Blake</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Botstein</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Butler</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Cherry</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Davis</snm>
						<fnm>AP</fnm>
					</au>
					<au>
						<snm>Dolinski</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Dwight</snm>
						<fnm>SS</fnm>
					</au>
					<au>
						<snm>Eppig</snm>
						<fnm>JT</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nat Genet</source>
				<pubdate>2000</pubdate>
				<volume>25</volume>
				<issue>1</issue>
				<fpage>25</fpage>
				<lpage>29</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/75556</pubid>
						<pubid idtype="pmpid" link="fulltext">10802651</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Gene Ontology: looking backwards and forwards</p>
				</title>
				<aug>
					<au>
						<snm>Lewis</snm>
						<fnm>SE</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2005</pubdate>
				<volume>6</volume>
				<issue>1</issue>
				<fpage>103</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">549054</pubid>
						<pubid idtype="pmpid" link="fulltext">15642104</pubid>
						<pubid idtype="doi">10.1186/gb-2004-6-1-103</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>GoMiner: aresource for biological interpretation of genomic and proteomic data</p>
				</title>
				<aug>
					<au>
						<snm>Zeeberg</snm>
						<fnm>BR</fnm>
					</au>
					<au>
						<snm>Feng</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Fojo</snm>
						<fnm>AT</fnm>
					</au>
					<au>
						<snm>Sunshine</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Narasimhan</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Kane</snm>
						<fnm>DW</fnm>
					</au>
					<au>
						<snm>Reinhold</snm>
						<fnm>WC</fnm>
					</au>
					<au>
						<snm>Lababidi</snm>
						<fnm>S</fnm>
					</au>
					<etal/>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2003</pubdate>
				<volume>4</volume>
				<issue>4</issue>
				<fpage>R28</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">154579</pubid>
						<pubid idtype="pmpid" link="fulltext">12702209</pubid>
						<pubid idtype="doi">10.1186/gb-2003-4-4-r28</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>GObar: a gene ontology based analysis and visualization tool for gene sets</p>
				</title>
				<aug>
					<au>
						<snm>Lee</snm>
						<fnm>JS</fnm>
					</au>
					<au>
						<snm>Katari</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Sachidanandam</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>BMC Bioinformatics</source>
				<pubdate>2005</pubdate>
				<volume>6</volume>
				<fpage>189</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1190157</pubid>
						<pubid idtype="pmpid" link="fulltext">16042800</pubid>
						<pubid idtype="doi">10.1186/1471-2105-6-189</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>GoSurfer: a graphical interactive tool for comparative analysis of large gene sets in Gene Ontology space</p>
				</title>
				<aug>
					<au>
						<snm>Zhong</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Storch</snm>
						<fnm>KF</fnm>
					</au>
					<au>
						<snm>Lipan</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Kao</snm>
						<fnm>MC</fnm>
					</au>
					<au>
						<snm>Weitz</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Wong</snm>
						<fnm>WH</fnm>
					</au>
				</aug>
				<source>Appl Bioinformatics</source>
				<pubdate>2004</pubdate>
				<volume>3</volume>
				<issue>4</issue>
				<fpage>261</fpage>
				<lpage>264</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.2165/00822942-200403040-00009</pubid>
						<pubid idtype="pmpid">15702958</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B8">
				<title>
					<p>DynGO: a tool for visualizing and mining ofGene Ontology and its associations</p>
				</title>
				<aug>
					<au>
						<snm>Liu</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Hu</snm>
						<fnm>ZZ</fnm>
					</au>
					<au>
						<snm>Wu</snm>
						<fnm>CH</fnm>
					</au>
				</aug>
				<source>BMC Bioinformatics</source>
				<pubdate>2005</pubdate>
				<volume>6</volume>
				<fpage>201</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1199584</pubid>
						<pubid idtype="pmpid" link="fulltext">16091147</pubid>
						<pubid idtype="doi">10.1186/1471-2105-6-201</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>Comparative analysis of gene sets in the Gene Ontology space under the multiple hypothesis testing framework</p>
				</title>
				<aug>
					<au>
						<snm>Zhong</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Tian</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Storch</snm>
						<fnm>KF</fnm>
					</au>
					<au>
						<snm>Wong</snm>
						<fnm>WH</fnm>
					</au>
				</aug>
				<source>Proc IEEE Comput Syst Bioinform Conf</source>
				<pubdate>2004</pubdate>
				<fpage>425</fpage>
				<lpage>435</lpage>
				<xrefbib>
					<pubid idtype="pmpid">16448035</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks</p>
				</title>
				<aug>
					<au>
						<snm>Maere</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Heymans</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Kuiper</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2005</pubdate>
				<volume>21</volume>
				<issue>16</issue>
				<fpage>3448</fpage>
				<lpage>3449</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/bti551</pubid>
						<pubid idtype="pmpid" link="fulltext">15972284</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>A graph-theoretic modeling on GO space for biological interpretation of gene clusters</p>
				</title>
				<aug>
					<au>
						<snm>Lee</snm>
						<fnm>SG</fnm>
					</au>
					<au>
						<snm>Hur</snm>
						<fnm>JU</fnm>
					</au>
					<au>
						<snm>Kim</snm>
						<fnm>YS</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2004</pubdate>
				<volume>20</volume>
				<issue>3</issue>
				<fpage>381</fpage>
				<lpage>388</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/btg420</pubid>
						<pubid idtype="pmpid" link="fulltext">14960465</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>GOTree Machine(GOTM): a web-based platform for interpreting sets of interestinggenes using Gene Ontology hierarchies</p>
				</title>
				<aug>
					<au>
						<snm>Zhang</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Schmoyer</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Kirov</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Snoddy</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>BMC Bioinformatics</source>
				<pubdate>2004</pubdate>
				<volume>5</volume>
				<fpage>16</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">373441</pubid>
						<pubid idtype="pmpid" link="fulltext">14975175</pubid>
						<pubid idtype="doi">10.1186/1471-2105-5-16</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Onto-Tools: an ensemble of web-accessible, ontology-based tools for the functional design and interpretation of high-throughput gene expression experiments</p>
				</title>
				<aug>
					<au>
						<snm>Khatri</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Bhavsar</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Bawa</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Draghici</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2004</pubdate>
				<volume>32</volume>
				<issue>Web Server</issue>
				<fpage>W449</fpage>
				<lpage>456</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">441547</pubid>
						<pubid idtype="pmpid" link="fulltext">15215428</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>Recent additions and improvements to the Onto-Tools</p>
				</title>
				<aug>
					<au>
						<snm>Khatri</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Sellamuthu</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Malhotra</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Amin</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Done</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Draghici</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2005</pubdate>
				<volume>33</volume>
				<issue>Web Server</issue>
				<fpage>W762</fpage>
				<lpage>765</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1160233</pubid>
						<pubid idtype="pmpid" link="fulltext">15980579</pubid>
						<pubid idtype="doi">10.1093/nar/gki472</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>GO::TermFinder &#8211; open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes</p>
				</title>
				<aug>
					<au>
						<snm>Boyle</snm>
						<fnm>EI</fnm>
					</au>
					<au>
						<snm>Weng</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Gollub</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Jin</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Botstein</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Cherry</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Sherlock</snm>
						<fnm>G</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2004</pubdate>
				<volume>20</volume>
				<issue>18</issue>
				<fpage>3710</fpage>
				<lpage>3715</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/bth456</pubid>
						<pubid idtype="pmpid" link="fulltext">15297299</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>GOstat: find statistically overrepresented Gene Ontologies within a group of genes</p>
				</title>
				<aug>
					<au>
						<snm>Beissbarth</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Speed</snm>
						<fnm>TP</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2004</pubdate>
				<volume>20</volume>
				<issue>9</issue>
				<fpage>1464</fpage>
				<lpage>1465</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/bth088</pubid>
						<pubid idtype="pmpid" link="fulltext">14962934</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>Integration of the Gene Ontology into an object-oriented architecture</p>
				</title>
				<aug>
					<au>
						<snm>Shegogue</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Zheng</snm>
						<fnm>WJ</fnm>
					</au>
				</aug>
				<source>BMC Bioinformatics</source>
				<pubdate>2005</pubdate>
				<volume>6</volume>
				<issue>1</issue>
				<fpage>113</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1156866</pubid>
						<pubid idtype="pmpid" link="fulltext">15885145</pubid>
						<pubid idtype="doi">10.1186/1471-2105-6-113</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B18">
				<title>
					<p>High-Throughput GoMiner, an 'industrial-strength' integrative gene ontology tool for interpretation of multiple-microarray experiments, with application to studies of Common Variable Immune Deficiency (CVID)</p>
				</title>
				<aug>
					<au>
						<snm>Zeeberg</snm>
						<fnm>BR</fnm>
					</au>
					<au>
						<snm>Qin</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Narasimhan</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Sunshine</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Cao</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Kane</snm>
						<fnm>DW</fnm>
					</au>
					<au>
						<snm>Reimers</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Stephens</snm>
						<fnm>RM</fnm>
					</au>
					<au>
						<snm>Bryant</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Burt</snm>
						<fnm>SK</fnm>
					</au>
					<etal/>
				</aug>
				<source>BMC Bioinformatics</source>
				<pubdate>2005</pubdate>
				<volume>6</volume>
				<fpage>168</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1190154</pubid>
						<pubid idtype="pmpid" link="fulltext">15998470</pubid>
						<pubid idtype="doi">10.1186/1471-2105-6-168</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>Ontological analysis of gene expression data: current tools, limitations, and open problems</p>
				</title>
				<aug>
					<au>
						<snm>Khatri</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Draghici</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2005</pubdate>
				<volume>21</volume>
				<issue>18</issue>
				<fpage>3587</fpage>
				<lpage>3595</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/bti565</pubid>
						<pubid idtype="pmpid" link="fulltext">15994189</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B20">
				<title>
					<p>Evidence for large domains of similarly expressed genes in the Drosophila genome</p>
				</title>
				<aug>
					<au>
						<snm>Spellman</snm>
						<fnm>PT</fnm>
					</au>
					<au>
						<snm>Rubin</snm>
						<fnm>GM</fnm>
					</au>
				</aug>
				<source>J Biol</source>
				<pubdate>2002</pubdate>
				<volume>1</volume>
				<issue>1</issue>
				<fpage>5</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">117248</pubid>
						<pubid idtype="pmpid" link="fulltext">12144710</pubid>
						<pubid idtype="doi">10.1186/1475-4924-1-5</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>Transcriptional regulation and function during the human cell cycle</p>
				</title>
				<aug>
					<au>
						<snm>Cho</snm>
						<fnm>RJ</fnm>
					</au>
					<au>
						<snm>Huang</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Campbell</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Dong</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Steinmetz</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Sapinoso</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Hampton</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Elledge</snm>
						<fnm>SJ</fnm>
					</au>
					<au>
						<snm>Davis</snm>
						<fnm>RW</fnm>
					</au>
					<au>
						<snm>Lockhart</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Nat Genet</source>
				<pubdate>2001</pubdate>
				<volume>27</volume>
				<issue>1</issue>
				<fpage>48</fpage>
				<lpage>54</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/83751</pubid>
						<pubid idtype="pmpid" link="fulltext">11137997</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>Biostatistics: A methodology for health sciences</p>
				</title>
				<aug>
					<au>
						<snm>Fisher</snm>
						<fnm>LD</fnm>
					</au>
					<au>
						<snm>Bell</snm>
						<fnm>Gv</fnm>
					</au>
				</aug>
				<publisher>Mew York: John Wiley and Sons;</publisher>
				<pubdate>1993</pubdate>
			</bibl>
			<bibl id="B23">
				<title>
					<p>Statistical Methods For Research Workers</p>
				</title>
				<aug>
					<au>
						<snm>Fisher</snm>
						<fnm>RA</fnm>
					</au>
				</aug>
				<publisher>London: Oliver and Boyd;</publisher>
				<pubdate>1932</pubdate>
			</bibl>
			<bibl id="B24">
				<title>
					<p>Statistical Method for Meta-Analysis</p>
				</title>
				<aug>
					<au>
						<snm>Hedges</snm>
						<fnm>LV</fnm>
					</au>
					<au>
						<snm>Olkin</snm>
						<fnm>I</fnm>
					</au>
				</aug>
				<publisher>Academic Press;</publisher>
				<pubdate>1985</pubdate>
			</bibl>
			<bibl id="B25">
				<title>
					<p>Calculation is based on only these genes that are identifiable in the GOFFA database</p>
				</title>
				<aug>
					<au>
						<cnm>Note</cnm>
					</au>
				</aug>
			</bibl>
			<bibl id="B26">
				<title>
					<p>Aristolochic acid as a probable human cancer hazard in herbal remedies: a review</p>
				</title>
				<aug>
					<au>
						<snm>Arlt</snm>
						<fnm>VM</fnm>
					</au>
					<au>
						<snm>Stiborova</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Schmeiser</snm>
						<fnm>HH</fnm>
					</au>
				</aug>
				<source>Mutagenesis</source>
				<pubdate>2002</pubdate>
				<volume>17</volume>
				<issue>4</issue>
				<fpage>265</fpage>
				<lpage>277</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/mutage/17.4.265</pubid>
						<pubid idtype="pmpid" link="fulltext">12110620</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>The role of activin a in regulation of hemopoiesis</p>
				</title>
				<aug>
					<au>
						<snm>Shav-Tal</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Zipori</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Stem Cells</source>
				<pubdate>2002</pubdate>
				<volume>20</volume>
				<issue>6</issue>
				<fpage>493</fpage>
				<lpage>500</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1634/stemcells.20-6-493</pubid>
						<pubid idtype="pmpid" link="fulltext">12456957</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
		</refgrp>
	</bm>
</art>

