<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2007-8-1-r3</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Software</dochead>
      <bibl>
         <title>
            <p>GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Carmona-Saez</snm>
               <fnm>Pedro</fnm>
               <insr iid="I1"/>
               <email>pcarmona@cnb.uam.es</email>
            </au>
            <au id="A2">
               <snm>Chagoyen</snm>
               <fnm>Monica</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>monica@cnb.uam.es</email>
            </au>
            <au id="A3">
               <snm>Tirado</snm>
               <fnm>Francisco</fnm>
               <insr iid="I2"/>
               <email>ptirado@dacya.ucm.es</email>
            </au>
            <au id="A4">
               <snm>Carazo</snm>
               <mi>M</mi>
               <fnm>Jose</fnm>
               <insr iid="I1"/>
               <email>carazo@cnb.uam.es</email>
            </au>
            <au id="A5" ca="yes">
               <snm>Pascual-Montano</snm>
               <fnm>Alberto</fnm>
               <insr iid="I2"/>
               <email>pascual@fis.ucm.es</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>BioComputing Unit, National Center of Biotechnology (CNB-CSIC), C/Darwin 3, Campus Universidad Aut&#243;noma de Madrid, 28049 Madrid, Spain</p>
            </ins>
            <ins id="I2">
               <p>Computer Architecture Department, Facultad de Ciencias F&#237;sicas, Universidad Complutense de Madrid, C/Avenida Complutense S/N, 28040 Madrid, Spain</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2007</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>R3</fpage>
         <url>http://genomebiology.com/2007/8/1/R3</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17204154</pubid>
               <pubid idtype="doi">10.1186/gb-2007-8-1-r3</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>3</day>
               <month>7</month>
               <year>2006</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>29</day>
               <month>9</month>
               <year>2006</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>4</day>
               <month>1</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>4</day>
               <month>01</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Carmona-Saez et al.; licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <shorttitle>
         <p>Finding concurrent annotations in gene lists</p>
      </shorttitle>
      <shortabs>
         <p>GENECODIS, a web-based tool for finding annotations that frequently co-occur in a set of genes and ranking them by their statistical significance, is presented.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>We present GENECODIS, a web-based tool that integrates different sources of information to search for annotations that frequently co-occur in a set of genes and rank them by statistical significance. The analysis of concurrent annotations provides significant information for the biologic interpretation of high-throughput experiments and may outperform the results of standard methods for the functional analysis of gene lists. GENECODIS is publicly available at <url>http://genecodis.dacya.ucm.es/</url>.</p>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010002">Bioinformatics</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Rationale</p>
         </st>
         <p>High-throughput experimental techniques such as DNA microarrays or proteomics are allowing researchers to study biologic systems from a global perspective. In many cases, the net result of these experiments is a large list of genes or proteins that are potentially interesting for the analyzed system, for example genes that are differentially expressed among normal and pathologic tissues. A logical further step in the analysis workflow is to translate such lists of significant genes into functional descriptors that help researchers in the process of elucidating the biologic meaning of their experimental results.</p>
         <p>Since Khatri and coworkers introduced Onto-Express <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, several methods have been proposed within this context, aimed at interpreting and extracting biologic knowledge from large lists of genes or proteins. Most of these applications find biologic annotations that are significantly enriched in a list of genes with respect to a reference set, usually the whole genome or those genes used in a microarray. Using a specific source of information, for example Gene Ontology (GO) <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, those tools first find all of the GO terms associated with the set of analyzed genes. The number of appearances of each term is then determined in the input and reference lists, and a statistical test - usually the hypergeometric, &#967;<sup>2</sup>, bionomial, or Fisher's exact test - is used to compute <it>p </it>values, which are subsequently adjusted for multiple testing. The result of this analysis is a list of single biological annotations from a given ontology (for instance, GO terms) with their corresponding <it>p </it>values. Those terms with <it>p </it>values indicating statistical significance are representative of the analyzed list of genes and can provide information about the underlying biologic processes. Good reviews of such methods are available elsewhere <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>.</p>
         <p>Most of the currently available tools, however, are designed to evaluate single annotations, which means that they provide a list of annotations with their corresponding <it>p </it>values without taking into account the potential relationships among them. Finding relationships among annotations based on co-occurrence patterns can extend our understanding of the biologic events associated with a given experimental system. For example, a set of differentially expressed genes may be associated with the activation of biologic processes that are restricted to certain cellular organelles. Retrieving such associations provides meaningful and additional information for the interpretation of the experimental results.</p>
         <p>In addition, the analysis of single annotations may show limitations in some cases. A simple motivating example of such limitations can be explained by using a hypothetical case of GO terms. There are categories such as 'signal transduction' that, although related to concrete aspects of the cell physiology, are associated with genes that are involved in disparate biologic processes, and therefore they may be annotated together with other terms such as 'cell proliferation' or 'apoptosis'. In this scenario, in a list of genes annotated as 'signal transduction' and 'cell proliferation', we may find that none of these terms are significant because a large number of genes in the genome belonging to each one of these categories are not included in the analyzed set. On the contrary, the co-occurrence of both categories might be significant if most of the genes simultaneously annotated with both terms are included in the list. This co-occurrence information reveals that a significant proportion of genes in the set are involved in specific signaling pathways related to cell proliferation. Therefore, relevant associations might be underestimated if only single annotations are taken into account.</p>
         <p>These observations prompted us to develop GENECODIS, a web-based tool for finding sets of biological annotations that frequently appear together and are significant in a set of genes. It allows the integrated analysis of annotations from different sources (for example, KEGG pathways, Swiss-Prot keywords, GO, and InterPro motifs) and generates statistical rank scores for single annotations and their combinations. We believe that GENECODIS is an important extension of existing tools for the functional analysis of gene lists. GENECODIS is publicly available from the application website <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>.</p>
      </sec>
      <sec>
         <st>
            <p>The GENECODIS algorithm</p>
         </st>
         <p>The application that we propose is simple in its concept; it takes a list of genes as input and determines biological annotations or combinations of annotations that are over-represented with respect to a reference list. The novelty of this tool relies in the fact that, before computing the statistical test, it incorporates a new functionality to extract all combinations of annotations that appear in at least <it>x </it>genes, with <it>x </it>being a user-defined threshold (Figure <figr fid="F1">1</figr> shows an overview of the methodology).</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Overview of the methodology</p>
            </caption>
            <text>
               <p>Overview of the methodology. <b>(a) </b>Annotations from several sources are assigned to genes in the input list. <b>(b) </b>The <it>apriori </it>algorithm is applied to find sets of annotations that frequently co-occur in the input list. <b>(c) </b>The statistical significance of each annotation or set of concurrent annotations is calculated based on its frequency in the input and reference sets. The figure illustrates an example in which a list of yeast genes is annotated with Gene Ontology (GO) terms for 'cellular component' and KEGG pathways. In the output table only the annotations that co-occur in more than five genes are shown.</p>
            </text>
            <graphic file="gb-2007-8-1-r3-1"/>
         </fig>
         <sec>
            <st>
               <p>Finding sets of terms that frequently appear together in a list of genes</p>
            </st>
            <p>To extract combinations of gene annotations, GENECODIS uses a modification to the methodology reported by Carmona-Saez and coworkers <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, which implements the <it>apriori </it>algorithm to extract associations among gene annotations and expression patterns.</p>
            <p>The <it>apriori </it>algorithm was originally introduced by Agrawal and coworkers <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> and has been extensively used to extract association rules from transaction databases. This algorithm generates sets of elements that frequently co-occur in a database of transactions. Briefly, the procedure starts by determining the set of all single annotations ('itemset') that appear in at least <it>x </it>genes (also known as support threshold) from the list of interest and establish the frequent <it>k </it>itemsets, where <it>k </it>= 1. In the second iteration (<it>k </it>= 2), the set of frequent annotations found in the previous step is used to produce the new set of candidates of size 2 (2-itemset), and the database is scanned again to explore each gene and counting the frequency of each pair of annotations. However, if the set of annotations does not satisfy the minimum support constraint - that is, they do not occur in at least <it>x </it>genes - then they are not further considered to generate larger itemsets. The procedure continues until no more combinations are possible. At the end of this search all itemsets that contain the collection of annotations that co-occur in at least <it>x </it>genes are obtained (Additional data file 1).</p>
            <p>In our previous work <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> we used the <it>apriori </it>algorithm to extract association rules among gene annotations and expression patterns. However, in this work we use it as the initial step in the methodology included in GENECODIS, namely the extraction of sets of annotation that frequently co-occur in a gene list.</p>
            <p>It is important to note that increasing the number of different items (sources of annotation in this case) while decreasing the minimum support value can significantly multiply the number of concurrences and thus the computation time. Additional data file 1 contains a complete study of execution time and size of the itemsets for different support values in real datasets. Very extreme scenarios, such as extracting all possible combinations of terms that appear in at least one gene (support value of 1), is in many cases a computationally unfeasible task. For this reason we have provided the application with a minimum support value of 3, which is a reasonable threshold to extract significant biological information from gene lists.</p>
         </sec>
         <sec>
            <st>
               <p>Statistical analysis</p>
            </st>
            <p>Once all combinations of annotations that appear in at least <it>x </it>genes have been extracted, the method counts the occurrence of each set of annotations in the list of genes and in a reference list. Note that for each set of concurrent annotations its frequency is calculated as the number of genes that are simultaneously co-annotated with those terms. By default, GENECODIS uses as a reference set all genes from the corresponding genome at the NCBI Entrez Gene database <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, but users can upload their own reference set (for example, genes in a chip). Then, a statistical test is applied to identify categories, and their combinations, that are significantly enriched in the list of genes. Two statistical tests are implemented in GENECODIS: the hypergeometric distribution and the &#967;<sup>2 </sup>test of independence. For a detailed description of these methods in the context of the ontological analysis of gene lists, see the work of Draghici and coworkers <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> and the online help for the program.</p>
            <p>The <it>p </it>values can then be adjusted for multiple tests using a simulation-based correction approach <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp> or the false discovery method proposed by Benjamini and Hochberg <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. For the simulation-based correction, a gene list of the same size of the input list is generated by randomly selecting genes from those used as reference. The frequent itemsets are then extracted (as described above) from this random list and their corresponding <it>p </it>values are calculated. This process is repeated 10,000 times and the corrected <it>p </it>value for each <it>k </it>itemset is calculated as the fraction of simulations having any <it>k </it>itemset with a <it>p </it>value as good as or better than the <it>p </it>value for that <it>k </it>itemset.</p>
            <p>Therefore, the result of the analysis performed by GENECODIS consists of a list of annotations or combinations of annotations with their corresponding <it>p </it>values. Annotations exhibiting <it>p </it>values below a certain threshold can be considered significantly associated with the list of genes under study and can be used to discern the biologic mechanisms relevant to the experimental system.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Implementation</p>
         </st>
         <p>GENECODIS is a web-based tool that is freely accessible from the application website <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. It uses the Entrez Gene database <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> as the backbone data structure to link the functional annotations imported from GO together with the correspondences among gene identifiers (IDs). It allows users to upload gene lists using different IDs, including, for example, Gene Symbols, Entrez Gene, or Unigene IDs (more information about the identifiers supported for each organism can be found in the application website). If duplicated IDs are used in the input list, then they are treated as unique entries.</p>
         <p>For each organism GENECODIS provides analysis of different annotations, including the three GO categories (biological process, cellular component, and molecular function), KEGG pathways, InterPro Motifs, and Swiss-Prot keywords. GO annotations for each gene are imported from the NCBI Entrez Gene database. GENECODIS allows users to select different levels of the GO hierarchy as well as GO Slim terms <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Information about metabolic pathways is imported from KEGG database <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, whereas Swiss-Prot keywords and InterPro motifs are imported from Swiss-Prot database.</p>
         <p>Regarding the supported organisms, GENECODIS currently works with <it>Arabidopsis thaliana</it>, <it>Bos taurus</it>, <it>Caenorhabditis elegans</it>, <it>Danio rerio</it>, <it>Drosophila melanogaster</it>, <it>Gallus gallus</it>, <it>Homo sapiens</it>, <it>Mus musculus</it>, <it>Rattus norvegicus</it>, <it>Saccharomyces cerevisiae</it>, and <it>Schizosaccharomyces pombe</it>. More organisms and annotations will be systematically added in future versions of the application.</p>
         <p>One relative limitation derived from the in-depth search performed is the increase in the computational cost and time as more annotation categories are analyzed. To tackle this limitation GENECODIS uses an efficient technique to extract frequent itemsets <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. Additionally, GENECODIS runs on a 16-processor cluster, which guarantees the simultaneous use of the tool by multiple users.</p>
      </sec>
      <sec>
         <st>
            <p>GENECODIS at work</p>
         </st>
         <p>We provide two examples showing the analysis performed by GENECODIS and how the results obtained as combinations of several biological annotations provide additional information that may be useful in the interpretation of high-throughput experimental data.</p>
         <sec>
            <st>
               <p>Yeast data</p>
            </st>
            <p>To illustrate GENECODIS, we show the results obtained using data generated by Smith and coworkers <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. They used oligonucleotide-based whole genome microarrays to measure gene expression levels in yeast during growth in oleate (peroxisome induction) and growth in glucose (peroxisome repression conditions). Using different clustering algorithms they identified 224 yeast genes whose expression patterns were similar to well known peroxisomal genes.</p>
            <p>The list of these 224 genes was re-analyzed using GENECODIS, selecting biological process (BP) and cellular component (CC) GO Slim annotations. The simultaneous analysis of both categories provided a global picture of the biological processes associated with the experimental system linked to cellular localization information (Figure <figr fid="F2">2</figr>). As was expected, the most significant category associated with this gene list was 'peroxisome' (CC). Other single categories that were highly representative were 'generation of precursor metabolites and energy' (BP), 'carbohydrate metabolism' (BP), and 'lipid metabolism' (BP), which is consistent with the observation that the shift to growth in the presence of oleate activates genes encoding enzymes that are involved in fatty acid degradation, allowing efficient use of the new carbon source <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Screenshot depicting results of the analysis of yeast genes</p>
               </caption>
               <text>
                  <p>Screenshot depicting results of the analysis of yeast genes. The 'Annotation/s' column represents the Gene Ontology codes of annotations found in the list. The '# list' and '# reference' columns represent the number of genes in the input list and reference list for a given annotation, respectively. The 'Genes' column represents the set of genes in the input list showing a given annotation. The 'Description/s' column represents the textual description of annotations. CC refers to 'cellular component' and BP to 'biological process' categories. Only annotations with corrected <it>P </it>values &#8804; 0.05 are shown. <it>P </it>values were calculated using the hypergeometric distribution and were corrected using the simulation-based approach.</p>
               </text>
               <graphic file="gb-2007-8-1-r3-2"/>
            </fig>
            <p>In addition to these single-category significant annotations, GENECODIS revealed a new set of associations with a strong biologic meaning. For example, taking a closer look at the second and third categories with the lowest <it>p </it>values, we can see that a significant set of genes were co-annotated with 'peroxisome' (CC) and 'lipid metabolism' (BP), and 'peroxisome' (CC) and 'organelle organization and biosynthesis' (BP), respectively. These findings allow us to easily identify the set of peroxisomal genes that are specifically involved in each one of these two different biological processes. Among the genes co-annotated as 'peroxisome' (CC) and 'lipid metabolism' (BP) are the genes involved in the fatty acid &#946;-oxidation pathway, such as <it>POX1</it>, <it>FAA2</it>, <it>ECI1</it>, <it>FOX2</it>, <it>POT1</it>, and <it>DCI1</it>. Among genes co-annotated as 'peroxisome' (CC) and 'organelle organization and biosynthesis' (BP) are the PEX genes, which are involved in peroxisome assembly <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> and are required for the increase in the number of these organelles during growth on oleate <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>.</p>
            <p>Another interesting set of annotations that show the usefulness of the application are those categories related to mitochondrial genes. Forty-eight out of 887 yeast genes annotated as 'mitochondrion' (CC) were present in the list, and therefore this annotation exhibited a <it>p </it>value of 0.0248 (simulation corrected <it>p </it>value = 0.2; Additional data file 2). Consequently, based on the statistical test, this annotation is not considered significant. Nevertheless, GENECODIS was able to identify a set of significant co-annotations related to mitochondrial genes. For example, 6 out of 21 yeast genes that were simultaneously annotated with 'mitochondrion' (CC) and 'lipid metabolism' (BP) were present in the list, and this co-annotation exhibited a <it>p </it>value of 0.000162 (simulation corrected <it>p </it>value = 0.0086). Among these genes was, for example, the <it>CRC1 </it>gene, which is a mitochondrial inner membrane carnitine transporter that is required for carnitine-dependent transport of acetyl-coenzyme A from peroxisomes to mitochondria. In the same way, the co-annotation of 'mitochondrial membrane' (CC) and 'generation of precursor metabolites and energy' (BP) related to a subset of genes that are component of the mitochondrial respiratory chain was found to be significant, with a simulation corrected <it>p </it>value of 0.004.</p>
            <p>Although fatty acid &#946;-oxidation in <it>Saccharomyces cerevisiae </it>is restricted to peroxisomes, the association of mitochondrion related categories to this set of genes is highly consistent with the important role of these organelles in the metabolism of &#946;-oxidation products. Acetyl-coenzyme A, the final product of the fatty acid &#946;-oxidation pathway in peroxisomes is transported to the mitochondria for the final oxidation to CO<sub>2 </sub>and H<sub>2</sub>O <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. In this way, peroxisomal fatty acid &#946;-oxidation demands a functional mitochondrial electron transport chain for energy production, and either functional peroxisomes and mitochondria are required for growth in the presence of oleate <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Human data</p>
            </st>
            <p>To provide a second example of the functionality of GENECODIS, we analyzed a set of 85 human genes expressed in testis reported by Su and coworkers <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. This dataset was also used by Zhang and colleagues <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> to illustrate the performance of the GOTree Machine (GOTM) software, and therefore it represents a good test case for our method. Zhang and colleagues, using GOTM, reported four main groups of GO biological process annotations related to the testis gene cluster: categories related to cell proliferation, cell cycle, mitosis, and meiosis; categories related to testis specific development; categories related to protein phosphorylation; and categories related to glycerolipid metabolism.</p>
            <p>We used our tool to analyze this set of genes using the GO biological process categories and InterPro motifs that appear in at least three genes. The most significant concurrences are shown in Figure <figr fid="F3">3</figr>. Similar results to those reported by Zhang and coworkers <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> were obtained by GENECODIS, except for the case of categories related to glycerolipid metabolism, which were not extracted because they were present in only two genes. In addition, GENECODIS was able to provide new information for the functional interpretation of this set of genes. For example, the fifth association revealed that a significant set of genes in the analyzed list were co-annotated with 'protein amino acid phosphorylation' and 'cell cycle' GO biological process categories and contained protein kinase motifs. The importance of this observation is the explicit connection between 'protein amino acid phosphorylation' and 'cell cycle' categories.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Screenshot depicting results of the analysis of human genes</p>
               </caption>
               <text>
                  <p>Screenshot depicting results of the analysis of human genes. GENECODIS results from the analysis of Gene Ontology CC ('cellular component') and InterPro motifs in the human gene set. Only annotations with corrected <it>P </it>values &#8804; 0.05 are shown.</p>
               </text>
               <graphic file="gb-2007-8-1-r3-3"/>
            </fig>
            <p>In order to explain the 'protein phosphorylation' category in the context of the phenotypic feature of the gene cluster, Zhang and colleagues <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> remarked that, 'spermatozoa undergo a series of changes before and during egg binding to acquire the ability to fuse with the oocyte. These priming events are regulated by the activation of compartmentalized intracellular signaling pathways, which control the phosphorylation status of sperm proteins.'</p>
            <p>Results provided by GENECODIS complement this finding and point out that, in this particular case, the 'protein phosphorylation' category is mainly related to proteins that are involved in cell cycle. Indeed, activation and inhibition of many key regulators of cell cycle are carried out by phosphorylation/dephosphorylation events.</p>
            <p>This finding can be confirmed by examining the genes that were co-annotated with both categories: <it>CDC2 </it>(Entrez Gene ID: 983), aurora kinase A (Entrez Gene ID: 6790), <it>NEK2 </it>(Entrez Gene ID: 4751), <it>BUB1 </it>(Entrez Gene ID: 699), and <it>BUB1B </it>(Entrez Gene ID: 701). All of these have been associated with testis tissues and cell proliferation events in previous studies. For example, the <it>NEK2 </it>gene is predominantly expressed in spermatocytes and appears to be associated with meiotic chromosomes in these cells <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>; expression of the gene <it>BUB1B </it>in testis decreases with advancing age, and it may play a role in regulating infertility <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>.</p>
            <p>These two examples illustrate the type of information provided by GENECODIS, which can be useful in helping researchers to interpret large lists of genes generated by high-throughput experimental techniques.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>High-throughput experimental techniques, such as DNA microarrays, have opened new ways to study biological systems from a global perspective. In many cases, these techniques generate huge amounts of data in the form of large gene or protein lists that share a common property, for example genes that are differentially expressed among pathologic and normal tissues. These data can provide a basis for the characterization of unknown genes, and at the same time they are also the basis for elucidating the biological processes associated with the experimental system. Methods based on the ontological analysis of such lists of genes have proved to be very useful tools for the analysis and interpretation of the underlying biological mechanisms.</p>
         <p>However, most of the current applications for functional profiling essentially use the same general approach and generate statistical scores for single annotations. They mainly differ on aspects such as the statistical test used, supported annotations and organisms, the gene identifiers that they are able to manage, and visualization capabilities. Indeed, a relevant conclusion of a review of such tools recently reported by Khatri and Draghici <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> was that it would be more beneficial if future applications expand the current approach rather than providing endless variations of the same idea.</p>
         <p>GENECODIS was designed to expand the biological enrichment of annotations by adding the possibility of extracting not only single enriched categories, but also significant combinations of them. To the best of our knowledge there is no other tool available in the field that integrates information from different sources in a flexible way for concurrent enrichment studies. A comparison of GENECODIS with related tools <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr></abbrgrp> and an example with test data <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> is provided in Additional data file 3. We hope that this tool will help by complementing available analysis tools for the large genome research community.</p>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>The following additional data are available with the online version of this article. Additional data file <supplr sid="S1">1</supplr> contains an illustrative example of the GENECODIS algorithm in operation. Additional data file <supplr sid="S2">2</supplr> contains the results obtained by GENECODIS in the analysis of the yeast and human gene sets. Additional data file <supplr sid="S3">3</supplr> provides a description of a comparative analysis of the results provided by GENECODIS and other related tools.</p>
         <suppl id="S1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>An illustrative example of the GENECODIS algorithm in operation</p>
            </caption>
            <text>
               <p>A file containing an illustrative example of the GENECODIS algorithm in operation.</p>
            </text>
            <file name="gb-2007-8-1-r3-S1.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S2">
            <title>
               <p>Additional data file 2</p>
            </title>
            <caption>
               <p>Results obtained by GENECODIS in the analysis of yeast and human gene sets</p>
            </caption>
            <text>
               <p>A file containing the results obtained by GENECODIS in the analysis of the yeast and human gene sets.</p>
            </text>
            <file name="gb-2007-8-1-r3-S2.xls">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S3">
            <title>
               <p>Additional data file 3</p>
            </title>
            <caption>
               <p>Description of a comparative analysis of results provided by GENECODIS and other related tools</p>
            </caption>
            <text>
               <p>A compressed file containing a description of a comparative analysis of the results provided by GENECODIS and other related tools.</p>
            </text>
            <file name="gb-2007-8-1-r3-S3.zip">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work was partially funded by Spanish grants CICYT BFU2004-00217/BMC, GEN2003-20235-c05-05, CYTED-505PI0058, TIN2005-5619, PR27/05-13964-BSCH and S-GEN-0166-2006, and a collaborative grant between the Spanish CSIC and the Canadian NRC (CSIC-050402040003). PCS is recipient of a grant from Comunidad Autonoma de Madrid (CAM). APM acknowledges the support of the Spanish Ram&#243;n y Cajal program. We thank Enrique de la Torre and Cesar Vicente for their technical support.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Profiling gene expression using onto-express.</p>
            </title>
            <aug>
               <au>
                  <snm>Khatri</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Draghici</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ostermeier</snm>
                  <fnm>GC</fnm>
               </au>
               <au>
                  <snm>Krawetz</snm>
                  <fnm>SA</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>2002</pubdate>
            <volume>79</volume>
            <fpage>266</fpage>
            <lpage>270</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11829497</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.</p>
            </title>
            <aug>
               <au>
                  <snm>Ashburner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ball</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Blake</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Butler</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Cherry</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Dolinski</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dwight</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Eppig</snm>
                  <fnm>JT</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>25</volume>
            <fpage>25</fpage>
            <lpage>29</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10802651</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Ontological analysis of gene expression data: current tools, limitations, and open problems.</p>
            </title>
            <aug>
               <au>
                  <snm>Khatri</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Draghici</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>3587</fpage>
            <lpage>3595</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15994189</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <aug>
               <au>
                  <snm>Draghici</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Data Analysis Tools for DNA Microarrays</source>
            <publisher>Boca Raton, FL: Chapman &amp; Hall/CRC Press</publisher>
            <pubdate>2003</pubdate>
         </bibl>
         <bibl id="B5">
            <title>
               <p>GENECODIS</p>
            </title>
            <url>http://genecodis.dacya.ucm.es/</url>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Integrated analysis of gene expression by Association Rules Discovery.</p>
            </title>
            <aug>
               <au>
                  <snm>Carmona-Saez</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Chagoyen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rodriguez</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Trelles</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Carazo</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Pascual-Montano</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>54</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1386712</pubid>
                  <pubid idtype="pmpid" link="fulltext">16464256</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Mining association rules between sets of items in large databases.</p>
            </title>
            <aug>
               <au>
                  <snm>Agrawal</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Imielinski</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Swami</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proceedings of the ACM SIGMOD International Conference on Management of Data: 26-28 May 1993; Washington, DC</source>
            <publisher>New York, NY: ACM Press</publisher>
            <editor>Buneman P, Jajodia S</editor>
            <pubdate>1993</pubdate>
            <fpage>207</fpage>
            <lpage>216</lpage>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Entrez Gene: gene-centered information at NCBI.</p>
            </title>
            <aug>
               <au>
                  <snm>Maglott</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ostell</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Pruitt</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Tatusova</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>D54</fpage>
            <lpage>D58</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">539985</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608257</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Global functional profiling of gene expression.</p>
            </title>
            <aug>
               <au>
                  <snm>Draghici</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Khatri</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Martins</snm>
                  <fnm>RP</fnm>
               </au>
               <au>
                  <snm>Ostermeier</snm>
                  <fnm>GC</fnm>
               </au>
               <au>
                  <snm>Krawetz</snm>
                  <fnm>SA</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>2003</pubdate>
            <volume>81</volume>
            <fpage>98</fpage>
            <lpage>104</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12620386</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Characterizing gene sets with FuncAssociate.</p>
            </title>
            <aug>
               <au>
                  <snm>Berriz</snm>
                  <fnm>GF</fnm>
               </au>
               <au>
                  <snm>King</snm>
                  <fnm>OD</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Sander</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Roth</snm>
                  <fnm>FP</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>2502</fpage>
            <lpage>2504</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">14668247</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>GO::TermFinder - open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Boyle</snm>
                  <fnm>EI</fnm>
               </au>
               <au>
                  <snm>Weng</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gollub</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Jin</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Cherry</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Sherlock</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <fpage>3710</fpage>
            <lpage>3715</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15297299</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Controlling the false discovery rate: a practical and powerful approach to multiple testing.</p>
            </title>
            <aug>
               <au>
                  <snm>Benjamini</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hochberg</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>J R Stat Soc</source>
            <pubdate>1995</pubdate>
            <volume>57</volume>
            <fpage>289</fpage>
            <lpage>300</lpage>
         </bibl>
         <bibl id="B13">
            <title>
               <p>GO Slim</p>
            </title>
            <url>http://www.geneontology.org/GO.slims.shtml</url>
         </bibl>
         <bibl id="B14">
            <title>
               <p>The KEGG resource for deciphering the genome.</p>
            </title>
            <aug>
               <au>
                  <snm>Kanehisa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Goto</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kawashima</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Okuno</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hattori</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>D277</fpage>
            <lpage>D280</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308797</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681412</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Transcriptome profiling to identify genes involved in peroxisome assembly and function.</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Marelli</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Christmas</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Vizeacoumar</snm>
                  <fnm>FJ</fnm>
               </au>
               <au>
                  <snm>Dilworth</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Ideker</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Galitski</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Dimitrov</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Rachubinski</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Aitchison</snm>
                  <fnm>JD</fnm>
               </au>
            </aug>
            <source>J Cell Biol</source>
            <pubdate>2002</pubdate>
            <volume>158</volume>
            <fpage>259</fpage>
            <lpage>271</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12135984</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Dissection of transient oxidative stress response in <it>Saccharomyces cerevisiae </it>by using DNA microarrays.</p>
            </title>
            <aug>
               <au>
                  <snm>Koerkamp</snm>
                  <fnm>MG</fnm>
               </au>
               <au>
                  <snm>Rep</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bussemaker</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Hardy</snm>
                  <fnm>GP</fnm>
               </au>
               <au>
                  <snm>Mul</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Piekarska</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Szigyarto</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>De Mattos</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Tabak</snm>
                  <fnm>HF</fnm>
               </au>
            </aug>
            <source>Mol Biol Cell</source>
            <pubdate>2002</pubdate>
            <volume>13</volume>
            <fpage>2783</fpage>
            <lpage>2794</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">117942</pubid>
                  <pubid idtype="pmpid" link="fulltext">12181346</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Fatty acid metabolism in <it>Saccharomyces cerevisiae</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>van Roermund</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Waterham</snm>
                  <fnm>HR</fnm>
               </au>
               <au>
                  <snm>Ijlst</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Wanders</snm>
                  <fnm>RJ</fnm>
               </au>
            </aug>
            <source>Cell Mol Life Sci</source>
            <pubdate>2003</pubdate>
            <volume>60</volume>
            <fpage>1838</fpage>
            <lpage>1851</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">14523547</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Large-scale analysis of the human and mouse transcriptomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Su</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Cooke</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Ching</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Hakak</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Walker</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Wiltshire</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Orth</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Vega</snm>
                  <fnm>RG</fnm>
               </au>
               <au>
                  <snm>Sapinoso</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Moqrich</snm>
                  <fnm>A</fnm>
               </au>
               <etal/>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>4465</fpage>
            <lpage>4470</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">123671</pubid>
                  <pubid idtype="pmpid" link="fulltext">11904358</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>GOTree Machine (GOTM): a web-based platform for interpreting sets of interesting genes using Gene Ontology hierarchies.</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Schmoyer</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kirov</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Snoddy</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>16</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">373441</pubid>
                  <pubid idtype="pmpid" link="fulltext">14975175</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Tcp10 promoter-directed expression of the Nek2 gene in mouse meiotic spermatocytes.</p>
            </title>
            <aug>
               <au>
                  <snm>Rhee</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Wolgemuth</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Mol Cells</source>
            <pubdate>2002</pubdate>
            <volume>13</volume>
            <fpage>85</fpage>
            <lpage>90</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11911479</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>BubR1 insufficiency causes early onset of aging-associated phenotypes and infertility in mice.</p>
            </title>
            <aug>
               <au>
                  <snm>Baker</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Jeganathan</snm>
                  <fnm>KB</fnm>
               </au>
               <au>
                  <snm>Cameron</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Juneja</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kopecka</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kumar</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Jenkins</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>de Groen</snm>
                  <fnm>PC</fnm>
               </au>
               <au>
                  <snm>Roche</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>van Deursen</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2004</pubdate>
            <volume>36</volume>
            <fpage>744</fpage>
            <lpage>749</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15208629</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>BABELOMICS: a suite of web tools for functional annotation and analysis of groups of genes in high-throughput experiments.</p>
            </title>
            <aug>
               <au>
                  <snm>Al-Shahrour</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Minguez</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Vaquerizas</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Conde</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Dopazo</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>W460</fpage>
            <lpage>W464</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1160217</pubid>
                  <pubid idtype="pmpid" link="fulltext">15980512</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>GeneMerge: post-genomic analysis, data mining, and hypothesis testing.</p>
            </title>
            <aug>
               <au>
                  <snm>Castillo-Davis</snm>
                  <fnm>CI</fnm>
               </au>
               <au>
                  <snm>Hartl</snm>
                  <fnm>DL</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>891</fpage>
            <lpage>892</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12724301</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>DAVID: Database for Annotation, Visualization, and Integrated Discovery.</p>
            </title>
            <aug>
               <au>
                  <snm>Dennis</snm>
                  <fnm>G</fnm>
                  <suf>Jr</suf>
               </au>
               <au>
                  <snm>Sherman</snm>
                  <fnm>BT</fnm>
               </au>
               <au>
                  <snm>Hosack</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gao</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lane</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Lempicki</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <fpage>P3</fpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12734009</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>WebGestalt: an integrated system for exploring gene sets in various biological contexts.</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kirov</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Snoddy</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>W741</fpage>
            <lpage>748</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1160236</pubid>
                  <pubid idtype="pmpid" link="fulltext">15980575</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>BayGO: Bayesian analysis of ontology term enrichment in microarray data.</p>
            </title>
            <aug>
               <au>
                  <snm>Vencio</snm>
                  <fnm>RZ</fnm>
               </au>
               <au>
                  <snm>Koide</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Gomes</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Pereira</snm>
                  <fnm>CA</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>86</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1440873</pubid>
                  <pubid idtype="pmpid" link="fulltext">16504085</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
