<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-6-161</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Software</dochead>
      <bibl>
         <title>
            <p><it>FACT </it>&#8211; a framework for the functional interpretation of high-throughput experiments</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Kokocinski</snm>
               <fnm>Felix</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>F.Kokocinski@factweb.de</email>
            </au>
            <au id="A2">
               <snm>Delhomme</snm>
               <fnm>Nicolas</fnm>
               <insr iid="I1"/>
               <email>N.Delhomme@factweb.de</email>
            </au>
            <au id="A3">
               <snm>Wrobel</snm>
               <fnm>Gunnar</fnm>
               <insr iid="I1"/>
               <email>G.Wrobel@factweb.de</email>
            </au>
            <au id="A4">
               <snm>Hummerich</snm>
               <fnm>Lars</fnm>
               <insr iid="I1"/>
               <email>L.Hummerich@dkfz.de</email>
            </au>
            <au id="A5">
               <snm>Toedt</snm>
               <fnm>Grischa</fnm>
               <insr iid="I1"/>
               <email>G.Toedt@dkfz.de</email>
            </au>
            <au id="A6">
               <snm>Lichter</snm>
               <fnm>Peter</fnm>
               <insr iid="I1"/>
               <email>M.MacLeod@dkfz.de</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Molecular Genetics, Deutsches Krebsforschungszentrum, 69115 Heidelberg, Germany</p>
            </ins>
            <ins id="I2">
               <p>Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1HH, UK</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2005</pubdate>
         <volume>6</volume>
         <issue>1</issue>
         <fpage>161</fpage>
         <url>http://www.biomedcentral.com/1471-2105/6/161</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">15985174</pubid>
               <pubid idtype="doi">10.1186/1471-2105-6-161</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>14</day>
               <month>3</month>
               <year>2005</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>28</day>
               <month>6</month>
               <year>2005</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>28</day>
               <month>6</month>
               <year>2005</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2005</year>
         <collab>Kokocinski et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Interpreting the results of high-throughput experiments, such as those obtained from DNA-microarrays, is an often time-consuming task due to the high number of data-points that need to be analyzed in parallel. It is usually a matter of extensive testing and unknown beforehand, which of the possible approaches for the functional analysis will be the most informative</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>To address this problem, we have developed the <it>Flexible Annotation and Correlation Tool </it>(FACT). FACT allows for detection of important patterns in large data sets by simplifying the integration of heterogeneous data sources and the subsequent application of different algorithms for statistical evaluation or visualization of the annotated data. The system is constantly extended to include additional annotation data and comparison methods.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>FACT serves as a highly flexible framework for the explorative analysis of large genomic and proteomic result sets. The program can be used online; open source code and supplementary information are available at <url>http://www.factweb.de</url>.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>A variety of algorithms and programs have been introduced to accomplish the processing of raw data as well as the statistical analysis of data from high-throughput experiments. But besides the mathematical complexity that needs to be handled, there is a biological complexity inherent to the data sets, too. Current means to analyze large-scale data sets usually target very specific questions and often fail to provide solutions that can be adapted to different types of data. Nevertheless, common and generalized questions for the interpretation of such data can be established as follows: i) What information is known about the analyzed features (clones, genes, e.g.)? ii) Are there correlations between the experimental outcomes and the additional information (shared pathways, etc.)? iii) Is the outcome comparable with results of other experiments (genomic or gene expression data sets, publications, etc.)?</p>
         <p>The program <it>Flexible Annotation and Correlation Tool </it>(<it>FACT</it>) was developed to address these questions by integrating data sources, tools and algorithms in a single open framework. First, FACT allows merging information from various data sources into one comprehensive annotation for an experimental data set. It then provides functional analysis tools to inspect and correlate this heterogeneous information. The functionality of FACT can be extended through the inclusion of new data sources, algorithms and programs by defining additional modules from a prototype. This flexibility is achieved by a strong level of abstraction from the actual data, by the design of the underlying database and by the modular architecture of the software itself. The task to identify relevant biological interconnections reflected by the experimental results (e.g. participation of the analyzed genes in shared pathways) is what we are targeting with the software introduced here.</p>
      </sec>
      <sec>
         <st>
            <p>Implementation</p>
         </st>
         <sec>
            <st>
               <p>Integration of data sources</p>
            </st>
            <p>The integration of bio-molecular data from diverse sources such as public databases or clinical parameters (<it>annotation</it>) is a key challenge in the process of the analysis of high-throughput experiments. While the interpretation of the outcome of a standard experiment used to depend on the knowledge of one human expert from the field, today's screening tools produce data quantities not manageable by human inspection. After receiving a list of differentially regulated genes from a microarray gene expression experiment comprising several hundreds or thousands of entries, it is not efficient to start the interpretation of these results by manually searching through publications. As a first step, broad biological themes should be identified and followed into a more detailed inspection.</p>
            <p>Using network technologies, the availability of data sources is no more the limiting factor, but if accomplished manually, the obstacles for their integration are numerous. Often data are made available in different formats (HTML pages, flat files, direct database access) and very heterogeneous layouts. In addition the nomenclature (e.g. gene names) as well as the relationship of different systems to each other are often inconsistent and require many manual selection and modification steps. At the same time, as much knowledge as possible should be integrated about the data features analyzed, since interesting unknown pathways and interconnections might be hidden behind biological complexity.</p>
            <p>As the first aspect, FACT accomplishes the task of integrating heterogeneous information sources by abstraction from the specific data types to one basic concept. (figure <figr fid="F1">1</figr>, lower part). The smallest entities are <it>data features</it>, which are single items of information, either a name/value pair or additional textual description thereof. This could be a list of ids of clones with their relative expression as measured on a cDNA microarray. They are grouped into d<it>ata sets</it>, combining data features relating to the same experiment or a group of annotation terms for a certain set. In our example the clone/ratio pairs measured in one hybridization would be stored as one data set. The data sets originate from specific <it>data sources</it>, defining distinct types of experiments or annotation sources. One data source would be "cDNA microarray measurements with textual clone ids and numerical results". This architecture follows the idea, that the primary data must be represented at a sufficient level of abstraction to make the data independent of the source technology <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. The concept applies to experimental as well as to annotation data. Meta-data about the different data sources is stored in dedicated tables of the underlying database, describing the source with date and type of data.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Layered architecture of the FACT framework</p>
               </caption>
               <text>
                  <p><b>Layered architecture of the FACT framework</b>. The database reflects the abstraction of any experimental or annotation data to <it>DataSets </it>with <it>DataFeatures</it>, originating from a specific <it>DataSource </it>for which <it>DataTypes </it>and <it>Parameters </it>have been defined. The core library (API) supplies all functionality for accessing the database and for the operation of diverse modules, which are adaptors for specific <it>DataSources </it>or functions. The web interface or other applications are using FACT API functions.</p>
               </text>
               <graphic file="1471-2105-6-161-1"/>
            </fig>
            <p>Differences in nomenclature and the problem of relating one type of experiment to an other, as the second obstacle for data integration, has been addressed for gene and protein centered research by the development of the GeneOntology system (GO) <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. This hierarchical framework of a directed acyclic graph of annotations for gene attributes has become a <it>de facto </it>standard, which can be employed in the functional analysis of experiments. Similar projects have been initiated for example to organize the classification of molecular interactions in pathways and molecular complexes (Genome Knowledgebase / Reactome <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>). Using these resources FACT is able to compare experimental results from technically distant applications.</p>
            <p>However, most experiments differ in focus and design and usually no standard solution for their interpretation can be applied. The third aspect for an integrating approach therefore is high flexibility concerning the application of diverse analysis methods that have already been developed for the interpretation of results or might be used in future.</p>
         </sec>
         <sec>
            <st>
               <p>Architecture</p>
            </st>
            <p>FACT consists of a MySQL database, a core library (as an <it>Application Programming Interface</it>, API) and various modules written in the language Perl. We also created an interface for the web-based usage of all functions of FACT <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. The database reflects the transformation of heterogeneous data into a generalized format, storing information in <it>DateSet </it>and <it>DataFeature </it>tables (figure <figr fid="F2">2</figr>). The core software framework supplies the basic functionality for data access in an object-oriented fashion and for adding and operating modules. These modules are <it>adaptors </it>and can be classified into three categories (figure <figr fid="F3">3</figr>):</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>FACT database schema</p>
               </caption>
               <text>
                  <p><b>FACT database schema</b>. The database schema reflects the generalized handling of heterogeneous data. At the definition layer the data sources are defined as experimental, annotational or analysis sources. Also the types of data that they use are specified here. These types are linked to the individual sources which are defined in the data source layer. Parameters that the functions handling the sources can take are stored as well. The actual data &#8211; experimental as well as annotational &#8211; are saved as data set and data features in the data set layer.</p>
               </text>
               <graphic file="1471-2105-6-161-2"/>
            </fig>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Outline of the flow of information</p>
               </caption>
               <text>
                  <p><b>Outline of the flow of information</b>. Modular DataSource-Adapters accomplish data access and data transformation from heterogeneous sources, making FACT a flexible framework.</p>
               </text>
               <graphic file="1471-2105-6-161-3"/>
            </fig>
            <sec>
               <st>
                  <p>I. Data loading</p>
               </st>
               <p>Different types of experimental data can be loaded using dedicated parser functions. This can be a simple functions to read tab-delimited data defined e.g. as a gene list with associated expression values. It can also be a more complex solution to handle case descriptions from comparative genomic hybridizations (CGH), a method that is employed to monitor copy numbers changes of all regions of a genome simultaneously on chromosome spreads. The individual modules read the specific file format, perform transformations to the generalized format (i.e. convert them into data features) and use core library functions to store the data.</p>
            </sec>
            <sec>
               <st>
                  <p>II. Annotation</p>
               </st>
               <p>Varying data sources can be utilized for the annotation of experimental data sets by different data-access functions (e.g. GO terms for gene names). Modules achieve this for instance through access to an online database or to a local copy of such database. Data of interest are then gathered and stored.</p>
            </sec>
            <sec>
               <st>
                  <p>III. Analysis</p>
               </st>
               <p>Different functions can be used to inspect the annotated information and highlight underlying patterns (e.g. overrepresented GO-terms). The modules typically produce textual and graphical output to draw the researcher's attention to the most promising features of his data.</p>
               <p>Currently available functions of FACT are described below. Further flexibility is achieved by the concept of experimental and annotational data being reduced to the basic model of one <it>data set </it>with several <it>data features</it>, as described above.</p>
               <p>All these modules use the FACT API. It offers a defined interface for the effortless extension to new sources and functions. Prototype modules for each category implementing this API are supplied. For the integration of annotation sources, available data can either be transferred to the local system (data warehousing) or linked to the original source (database federation); the FACT system allows both options to be used by the module functions. Currently remote databases are accessed by the <it>EnsEMBL</it>, <it>BBID </it>and <it>Reactome </it>modules; the <it>CpG </it>and <it>CGAP </it>functions use locally stored information (see below). The update of the local data is accomplished semi-automatically be invoking of the respective <it>update </it>function in the separate modules.</p>
               <p>Finally, as there is an active development of software for the annotation and analysis of gene expression data in the language R (Bioconductor project <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>), and the handling of large data matrices is accomplished faster in R, we used the Perl/R interface <it>RSPerl </it><abbrgrp><abbr bid="B6">6</abbr></abbrgrp> in different modules to encapsulate analysis functions written in R. Other modules employ the functionality from the <it>BioPerl </it><abbrgrp><abbr bid="B7">7</abbr></abbrgrp> and <it>Ensembl </it>Perl API <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Available functionality</p>
            </st>
            <p>A variety of modules for the handling of different data sources (table <tblr tid="T1">1</tblr>) as well as for the application of basic data analysis and display functions (table <tblr tid="T2">2</tblr>) were developed. Most of the current functionality is focused on handling human and murine gene annotation information. Additional functionality can be added in a "plug-and-play" fashion, since new modules can be loaded dynamically into FACT.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Data Types and Sources accessible by current annotation modules.</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Data source, access method</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Data provider, data location</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Type of annotation used by FACT</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Ensembl</it>, Perl API access to local or remote database</p>
                     </c>
                     <c ca="left">
                        <p>European Bioinformatics Institute and Wellcome Trust Sanger Institute (GB) [8], <url>http://www.ensembl.org</url></p>
                     </c>
                     <c ca="left">
                        <p>Ensembl ID, Gene Symbol, Gene Name, Chromosomal Location, Homologues Genes, Interpro Domains, RefSeq Accession Number, Affymetrix ID</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>euGenes</it>, local database</p>
                     </c>
                     <c ca="left">
                        <p>University of Indiana (USA) [10], <url>ftp://iubio.bio.indiana.edu/eugenes/</url></p>
                     </c>
                     <c ca="left">
                        <p>euGene ID, Gene Symbol, Gene Name, GDB ID, OMIM ID, Genomic Localization, GeneOntology Terms, Protein Accession Numbers</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Image Consortium</it>, local database</p>
                     </c>
                     <c ca="left">
                        <p>Lawrence Livermore National Laboratory (USA) [28], <url>ftp://image.llnl.gov/image/imagene/</url></p>
                     </c>
                     <c ca="left">
                        <p>Clone Image ID</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Biological Biochemical Image Database</it>, HTTP parser and HTTP request</p>
                     </c>
                     <c ca="left">
                        <p>National Institute of Aging, NIH (USA) [11], <url>http://bbid.grc.nia.nih.gov/cgi-bin/pathwaysearch.pl</url></p>
                     </c>
                     <c ca="left">
                        <p>Pathway Name and Image-link</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>GeneOntology</it>, local database</p>
                     </c>
                     <c ca="left">
                        <p>GeneOntology Consortium [2], <url>http://www.geneontology.org/GO.current.annotations.shtml</url></p>
                     </c>
                     <c ca="left">
                        <p>ID and Name of GO-Term (Biological Process, Molecular Function, Cellular Localization)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Cancer Genome Anatomy Project</it>, local database</p>
                     </c>
                     <c ca="left">
                        <p>National Cancer Institute, NIH (USA) [29], <url>ftp://ftp1.nci.nih.gov/pub/CGAP</url></p>
                     </c>
                     <c ca="left">
                        <p>Biocarta name, Biocarta short name, KEGG Pathway Name, KEGG Pathway ID, PFAM ID</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>LocusLink </it>/ <it>EntrezGene</it>, local database</p>
                     </c>
                     <c ca="left">
                        <p>NCBI/NIH (USA) [30], <url>ftp://ftp.ncbi.nih.gov/refseq/LocusLink</url> / <url>ftp://ftp.ncbi.nih.gov/gene</url></p>
                     </c>
                     <c ca="left">
                        <p>A. LocusLink ID, Gene Symbol, Gene Name, Genomic Localization, GeneOntology Terms, OMIM ID B. Key references (PubMed links)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Mouse Genome Database</it>, local database</p>
                     </c>
                     <c ca="left">
                        <p>Jackson Laboratory (USA) [31], <url>ftp://ftp.informatics.jax.org</url></p>
                     </c>
                     <c ca="left">
                        <p>MGI ID / Gene Symbol</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Internal <it>CloneBase</it>, local database</p>
                     </c>
                     <c ca="left">
                        <p>Deutsches Krebsforschungs zentrum, Div. Molecular Genetics (D)</p>
                     </c>
                     <c ca="left">
                        <p>General Information on available Clones</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>CpG</it>, local database</p>
                     </c>
                     <c ca="left">
                        <p>University of California Santa Cruz (USA), <url>ftp://hgdownload.cse.ucsc.edu/goldenPath/currentGenomes/Homo_sapiens/database/cpgIsland.txt.gz</url></p>
                     </c>
                     <c ca="left">
                        <p>Calculated relative CpG content of genomic region</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>STRING</it>, local database</p>
                     </c>
                     <c ca="left">
                        <p>EMBL (D) [12], <url>http://string.embl.de</url> (medium or better confidence)</p>
                     </c>
                     <c ca="left">
                        <p>Protein interaction data (computed and imported from other databases)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Affymetrix CEL </it>files</p>
                     </c>
                     <c ca="left">
                        <p>Affymetrix Inc. / FACT, <url>http://www.affymetrix.com</url></p>
                     </c>
                     <c ca="left">
                        <p>Use of Affymetrix probe IDs</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p><it>Reactome</it>, local database and HTTP request</p>
                     </c>
                     <c ca="left">
                        <p>European Bioinformatics Institute (GB) [3], <url>http://www.reactome.org/download</url></p>
                     </c>
                     <c ca="left">
                        <p>Pathway information</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Current data analysis and display modules</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Method Name</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Reference</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Method Description</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Simple Count</p>
                     </c>
                     <c ca="left">
                        <p>FACT</p>
                     </c>
                     <c ca="left">
                        <p>Count and display of occurrences of annotation terms</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>GO-Term Comparison</p>
                     </c>
                     <c ca="left">
                        <p>In part from <it>GO::TermFinder </it>[15]</p>
                     </c>
                     <c ca="left">
                        <p>Detection of significantly overrepresented GO terms in Gene List, based upon hypergeometric tail probability</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MedLiner</p>
                     </c>
                     <c ca="left">
                        <p><it>Bio::Biblio </it>(M. Senger, EBI)</p>
                     </c>
                     <c ca="left">
                        <p>List Publications with co-occurrences of terms</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CGH database</p>
                     </c>
                     <c ca="left">
                        <p>Deutsches Krebsforschungs-zentrum, Div. Molecular Genetics (D)</p>
                     </c>
                     <c ca="left">
                        <p>Compare CGH results to archived data</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>goCluster</p>
                     </c>
                     <c ca="left">
                        <p><it>goCluster</it>, G. Wrobel, available at <url>http://www.bioconductor.org</url></p>
                     </c>
                     <c ca="left">
                        <p>Detection of significantly overrepresented GO terms (based upon Fisher's exact test) in Clusters built with k-means algorithm</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Hypergeometric Tail</p>
                     </c>
                     <c ca="left">
                        <p>In part from <it>GeneMerge </it>[14]</p>
                     </c>
                     <c ca="left">
                        <p>Detection of significantly overrepresented terms of any kind, based upon hypergeometric tail probability</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CGH &#8211; Expression Comparison</p>
                     </c>
                     <c ca="left">
                        <p>FACT</p>
                     </c>
                     <c ca="left">
                        <p>Detect correlation between genomic and expression data sets, based on two-sided T-Tests</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Chromosomal Plot</p>
                     </c>
                     <c ca="left">
                        <p>FACT</p>
                     </c>
                     <c ca="left">
                        <p>Display values or occurrences in genomic context</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>To read in experimental results, a simple list (with terms or term-value pairs) or table can be used, or more specialized parser functions can be employed to read and decipher the notations of different types of results. One parser (<it>2_Colums</it>) reads tab-, or semicolon-separated lists and stores the data as terms of the data type that is passed as a parameter (e.g. gene symbol or clone id) and the respective value. Another function (<it>LongList</it>) expects terms only (list of genes that are to be annotated) or Affymetrix probe ids (<it>AffyCelFile</it>). The parser for CGH results translates ISCN notations of cytogenetic alterations <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> into the distinct chromosomal bands that are affected while reading the data file. The bands are stored with the alteration -1 (loss of genomic material), +1 (gain), or +2 (high level amplification).</p>
            <p>Data sources that can then be used for annotating these experimental results contain among others <it>EnsEMBL </it>databases <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, with functions providing numerous gene annotations on the human and murine genome (gene name, accession numbers, genomic localization, GO terms and other ids). The annotation data is fetched by using the EnsEMBL API or direct sql queries. Chromosomal locations expressed as cytogenetic bands can be translated in megabasepair positions. This can permit the direct comparison of results from genomic and expression experiments. Most common identifiers (IMAGE IDs, DDBJ/EMBL/GenBank accession numbers, international clone names, MGD (Mouse-Genome Database) IDs or Probe-IDs as used on expression microarrays in the Affymetrix system) are recognized and used by the different annotation modules. Additionally, homologous genes, sequence features, InterPro protein domains, CpG content and <it>PubMed </it>references can be acquired. The <it>euGenes </it>database <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> modules delivers an additional set of broad annotations (Gene Symbol and Name, GDB ID, OMIM ID, Genomic Localization, GO terms, etc.). The BBID module searches for representations of affected pathways in the <it>Biological Biochemical Image Database </it><abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. We store links to the images which sometimes allow the clarification of interactions better than textual description alone. Additionally data is used from STRING <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> and Reactome <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> to point out protein interactions and involvement in molecular complexes.</p>
            <p>Annotation steps can be concatenated, allowing deriving from specific (e.g. Affymetrix Probe-IDs) to more general terms (e.g. gene symbols). This broad annotation that is facilitated by FACT is crucial for the researcher to acquire a complete picture of his data. The user can export the combined lists of annotated data in HTML, XML or text format. If desired the system can send them by email.</p>
            <p>Modules to correlate these annotated datasets with each other have been developed for FACT, incorporating existing algorithms or presenting new approaches (table <tblr tid="T2">2</tblr>). For example, a counting procedure (<it>SimpleCount</it>) reports the number of occurrences of each annotation term. The module is independent of the type of data, one or more data sets can be added up and a threshold for reporting can be defined. The results are displayed graphically in a chart and in a table format, directing the researcher's attention to distinct characteristics (figure <figr fid="F4">4</figr>). A more specific approach to interpret list of genes lies in the explicit usage of GO terms. As originally introduced by Khatri <it>et al</it>. in the program <it>OntoExpress </it><abbrgrp><abbr bid="B13">13</abbr></abbrgrp> there are different methods and implementation to search for those parts of the GO tree that appear more often in the gene list analyzed than by chance alone. For FACT we used an implementation from the <it>GeneMerge </it>program <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> and the GO::TermFinder perl module <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Occurrences of terms in a gene list are normalized against a background, which might contain the annotations of all terms spotted on a chip or the entire genome, by using the hypergeometric tail probability (with Bonferroni correction if desired, example: figure <figr fid="F5">5</figr>). In our implementation, the function can be applied on GO data as well as on any other kind of annotation to detect overrepresented terms. Alternatively the Fisher's exact test can be used for this purpose. We included part of this method taken from the <it>EASE </it>program <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> in a FACT function. We also developed a combined algorithm, which calculates a K-Means similarity matrix based on the experimental values and reports on significant terms with the Fisher's exact test within the identified groups afterwards (<it>goCluster</it>, G. Wrobel, available at <url>http://www.bioconductor.org</url>). Another analysis module dedicated to the comparison of genomic and expression data runs pairs of two-sided T-Tests on the groups of over- and under-expressed genes against the groups of enhanced/amplified and deleted genomic regions. This allows detecting a correlation between an altered genomic locus and the corresponding expression pattern. FACT can also represent experimental values or frequency counts in the genomic context which can be helpful for the identification and representation of localization effects in the genome. This is achieved by reading in the genomic locations of data sets and generating bar chart images on top of prepared chromosome ideograms (figure <figr fid="F6">6</figr>). The MedLiner module presents a simple literature mining tool which uses the Bio::Biblio perl functions to find citations that are shared by two or more gene (Figure <figr fid="F5">5</figr>). Example output files are also available at <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Application of the FACT system for the functional analysis of microarray data of the development of non-melanoma skin cancer (<it>SimpleCount </it>function)</p>
               </caption>
               <text>
                  <p><b>Application of the FACT system for the functional analysis of microarray data of the development of non-melanoma skin cancer (<it>SimpleCount </it>function)</b>. Occurrences of annotation terms are counted and displayed to draw the researchers attention to potentially characteristic features of the data set. In this case the genomic bands at 1q21 seem to play an important role in the experiment.</p>
               </text>
               <graphic file="1471-2105-6-161-4"/>
            </fig>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Application of the FACT system in non-melanoma skin cancer research (<it>GoTerm </it>function)</p>
               </caption>
               <text>
                  <p><b>Application of the FACT system in non-melanoma skin cancer research (<it>GoTerm </it>function)</b>. Overrepresented terms from a Gene Ontology annotation are displayed in a chart. The usage of the GO system is the most common approach for the functional interpretation of gene lists.</p>
               </text>
               <graphic file="1471-2105-6-161-5"/>
            </fig>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Application of the FACT system in non-melanoma skin cancer research (<it>ChromPlot </it>function)</p>
               </caption>
               <text>
                  <p><b>Application of the FACT system in non-melanoma skin cancer research (<it>ChromPlot </it>function)</b>. Visual representation of the genomic distribution of analyzed features highlights the involvement of the genomic band 1q21 (first 5 chromosomes shown). In this case the localization of human homologues genes corresponding to murine clones over- and under expressed in squamous cell carcinoma are displayed.</p>
               </text>
               <graphic file="1471-2105-6-161-6"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Application of the program</p>
            </st>
            <p>The <it>Flexible Annotation and Correlation Tool </it>has proven especially helpful with the interpretation of results from genomic and expression microarray experiments, but most functions can be applied to a broad variety of experimental data.</p>
            <p>We demonstrate the benefits of FACT for the functional interpretation of the results from a comprehensive analysis of gene expression patterns in the development of non-melanoma skin-cancer conducted within our group [L. Hummerich et al.: <it>Identification of novel tumor-associated genes in the process of squamous cell cancer development</it>; submitted]. Using two different sets of microarrays with 15.000 and 20.000 cDNA fragments respectively, the chemically induced multi-step development of squamous cell carcinoma was monitored. We used the dorsal skin of mice as a well-studied system for the development of epithelial cancer <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> with the carcinogen 7,12-dimethylbenz-[a]-anthracene and the tumor promoter 12-O-tetradecanoylphorbol-13-acetate as inducing agents. Expression values were measured at different time-points of tumor formation. Genes with differentiated expression patterns are expected to play a role in the development of human epidermal tumor development as well.</p>
            <p>Preprocessed results where loaded into the FACT system as text files containing the murine clone identifier and expression values. Using mainly information from the <it>Ensembl </it>database, FACT annotation functions acquired corresponding gene names, genomic localizations, functional information from Gene Ontology, homologues human gene names and the genomic localization of those genes (supplement S1, complete set of annotated data). We applied different FACT analysis functions to explore the expression data (for example the <it>TermFinder </it>function to search for enriched functional groups of genes, figure <figr fid="F5">5</figr>). The function <it>SimpleCount </it>was used to search for enriched functional categories of annotations. Using the information of homologues genes, we were able to identify the human chromosomal band 1q21, as a region of accumulated differentially expresses genes in murine skin cancer formation (figure <figr fid="F4">4</figr>). The visual representation of the loci using the <it>ChromPlot </it>function highlights these findings (figure <figr fid="F6">6</figr>). With the <it>MedLiner </it>module we were to confirm that several regulated genes were collectively mentioned in previous publications (figure <figr fid="F7">7</figr>). A number of genes (S100A3, S100A6, S100A8, S100A9) which are part of the S100 family of calcium-binding proteins are involved in the regulation of AP-1 and NF&#954;B-dependent transcription. Using FACT we were able to focus our analysis and to gain understanding of the relevance of the results from the microarray study. With the application, it was possible to find human homologues of the murine tumor-associated genes and to confirm the involvement of the S100 protein family in human epidermal malignancies.</p>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Application of the FACT system in non-melanoma skin cancer research (<it>MedLiner </it>function)</p>
               </caption>
               <text>
                  <p><b>Application of the FACT system in non-melanoma skin cancer research (<it>MedLiner </it>function)</b>. FACT's simple automated literature screen function displays publications mentioning groups of genes identified in the study (top 5 hits shown).</p>
               </text>
               <graphic file="1471-2105-6-161-7"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Comparison to and inclusion of other tools</p>
            </st>
            <p>There are several large-scale database projects that incorporate an immense spectrum of information about genes and gene product (<it>Ensembl</it>, <it>euGene</it>, <it>LocusLink/EntrezGene</it>, etc.). The <it>Ensembl </it>project for example also allows the user to display his own selected data sources in the context of the full genome annotation through the <it>Distributed Annotation System </it><abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, and it offers the possibility of mining the data of several genomes using the <it>EnsMart </it>software <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. FACT uses <it>Ensembl</it>, <it>LocusLink/EntrezGene </it>and <it>euGene </it>data and complements them with other annotation resources; it allows the user to apply different analysis functions on the combined data.</p>
            <p>Recently, a variety of computational tools have been introduced to aid in the interpretation of results, some of which are of interest concerning FACT. The majority of the programs use GO annotations to gain an interpretation of gene expression data. <it>OntoExpress </it><abbrgrp><abbr bid="B13">13</abbr></abbrgrp> was introduced in 2002 and offers the options to use hypergeometric, chi-square, binomial and Fischer's exact test to score annotation term derived from gene lists. It also allows the appliance of different methods (False Discovery Rate (FDR), Bonferroni, Holm, Sidak) for the multiple experiment correction <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. The program can also include KEGG pathway information and chromosomal localization and is now part of the Onto-Tools collection to offer further functionality <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. <it>EASE </it><abbrgrp><abbr bid="B16">16</abbr></abbrgrp><it>/ DAVID </it><abbrgrp><abbr bid="B22">22</abbr></abbrgrp> offer a broad variety of annotation options in their latest version including all major database identifiers, protein domain and pathway information. The Fisher's exact test is used for the detection of enriched terms. <it>GoMiner </it><abbrgrp><abbr bid="B23">23</abbr></abbrgrp> and numerous other tools listed at the GO website <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> can be used for the annotation of gene lists with GO terms. <it>GeneMerge </it><abbrgrp><abbr bid="B14">14</abbr></abbrgrp> uses the hypergeometric tail probability with Bonferroni correction to test GO terms, genomic localizations and KEGG pathway information and is used in parts within the FACT system. FACT uses the available Perl code of <it>GO::TermFinder </it><abbrgrp><abbr bid="B15">15</abbr></abbrgrp> for the GO annotation and the detection of significantly overrepresented terms using the FDR. It also includes a function combining K-means clustering and Fisher's exact test. <it>GEPAS </it><abbrgrp><abbr bid="B25">25</abbr></abbrgrp> and <it>GECKO </it><abbrgrp><abbr bid="B26">26</abbr></abbrgrp> are two recently introduced large software packages that include functional analysis and visualization steps. In contrast to FACT they are focused on the initial statistical evaluation and on the analysis of gene expression microarray data. GFINDer <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> is a system that offers annotations on GO, pathway information, protein domains and genetic disorders. It analyses with count functions and appropriate tests (Hypergeometric, Binomial, Fisher's, Z or Poisson Test).</p>
            <p>This list of available tools is far from complete and not all aspects are covered. The FACT system was developed with the focus to include and extend the functionality of tools like these. To our knowledge, the individual programs do not offer the same degree of flexibility and openness to different data sources and analysis methods. New functions can be added to FACT by simply uploading the respective module. The system is designed as an open framework for the explorative analysis using a variety of methods on annotational data. It is not restricted to or focused on Gene Ontology-based interpretation or the analysis of gene expression data alone and should facilitate the development and application of new analysis approaches. The system is constantly being extended to include additional aspects. With the submission as an open-source project we want to encourage other researchers to participate in this development.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>To gain a more complete picture of results obtained from high-throughput experiments such as DNA-microarrays, automated procedures are required for annotation and analysis. At the same time it is usually a matter of testing and not known beforehand, which of the possible approaches for the functional analysis will be the most informative or appropriate. The <it>Flexible Annotation and Correlation Tool </it>offers the flexibility to integrate and compare annotation data and different algorithms in one environment by using a unified data basis. Data sets of different nature and format can be incorporated, diverse analytical algorithms can be applied and the user can add his own data integration and analysis functions. As a flexible framework for the explorative meta-analysis of genomic, proteomic or other experiments, FACT can help with the task of analyzing the biological complexity, allowing researchers to bridge gaps between different kinds of experiments and acquiring a more complete interpretation of large-scale experiments.</p>
      </sec>
      <sec>
         <st>
            <p>Availability and requirements</p>
         </st>
         <p>- <b>Project name: </b>Flexible Annotation and Correlation Tool (FACT).</p>
         <p>- <b>Project home page: </b><url>http://www.factweb.de</url></p>
         <p>- <b>Operating system: </b>tested on Linux SUSE 9.1</p>
         <p>- <b>Programming language: </b>Perl (5.8.1)</p>
         <p>- <b>Other requirements: </b>MySQL database (4.0.15); for specific modules: R (1.8.0 with RSPerl) and Bioconductor (1.4.0); for full installation: Apache web-server (apache2-prefork-2.0.48); additional Perl modules: BioPerl (1.2.1), Ensembl (currently 28). Please refer to website for full listing.</p>
         <p>- <b>License: </b>Open Source GNU GPL (see licence document)</p>
         <p>- <b>Any restrictions to use by non-academics: </b>written licence needed</p>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>API &#8211; application programming interface, CGH &#8211; comparative genomic hybridization, FACT &#8211; Flexible annotation and correlation tool, GO &#8211; Gene Ontology, ISCN &#8211; international system for human cytogenetic nomenclature</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>FK designed and implemented the FACT system and the web-interface, ND re-designed and extended it, GW helped with the initial design and supplied R-modules, LH carried out the microarray experiments and supplied additional ideas, GT conducted the integration into other analysis systems, PL supervised the FACT project. All authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We are grateful for the contributions of Regina Mueller, Jochen Hess and Peter Angel within the non-melanoma skin cancer project and the helpful comments from Anja Kolb-Kokocinski and Imre Vastrik. The FACT project was supported by grants from the German Ministry for Education and Research (NGFN 01 GR 0101, NGFN 01GR 0417 and NGFN 01GR 0418).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>GeneX: An Open Source gene expression database and integrated tool set</p>
            </title>
            <aug>
               <au>
                  <snm>Mangalam</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Stewart</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schlauch</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Waugh</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Farmer</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Colello</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Weller</snm>
                  <fnm>JW</fnm>
               </au>
            </aug>
            <source>IBM Systems J</source>
            <pubdate>2001</pubdate>
            <volume>40</volume>
            <issue>2</issue>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Creating the gene ontology resource: design and implementation</p>
            </title>
            <aug>
               <au>
                  <cnm>GeneOntology Consortium</cnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <fpage>1425</fpage>
            <lpage>33</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">311077</pubid>
                  <pubid idtype="pmpid" link="fulltext">11483584</pubid>
                  <pubid idtype="doi">10.1101/gr.180801</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Reactome: a knowledgebase of biological pathways</p>
            </title>
            <aug>
               <au>
                  <snm>Joshi-Tope</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gillespie</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Vastrik</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>D'Eustachio</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Schmidt</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>de Bono</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Jassal</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Gopinath</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Matthews</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Lewis</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Stein</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>D428</fpage>
            <lpage>32</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540026</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608231</pubid>
                  <pubid idtype="doi">10.1093/nar/gki072</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>FACT website</p>
            </title>
            <url>http://www.factweb.de</url>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Bioconductor: open software development for computational biology and bioinformatics</p>
            </title>
            <aug>
               <au>
                  <snm>Gentleman</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Carey</snm>
                  <fnm>VJ</fnm>
               </au>
               <au>
                  <snm>Bates</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Bolstad</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Dettling</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dudoit</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ellis</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Gautier</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Ge</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Gentry</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hornik</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Hothorn</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Huber</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Iacus</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Irizarry</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Leisch</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Maechler</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rossini</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Sawitzki</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Smyth</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Tierney</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>JY</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>10</issue>
            <fpage>R80</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">545600</pubid>
                  <pubid idtype="pmpid" link="fulltext">15461798</pubid>
                  <pubid idtype="doi">10.1186/gb-2004-5-10-r80</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>RSPerl</p>
            </title>
            <url>http://www.omegahat.org</url>
         </bibl>
         <bibl id="B7">
            <title>
               <p>The Bioperl toolkit: Perl modules for the life sciences</p>
            </title>
            <aug>
               <au>
                  <snm>Stajich</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Block</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Boulez</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Brenner</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Chervitz</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Dagdigian</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Fuellen</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>JGR</fnm>
               </au>
               <au>
                  <snm>Korf</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>H</snm>
                  <fnm>Lapp</fnm>
               </au>
               <au>
                  <snm>Lehvaslaiho</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Matsalla</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Mungall</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Osborne</snm>
                  <fnm>BI</fnm>
               </au>
               <au>
                  <snm>Pocock</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Schattner</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Senger</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Stein</snm>
                  <fnm>LD</fnm>
               </au>
               <au>
                  <snm>Stupka</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Wilkinson</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <issue>10</issue>
            <fpage>1611</fpage>
            <lpage>1618</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">187536</pubid>
                  <pubid idtype="pmpid" link="fulltext">12368254</pubid>
                  <pubid idtype="doi">10.1101/gr.361602</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>The Ensembl genome database project</p>
            </title>
            <aug>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Barker</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Cameron</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Cox</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Cuff</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Curwen</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Down</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Eyras</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hammond</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Huminiecki</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Kasprzyk</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lehvaslaiho</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lijnzaad</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Melsopp</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Mongin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Pettett</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Pocock</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Potter</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Rust</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Schmidt</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Searle</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Slater</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Spooner</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Stabenau</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Stalker</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Stupka</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Ureta-Vidal</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Vastrik</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Clamp</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <issue>1</issue>
            <fpage>38</fpage>
            <lpage>41</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99161</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752248</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.38</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>An International System for Human Cytogenetic Nomenclature (1985)</p>
            </title>
            <aug>
               <au>
                  <cnm>Standing Committee on Human Cytogenetic Nomenclature</cnm>
               </au>
            </aug>
            <source>Birth Defects Orig Artic Ser</source>
            <pubdate>1985</pubdate>
            <volume>21</volume>
            <issue>1</issue>
            <fpage>1</fpage>
            <lpage>117</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">4041569</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>euGenes: a eukaryote genome information system</p>
            </title>
            <aug>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>DG</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>145</fpage>
            <lpage>148</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99146</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752277</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.145</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>BBID: the biological biochemical image database</p>
            </title>
            <aug>
               <au>
                  <snm>Becker</snm>
                  <fnm>KG</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Muller</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Engel</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <issue>8</issue>
            <fpage>745</fpage>
            <lpage>6</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/16.8.745</pubid>
                  <pubid idtype="pmpid" link="fulltext">11099263</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>STRING: a database of predicted functional associations between proteins</p>
            </title>
            <aug>
               <au>
                  <snm>von Mering</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Huynen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jaeggi</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Schmidt</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Snel</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <issue>1</issue>
            <fpage>258</fpage>
            <lpage>61</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">165481</pubid>
                  <pubid idtype="pmpid" link="fulltext">12519996</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg034</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Profiling gene expression using Onto-express</p>
            </title>
            <aug>
               <au>
                  <snm>Khatri</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Draghici</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ostermeier</snm>
                  <fnm>GC</fnm>
               </au>
               <au>
                  <snm>Krawetz</snm>
                  <fnm>SA</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>2002</pubdate>
            <volume>79</volume>
            <issue>2</issue>
            <fpage>266</fpage>
            <lpage>70</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/geno.2002.6698</pubid>
                  <pubid idtype="pmpid" link="fulltext">11829497</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>GeneMerge &#8211; post-genomic analysis, data mining, and hypothesis testing</p>
            </title>
            <aug>
               <au>
                  <snm>Castillo-Davis</snm>
                  <fnm>CI</fnm>
               </au>
               <au>
                  <snm>Hartl</snm>
                  <fnm>DL</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>7</issue>
            <fpage>891</fpage>
            <lpage>2</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg114</pubid>
                  <pubid idtype="pmpid" link="fulltext">12724301</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>GO::TermFinder &#8211; open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes</p>
            </title>
            <aug>
               <au>
                  <snm>Boyle</snm>
                  <fnm>EI</fnm>
               </au>
               <au>
                  <snm>Weng</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gollub</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Jin</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Cherry</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Sherlock</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <issue>18</issue>
            <fpage>3710</fpage>
            <lpage>5</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15297299</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Identifying biological themes withing lists of genes with EASE</p>
            </title>
            <aug>
               <au>
                  <snm>Hosack</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Dennis</snm>
                  <fnm>G</fnm>
                  <suf>Jr</suf>
               </au>
               <au>
                  <snm>Sherman</snm>
                  <fnm>BT</fnm>
               </au>
               <au>
                  <snm>Lane</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Lempicki</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <issue>10</issue>
            <fpage>R70</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">328459</pubid>
                  <pubid idtype="pmpid" link="fulltext">14519205</pubid>
                  <pubid idtype="doi">10.1186/gb-2003-4-10-r70</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Global functional profiling of gene expression</p>
            </title>
            <aug>
               <au>
                  <snm>Draghici</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Khatri</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Martins</snm>
                  <fnm>RP</fnm>
               </au>
               <au>
                  <snm>Ostermeier</snm>
                  <fnm>GC</fnm>
               </au>
               <au>
                  <snm>Krawetz</snm>
                  <fnm>SA</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>2003</pubdate>
            <volume>81</volume>
            <issue>2</issue>
            <fpage>98</fpage>
            <lpage>104</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0888-7543(02)00021-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">12620386</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Multistage carcinogenesis in mouse skin</p>
            </title>
            <aug>
               <au>
                  <snm>DiGionvanni</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Pharmacol Ther</source>
            <pubdate>1992</pubdate>
            <volume>54</volume>
            <fpage>63</fpage>
            <lpage>126</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0163-7258(92)90051-Z</pubid>
                  <pubid idtype="pmpid">1528955</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>The distributed annotation system</p>
            </title>
            <aug>
               <au>
                  <snm>Dowell</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Jokerst</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Day</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Eddy</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Stein</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>2</volume>
            <issue>1</issue>
            <fpage>7</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">58584</pubid>
                  <pubid idtype="pmpid" link="fulltext">11667947</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-2-7</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>EnsMart: a generic system for fast and flexible access to biological data</p>
            </title>
            <aug>
               <au>
                  <snm>Kasprzyk</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Keefe</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Smedley</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>London</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Spooner</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Melsopp</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hammond</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rocca-Serra</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Cox</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <fpage>160</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">314293</pubid>
                  <pubid idtype="pmpid" link="fulltext">14707178</pubid>
                  <pubid idtype="doi">10.1101/gr.1645104</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Onto-Tools: an ensemble of web-accessible, ontology-based tools for the functional design and interpretation of high-throughput gene expression experiments</p>
            </title>
            <aug>
               <au>
                  <snm>Khatri</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bhavsar</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bawa</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Draghici</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>W449</fpage>
            <lpage>56</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">441547</pubid>
                  <pubid idtype="pmpid" link="fulltext">15215428</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh086</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>DAVID: Database for Annotation, Visualization, and Integrated Discovery</p>
            </title>
            <aug>
               <au>
                  <snm>Dennis</snm>
                  <fnm>G</fnm>
                  <suf>Jr</suf>
               </au>
               <au>
                  <snm>Sherman</snm>
                  <fnm>BT</fnm>
               </au>
               <au>
                  <snm>Hosack</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gao</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lane</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Lempicki</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <issue>5</issue>
            <fpage>P3</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/gb-2003-4-5-p3</pubid>
                  <pubid idtype="pmpid" link="fulltext">12734009</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>GoMiner: A Resource for Biological Interpretation of Genomic and Proteomic Data</p>
            </title>
            <aug>
               <au>
                  <snm>Zeeberg</snm>
                  <fnm>BR</fnm>
               </au>
               <au>
                  <snm>Feng</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Fojo</snm>
                  <fnm>AT</fnm>
               </au>
               <au>
                  <snm>Sunshine</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Narasimhan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kane</snm>
                  <fnm>DW</fnm>
               </au>
               <au>
                  <snm>Reinhold</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Lababidi</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bussey</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Riss</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Barrett</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Weinstein</snm>
                  <fnm>JN</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <issue>4</issue>
            <fpage>R28</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">154579</pubid>
                  <pubid idtype="pmpid" link="fulltext">12702209</pubid>
                  <pubid idtype="doi">10.1186/gb-2003-4-4-r28</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Gene Ontology website</p>
            </title>
            <url>http://www.geneontology.org</url>
         </bibl>
         <bibl id="B25">
            <title>
               <p>New challenges in gene expression data analysis and the extended GEPAS</p>
            </title>
            <aug>
               <au>
                  <snm>Herrero</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Vaquerizas</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Al-Shahrour</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Conde</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Mateos</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Diaz-Uriarte</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Dopazo</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <issue>Web Server</issue>
            <fpage>W485</fpage>
            <lpage>W491</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">441559</pubid>
                  <pubid idtype="pmpid" link="fulltext">15215434</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>GECKO: a complete large-scale gene expression analysis platform</p>
            </title>
            <aug>
               <au>
                  <snm>Theilhaber</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ulyanov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Malanthara</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Cole</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Nahf</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Heuer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Brockel</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Bushnell</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>1</issue>
            <fpage>195</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">539353</pubid>
                  <pubid idtype="pmpid" link="fulltext">15588317</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-5-195</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>GFINDer: Genome Function INtegrated Discoverer through dynamic annotation, statistical analysis, and mining</p>
            </title>
            <aug>
               <au>
                  <snm>Masseroli</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Martucci</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Pinciroli</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>W293</fpage>
            <lpage>300</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">441570</pubid>
                  <pubid idtype="pmpid" link="fulltext">15215397</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh108</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>The I.M.A.G.E. Consortium: an integrated molecular analysis of genomes and their expression</p>
            </title>
            <aug>
               <au>
                  <snm>Lennon</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Auffray</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Polymeropoulos</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Soares</snm>
                  <fnm>MB</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>1996</pubdate>
            <volume>33</volume>
            <issue>1</issue>
            <fpage>151</fpage>
            <lpage>2</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/geno.1996.0177</pubid>
                  <pubid idtype="pmpid" link="fulltext">8617505</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>The Cancer Genome Anatomy Project: building an annotated gene index</p>
            </title>
            <aug>
               <au>
                  <snm>Strausberg</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Buetow</snm>
                  <fnm>KH</fnm>
               </au>
               <au>
                  <snm>Emmert-Buck</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Klausner</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Trends in Genetics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>103</fpage>
            <lpage>106</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(99)01937-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">10689348</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Introducing RefSeq and LocusLink: curated human genome resources at the NCBI</p>
            </title>
            <aug>
               <au>
                  <snm>Pruitt</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Katz</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Sicotte</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Maglott</snm>
                  <fnm>DR</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <issue>1</issue>
            <fpage>44</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(99)01882-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">10637631</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>MGD: The Mouse Genome Database</p>
            </title>
            <aug>
               <au>
                  <snm>Blake</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Richardson</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Bult</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Kadin</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Eppig</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <cnm>the members of the Mouse Genome Database Group</cnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>193</fpage>
            <lpage>195</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">165494</pubid>
                  <pubid idtype="pmpid" link="fulltext">12519980</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg047</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
