<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-7-374</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Methodology article</dochead>
      <bibl>
         <title>
            <p>Genome comparison using Gene Ontology (GO) with statistical testing</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Cai</snm>
               <fnm>Zhaotao</fnm>
               <insr iid="I1"/>
               <email>cait@mail.cbi.pku.edu.cn</email>
            </au>
            <au id="A2">
               <snm>Mao</snm>
               <fnm>Xizeng</fnm>
               <insr iid="I1"/>
               <email>maoxz@mail.cbi.pku.edu.cn</email>
            </au>
            <au id="A3">
               <snm>Li</snm>
               <fnm>Songgang</fnm>
               <insr iid="I1"/>
               <email>lsg@pku.edu.cn</email>
            </au>
            <au id="A4" ca="yes">
               <snm>Wei</snm>
               <fnm>Liping</fnm>
               <insr iid="I1"/>
               <email>weilp@mail.cbi.pku.edu.cn</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Center for Bioinformatics, National Laboratory of Protein Engineering and Plant Genetic Engineering, College of Life Sciences, Peking University, Beijing 100871, P.R. China</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2006</pubdate>
         <volume>7</volume>
         <issue>1</issue>
         <fpage>374</fpage>
         <url>http://www.biomedcentral.com/1471-2105/7/374</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">16901353</pubid>
               <pubid idtype="doi">10.1186/1471-2105-7-374</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>24</day>
               <month>2</month>
               <year>2006</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>11</day>
               <month>8</month>
               <year>2006</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>11</day>
               <month>8</month>
               <year>2006</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2006</year>
         <collab>Cai et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Automated comparison of complete sets of genes encoded in two genomes can provide insight on the genetic basis of differences in biological traits between species. Gene ontology (GO) is used as a common vocabulary to annotate genes for comparison. Current approaches calculate the fold of unweighted or weighted differences between two species at the high-level GO functional categories. However, to ensure the reliability of the differences detected, it is important to evaluate their statistical significance. It is also useful to search for differences at all levels of GO.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We propose a statistical approach to find reliable differences between the complete sets of genes encoded in two genomes at all levels of GO. The genes are first assigned GO terms from BLAST searches against genes with known GO assignments, and for each GO term the abundance of genes in the two genomes is compared using a chi-squared test followed by false discovery rate (FDR) correction. We applied this method to find statistically significant differences between two cyanobacteria, <it>Synechocystis </it>sp. PCC6803 and <it>Anabaena </it>sp. PCC7120. We then studied how the set of identified differences vary when different BLAST cutoffs are used. We also studied how the results vary when only subsets of the genes were used in the comparison of human <it>vs</it>. mouse and that of <it>Saccharomyces cerevisiae </it>vs. <it>Schizosaccharomyces pombe</it>.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>There is a surprising lack of statistical approaches for comparing complete genomes at all levels of GO. With the rapid increase of the number of sequenced genomes, we hope that the approach we proposed and tested can make valuable contribution to comparative genomics.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Comparison of two completely sequenced genomes sheds lights on the genetic basis of differences in biological traits between species. Of particular interest is the comparison of complete sets of genes and gene products encoded in two genomes. Manual comparison is important but time-consuming and labor-intensive at the whole-genome scale and thus must be aided by automated approaches.</p>
         <p>Unambiguous automated comparison requires that both genomes be annotated with the same structured, controlled vocabulary. Currently, the most common choice for such a vocabulary is gene ontology (GO) <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. The November 15, 2005 version of GO contained 19,025 terms in three hierarchical structures&#8212;as Directed Acyclic Graphs (DAGs)&#8212;termed Biological Processes, Cellular Components, and Molecular Functions. Every branch in the graph represents a biological concept progressing from general to specialized with increasing graph depth. The depth of the branches in the graphs varies, with levels ranging from 2 to 15.</p>
         <p>The GO web site currently lists 31 genomes that have been annotated with GO <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. The annotations that are of the highest quality and updated most frequently are usually carried out by researchers who sequence and study a particular species; these annotations are primarily stored in species-specific databases such as SGD <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> for <it>Saccharomyces cerevisiae</it>, FlyBase <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> for <it>Drosophila melanogaster</it>, WormBase <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> for <it>Caenorhabditis elegans</it>, MGI <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> for <it>Mus musculus</it>, and TAIR <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> for <it>Arabidopsis thaliana</it>. Since these species-specific databases are located in different sites on the web, there is need for integrated, searchable databases that contain annotations for multiple species. The GO Consortium has developed such a resource, called AMIGO <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, that allows users to search and browse GO annotations integrated from many species-specific databases. Additionally, the European Bioinformatics Institute (EBI) has developed the Gene Ontology Annotation (GOA) database <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> that provides GO annotations for non-redundant proteins from many species in UniProt <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>. We compare these two resources in the Methods section. In addition to sequences annotated with GO, 15,754 functional domains in the InterPro domain database <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> have been linked to 2,627 GO terms <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
         <p>Using the above-mentioned resources, there are two main types of methods developed to automatically annotate new gene products with GO terms: sequence similarity-based methods such as GOFigure <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, Goblet <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, OntoBlast <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, GOtcha <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, and Blast2GO <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, and sequence domain-based methods such as InterProScan <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> and GOTrees <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. For genome-scale GO annotations the similarity-based, in particular BLAST-based methods have been the preferred choice <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>. BLAST is significantly faster than InterProScan and can annotate many more GO terms than InterProScan can. A recent evaluation showed that assigning GO terms of the top BLAST hit gave satisfactory results when compared with several more complex methods <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. Thus we chose the BLAST approach in our work.</p>
         <p>After the sets of genes encoded in the two genomes are annotated with GO, they can then be compared. The goal is to find functional categories that differ between the two genomes, which may explain differences in biological traits or suggest interesting families for further detailed investigation. The most common practice is to use tools such as GOslim <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp> to tally the number of genes that fall within each functional category at the first level under Biological Processes, Cellular Components, and Molecular Functions, and then to compare between the two genomes. Because the two genomes usually differ in size, the absolute numbers of genes in each functional category need to be weighted before they are compared; they are often divided by the total number of genes in the respective genomes <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. The results of the unweighted and weighted comparisons are usually presented as bar charts or fold changes.</p>
         <p>The unweighted and weighted GO-based genome comparisons, although useful, have two drawbacks. First, focusing only on the high-level functional categories may miss differences that are detectable only at more refined levels. Second, bar charts or fold changes alone are not sufficient to separate true functional differences from those occurring by chance; thus, statistical testing of significance is necessary. Lessons can be learned from another, more extensively researched application of GO&#8212;the detection of significantly enriched GO categories in a set of co-expressed or differentially expressed genes in microarray experiments. Several tools have been developed to search complete GO trees (rather than just the high levels) and apply statistical testing of significance (e.g., Onto-Express <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>; FatiGO <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>; for an evaluation of these tools, see ref. <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>).</p>
         <p>Contrary to the situation in microarray analysis, there is a surprising lack of statistical approaches for GO-based comparison of two genomes. Here we propose such a statistical approach to find reliable differences between the complete sets of genes encoded in two genomes at all levels of GO. For each GO term the abundance of genes in the two genomes is compared using chi-squared test followed by false discovery rate (FDR) correction. Furthermore, to analyze the reliability of the differences detected, we studied two important issues. First, when new sequences are assigned GO terms by similarity (as determined by BLAST) to other sequences having known GO assignments, the choice of BLAST cutoff may affect the results. We therefore analyzed the effects of employing a wide range of BLAST cutoffs. Second, we studied how the results vary when only subsets of the genes were used. To our knowledge, our work is the first to address all the aforementioned issues.</p>
         <p>We used this statistical approach to compare two cyanobacterial genomes, <it>Synechocystis </it>sp. PCC6803 and <it>Anabaena </it>sp. PCC7120. Cyanobacteria (also called blue-green bacteria, blue-green algae, cyanophyceae, or cyanophytes) are important model organisms for the study of photosynthesis, nitrogen fixation, evolution of plant plastids, and survival in diverse environments <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr></abbrgrp>. Two of the most widely studied cyanobateria species are <it>Synechocystis </it>sp. PCC6803 and <it>Anabaena </it>sp. PCC7120. PCC6803 is a fresh water unicellular cyanobacterium incapable of nitrogen fixation <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>; PCC7120 is a filamentous, heterocyst-forming cyanobacterium that has long been used to study the genetics and physiology of cellular differentiation, pattern formation, and nitrogen fixation <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. These interesting biological differences as well as the appropriate evolutionary distance between PCC6803 and PCC7120 make them a popular pair of species to compare and contrast<abbrgrp><abbr bid="B34">34</abbr><abbr bid="B44">44</abbr><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr><abbr bid="B49">49</abbr><abbr bid="B50">50</abbr></abbrgrp>. We compared PCC6803 and PCC7120 genomes using our statistical method and evaluated the detected statistically significant differences against known biological differences. To analyze how results change when only subsets of the genes are used, a larger set of statistically significant differences is desirable and we used the comparison of human <it>vs</it>. mouse and that of <it>Saccharomyces cerevisiae vs</it>. <it>Schizosaccharomyces pombe </it>genomes.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Whole-genome GO annotation</p>
            </st>
            <p>To annotate a new sequence, we used BLAST to compare it against a database of sequences with known GO annotations. Such a database should contain as many annotated sequences as possible from as many species as possible. AMIGO and GOA are two primary choices for such a database. We compared AMIGO and GOA, as shown in Table <tblr tid="T1">1</tblr>. Both databases have unique merit. AMIGO has been integrated to a greater extent with other databases and provides a better browsing function on the web, whereas GOA contains more sequences. For our purpose, it was attractive to have a larger collection of sequences for comparisons using BLAST, and thus we chose the GOA database. We set the default BLAST cutoff E-value to be 1E-20. With this method, a gene is assigned the GO terms of its top BLAST hit in GOA; it is also linked to all parent GO terms by propagating the DAG structures. Finally, the number of genes assigned to each GO term is tallied, representing the abundance of genes in each GO function within the genome.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Comparison of AMIGO and GOA</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c ca="left">
                        <p>GO Annotation Database</p>
                     </c>
                     <c ca="center">
                        <p>AMIGO<sup>1</sup></p>
                     </c>
                     <c ca="center">
                        <p>GOA<sup>2</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Curator</p>
                     </c>
                     <c ca="center">
                        <p>GO Consortium</p>
                     </c>
                     <c ca="center">
                        <p>European Bioinformatics Institute (EBI)</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>URL</p>
                     </c>
                     <c ca="center">
                        <p>
                           <url>http://www.godatabase.org/</url>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <url>http://www.ebi.ac.uk/GOA/</url>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total number of species</p>
                     </c>
                     <c ca="center">
                        <p>129,722</p>
                     </c>
                     <c ca="center">
                        <p>96,203</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total number of associations</p>
                     </c>
                     <c ca="center">
                        <p>7,745,168</p>
                     </c>
                     <c ca="center">
                        <p>7,600,805</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total number of non-redundant sequences</p>
                     </c>
                     <c ca="center">
                        <p>219,341</p>
                     </c>
                     <c ca="center">
                        <p>1,605,096</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total number of GO terms</p>
                     </c>
                     <c ca="center">
                        <p>10,916</p>
                     </c>
                     <c ca="center">
                        <p>9,258</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total number of other databases integrated</p>
                     </c>
                     <c ca="center">
                        <p>143</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>1</sup>AMIGO monthly release (November 1, 2005) downloaded from <url>http://archive.godatabase.org/full/2005-11-01/go_200511-seqdb-data.gz</url></p>
                  <p><sup>2</sup>GOA version 33.0 (October 25, 2005) downloaded from <url>ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/UNIPROT/gene_association.goa_uniprot.gz.</url></p>
               </tblfn>
            </tbl>
            <p>We were able to annotate 2,224 genes in the PCC6803 genome to 1,933 GO terms, and 3,348 genes in the PCC7120 genome to 1,947 GO terms.</p>
         </sec>
         <sec>
            <st>
               <p>Testing the statistical significance of detected differences between genomes</p>
            </st>
            <p>For each GO category, we used the chi-squared test to determine whether the numbers of genes from the two genomes were statistically significantly different <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. Since the total number of GO categories is large, a large number of tests is required. We adopted the widely used FDR correction (<it>q</it>-value cutoff = 0.01) to control the overall false positive rate <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. We chose rather strict criteria to ensure reliability of the results; they can be set differently by other users.</p>
            <p>We found seven terms in the GO Biological Process category that were statistically significantly different between the two genomes, including "transition metal ion transport" (GO:0000041, <it>q</it>-value 6.1E-6), "di-, trivalent inorganic cation transport" (GO:0015674, <it>q</it>-value 6.1E-6), "cobalt ion transport" (GO:0006824, <it>q</it>-value 7.3E-05), "metal ion transport" (GO:0030001, <it>q</it>-value 0.00056,), "protein amino acid phosphorylation" (GO:0006468, <it>q</it>-value 0.0021), "cellular biosynthesis" (GO:0044249, <it>q</it>-value 0.0022) and "nitrogen fixation" (GO:0009399, <it>q</it>-value 0.0094). These differences are shown in Figure <figr fid="F1">1</figr> and discussed below. (The differences detected in the Molecular Function and Cellular Component categories are available in the <supplr sid="S1">Additional file 1</supplr>)</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Comparison of PCC6803 and PCC7120 using our statistical approach</p>
               </caption>
               <text>
                  <p><b>Comparison of PCC6803 and PCC7120 using our statistical approach</b>. Comparison of PCC6803 and PCC7120 in the biological process category of GO, using the chi-squared test followed by FDR correction, with the <it>q</it>-value cutoff set to 0.01. The colors denote levels of statistical significance of differences between genomes, with the non-significant parent nodes of significant child nodes shown in tan color. (Results for the Molecular Function and Cellular Component categories are available in the <supplr sid="S1">Additional file 1</supplr>)</p>
               </text>
               <graphic file="1471-2105-7-374-1"/>
            </fig>
            <suppl id="S1">
               <title>
                  <p>Additional File 1</p>
               </title>
               <text>
                  <p><b>Supplementary materials and related programs</b>. The compressed file contains supplementary materials and related programs for the paper, including the source codes and documents, the genome comparison results between PCC6803_PCC7120, Cerevisiae_Pombe and Human_Mouse, the figures for the effect of using different subsets of the input genes and the statistical analysis about the BLAST HSP (High scoring Segment Pair) length. Please unzip the file and read the "index.htm" for detail. Also, you can visit the website for the information (<url>http://www.cbi.pku.edu.cn/cbird/GO/</url>).</p>
               </text>
               <file name="1471-2105-7-374-S1.zip">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>The PCC7120 genome contains significantly more genes in "cobalt ion transport" (GO:0006824) compared with PCC6803, likely a consequence of the multicellular nature of PCC7120. Close inspection showed that the statistically significant difference in parent nodes "transition metal ion transport" (GO:0000041), "di-, trivalent inorganic cation transport" (GO:0015674), and "metal ion transport" (GO:0030001) is a consequence of the difference in the subfamily "cobalt ion transport" (GO:0006824) rather than a cumulative effect of any other subfamilies. PCC7120 contains significantly more genes than PCC6803 in "protein amino acid phosphorylation" (GO:0006468). These genes are responsible for critical protein kinase functions in the multicellular PCC7120 <abbrgrp><abbr bid="B53">53</abbr><abbr bid="B54">54</abbr><abbr bid="B55">55</abbr></abbrgrp>. The significantly greater number of genes in "nitrogen fixation" (GO:0009399) in PCC7120 is consistent with its ability to fix nitrogen, a function the simpler organism PCC6803 does not have. The "cellular biosynthesis" (GO:0044249) family differs from those above in that it is significantly more abundant in PCC6803 than in PCC7120. This result may be a consequence of PCC6803's rapid growth capability.</p>
            <p>We compared the two genomes with regard to the GO molecular function category and obtained similar results. We then compared them with regard to the GO cellular component category and found three statistically significant differences: "cytoplasm" (GO:0005737), "integral to membrane" (GO:0016021), and "intrinsic to membrane" (GO:0031224), all of which are more abundant in PCC6803 than in PCC7120.</p>
            <p>We compared our results with results from traditional GO-slim-based, weighted comparison. As shown in Figure <figr fid="F2">2</figr>, the fold difference in the GO-slim-based comparison ranged from 0.7 to 1.5. The fold difference gave only a rough indication of how much PCC6803 and PCC7120 differ in each high-level functional category. In addition, GO-slim-based approach compares two genomes at only the high level, as opposed to our approach that compares at every level and every node. Many important functional differences between two genomes may be detectable only at a finer level. For instance, GO-slim-based approach found little difference between the two cyanobacteria for GOslim term "metabolism, GO:0008152" in the Biological Process category (fold difference 1.03), whereas our approach found that the two species differ significantly in the sub-term "nitrogen fixation, GO:0009399", one of the most important known functional differences.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>GOslim-based weighted comparison of PCC6803 and PCC7120</p>
               </caption>
               <text>
                  <p><b>GOslim-based weighted comparison of PCC6803 and PCC7120</b>. The bars show the fold difference between PCC6803 and PCC7120 in each GOslim functional catalogory, calculated by the weighted number of genes belonging to the functional category in PCC6803 divided by that in PCC7120. If there is no difference, the fold difference is equal to 1.</p>
               </text>
               <graphic file="1471-2105-7-374-2"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Effect of different BLAST cutoffs</p>
            </st>
            <p>We varied the BLAST E-value cutoff to study its effect on the number of statistically significant terms detected as well as the number of common terms between adjacent cutoffs. As shown in Figure <figr fid="F3">3</figr>, when the E-value cutoff is high (i.e., less strict, on the left end of the plot), the result is sensitive to the change in cutoff. The results stabilize around cutoff values of 1E-20 to 1E-40. We chose a default cutoff of 1E-20, which coincides with that chosen by GOblet <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Effect of different BLAST cutoffs on GO results</p>
               </caption>
               <text>
                  <p><b>Effect of different BLAST cutoffs on GO results</b>. This figure illustrates how much the result changes when the BLAST cutoff is changed. Circles show the number of significant GO terms ("Sigcount") at each cutoff. The symbol '&#215; ' indicates the number of significant GO terms common between a given cutoff and its nearest right neighbor. The BLAST cutoff values range from 1E-100 to 10 (1E-100, 1E-90, 1E-80, 1E-70, 1E-60, 1E-50, 1E-40, 1E-30, 1E-20, 1E-10, 1E-5, 0.001, 0.01, 1, 10). The results stabilize around cutoff values of 1E-20 to 1E-40. We chose a default cutoff of 1E-20.</p>
               </text>
               <graphic file="1471-2105-7-374-3"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Effect of partial data</p>
            </st>
            <p>Using the GO-based comparison method, we compared the human and mouse genomes and found 458 statistically significantly different GO terms. We randomly sampled 90% from each of the input gene sets for 1,000 times and compared the statistically significantly different GO terms from each sampling with those from the whole data. As shown in Figure <figr fid="F4">4</figr> (Hatched bars, "Common GO terms"), most of the GO terms occurred in the majority of the samplings; 298 of the 458 GO terms occurred 1,000 times in all sampling results, whereas at the lower extreme three GO terms occurred only 169 times. This analysis offers an additional measure of reliability of the significant terms detected. The more times a term occurs in the samples, the more reliable it may be. We plotted the distribution of the "unique GO terms"&#8212;significant terms detected in one or more of the samples but not in the whole data set&#8212;and found that they occurred in as few as one and as many as 247 samples (Figure <figr fid="F4">4</figr>, open bars). As shown in Figure <figr fid="F4">4</figr>, the histogram distributions of the common and unique GO terms overlap slightly. We sampled 60%, 70%, and 80% of the input genes, respectively, and observed similar patterns (see the supplementary figures in the <supplr sid="S1">Additional file 1</supplr>). We performed analysis of the comparison of the two yeast genomes of <it>Saccharomyces cerevisiae </it>and <it>Schizosaccharomyces pombe </it>and also observed similar patterns (Figure <figr fid="F5">5</figr> and supplementary figures in the <supplr sid="S1">Additional file 1</supplr>).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Histogram of sampling analysis results of the comparison between human and mouse genomes</p>
               </caption>
               <text>
                  <p><b>Histogram of sampling analysis results of the comparison between human and mouse genomes</b>. The x-axis shows the number of samplings containing a significant GO term, grouped by 50. The y-axis shows the number of terms. The "Common GO terms" are those that occur both in the results from the complete data sets and in at least one sampling. The "Unique GO terms" are those that occur in the result from one or more samplings, but not in the results from the whole data set. For example, the right-most bar shows that 346 "Common GO terms" occurred in the results from 950 or more samples; the left-most bar shows that 109 "Unique GO terms" occurred in the results from less than fifty samples.</p>
               </text>
               <graphic file="1471-2105-7-374-4"/>
            </fig>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Histogram of sampling analysis results of the comparison between <it>Saccharomyces cerevisiae </it>and <it>Schizosaccharomyces pombe </it>genomes</p>
               </caption>
               <text>
                  <p>Histogram of sampling analysis results of the comparison between <it>Saccharomyces cerevisiae </it>and <it>Schizosaccharomyces pombe </it>genomes.</p>
               </text>
               <graphic file="1471-2105-7-374-5"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>BLAST and InterProScan are two most widely used automated GO annotation methods. BLAST is the preferred choice for genome-scale annotation because it runs much faster and, perhaps more importantly, can annotate many more GO terms than InterProScan can. We had used InterProScan to annotate and compare PCC6803 and PCC7120, and found that it missed some important differences including "nitrogen fixation, GO:0009399". However, BLAST has its own limitations. Accurate functional assignment is difficult in cases where the match is less well defined due to lower sequence similarity <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>. In future research we will investigate how to combine results from BLAST and InterProScan to improve annotation quality and use grid computing to reduce computation time.</p>
         <p>We used BLAST E-value cutoff as the criteria in assigning GO terms. Local sequence alignment programs such as BLAST may prefer short strong matches to long weak matches and may cause inaccurate GO assignment. The strict E-value cutoff we chose in our analysis ensured the relatively high quality of the results. It was reported that a match between two sequences is most likely reliable if the alignment is at least 70 residues in length with at least 40% sequence identity <abbrgrp><abbr bid="B58">58</abbr></abbrgrp>. We investigated the quality of the HSP (High scoring Segment Pair) in our BLAST results (detail provided in the <supplr sid="S1">Additional file 1</supplr>). With E-value cutoff 1e-20, the minimum length of HSP was 64 and the minimum sequence identity was 68%. Thus the assignments in our results were reliable. It is possible that false negatives may occur with a strict cutoff. In our analysis we prefer accuracy to coverage. Others can use different criteria depending on their individual goals. The statistical testing we proposed in this paper is independent of the GO assignment method. We suggest doing the comparison and comparing the results using different E-value cutoffs and different subsets of the input gene sets to identify the most reliable differences between two genomes.</p>
         <p>In any GO analysis, the quality of the original GO annotation is critical. The GO annotation data are continuously expanded; however, the present data are incomplete and noisy <abbrgrp><abbr bid="B59">59</abbr></abbrgrp>, and the annotation quality is uneven, with a mix of literature-supported annotations and those inferred automatically. We did not modify the GO annotation data for our present study, but further research will consider the quality of the original GO annotations when assessing the reliability of the results. One limitation of our approach is that it only compared the number of genes in each functional category. It cannot capture differences in the level of gene expression. Another inherent limitation of GO is that it does not map directly to pathways. As a result GO-based comparison cannot detect differences at the pathway level. We have recently used the KEGG Orthology (KO) as an alternative controlled vocabulary in a KO-Based Annotation System (KOBAS) and demonstrated that KOBAS is effective in automated annotation and pathway identification <abbrgrp><abbr bid="B60">60</abbr></abbrgrp>. In future research we will investigate KO-based comparison to compare two genomes at the pathway level.</p>
         <p>Our goal is to achieve higher confidence in the differences detected between two genomes. Towards this end, we applied rigorous statistical testing followed by FDR correction instead of simply relying on fold changes. We also tested a wide range of BLAST cutoff values and different subsets of the input genes to provide additional measures of confidence in the results. If results beyond those having the highest confidence are required, then the cutoff values can be relaxed. The advantage of the statistical approach presented here is that, no matter what cutoff values are chosen, the resulting <it>p</it>-values, <it>q</it>-values, and sampling analysis can be used to assess the confidence in the results.</p>
         <p>There are other procedures available to correct false positive rates resulting from multiple testing, including the Bonferroni correction, Sidak stepwise correction, Holm stepwise correction, Hochberg's stepwise correction, and others <abbrgrp><abbr bid="B61">61</abbr><abbr bid="B62">62</abbr></abbrgrp>. We chose the FDR correction because of its overall high quality and computational speed <abbrgrp><abbr bid="B63">63</abbr><abbr bid="B64">64</abbr></abbrgrp>. It is also the most common procedure used in GO-related and microarray analyses <abbrgrp><abbr bid="B62">62</abbr><abbr bid="B65">65</abbr><abbr bid="B66">66</abbr></abbrgrp>.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>Contrary to the situation in microarray analysis, there is a surprising lack of statistical approaches used in GO-based comparison of two complete genomes. Our work is the first to propose and test a statistical approach to comparing the complete sets of genes in two whole genomes at all levels of GO and study the effect of varying BLAST cutoffs and using subset of the input gene sets. We believe that such an approach can provide a measure of confidence in the identified differences and help ensure the reliability of the results.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>Supplementary materials and related programs for the paper are provided on-line [See <supplr sid="S1">Additional file 1</supplr>].</p>
         <sec>
            <st>
               <p>Whole-genome GO annotation</p>
            </st>
            <p>We set the default BLAST cutoff E-value to be 1E-20. In Part 3 of results, we study the cutoff's effect on the final results. We parsed the BLAST result to obtain the GOA ID for the top hit and used the ID to query the GOA association database to retrieve the corresponding GO annotation and assign it to the query sequence. The result is written to a file in the format specified by the GO Consortium <abbrgrp><abbr bid="B67">67</abbr></abbrgrp>.</p>
            <p>We parsed the gene ontology DAGs and stored the GO terms and their hierarchical relationships in a local data structure. The genes in a genome are linked to GO terms using the aforementioned approach; they are also linked to all parent GO terms by propagating the DAG structures. If a gene has been assigned more than one GO terms that have a common parent GO term, the gene is counted only once in the parent GO term. Finally, the numbers of genes assigned to each GO term in the DAGs are tallied, representing the abundance of genes in each GO function within the genome.</p>
            <p>The complete set of known and predicted genes in PCC6803 and PCC7120 genomes were downloaded from Cyanobase <abbrgrp><abbr bid="B68">68</abbr></abbrgrp>. The PCC6803 genome contains 3,573,470 bp with 3,167 predicted ORFs; the PCC7120 genome contains 6,413,771 bp with 5,362 predicted ORFs.</p>
         </sec>
         <sec>
            <st>
               <p>Testing the statistical significance of detected differences between genomes</p>
            </st>
            <p>The goal is to identify all GO terms for which two genomes (A and B) are statistically significantly different. Define:</p>
            <p>N = the total number of annotated genes in Genome A</p>
            <p>n = the total number of annotated genes in Genome B</p>
            <p>X = the number of genes in Genome A that are assigned the GO term currently under consideration</p>
            <p>x = the number of genes in Genome B that are assigned the GO term currently under consideration</p>
            <p>We used the chi-squared test to address whether the ratios, <graphic file="1471-2105-7-374-i1.gif"/> and <graphic file="1471-2105-7-374-i2.gif"/>, come from the same distribution, either:</p>
            <p>H<sub>0</sub>: <b><it>p</it><sub>0 </sub></b>= <b><it>p</it><sub>1 </sub></b>or</p>
            <p>H<sub>1</sub>: <b><it>p</it><sub>0 </sub></b>&#8800; <b><it>p</it><sub>1</sub></b></p>
            <p>The <it>p-</it>value is calculated as the upper tail probability of the chi-squared distribution with one degree of freedom using the CPAN Statistics::Distributions modules <abbrgrp><abbr bid="B69">69</abbr></abbrgrp>.</p>
            <p>Because the number of tests performed equals the number of GO terms, which may be thousands, multiple hypotheses testing is important to control the overall Type I error rate. We used the commonly applied FDR correction. For every test result that is considered statistically significant, the FDR correction calculates a <it>q-</it>value to measure the minimum FDR when calling that result significant. A <it>q-</it>value cutoff, &#945; (alpha), guarantees that the expected proportion of false positives is &#945; (alpha) among the set of significant features produced <abbrgrp><abbr bid="B52">52</abbr><abbr bid="B66">66</abbr></abbrgrp>. The default for &#945; (alpha) was set to 0.01 in our study. The conservative FDR correction was implemented according to the GenTS package <abbrgrp><abbr bid="B70">70</abbr></abbrgrp>.</p>
            <p>The statistically significantly different GO terms detected between two genomes are stored in text format, sorted by increasing <it>q</it>-value. We also modified the GO TermFinder package <abbrgrp><abbr bid="B71">71</abbr></abbrgrp> to show the results graphically, with different colors showing different levels of significance.</p>
            <p>All related programs are attached in <supplr sid="S1">Additional file 1</supplr></p>
         </sec>
         <sec>
            <st>
               <p>Effect of different BLAST cutoffs</p>
            </st>
            <p>We studied how the BLAST cutoff value can affect the comparison of results between two genomes of PCC6803 and PCC7120. We tested a wide range of BLAST E-value cutoffs, from 1E-100 to 10, and recorded the number of statistically significantly different GO terms between the two cyanobacterial genomes at each cutoff. We then recorded the number of common statistically significantly different GO terms between adjacent cutoffs to show how much the result changes when the cutoff is varied.</p>
         </sec>
         <sec>
            <st>
               <p>Effect of partial data</p>
            </st>
            <p>We performed the random sampling to study how the results are affected when only part of the data is used. For each sample, we randomly selected 90%, 80%, 70%, and 60% of the annotated genes in each genome, and recomputed the statistically significantly different GO terms. We then compared the result of each sampling with that for the complete data sets and counted the numbers of common and unique GO terms. Because comparison of the two cyanobacteria resulted in too few significant GO terms to make this analysis meaningful, we analyzed the comparison of human <it>vs</it>. mouse and <it>Saccharomyces cerevisiae vs. Schizosaccharomyces pombe</it>. The GO annotations for these four genomes were retrieved from the Gene Ontology Consortium web site.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>ZC and LW conceived of the study; ZC carried out most of the implementation and analysis; XM and LW participated in the analysis; SL participated in the design of statistical tests. All authors participated in preparation of the manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work was supported by the China Ministry of Science and Technology "863" grants. We thank Drs. Arthur Grossman and Devaki Bhaya for insightful discussions. We thank the two anonymous reviewers for helpful suggestions.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Gene ontology: tool for the unification of biology. The Gene Ontology Consortium</p>
            </title>
            <aug>
               <au>
                  <snm>Ashburner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ball</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Blake</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Butler</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Cherry</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Dolinski</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dwight</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Eppig</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Harris</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Hill</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Issel-Tarver</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Kasarskis</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lewis</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Matese</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Richardson</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Ringwald</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Sherlock</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2000</pubdate>
            <volume>25</volume>
            <fpage>25</fpage>
            <lpage>29</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/75556</pubid>
                  <pubid idtype="pmpid" link="fulltext">10802651</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Current annotated genomes in GO web site</p>
            </title>
            <url>http://www.geneontology.org/GO.current.annotations.shtml</url>
         </bibl>
         <bibl id="B3">
            <title>
               <p>SGD</p>
            </title>
            <url>http://www.yeastgenome.org/</url>
         </bibl>
         <bibl id="B4">
            <title>
               <p>FlyBase</p>
            </title>
            <url>http://www.fruitfly.org/</url>
         </bibl>
         <bibl id="B5">
            <title>
               <p>WormBase</p>
            </title>
            <url>http://www.wormbase.org/</url>
         </bibl>
         <bibl id="B6">
            <title>
               <p>MGI</p>
            </title>
            <url>http://www.informatics.jax.org/</url>
         </bibl>
         <bibl id="B7">
            <title>
               <p>TAIR</p>
            </title>
            <url>http://www.arabidopsis.org/</url>
         </bibl>
         <bibl id="B8">
            <title>
               <p>AMIGO</p>
            </title>
            <url>http://www.godatabase.org</url>
         </bibl>
         <bibl id="B9">
            <title>
               <p>GOA</p>
            </title>
            <url>ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/</url>
         </bibl>
         <bibl id="B10">
            <title>
               <p>UniProt: the Universal Protein knowledgebase</p>
            </title>
            <aug>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Bairoch</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>CH</fnm>
               </au>
               <au>
                  <snm>Barker</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Boeckmann</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Ferro</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gasteiger</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Magrane</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Natale</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>O'Donovan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Redaschi</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Yeh</snm>
                  <fnm>LS</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32 Database issue</volume>
            <fpage>D115</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1093/nar/gkh131</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology</p>
            </title>
            <aug>
               <au>
                  <snm>Camon</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Magrane</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Barrell</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Dimmer</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Maslen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Binns</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Harte</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32 Database issue</volume>
            <fpage>D262</fpage>
            <lpage>6</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1093/nar/gkh021</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>InterPro</p>
            </title>
            <url>http://www.ebi.ac.uk/interpro</url>
         </bibl>
         <bibl id="B13">
            <title>
               <p>The InterPro Database, 2003 brings increased coverage and new features</p>
            </title>
            <aug>
               <au>
                  <snm>Mulder</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Attwood</snm>
                  <fnm>TK</fnm>
               </au>
               <au>
                  <snm>Bairoch</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Barrell</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Bateman</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Binns</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Biswas</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bradley</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bucher</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Copley</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Courcelle</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Das</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Falquet</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Fleischmann</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Griffiths-Jones</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Haft</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Harte</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Hulo</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Kahn</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kanapin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Krestyaninova</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Letunic</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Lonsdale</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Silventoinen</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Orchard</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Pagni</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Peyruc</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ponting</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Selengut</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Servant</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Sigrist</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Vaughan</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Zdobnov</snm>
                  <fnm>EM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>315</fpage>
            <lpage>318</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">165493</pubid>
                  <pubid idtype="pmpid" link="fulltext">12520011</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg046</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>GoFigure: automated Gene Ontology annotation</p>
            </title>
            <aug>
               <au>
                  <snm>Khan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Situ</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Decker</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Schmidt</snm>
                  <fnm>CJ</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>2484</fpage>
            <lpage>2485</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg338</pubid>
                  <pubid idtype="pmpid" link="fulltext">14668239</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>GOblet: a platform for Gene Ontology annotation of anonymous sequence data</p>
            </title>
            <aug>
               <au>
                  <snm>Groth</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lehrach</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hennig</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>W313</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">441544</pubid>
                  <pubid idtype="pmpid" link="fulltext">15215401</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>OntoBlast function: From sequence similarities directly to potential functional annotations by ontology terms</p>
            </title>
            <aug>
               <au>
                  <snm>Zehetner</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>3799</fpage>
            <lpage>3803</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">168962</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824422</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg555</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Martin</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Berriman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Barton</snm>
                  <fnm>GJ</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>178</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">535938</pubid>
                  <pubid idtype="pmpid" link="fulltext">15550167</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-5-178</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research</p>
            </title>
            <aug>
               <au>
                  <snm>Conesa</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gotz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Garcia-Gomez</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Terol</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Talon</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Robles</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>3674</fpage>
            <lpage>3676</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti610</pubid>
                  <pubid idtype="pmpid" link="fulltext">16081474</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>InterProScan--an integration platform for the signature-recognition methods in InterPro</p>
            </title>
            <aug>
               <au>
                  <snm>Zdobnov</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>847</fpage>
            <lpage>848</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/17.9.847</pubid>
                  <pubid idtype="pmpid" link="fulltext">11590104</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Gotrees: predicting go associations from protein domain composition using decision trees</p>
            </title>
            <aug>
               <au>
                  <snm>Hayete</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bienkowska</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Pac Symp Biocomput</source>
            <pubdate>2005</pubdate>
            <fpage>127</fpage>
            <lpage>138</lpage>
            <xrefbib>
               <pubid idtype="pmpid">15759620</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>The sequence and analysis of Trypanosoma brucei chromosome II</p>
            </title>
            <aug>
               <au>
                  <snm>El-Sayed</snm>
                  <fnm>NM</fnm>
               </au>
               <au>
                  <snm>Ghedin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Song</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>MacLeod</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bringaud</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Larkin</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Wanless</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hou</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tweedie</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Biteau</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Khalak</snm>
                  <fnm>HG</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Mason</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hannick</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Caler</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Blandin</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Bartholomeu</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Simpson</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Kaul</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Zhao</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Pai</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Van Aken</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Utterback</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Haas</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Koo</snm>
                  <fnm>HL</fnm>
               </au>
               <au>
                  <snm>Umayam</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Suh</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Gerrard</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Leech</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Qi</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Feldblyum</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Salzberg</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tait</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Turner</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Ullu</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Melville</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Adams</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Fraser</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Donelson</snm>
                  <fnm>JE</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>4856</fpage>
            <lpage>4863</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">169936</pubid>
                  <pubid idtype="pmpid" link="fulltext">12907728</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg673</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>The complete genome sequence of the Arabidopsis and tomato pathogen Pseudomonas syringae pv. tomato DC3000</p>
            </title>
            <aug>
               <au>
                  <snm>Buell</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Joardar</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Lindeberg</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Selengut</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Paulsen</snm>
                  <fnm>IT</fnm>
               </au>
               <au>
                  <snm>Gwinn</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Dodson</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Deboy</snm>
                  <fnm>RT</fnm>
               </au>
               <au>
                  <snm>Durkin</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Kolonay</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Madupu</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Daugherty</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Brinkac</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Beanan</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Haft</snm>
                  <fnm>DH</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Davidsen</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Zafar</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Yuan</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Khouri</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Fedorova</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Tran</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Russell</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Berry</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Utterback</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Van Aken</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Feldblyum</snm>
                  <fnm>TV</fnm>
               </au>
               <au>
                  <snm>D'Ascenzo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Deng</snm>
                  <fnm>WL</fnm>
               </au>
               <au>
                  <snm>Ramos</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Alfano</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Cartinhour</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chatterjee</snm>
                  <fnm>AK</fnm>
               </au>
               <au>
                  <snm>Delaney</snm>
                  <fnm>TP</fnm>
               </au>
               <au>
                  <snm>Lazarowitz</snm>
                  <fnm>SG</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>GB</fnm>
               </au>
               <au>
                  <snm>Schneider</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Tang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Bender</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Fraser</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Collmer</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <fpage>10181</fpage>
            <lpage>10186</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">193536</pubid>
                  <pubid idtype="pmpid" link="fulltext">12928499</pubid>
                  <pubid idtype="doi">10.1073/pnas.1731982100</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies</p>
            </title>
            <aug>
               <au>
                  <snm>Haas</snm>
                  <fnm>BJ</fnm>
               </au>
               <au>
                  <snm>Delcher</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Mount</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Wortman</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>RKJ</fnm>
               </au>
               <au>
                  <snm>Hannick</snm>
                  <fnm>LI</fnm>
               </au>
               <au>
                  <snm>Maiti</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ronning</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Rusch</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Town</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Salzberg</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>5654</fpage>
            <lpage>5666</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">206470</pubid>
                  <pubid idtype="pmpid" link="fulltext">14500829</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg770</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Annotation of the Arabidopsis genome</p>
            </title>
            <aug>
               <au>
                  <snm>Wortman</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Haas</snm>
                  <fnm>BJ</fnm>
               </au>
               <au>
                  <snm>Hannick</snm>
                  <fnm>LI</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>RKJ</fnm>
               </au>
               <au>
                  <snm>Maiti</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ronning</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ayele</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Whitelaw</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>OR</fnm>
               </au>
               <au>
                  <snm>Town</snm>
                  <fnm>CD</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>2003</pubdate>
            <volume>132</volume>
            <fpage>461</fpage>
            <lpage>468</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">166989</pubid>
                  <pubid idtype="pmpid" link="fulltext">12805579</pubid>
                  <pubid idtype="doi">10.1104/pp.103.022251</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Automated methods of predicting the function of biological sequences using GO and BLAST</p>
            </title>
            <aug>
               <au>
                  <snm>Jones</snm>
                  <fnm>CE</fnm>
               </au>
               <au>
                  <snm>Baumann</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>AL</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <fpage>272</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1298289</pubid>
                  <pubid idtype="pmpid" link="fulltext">16288652</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-6-272</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Initial sequencing and comparative analysis of the mouse genome</p>
            </title>
            <aug>
               <au>
                  <snm>Waterston</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Lindblad-Toh</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rogers</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Abril</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Agarwal</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Agarwala</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ainscough</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Alexandersson</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>An</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Antonarakis</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Attwood</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Baertsch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Bailey</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Barlow</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Beck</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Berry</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Birren</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bloom</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Botcherby</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bray</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Brent</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Bult</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Burton</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Butler</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Campbell</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Carninci</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Cawley</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chiaromonte</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Chinwalla</snm>
                  <fnm>AT</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Clamp</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Clee</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Collins</snm>
                  <fnm>FS</fnm>
               </au>
               <au>
                  <snm>Cook</snm>
                  <fnm>LL</fnm>
               </au>
               <au>
                  <snm>Copley</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Coulson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Couronne</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Cuff</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Curwen</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Cutts</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Daly</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>David</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Davies</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Delehaunty</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Deri</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dermitzakis</snm>
                  <fnm>ET</fnm>
               </au>
               <au>
                  <snm>Dewey</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Dickens</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Diekhans</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dodge</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Dunn</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Eddy</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Elnitski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Emes</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Eswara</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Eyras</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Felsenfeld</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Fewell</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Flicek</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Foley</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Frankel</snm>
                  <fnm>WN</fnm>
               </au>
               <au>
                  <snm>Fulton</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Fulton</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Furey</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Gage</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Gibbs</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Glusman</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gnerre</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Goldman</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Goodstadt</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Grafham</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Graves</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>Gregory</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Guigo</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Guyer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hayashizaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hillier</snm>
                  <fnm>LW</fnm>
               </au>
               <au>
                  <snm>Hinrichs</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hlavina</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Holzer</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hsu</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Hua</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hunt</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jackson</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Jaffe</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>LS</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Joy</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kamal</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Karlsson</snm>
                  <fnm>EK</fnm>
               </au>
               <au>
                  <snm>Karolchik</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kasprzyk</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kawai</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Keibler</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Kells</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Kirby</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kolbe</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Korf</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Kucherlapati</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Kulbokas</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Kulp</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Landers</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Leger</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Leonard</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Letunic</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lloyd</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lucas</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ma</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Maglott</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Mardis</snm>
                  <fnm>ER</fnm>
               </au>
               <au>
                  <snm>Matthews</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Mauceli</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Mayer</snm>
                  <fnm>JH</fnm>
               </au>
               <au>
                  <snm>McCarthy</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>McCombie</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>McLaren</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>McLay</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>McPherson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Meldrim</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Meredith</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Miner</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Mongin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Montgomery</snm>
                  <fnm>KT</fnm>
               </au>
               <au>
                  <snm>Morgan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mott</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Mullikin</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Muzny</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Nash</snm>
                  <fnm>WE</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>JO</fnm>
               </au>
               <au>
                  <snm>Nhan</snm>
                  <fnm>MN</fnm>
               </au>
               <au>
                  <snm>Nicol</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ning</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Nusbaum</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>O'Connor</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Okazaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Oliver</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Overton-Larty</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Parra</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Pepin</snm>
                  <fnm>KH</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Pevzner</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Plumb</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Pohl</snm>
                  <fnm>CS</fnm>
               </au>
               <au>
                  <snm>Poliakov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ponce</snm>
                  <fnm>TC</fnm>
               </au>
               <au>
                  <snm>Ponting</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Potter</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Quail</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Reymond</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Roe</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Roskin</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Rust</snm>
                  <fnm>AG</fnm>
               </au>
               <au>
                  <snm>Santos</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sapojnikov</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Schultz</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Schultz</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Scott</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Seaman</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Searle</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sharpe</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sheridan</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Shownkeen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sims</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Singer</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Slater</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Smit</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Spencer</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Stabenau</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Stange-Thomann</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sugnet</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Suyama</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tesler</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Torrents</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Trevaskis</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Tromp</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ucla</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ureta-Vidal</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Vinson</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Von Niederhausern</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Wade</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Wall</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Weber</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Weiss</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Wendl</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>West</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Wetterstrand</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Whelan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wierzbowski</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Willey</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wilson</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Winter</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Worley</snm>
                  <fnm>KC</fnm>
               </au>
               <au>
                  <snm>Wyman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Zdobnov</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Zody</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>420</volume>
            <fpage>520</fpage>
            <lpage>562</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01262</pubid>
                  <pubid idtype="pmpid" link="fulltext">12466850</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>GOslim</p>
            </title>
            <url>http://geneontology.org/GO.slims.shtml</url>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Analysis and functional classification of transcripts from the nematode Meloidogyne incognita</p>
            </title>
            <aug>
               <au>
                  <snm>McCarter</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Mitreva</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dante</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Wylie</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Rao</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Pape</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Bowers</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Theising</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Murphy</snm>
                  <fnm>CV</fnm>
               </au>
               <au>
                  <snm>Kloek</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Chiapelli</snm>
                  <fnm>BJ</fnm>
               </au>
               <au>
                  <snm>Clifton</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Bird</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Waterston</snm>
                  <fnm>RH</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <fpage>R26</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">154577</pubid>
                  <pubid idtype="pmpid" link="fulltext">12702207</pubid>
                  <pubid idtype="doi">10.1186/gb-2003-4-4-r26</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Comparative genomics of gene expression in the parasitic and free-living nematodes Strongyloides stercoralis and Caenorhabditis elegans</p>
            </title>
            <aug>
               <au>
                  <snm>Mitreva</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>McCarter</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dante</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Wylie</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Chiapelli</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Pape</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Clifton</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Nutman</snm>
                  <fnm>TB</fnm>
               </au>
               <au>
                  <snm>Waterston</snm>
                  <fnm>RH</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <fpage>209</fpage>
            <lpage>220</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">327096</pubid>
                  <pubid idtype="pmpid" link="fulltext">14762059</pubid>
                  <pubid idtype="doi">10.1101/gr.1524804</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Stein</snm>
                  <fnm>LD</fnm>
               </au>
               <au>
                  <snm>Bao</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Blasiar</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Blumenthal</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Brent</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Chinwalla</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Clarke</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Clee</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Coghlan</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Coulson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>D'Eustachio</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Fitch</snm>
                  <fnm>DH</fnm>
               </au>
               <au>
                  <snm>Fulton</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Fulton</snm>
                  <fnm>RE</fnm>
               </au>
               <au>
                  <snm>Griffiths-Jones</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Harris</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Hillier</snm>
                  <fnm>LW</fnm>
               </au>
               <au>
                  <snm>Kamath</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kuwabara</snm>
                  <fnm>PE</fnm>
               </au>
               <au>
                  <snm>Mardis</snm>
                  <fnm>ER</fnm>
               </au>
               <au>
                  <snm>Marra</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Miner</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Minx</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Mullikin</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Plumb</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Rogers</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schein</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Sohrmann</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Spieth</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Stajich</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Wei</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Willey</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Wilson</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Waterston</snm>
                  <fnm>RH</fnm>
               </au>
            </aug>
            <source>PLoS Biol</source>
            <pubdate>2003</pubdate>
            <volume>1</volume>
            <fpage>E45</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">261899</pubid>
                  <pubid idtype="pmpid" link="fulltext">14624247</pubid>
                  <pubid idtype="doi">10.1371/journal.pbio.0000045</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Profiling gene expression using onto-express</p>
            </title>
            <aug>
               <au>
                  <snm>Khatri</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Draghici</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ostermeier</snm>
                  <fnm>GC</fnm>
               </au>
               <au>
                  <snm>Krawetz</snm>
                  <fnm>SA</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>2002</pubdate>
            <volume>79</volume>
            <fpage>266</fpage>
            <lpage>270</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/geno.2002.6698</pubid>
                  <pubid idtype="pmpid" link="fulltext">11829497</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes</p>
            </title>
            <aug>
               <au>
                  <snm>Al-Shahrour</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Diaz-Uriarte</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Dopazo</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <fpage>578</fpage>
            <lpage>580</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg455</pubid>
                  <pubid idtype="pmpid" link="fulltext">14990455</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Ontological analysis of gene expression data: current tools, limitations, and open problems</p>
            </title>
            <aug>
               <au>
                  <snm>Khatri</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Draghici</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>3587</fpage>
            <lpage>3595</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti565</pubid>
                  <pubid idtype="pmpid" link="fulltext">15994189</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>The evolutionary development of the protein complement of photosystem 2</p>
            </title>
            <aug>
               <au>
                  <snm>Raymond</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Blankenship</snm>
                  <fnm>RE</fnm>
               </au>
            </aug>
            <source>Biochim Biophys Acta</source>
            <pubdate>2004</pubdate>
            <volume>1655</volume>
            <fpage>133</fpage>
            <lpage>139</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.bbabio.2003.10.015</pubid>
                  <pubid idtype="pmpid" link="fulltext">15100025</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Genetic tools for cyanobacteria</p>
            </title>
            <aug>
               <au>
                  <snm>Koksharova</snm>
                  <fnm>OA</fnm>
               </au>
               <au>
                  <snm>Wolk</snm>
                  <fnm>CP</fnm>
               </au>
            </aug>
            <source>Appl Microbiol Biotechnol</source>
            <pubdate>2002</pubdate>
            <volume>58</volume>
            <fpage>123</fpage>
            <lpage>137</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00253-001-0864-9</pubid>
                  <pubid idtype="pmpid">11876404</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Cyanobacteria-eukaryotic plant symbioses</p>
            </title>
            <aug>
               <au>
                  <snm>Stewart</snm>
                  <fnm>WD</fnm>
               </au>
               <au>
                  <snm>Rowell</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Rai</snm>
                  <fnm>AN</fnm>
               </au>
            </aug>
            <source>Ann Microbiol (Paris)</source>
            <pubdate>1983</pubdate>
            <volume>134B</volume>
            <fpage>205</fpage>
            <lpage>228</lpage>
            <xrefbib>
               <pubid idtype="pmpid">6139055</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Nitrogen fixation and photosynthetic oxygen evolution in cyanobacteria</p>
            </title>
            <aug>
               <au>
                  <snm>Berman-Frank</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Lundgren</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Falkowski</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Res Microbiol</source>
            <pubdate>2003</pubdate>
            <volume>154</volume>
            <fpage>157</fpage>
            <lpage>164</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0923-2508(03)00029-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">12706503</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Endosymbiosis and evolution of the plant cell</p>
            </title>
            <aug>
               <au>
                  <snm>McFadden</snm>
                  <fnm>GI</fnm>
               </au>
            </aug>
            <source>Curr Opin Plant Biol</source>
            <pubdate>1999</pubdate>
            <volume>2</volume>
            <fpage>513</fpage>
            <lpage>519</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1369-5266(99)00025-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">10607659</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Cyanobacterial-bacterial mat consortia: examining the functional unit of microbial survival and growth in extreme environments</p>
            </title>
            <aug>
               <au>
                  <snm>Paerl</snm>
                  <fnm>HW</fnm>
               </au>
               <au>
                  <snm>Pinckney</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Steppe</snm>
                  <fnm>TF</fnm>
               </au>
            </aug>
            <source>Environ Microbiol</source>
            <pubdate>2000</pubdate>
            <volume>2</volume>
            <fpage>11</fpage>
            <lpage>26</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1462-2920.2000.00071.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">11243256</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Photosynthetic microbes in freezing deserts</p>
            </title>
            <aug>
               <au>
                  <snm>Thomas</snm>
                  <fnm>DN</fnm>
               </au>
            </aug>
            <source>Trends Microbiol</source>
            <pubdate>2005</pubdate>
            <volume>13</volume>
            <fpage>87</fpage>
            <lpage>88</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tim.2004.11.002</pubid>
                  <pubid idtype="pmpid" link="fulltext">15737723</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>The Molecular Biology of Cyanobacteria</p>
            </title>
            <aug>
               <au>
                  <snm>Bryant</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <publisher>Netherlands, Kluwer Academic Publishers</publisher>
            <pubdate>1994</pubdate>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions</p>
            </title>
            <aug>
               <au>
                  <snm>Kaneko</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sato</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kotani</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Tanaka</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Asamizu</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Miyajima</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Hirosawa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sugiura</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sasamoto</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kimura</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hosouchi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Matsuno</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Muraki</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nakazaki</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Naruo</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Okumura</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shimpo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Takeuchi</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Wada</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Watanabe</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Yamada</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yasuda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tabata</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>DNA Res</source>
            <pubdate>1996</pubdate>
            <volume>3</volume>
            <fpage>109</fpage>
            <lpage>136</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/dnares/3.3.109</pubid>
                  <pubid idtype="pmpid" link="fulltext">8905231</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Complete genomic sequence of the filamentous nitrogen-fixing cyanobacterium Anabaena sp. strain PCC 7120</p>
            </title>
            <aug>
               <au>
                  <snm>Kaneko</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Wolk</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Kuritz</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sasamoto</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Watanabe</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Iriguchi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ishikawa</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kawashima</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kimura</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kishida</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kohara</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Matsumoto</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Matsuno</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Muraki</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nakazaki</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Shimpo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sugimoto</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Takazawa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yamada</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yasuda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tabata</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>DNA Res</source>
            <pubdate>2001</pubdate>
            <volume>8</volume>
            <fpage>205</fpage>
            <lpage>13; 227-53</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/dnares/8.5.205</pubid>
                  <pubid idtype="pmpid" link="fulltext">11759840</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Survey, analysis and genetic organization of genes encoding eukaryotic-like signaling proteins on a cyanobacterial genome</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Gonzalez</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Phalip</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1998</pubdate>
            <volume>26</volume>
            <fpage>3619</fpage>
            <lpage>3625</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">147778</pubid>
                  <pubid idtype="pmpid" link="fulltext">9685474</pubid>
                  <pubid idtype="doi">10.1093/nar/26.16.3619</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>From genome to enzyme: analysis of key glycolytic and oxidative pentose-phosphate pathway enzymes in the cyanobacterium Synechocystis sp. PCC 6803</p>
            </title>
            <aug>
               <au>
                  <snm>Knowles</snm>
                  <fnm>VL</fnm>
               </au>
               <au>
                  <snm>Plaxton</snm>
                  <fnm>WC</fnm>
               </au>
            </aug>
            <source>Plant Cell Physiol</source>
            <pubdate>2003</pubdate>
            <volume>44</volume>
            <fpage>758</fpage>
            <lpage>763</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/pcp/pcg086</pubid>
                  <pubid idtype="pmpid" link="fulltext">12881504</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Lessons from Sequencing of the Genome of a Unicellular Cyanobacterium, Synechocystis Sp. Pcc6803</p>
            </title>
            <aug>
               <au>
                  <snm>Kotani</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Tabata</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Annu Rev Plant Physiol Plant Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>49</volume>
            <fpage>151</fpage>
            <lpage>171</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.arplant.49.1.151</pubid>
                  <pubid idtype="pmpid" link="fulltext">15012231</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Hydrogenases and hydrogen metabolism of cyanobacteria</p>
            </title>
            <aug>
               <au>
                  <snm>Tamagnini</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Axelsson</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lindberg</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Oxelfelt</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Wunschiers</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lindblad</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Microbiol Mol Biol Rev</source>
            <pubdate>2002</pubdate>
            <volume>66</volume>
            <fpage>1</fpage>
            <lpage>20, table of contents</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">120778</pubid>
                  <pubid idtype="pmpid" link="fulltext">11875125</pubid>
                  <pubid idtype="doi">10.1128/MMBR.66.1.1-20.2002</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Analysis of the hli gene family in marine and freshwater cyanobacteria</p>
            </title>
            <aug>
               <au>
                  <snm>Bhaya</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Dufresne</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Vaulot</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Grossman</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>FEMS Microbiol Lett</source>
            <pubdate>2002</pubdate>
            <volume>215</volume>
            <fpage>209</fpage>
            <lpage>219</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1574-6968.2002.tb11393.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">12399037</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Comparative genomics analysis of NtcA regulons in cyanobacteria: regulation of nitrogen assimilation and its coupling to photosynthesis</p>
            </title>
            <aug>
               <au>
                  <snm>Su</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Olman</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Mao</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>5156</fpage>
            <lpage>5171</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1214546</pubid>
                  <pubid idtype="pmpid" link="fulltext">16157864</pubid>
                  <pubid idtype="doi">10.1093/nar/gki817</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Cyanobacterial signature genes</p>
            </title>
            <aug>
               <au>
                  <snm>Martin</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Siefert</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Yerrapragada</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>McNeill</snm>
                  <fnm>TZ</fnm>
               </au>
               <au>
                  <snm>Moreno</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Weinstock</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Widger</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Fox</snm>
                  <fnm>GE</fnm>
               </au>
            </aug>
            <source>Photosynth Res</source>
            <pubdate>2003</pubdate>
            <volume>75</volume>
            <fpage>211</fpage>
            <lpage>221</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1023/A:1023990402346</pubid>
                  <pubid idtype="pmpid">16228602</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>POWER_SAGE: comparing statistical tests for SAGE experiments</p>
            </title>
            <aug>
               <au>
                  <snm>Man</snm>
                  <fnm>MZ</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>953</fpage>
            <lpage>959</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/16.11.953</pubid>
                  <pubid idtype="pmpid" link="fulltext">11159306</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Controlling the false discovery rate: a practical and powerful approach to multiple testing</p>
            </title>
            <aug>
               <au>
                  <snm>Benjamini</snm>
                  <fnm>YYH</fnm>
               </au>
            </aug>
            <source>J Roy Statist Soc B</source>
            <pubdate>1995</pubdate>
            <volume>57</volume>
            <fpage>289</fpage>
            <lpage>300</lpage>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Histidine kinases and response regulator proteins in two-component signaling systems</p>
            </title>
            <aug>
               <au>
                  <snm>West</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Stock</snm>
                  <fnm>AM</fnm>
               </au>
            </aug>
            <source>Trends Biochem Sci</source>
            <pubdate>2001</pubdate>
            <volume>26</volume>
            <fpage>369</fpage>
            <lpage>376</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0968-0004(01)01852-7</pubid>
                  <pubid idtype="pmpid" link="fulltext">11406410</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>The molecular puzzle of two-component signaling cascades</p>
            </title>
            <aug>
               <au>
                  <snm>Foussard</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Cabantous</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Pedelacq</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Guillet</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Tranier</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mourey</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Birck</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Samama</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Microbes Infect</source>
            <pubdate>2001</pubdate>
            <volume>3</volume>
            <fpage>417</fpage>
            <lpage>424</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1286-4579(01)01390-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">11369279</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>Histidine protein kinases: key signal transducers outside the animal kingdom</p>
            </title>
            <aug>
               <au>
                  <snm>Wolanin</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Thomason</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Stock</snm>
                  <fnm>JB</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>REVIEWS3013</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">244915</pubid>
                  <pubid idtype="pmpid" link="fulltext">12372152</pubid>
                  <pubid idtype="doi">10.1186/gb-2002-3-10-reviews3013</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Automated Gene Ontology annotation for anonymous sequence data</p>
            </title>
            <aug>
               <au>
                  <snm>Hennig</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Groth</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lehrach</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>3712</fpage>
            <lpage>3715</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">168988</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824400</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg582</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Can sequence determine function?</p>
            </title>
            <aug>
               <au>
                  <snm>Gerlt</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Babbitt</snm>
                  <fnm>PC</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2000</pubdate>
            <volume>1</volume>
            <fpage>REVIEWS0005</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">138884</pubid>
                  <pubid idtype="pmpid" link="fulltext">11178260</pubid>
                  <pubid idtype="doi">10.1186/gb-2000-1-5-reviews0005</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships</p>
            </title>
            <aug>
               <au>
                  <snm>Brenner</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Chothia</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>6073</fpage>
            <lpage>6078</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">27587</pubid>
                  <pubid idtype="pmpid" link="fulltext">9600919</pubid>
                  <pubid idtype="doi">10.1073/pnas.95.11.6073</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>A procedure for assessing GO annotation consistency</p>
            </title>
            <aug>
               <au>
                  <snm>Dolan</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Ni</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Camon</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Blake</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21 Suppl 1</volume>
            <fpage>i136</fpage>
            <lpage>i143</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti1019</pubid>
                  <pubid idtype="pmpid" link="fulltext">15961450</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>Automated genome annotation and pathway identification using the KEGG Orthology (KO) as a controlled vocabulary</p>
            </title>
            <aug>
               <au>
                  <snm>Mao</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Cai</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Olyarchuk</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Wei</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>3787</fpage>
            <lpage>3793</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti430</pubid>
                  <pubid idtype="pmpid" link="fulltext">15817693</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>GO::TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes</p>
            </title>
            <aug>
               <au>
                  <snm>Boyle</snm>
                  <fnm>EI</fnm>
               </au>
               <au>
                  <snm>Weng</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gollub</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Jin</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Cherry</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Sherlock</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <fpage>3710</fpage>
            <lpage>3715</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bth456</pubid>
                  <pubid idtype="pmpid" link="fulltext">15297299</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>Controlling the familywise error rate in functional neuroimaging: a comparative review</p>
            </title>
            <aug>
               <au>
                  <snm>Nichols</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hayasaka</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Stat Methods Med Res</source>
            <pubdate>2003</pubdate>
            <volume>12</volume>
            <fpage>419</fpage>
            <lpage>446</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1191/0962280203sm341ra</pubid>
                  <pubid idtype="pmpid" link="fulltext">14599004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>Identifying differentially expressed genes using false discovery rate controlling procedures</p>
            </title>
            <aug>
               <au>
                  <snm>Reiner</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Yekutieli</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Benjamini</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>368</fpage>
            <lpage>375</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btf877</pubid>
                  <pubid idtype="pmpid" link="fulltext">12584122</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B64">
            <title>
               <p>Comparison of false discovery rate methods in identifying genes with differential expression</p>
            </title>
            <aug>
               <au>
                  <snm>Qian</snm>
                  <fnm>HR</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>2005</pubdate>
            <volume>86</volume>
            <fpage>495</fpage>
            <lpage>503</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.ygeno.2005.06.007</pubid>
                  <pubid idtype="pmpid" link="fulltext">16054333</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B65">
            <title>
               <p>From patterns to pathways: gene expression data analysis comes of age</p>
            </title>
            <aug>
               <au>
                  <snm>Slonim</snm>
                  <fnm>DK</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2002</pubdate>
            <volume>32 Suppl</volume>
            <fpage>502</fpage>
            <lpage>508</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1033</pubid>
                  <pubid idtype="pmpid" link="fulltext">12454645</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>Statistical significance for genomewide studies</p>
            </title>
            <aug>
               <au>
                  <snm>Storey</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <fpage>9440</fpage>
            <lpage>9445</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">170937</pubid>
                  <pubid idtype="pmpid" link="fulltext">12883005</pubid>
                  <pubid idtype="doi">10.1073/pnas.1530509100</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B67">
            <title>
               <p>GO annotation file format</p>
            </title>
            <url>http://www.geneontology.org/GO.annotation.html#file</url>
         </bibl>
         <bibl id="B68">
            <title>
               <p>Cyanobase</p>
            </title>
            <url>http://www.kazusa.or.jp/cyanobase/</url>
         </bibl>
         <bibl id="B69">
            <title>
               <p>Statistics::Distributions modules</p>
            </title>
            <url>http://search.cpan.org/~mikek/Statistics-Distributions-1.02/Distributions.pm</url>
         </bibl>
         <bibl id="B70">
            <title>
               <p>GenTS</p>
            </title>
            <url>http://www.strimmerlab.org/software/genets/</url>
         </bibl>
         <bibl id="B71">
            <title>
               <p>GO TermFinder package</p>
            </title>
            <url>http://search.cpan.org/~sherlock/GO-TermFinder-0.64/</url>
         </bibl>
      </refgrp>
   </bm>
</art>
