<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-7-57</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Using pyrosequencing to shed light on deep mine microbial ecology</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Edwards</snm>
               <mi>A</mi>
               <fnm>Robert</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <insr iid="I3"/>
               <insr iid="I4"/>
               <email>redwards@salmonella.org</email>
            </au>
            <au id="A2">
               <snm>Rodriguez-Brito</snm>
               <fnm>Beltran</fnm>
               <insr iid="I1"/>
               <insr iid="I3"/>
               <email>beltran.rodriguezbrito@gmail.com</email>
            </au>
            <au id="A3">
               <snm>Wegley</snm>
               <fnm>Linda</fnm>
               <insr iid="I1"/>
               <email>lwegley@gmail.com</email>
            </au>
            <au id="A4">
               <snm>Haynes</snm>
               <fnm>Matthew</fnm>
               <insr iid="I1"/>
               <email>mhaynes@projects.sdsu.edu</email>
            </au>
            <au id="A5">
               <snm>Breitbart</snm>
               <fnm>Mya</fnm>
               <insr iid="I1"/>
               <email>mya@marine.usf.edu</email>
            </au>
            <au id="A6">
               <snm>Peterson</snm>
               <mi>M</mi>
               <fnm>Dean</fnm>
               <insr iid="I5"/>
               <email>dpeters1@nrri.umn.edu</email>
            </au>
            <au id="A7">
               <snm>Saar</snm>
               <mi>O</mi>
               <fnm>Martin</fnm>
               <insr iid="I6"/>
               <email>saar@tc.umn.edu</email>
            </au>
            <au id="A8">
               <snm>Alexander</snm>
               <fnm>Scott</fnm>
               <insr iid="I6"/>
               <email>alexa017@umn.edu</email>
            </au>
            <au id="A9">
               <snm>Alexander</snm>
               <fnm>E Calvin</fnm>
               <suf>Jr</suf>
               <insr iid="I6"/>
               <email>alexa001@umn.edu</email>
            </au>
            <au id="A10">
               <snm>Rohwer</snm>
               <fnm>Forest</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>frohwer@sunstroke.sdsu.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Biology, San Diego State University, San Diego, USA.</p>
            </ins>
            <ins id="I2">
               <p>Center for Microbial Sciences, San Diego State University, San Diego, USA.</p>
            </ins>
            <ins id="I3">
               <p>Computational Science Research Center, San Diego State University, San Diego, USA.</p>
            </ins>
            <ins id="I4">
               <p>Fellowship for Interpretation of Genomes, Burr Ridge, USA.</p>
            </ins>
            <ins id="I5">
               <p>Natural Resources Research Institute, Department of Geological Sciences, University of Minnesota, Duluth, USA.</p>
            </ins>
            <ins id="I6">
               <p>Department of Geology and Geophysics, University of Minnesota, Minneapolis, USA.</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2006</pubdate>
         <volume>7</volume>
         <issue>1</issue>
         <fpage>57</fpage>
         <url>http://www.biomedcentral.com/1471-2164/7/57</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">16549033</pubid>
               <pubid idtype="doi">10.1186/1471-2164-7-57</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>04</day>
               <month>11</month>
               <year>2005</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>20</day>
               <month>3</month>
               <year>2006</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>20</day>
               <month>3</month>
               <year>2006</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2006</year>
         <collab>Edwards et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Contrasting biological, chemical and hydrogeological analyses highlights the fundamental processes that shape different environments. Generating and interpreting the biological sequence data was a costly and time-consuming process in defining an environment. Here we have used pyrosequencing, a rapid and relatively inexpensive sequencing technology, to generate environmental genome sequences from two sites in the Soudan Mine, Minnesota, USA. These sites were adjacent to each other, but differed significantly in chemistry and hydrogeology.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Comparisons of the microbes and the subsystems identified in the two samples highlighted important differences in metabolic potential in each environment. The microbes were performing distinct biochemistry on the available substrates, and subsystems such as carbon utilization, iron acquisition mechanisms, nitrogen assimilation, and respiratory pathways separated the two communities. Although the correlation between much of the microbial metabolism occurring and the geochemical conditions from which the samples were isolated could be explained, the reason for the presence of many pathways in these environments remains to be determined. Despite being physically close, these two communities were markedly different from each other. In addition, the communities were also completely different from other microbial communities sequenced to date.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>We anticipate that pyrosequencing will be widely used to sequence environmental samples because of the speed, cost, and technical advantages. Furthermore, subsystem comparisons rapidly identify the important metabolisms employed by the microbes in different environments.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Banded iron formations started appearing ~3,700 million years ago when localized sea floor cyanobacterial photosynthesis raised oxygen concentrations high enough that dissolved iron precipitated. That iron powered the industrial revolution. The Soudan Iron Mine in Minnesota, USA was active from 1884 to 1962, and during this period 17.9 million tons of iron ore, primarily hematite, were removed. Nowadays the mine is used as a state park and as a facility for high-energy physics experiments.</p>
         <p>Metagenomics is a term used to describe "the functional and sequence-based analysis of the collective microbial genomes contained in an environmental sample"<abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. Random shotgun sequencing of DNA from natural communities has been used to characterize seawater, sediment, and fecal viral communities <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp>, as well as the microbial communities in soil, whale falls, seawater and the Iron Mountain acid mine drainage (AMD) <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>. Comparative metagenomics was introduced recently<abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, identifying those sets of genes that distinguish environmental samples. For example, samples from the surface of the ocean contain significantly more photosynthetic genes than soil or other samples<abbrgrp><abbr bid="B6">6</abbr><abbr bid="B8">8</abbr><abbr bid="B10">10</abbr></abbrgrp>. We have used comparative metagenomics to characterize the metabolic potential of different environments, and identify those genes, pathways, and subsystems that are more common in any particular environment <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>.</p>
         <p>Most current sequencing is a modification of the classical Sanger method, where extending DNA fragments are stopped by the random incorporation of a fluorescently labeled ddNTP. The different-sized fragments are then separated using capillary gel electrophoresis and detected with a LASER. Pyrosequencing is a fundamentally different methodology because only one dNTP is added into the reaction at a time <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr></abbrgrp>. If there is a complementary base, then the DNA polymerase catalyzes the reaction and releases pyrophosphate. ATP sulfurylase uses the pyrophosphate to produce ATP in the presence of adenosine 5' phosphosulfate (APS). A Charge-Coupled Device (CCD) measures the light produced when the ATP is used by luciferase to convert luciferin to oxyluciferin. 454 Life Sciences has scaled this process up to be massively parallel, determining the composition of more than 300,000 sequences at once, for approximately the same price as 96 to 192 sequencing reactions performed using traditional chemistries<abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. In addition to the massive parallelization, the 454 technology does not require cloning of the environmental samples, thus eliminating many of the problems that are associated with this step of metagenomics<abbrgrp><abbr bid="B2">2</abbr></abbrgrp>.</p>
         <p>This report describes the first application of pyrosequencing to environmental samples. From this sequence data, we identify the 16S rDNA sequences present in the sample, and apply new annotation methods to this data using the SEED database<abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. This paper also describes a comprehensive statistical treatment of the genes identified in each sample using a completely novel methodology that exploits the differences between metagenome sequences. We demonstrate that completely unique microbial communities inhabit proximate environments joined by a common watercourse, and that using metagenomics we can identify the unique metabolic potentials prevalent in each environment such as their mechanisms of iron acquisition and respiration. The integration of pyrosequencing, subsystems analysis, comparative metagenomics, statistics, hydrogeology, and chemistry provides a comprehensive systems analysis of the Soudan Mine.</p>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <sec>
            <st>
               <p>Description of the environmental samples</p>
            </st>
            <p>The two environments sampled within the Soudan Mine are shown in Figure <figr fid="F1">1</figr>. Groundwater is not very abundant in the banded iron formations of the mine at our sampling depth of 714 m below the surface. However, small amounts of water emerge steadily from exploration boreholes that extend downward from the deepest level of the mine. The two sets of samples were collected from water trickling out of two such boreholes that are separated by about 100 meters (Figure <figr fid="F1">1</figr>). This water is a calcium, sodium, chloride solution about twice as salty as seawater. It is anoxic, with up to 150 ppm of dissolved ferrous iron and variable enrichments of several trace elements. At both locations, the water emerging from the boreholes produces cm-scale "Black" environments that appear to extend down into the borehole. The water flowing away from each borehole, on the floor of the mine tunnel, is exposed to the oxygenated mine atmosphere, and transitions to a sequence of "Red" environments within a few cm of the orifices. The oxidized environments are continuously fed by anoxic water flowing from the boreholes. The water in the borehole, which yielded the Black sample, as well as a number of similar sites found throughout the mine, has a pH of 6.70 and redox potential of -142 mV. Some of the Black areas are associated with bubbling of gas. The Black sediment contained 5.8 &#215; 10<sup>5 </sup>microbes per ml. X-ray diffraction analyses of the minerals in this area show that chlorite-serpentine [(Mg,Al)<sub>6</sub>(Si,Al)<sub>4</sub>O<sub>10</sub>(OH)<sub>8</sub>], clinochlore, ferroan [(Mg,Fe)<sub>6</sub>(Si,Al)<sub>4</sub>O<sub>10</sub>(OH)<sub>8</sub>], quartz, and silinaite [LiNaSiO<sub>5</sub>&#183;HCl] are present in the Black sediments. Water slowly flows from the borehole into the stream running down the main mine tunnel. As the water comes in contact with oxygen in the passageway, the pH rapidly decreases to 4.37 and redox potential increases to -8 mV. The Red sample contained 1.2 &#215; 10<sup>6 </sup>microbes per ml, and these sediments include goethite [FeO(OH)], followed by szaibelyite [(Mg,Mn)BO<sub>2</sub>(OH)], and sussexite [(Mn,Mg)BO<sub>2</sub>(OH)].</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Sampling from the Soudan Mine</p>
               </caption>
               <text>
                  <p><b>Sampling from the Soudan Mine</b>. The Soudan Mine is an Algoma-type Iron Formation rich in hematite. Panel A shows a cross-section of the mine looking East-North-East at 78.5&#176;. Panel B depicts a three dimensional view of the mine, including the cross-section shown in Panel A, and with the sampling sites shown for the "Red" and "Black" samples. Panel C shows the overall flow of water in the mine at level 27, located 714 meters below the surface (Panel D). Panels E and F show a close up of the two sampling sites.</p>
               </text>
               <graphic file="1471-2164-7-57-1"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>The first two pyrosequences of environmental samples</p>
            </st>
            <p>DNA was purified from the two samples, amplified using the GenomiPhi procedure (GE Healthcare, Piscataway, NJ), and then sequenced by 454 Life Sciences. A summary of the sequence characteristics determined using the pyrosequencing technique is shown in Table <tblr tid="T1">1</tblr>. The raw sequence reads and quality scores [see Additional files <supplr sid="S6">6</supplr> and <supplr sid="S7">7</supplr>] are provided in compressed format.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Summary of pyrosequence data from the Soudan Mine</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>Red Sample</it>
                           </b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>
                              <it>Black Sample</it>
                           </b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Number of Sequences</p>
                     </c>
                     <c ca="left">
                        <p>334,386</p>
                     </c>
                     <c ca="left">
                        <p>388,627</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total Length of Sequences</p>
                     </c>
                     <c ca="left">
                        <p>35,439,683 bp</p>
                     </c>
                     <c ca="left">
                        <p>38,502,057 bp</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Average Length of Sequences</p>
                     </c>
                     <c ca="left">
                        <p>106.0 bp</p>
                     </c>
                     <c ca="left">
                        <p>99.1 bp</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Average Quality Score<sup>1</sup></p>
                     </c>
                     <c ca="left">
                        <p>26.2</p>
                     </c>
                     <c ca="left">
                        <p>25.8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Skew<sup>2</sup></p>
                     </c>
                     <c ca="left">
                        <p>2.53</p>
                     </c>
                     <c ca="left">
                        <p>2.44</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>1 </sup>The quality score of each base was provided by 454 Life Sciences, and is analogous to the Phred score of Sanger Sequencing methods [26]. The value cited here is the average of the mean quality score per sequence.</p>
                  <p><sup>2</sup>The skew was calculated by comparing the dinucleotide frequencies within each library. A similar analysis performed on Bacterial and Eukaryotic genomes sampled at random yielded entropies of 2.63 and 2.66 respectively.</p>
               </tblfn>
            </tbl>
            <suppl id="S6">
               <title>
                  <p>Additional File 6</p>
               </title>
               <text>
                  <p>
							A gzip compressed archive of the fasta files (those ending .fa.gz) and quality scores (those ending .qual.gz) of sequences from the Red samples as supplied by 454, Inc.</p>
               </text>
               <file name="1471-2164-7-57-S6.zip">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S7">
               <title>
                  <p>Additional File 7</p>
               </title>
               <text>
                  <p>
							A gzip compressed archive of the fasta files (those ending .fa.gz) and quality scores (those ending .qual.gz) of sequences from the Black samples as supplied by 454, Inc.</p>
               </text>
               <file name="1471-2164-7-57-S7.zip">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>The two samples produced more than 70 Mbp of sequence data from over 700,000 sequences, and there was no significant skew in the sequence data (as measured by dinucleotide frequency) when the data generated by pyrosequencing was compared to complete genome sequences.</p>
         </sec>
         <sec>
            <st>
               <p>16S rDNA analysis of the samples</p>
            </st>
            <p>The two sequence libraries were compared to the 16S rDNA database from the Ribosomal Database Project<abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. As shown in Figure <figr fid="F2">2</figr>, the Black sample was dominated by Actinomycetales such as <it>Brevibacterium </it>and <it>Corynebacterium </it>that volatilize sulfur via an organic intermediate and can also break down complex heterocyclic and polycyclic ring structures<abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>. In contrast, members of the Chromatiales, including the genera <it>Chromatiaceae</it>, <it>Thiobacillus</it>, and <it>Halothiobacillus</it>, dominate the Red sample. These chemoautotrophic Bacteria often use the Calvin-Benson-Bassham cycle to fix CO<sub>2 </sub>through the oxidation of iron or sulfur, and consequently they would be expected to be present in samples from an iron-rich deposit. These two communities are fundamentally different both from each other and from the community identified in the Iron Mountain metagenome<abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. The community in the Red sample has a much higher species richness than the Black sample, and the differences between the Soudan and Iron mines reflect the iron composition (hematite versus pyrite), temperature, and pH of the various environments<abbrgrp><abbr bid="B7">7</abbr></abbrgrp>.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Composition of the 16S rDNA sequences from the two samples and comparison of 16S sequences from the 454 libraries and a traditional clone library</p>
               </caption>
               <text>
                  <p><b>Composition of the 16S rDNA sequences from the two samples and comparison of 16S sequences from the 454 libraries and a traditional clone library</b>. The percentage of all sequences from each library in each of the orders is shown for the 454-sequenced Black sample (solid black bars; n = 24), the 454 sequenced red sample (solid red bars; n = 76), and the PCR amplified clone library (hatched red bars; n = 91).</p>
               </text>
               <graphic file="1471-2164-7-57-2"/>
            </fig>
            <p>A16S clone library was created from the Red sample to validate the 454 sequencing approach. Ninety-six clones were sequenced using traditional techniques, and compared to the 16S rDNA database from the Ribosomal Database Project <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. The congruity between the 16S genes sequenced in the 454 library and the 16S sequences from the clone library, as shown in Fig. <figr fid="F2">2</figr>, is quite remarkable.</p>
            <p>We also used the 16S sequences to evaluate the randomness of the library. An analysis of 160 bacterial genome sequences in the SEED database <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B19">19</abbr></abbrgrp> with annotated 16S genes showed that about 1 in 10<sup>5 </sup>bases is from a 16S gene. Based on this estimate, as a rule of thumb the Soudan samples are expected to contain approximately 3,000 bases of 16S sequence in total, or approximately 30 sequences. Twenty four sequences were found to have significant similarity (with an E value less than 1 &#215; 10<sup>-5 </sup>and a match of 50 bp or more) to 16S rDNA from the Black sample and seventy six sequences were found to have significant similarity to 16S rDNA from the Red sample.</p>
         </sec>
         <sec>
            <st>
               <p>Metabolic potential from the metagenome library</p>
            </st>
            <p>Sequences from both libraries were compared to the SEED database, a curated database of microbial genomes <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B19">19</abbr></abbrgrp>. The annotations using the SEED interface primarily occur through the development of subsystems, a technique pioneered by the Fellowship for Interpretation of Genomes<abbrgrp><abbr bid="B15">15</abbr><abbr bid="B20">20</abbr></abbrgrp>. Subsystems are groups of genes that function together, such as the genes whose products are involved in a metabolic pathway, or the group of genes whose products make a cellular structure. A summary of the subsystem hits are shown in Figure <figr fid="F3">3</figr>, and all matches to subsystems are provided as supplemental data [see Additional files <supplr sid="S1">1</supplr>, <supplr sid="S2">2</supplr> and <supplr sid="S3">3</supplr>]. These subsystems show that the pyrosequencing generates sequences that represent a large swathe of central metabolism in each of the environments. Common metabolic potential that is expected to be present in sulfur-utilizing chemoautotrophs is represented in the mine libraries, including the Calvin-Benson cycle, inorganic sulfur assimilation, amino acid biosynthetic genes, and so on. The comparison of the subsystem similarities suggested the simple hypothesis that groups of genes (or subsystems) important to a particular environment will be enriched in that environment. To distinguish between ecologically important differences and differences caused by sampling error, a method was devised to identify those subsystems that are statistically significantly overrepresented in one sample when compared to another <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Subsystems in the Red and Black Samples</p>
               </caption>
               <text>
                  <p><b>Subsystems in the Red and Black Samples</b>. The occurrence of classes of subsystems is shown as a percent of all subsystems in each sample for the Red and Black samples. Notes and abbreviations: The subsystem class "Glu, Asp" also contains Gln and Asn. The subsystem class "Lys, Thr" also contains Met and Cys. CHO: Carbohydrates; sacch: saccharides; Extracell. Poly: Extracellular polysaccharides; Myco: Mycobacterial cell wall; Gm: Gram stain positive (+) or negative (-); Clust: clusters; RFN: Riboflavin; T: Transporters; Mot: Motility; N: Nitrogen; Resp: Respiration; e-: electron; S: Sulfur.</p>
               </text>
               <graphic file="1471-2164-7-57-3"/>
            </fig>
            <suppl id="S1">
               <title>
                  <p>Additional File 1</p>
               </title>
               <text>
                  <p><b>Table1SRed</b>. Two lists (one for the Red Sample and one for the Black Sample) describing all the similarities found in the data. The table has the following columns: "Classification I" and "Classification II" are hierarchical classifications of the subsystems. "Subsystem" is the name of the subsystem <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. "Functional Role" is the role of the protein in the subsystem to which the sequence from the Soudan Mine was similar. "Occurrence" is the number of times that a functional role is found in each sample. The text files have the data as tab separated items, and the file ending .xls has the same data in Microsoft Excel format.</p>
               </text>
               <file name="1471-2164-7-57-S1.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S2">
               <title>
                  <p>Additional File 2</p>
               </title>
               <text>
                  <p><b>Table1SBlack</b>. Two lists (one for the Red Sample and one for the Black Sample) describing all the similarities found in the data. The table has the following columns: "Classification I" and "Classification II" are hierarchical classifications of the subsystems. "Subsystem" is the name of the subsystem <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. "Functional Role" is the role of the protein in the subsystem to which the sequence from the Soudan Mine was similar. "Occurrence" is the number of times that a functional role is found in each sample. The text files have the data as tab separated items, and the file ending .xls has the same data in Microsoft Excel format.</p>
               </text>
               <file name="1471-2164-7-57-S2.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S3">
               <title>
                  <p>Additional File 3</p>
               </title>
               <text>
                  <p><b>Table1S</b>. Two lists (one for the Red Sample and one for the Black Sample) describing all the similarities found in the data. The table has the following columns: "Classification I" and "Classification II" are hierarchical classifications of the subsystems. "Subsystem" is the name of the subsystem <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. "Functional Role" is the role of the protein in the subsystem to which the sequence from the Soudan Mine was similar. "Occurrence" is the number of times that a functional role is found in each sample. The text files have the data as tab separated items, and the file ending .xls has the same data in Microsoft Excel format.</p>
               </text>
               <file name="1471-2164-7-57-S3.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Subsystems enriched in the Black or Red samples</p>
            </st>
            <p>Table <tblr tid="T2">2</tblr> shows subsystems that were determined to be statistically more common, with 95% confidence, in either the Red or Black samples from the Soudan mine. The subsystems that are overrepresented in a metagenome can yield significant insights into the microbial ecology of the environment. A few specific examples are detailed below.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Subsystems statistically more likely to be present in either the Red or Black samples. These subsystems are more frequently found among sequences from either the Red or Black samples with a sample size of 5,000 proteins, 20,000 repeated samples, and <it>P </it>&lt; 0.05.</p>
               </caption>
               <tblbdy cols="2">
                  <r>
                     <c ca="center">
                        <p>
                           <b>
                              <it>Red Sample (Oxidized, pH4.37, E<sub>h</sub>-8)</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>
                              <it>Black Sample (Reduced, pH 6.70, E<sub>h</sub>-142)</it>
                           </b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Amino Acids and Derivatives</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Arginine biosynthesis</p>
                     </c>
                     <c ca="left">
                        <p>Urea decomposition</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Tryptophan synthesis</p>
                     </c>
                     <c ca="left">
                        <p>Chorismate synthesis</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Asp-Glu-tRNA(Asn-Gln) transamidation</p>
                     </c>
                     <c ca="left">
                        <p>Branched-chain amino acid biosynthesis</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Histidine biosynthesis</p>
                     </c>
                     <c ca="left">
                        <p>Isoleucine degradation</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Leucine biosynthesis</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Leucine degradation and HMG-CoA metabolism</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Valine degradation</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Methionine salvage</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Carbohydrates</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Glyoxylate synthesis</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Cell Division and Cell Cycle</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cytoskeleton</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Cell Wall and Capsule</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>N-linked glycosylation in Bacteria</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Teichoic acid biosynthesis</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Cofactors, Vitamins, Prosthetic Groups, Pigments</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Folate biosynthesis</p>
                     </c>
                     <c ca="left">
                        <p>Coenzyme A biosynthesis in pathogens</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Methylglyoxal metabolism</p>
                     </c>
                     <c ca="left">
                        <p>Molybdopterin biosynthesis</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Pyruvate metabolism I: anaplerotic rx, PEP</p>
                     </c>
                     <c ca="left">
                        <p>Carotenoids</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ubiquinone biosynthesis</p>
                     </c>
                     <c ca="left">
                        <p>Polyisoprenoid biosynthesis</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ubiquinone menaquinone-cytochrome c reductase</p>
                     </c>
                     <c ca="left">
                        <p>NAD and NADP cofactor biosynthesis global</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Riboflavin metabolism</p>
                     </c>
                     <c ca="left">
                        <p>Coenzyme PQQ synthesis</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Pyrroloquinoline quinone biosynthesis</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Siderophore enterobactin biosynthesis</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Siderophore enterobactin biosynthesis and ferric enterobactin transport</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Thiamin biosynthesis</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>DNA metabolism</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>DNA repair, bacterial</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Fatty Acids and Lipids</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Fatty acid metabolism</p>
                     </c>
                     <c ca="left">
                        <p>Glycerolipid and glycerphospholipid metabolism</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Fatty acid oxidation pathway</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Membrane Transport</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ABC transporter maltose</p>
                     </c>
                     <c ca="left">
                        <p>ABC transporter ferrichrome</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>ABC transporter heme</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>CbiQO-type ABC transporter systems</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Sodium hydrogen antiporter</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Metabolism of aromatic compounds</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Phenylacetate pathway of aromatic compound degradation</p>
                     </c>
                     <c ca="left">
                        <p>Homogentisate pathway of aromatic compound degradation</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Motility and Chemotaxis</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Bacterial chemotaxis</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Flagellum</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Nitrogen Metabolism</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Denitrification</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Nucleosides and Nucleotides</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>De novo purine biosynthesis</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ribonucleotide reduction</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Protein Metabolism</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ribosome LSU bacterial</p>
                     </c>
                     <c ca="left">
                        <p>Phenylpropionate degradation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ribosome SSU bacterial</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Translation factors bacterial</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Universal GTPases</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Protein degradation</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Respiration</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>F0F1-type ATP synthase</p>
                     </c>
                     <c ca="left">
                        <p>NiFe hydrogenase maturation</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Terminal cytochrome C oxidases</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Hydrogenases</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Membrane-bound Ni, Fe-hydrogenase</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Na(+)-translocating NADH-quinone oxidoreductase and rnf-like group of electron transport complexes</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Respiratory complex I</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Respiratory dehydrogenases 1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>RNA metabolism</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Polyadenylation bacterial</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>RNA polymerase bacterial</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>tRNA aminoacylation</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Stress response</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Glutathione redox metabolism</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ppGpp biosynthesis</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Sulfur Metabolism</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Sulfate assimilation</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="2" ca="center">
                        <p>
                           <b>Virulence</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Resistance to fluoroquinolones</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>Several subsystems involved in iron uptake and utilization such as siderophores and ABC transporters for ferrichrome and heme are more common in the Black sample. The overall concentration of iron at the two sites was similar (Table <tblr tid="T3">3</tblr>; Figure <figr fid="F5">5</figr>). However, the iron in the Black sample is present as either Fe<sup>2+ </sup>dissolved in the water or as ferroan [(Mg,Fe)<sub>6</sub>(Si,Al)<sub>4</sub>O<sub>10</sub>(OH)<sub>8</sub>]. In either case, the ferrous iron can not be assimilated biologically, and the microbes are forced to scavenge for the limited ferric iron (Fe<sup>3+</sup>) available. In contrast, in the Red sample, goethite [FeO(OH)] is present and ferric iron is more readily available for biological utilization. The Black sample is enriched for amino acid degradation pathways and microbes may be assimilating nitrogen or carbon through these pathways. It is not currently apparent from where free amino acids would be supplied.</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Water chemistry from Soudan Mine. No significant differences were found for Ca, Mg, Na, K, Li, Al, Mn, Sr, Ba, Si, Cr, Co, Ni, Cu, Zn, As, Se, Rb, Cd, Cs, Pb, total alkalitity, lactate, acetate, formate, chlorate, oxalate, and trace elements.</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <b>
                              <it>Black</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>
                              <it>Red</it>
                           </b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Temp (&#176;C)</p>
                     </c>
                     <c ca="center">
                        <p>10.9</p>
                     </c>
                     <c ca="center">
                        <p>10.9</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>pH</p>
                     </c>
                     <c ca="center">
                        <p>6.70</p>
                     </c>
                     <c ca="center">
                        <p>4.37</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>redox (mV)</p>
                     </c>
                     <c ca="center">
                        <p>-142</p>
                     </c>
                     <c ca="center">
                        <p>-8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Fe (ppm)</p>
                     </c>
                     <c ca="center">
                        <p>161.5</p>
                     </c>
                     <c ca="center">
                        <p>146.3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total N (ppm)</p>
                     </c>
                     <c ca="center">
                        <p>1.510</p>
                     </c>
                     <c ca="center">
                        <p>1.280</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>&#8226; NH<sub>4</sub></p>
                     </c>
                     <c ca="center">
                        <p>1.22</p>
                     </c>
                     <c ca="center">
                        <p>0.91</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>&#8226; NO<sub>3</sub></p>
                     </c>
                     <c ca="center">
                        <p>0.29</p>
                     </c>
                     <c ca="center">
                        <p>0.36</p>
                     </c>
                  </r>
                  <r>
                     <c indent="1" ca="left">
                        <p>&#8226; NO<sub>2</sub></p>
                     </c>
                     <c ca="center">
                        <p>&lt;0.10</p>
                     </c>
                     <c ca="center">
                        <p>&lt;0.10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>SO<sub>4 </sub>(ppm)</p>
                     </c>
                     <c ca="center">
                        <p>27.4</p>
                     </c>
                     <c ca="center">
                        <p>29.4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>PO<sub>4 </sub>(ppm)</p>
                     </c>
                     <c ca="center">
                        <p>4.1</p>
                     </c>
                     <c ca="center">
                        <p>1.8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>B (ppm)</p>
                     </c>
                     <c ca="center">
                        <p>186</p>
                     </c>
                     <c ca="center">
                        <p>70</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mo (ppm)</p>
                     </c>
                     <c ca="center">
                        <p>2.59</p>
                     </c>
                     <c ca="center">
                        <p>0.68</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>W (ppm)</p>
                     </c>
                     <c ca="center">
                        <p>3.82</p>
                     </c>
                     <c ca="center">
                        <p>0.91</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Tl (ppm)</p>
                     </c>
                     <c ca="center">
                        <p>1.90</p>
                     </c>
                     <c ca="center">
                        <p>0.52</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>U (ppm)</p>
                     </c>
                     <c ca="center">
                        <p>1.01</p>
                     </c>
                     <c ca="center">
                        <p>0.20</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Cations and Anions found in the Soudan Mine</p>
               </caption>
               <text>
                  <p><b>Cations and Anions found in the Soudan Mine</b>. The pie chart shows the abundance of cations and anions found in the mine. The numbers in parentheses are the concentrations (in ppm) of each ion in the "Black" and "Red" samples respectively. The minor ions are shown expanded in the rightmost pie.</p>
               </text>
               <graphic file="1471-2164-7-57-5"/>
            </fig>
            <p>The respiratory complexes and cytochrome-C oxidases are more commonly found in the sample from the oxidized environment (the Red sample; Table <tblr tid="T2">2</tblr>). Respiration proceeds via multiple electron transfer steps (Figure <figr fid="F6">6</figr>). In an aerobic environment, electrons are passed from hydrogenases to quinones (e.g., ubiquinone, quinone, menaquinone, and plastoquinone) and then to cytochromes resulting in the conversion of oxygen to water. In anaerobic environments the electrons are shuffled through nitrite and nitrate reductases, reducing NO<sub>3 </sub>first to NO<sub>2 </sub>and then to N<sub>2 </sub>gas. The Black sample is enriched for these denitrification genes suggesting that the latter pathway predominates while the Red sample is enriched for components of the aerobic respiratory pathway. Moreover, the Black sample had a lower concentration of free nitrate than the Red sample, presumably because nitrate is being used as an electron acceptor during respiration (although nitrite was below the level of detection in both samples; Table <tblr tid="T3">3</tblr>).</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Respiration in aerobic and anaerobic environments</p>
               </caption>
               <text>
                  <p><b>Respiration in aerobic and anaerobic environments</b>. Among other potential pathways in the Soudan mine, electrons are transferred from hydrogenases to either cytochromes and then to oxygen to produce water in an oxidative environment, or via nitrate and nitrite reductases (denitrification) in anaerobic environments. Genes encoding the hydrogenases, respiratory complexes, and terminal cytochromes of the aerobic sample were significantly more abundant in the Red (oxidized) sample, while genes encoding the hydrogenases and denitrification genes were more abundant in the Black (reduced) sample. After Vassieva, O. [25]</p>
               </text>
               <graphic file="1471-2164-7-57-6"/>
            </fig>
            <p>This analysis demonstrates that by combining pyrosequencing, subsystems analysis, and comparative metagenomics the microbiology of different environments can be correlated with the chemistry and hydrogeology of those environments to identify significant ecological differences between them.</p>
         </sec>
         <sec>
            <st>
               <p>Comparisons between Soudan and Iron Mountain communities</p>
            </st>
            <p>A previous study used Sanger sequencing to determine the metagenome of the Iron Mountain community<abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. The environmental differences (such as the difference in temperature) account for the predominant differences between the microbial communities. The organismal differences are reflected in the individual biochemistries of the samples [see Additional files <supplr sid="S4">4</supplr> and <supplr sid="S5">5</supplr>]. For example, the AMD metagenome contains significantly more occurrences of Archaea-specific subsystems such as those involved in protein biosynthesis than the Soudan samples. The AMD sample has a preference for CO<sub>2 </sub>fixation and simple carbohydrate metabolism when compared to either of the Soudan samples. There are also many currently unexplained differences between subsystems found in these environments that must relate the biology of the organisms to the chemistry of the environment.</p>
            <suppl id="S4">
               <title>
                  <p>Additional File 4</p>
               </title>
               <text>
                  <p><b>Table2S</b>. The occurrence of subsystems in either the Red Sample or the Black Sample were compared to the subsystems found in the following metagenomes: AMD <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, Farm <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, Whale (all three whale falls combined)<abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, the SEED non-redundant database <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, and the Sargasso Sea <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. For each pair wise comparison the subsystems that are more likely to be found (<it>P </it>> 0.95) in either of the samples are shown, along with the sample that the subsystem is more likely to be found in. Subsystem names and classification are as found at <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. The text file has the data as tab separated items, and the file ending .xls has the same data in Microsoft Excel format.</p>
               </text>
               <file name="1471-2164-7-57-S4.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S5">
               <title>
                  <p>Additional File 5</p>
               </title>
               <text>
                  <p><b>Table2S</b>. The occurrence of subsystems in either the Red Sample or the Black Sample were compared to the subsystems found in the following metagenomes: AMD <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, Farm <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, Whale (all three whale falls combined)<abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, the SEED non-redundant database <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, and the Sargasso Sea <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. For each pair wise comparison the subsystems that are more likely to be found (<it>P </it>> 0.95) in either of the samples are shown, along with the sample that the subsystem is more likely to be found in. Subsystem names and classification are as found at <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. The text file has the data as tab separated items, and the file ending .xls has the same data in Microsoft Excel format.</p>
               </text>
               <file name="1471-2164-7-57-S5.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Comparisons between Soudan and other metagenome sequences</p>
            </st>
            <p>The SEED database used for these studies contained 351 subsystems. The vast majority (83%) of subsystems were present in one or more of the sequenced metagenomes, and over half (52%) of the subsystems are present in every metagenome. A comparison of the subsystem classification reveals trends between the metagenomes (Figure <figr fid="F4">4</figr>). For example, oxygenic photosynthesis is prevalent in samples that are naturally illuminated such as the Sargasso Sea<abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. This analysis also suggested that phosphorous metabolism is more prevalent in oceanic surfaces rather than terrestrial environments. Comparisons of the Minnesota Farm metagenome<abbrgrp><abbr bid="B6">6</abbr></abbrgrp> with the Soudan Mine metagenomes, also from Minnesota, showed important differences in the production and consumption of secondary metabolites, membrane transport, and fatty acid metabolism. The complete lists of statistically significantly different subsystems between both Red or Black samples and each of the previously published metagenomes are supplied as supplemental material [see Additional files <supplr sid="S4">4</supplr> and <supplr sid="S5">5</supplr>].</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Subsystems present in different metagenome sequences</p>
               </caption>
               <text>
                  <p><b>Subsystems present in different metagenome sequences</b>. The subsystems present in the Soudan samples, the Iron Mountain AMD sample, the Minnesota Farm and the Sargasso Sea are shown grouped by family. The red x corresponds to very low abundance or complete absence of that family of subsystems. The size of the circle represents the proportion of sequences seen within that family of subsystems.</p>
               </text>
               <graphic file="1471-2164-7-57-4"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>This is the first metagenome analysis performed using pyrosequencing, which is approximately 10 to 30 times cheaper than current Sanger sequencing. Pyrosequencing also eliminates the need for cloning, thus removing the potential for both aberrant recombinants in the surrogate host and for cloning-related artifacts such as counterselection against potentially toxic genes such as those found on phages<abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. The main concerns with current pyrosequencing technology are the short length of sequence fragments (average of 105 bp in this study), and the requirement to use whole genome amplification to generate sufficient DNA for sequencing from environmental libraries The former may make it difficult to accurately assemble genomes in the absence of a scaffold, while the later may bias these analyses. Our preliminary unpublished data suggests that the whole genome amplification bias is minimal, and is preferentially towards the ends of linear pieces of DNA (Haynes, Rayhawk, Edwards, Rohwer; unpublished). Since these biases are applied equally to both libraries, they will be negated during the comparative study to highlight differences between metagenomes. Nonetheless, the short fragments are sufficient to determine statistically significant differences between metagenomes that reflect the most likely biology occurring in each environment. The low cost, high yield of pyrosequencing combined with statistical analyses on the abundance of subsystems in the samples allows the rapid identification of key processes driving the metabolism of different environments.</p>
         <p>The systems approach of integrating biology, chemistry, and geology has yielded significant insights into the metabolism of two environments in the Soudan Mine. The oxidized sample is using aerobic respiratory pathways while the reduced sample is using anaerobic pathways. Nitrogen assimilation, iron acquisition, and sulfur metabolism are all differentiated between these two samples from close proximity within the same mine. However, many more significant differences between the samples remain unexplained by our current knowledge of bacterial physiology and metabolism. Explaining these differences will be a grand challenge for the future. By combining pyrosequencing, subsystems analysis, comparative metagenomics, and statistics, Occam has used his razor on metagenomics.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Sample collections, microbial enumeration, and DNA extraction</p>
            </st>
            <p>Samples were collected from several sites in the Soudan Mine. This analysis concerns the sample collection at two sites on Level 27 (714 m below the surface; Figure <figr fid="F1">1</figr>). Water and sediments were sampled from the two locations shown in Figure <figr fid="F1">1</figr> giving the "Black" (reduced) sample and "Red" (oxidized) sample. Microbes were concentrated from these samples by filtration with 0.22 &#956;m Sterivex units. Microbial counts were enumerated by staining the samples with SYBR-Gold (Invitrogen, Carlsbad, CA) and visualization with an epifluorescent microscope <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. DNA was extracted from the microbial sample using either the Ultra Clean Soil DNA Kit or Power Soil Kit (MolBio, Boulder, CO). The DNA was amplified with GenomiPhi (GE Healthcare, Piscataway, NJ) in an Eppendorf thermal cycler (Eppendorf, Westbury, NY) using multiple reactions containing 50&#8211;100 ng of the isolated DNA as template and the manufacturer's recommended protocols. After amplification, the resulting DNA was purified with silica columns (Qiagen, Valencia, CA) and concentrated by ethanol precipitation. The DNA was resuspended in water to a final concentration of 0.3 mg/ml. Approximately 10 &#956;g of each sample was sequenced using the pyrosequencing technology (454 Life Sciences, Branford, CT).</p>
            <p>Bacterial-specific primers 27F (5'-AGAGTTTGATCMTGGCTCAG-3') and the universal 1492R primer (5'-TACGGYTACCTTGTTACGACTT-3') <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> were used to amplify the 16S rDNA genes. PCR products were cloned into the pCR<sup>&#174;</sup>4-TOPO<sup>&#174; </sup>vector as recommended by the manufacturer (Invitrogen, Carlsbad, CA).</p>
         </sec>
         <sec>
            <st>
               <p>Water and mineral analyses</p>
            </st>
            <p>Water samples were collected by filtering the water through 0.2 &#956;m filters into clean bottles. Field measurements of pH, E<sub>h</sub>, temperature and conductivity were conducted <it>in situ</it>. The sediment samples were collected as slurries with a pipette resting on and in the sediments. Those slurries were transferred to clean centrifuge tubes, allowed to settle by gravity and then the fluid was decanted.</p>
            <p>Major anions in the water were determined by GC (Dionex IGS-2000, Sunnyvale, CA) and major and trace elements by ICP/MS (Thermo Electron PQ ExCell, Franklin, MA). The mineral identifications are based on XRD (Bruker-AX D500 X-ray Diffractometer, Germany) measurements. The X-ray peaks were relatively small. Much of the sediment was apparently not well crystallized.</p>
         </sec>
         <sec>
            <st>
               <p>Sequence analysis</p>
            </st>
            <p>The unassembled sequences provided by 454 were compared to the SEED database using the BLASTX algorithm on the Teragrid cluster at Argonne National Laboratories<abbrgrp><abbr bid="B15">15</abbr><abbr bid="B23">23</abbr></abbrgrp>. All BLAST searches were performed using an expect value cutoff of 1 &#215; 10<sup>-5</sup>. At this cutoff approximately 3 of the observed hits would be expected to occur at random<abbrgrp><abbr bid="B23">23</abbr></abbrgrp>.</p>
            <p>The BLASTN algorithm was used to identify 16S genes from release 9 of the RDP database <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B24">24</abbr></abbrgrp>. These BLAST searches were also performed using an expect value cutoff of 1 &#215; 10<sup>-5 </sup>and a minimum sequence match length of 50 nt.</p>
         </sec>
         <sec>
            <st>
               <p>Statistical analyses of metagenome datasets</p>
            </st>
            <p>The statistical analysis of subsystems present in each sample was performed essentially as described elsewhere <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. The presence or absence of subsystems between two data sets was determined using 20,000 replicates of samples of 5,000 subsystems each. The 95% confidence interval for the median was constructed using the 0.025 and 0.975 percentiles.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>LW, MB, and FR collected the samples from the mine. DP, MS, SA, and CA performed chemical and hydrogeological analyses. MH extracted the DNA and processed samples for sequencing, BR-B, and RE performed computational and statistical analysis. RE authored the manuscript, and all authors edited and commented on the paper.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The authors are grateful to Bill Miller, Director of the Soudan Facility, for arranging the sampling trips, ploughing through paperwork, and bringing these fascinating microbial communities to our attention. Thanks to Jim Essig, for guiding us to Level 10, arranging our sampling protocols, and generally helping out. Tony Zavodnick, Paul Paulisich, and Jack Zorman provided invaluable guidance and explanations of mining at the Soudan Mine site and the whole Soudan mine and facility crews for making our sampling trip so enjoyable. In addition, we thank Robert Olson (Argonne National Labs) for assistance with the computational analysis. This work was supported by a grant NSF DEB-BE 04-21955 from the NSF Biocomplexity program (to FR).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Metagenomics: genomic analysis of microbial communities</p>
            </title>
            <aug>
               <au>
                  <snm>Riesenfeld</snm>
                  <fnm>CS</fnm>
               </au>
               <au>
                  <snm>Schloss</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Handelsman</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Annu Rev Genet</source>
            <pubdate>2004</pubdate>
            <volume>38</volume>
            <fpage>525</fpage>
            <lpage>552</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.genet.38.072902.091216</pubid>
                  <pubid idtype="pmpid" link="fulltext">15568985</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Viral metagenomics</p>
            </title>
            <aug>
               <au>
                  <snm>Edwards</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Rohwer</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Nat Rev Microbiol</source>
            <pubdate>2005</pubdate>
            <volume>3</volume>
            <issue>6</issue>
            <fpage>504</fpage>
            <lpage>510</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrmicro1163</pubid>
                  <pubid idtype="pmpid" link="fulltext">15886693</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Diversity and population structure of a near-shore marine-sediment viral community</p>
            </title>
            <aug>
               <au>
                  <snm>Breitbart</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Felts</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kelley</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mahaffy</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Nulton</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Salamon</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Rohwer</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Proc R Soc Lond B Biol Sci</source>
            <pubdate>2004</pubdate>
            <volume>271</volume>
            <issue>1539</issue>
            <fpage>565</fpage>
            <lpage>574</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1098/rspb.2003.2628</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Metagenomic analyses of an uncultured viral community from human feces</p>
            </title>
            <aug>
               <au>
                  <snm>Breitbart</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hewson</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Felts</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Mahaffy</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Nulton</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Salamon</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Rohwer</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2003</pubdate>
            <volume>185</volume>
            <issue>20</issue>
            <fpage>6220</fpage>
            <lpage>6223</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">225035</pubid>
                  <pubid idtype="pmpid" link="fulltext">14526037</pubid>
                  <pubid idtype="doi">10.1128/JB.185.20.6220-6223.2003</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Genomic analysis of uncultured marine viral communities</p>
            </title>
            <aug>
               <au>
                  <snm>Breitbart</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Salamon</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Andresen</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Mahaffy</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Segall</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Mead</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Azam</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Rohwer</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <issue>22</issue>
            <fpage>14250</fpage>
            <lpage>14255</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">137870</pubid>
                  <pubid idtype="pmpid" link="fulltext">12384570</pubid>
                  <pubid idtype="doi">10.1073/pnas.202488399</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Comparative metagenomics of microbial communities</p>
            </title>
            <aug>
               <au>
                  <snm>Tringe</snm>
                  <fnm>SG</fnm>
               </au>
               <au>
                  <snm>von Mering</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kobayashi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Salamov</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Chang</snm>
                  <fnm>HW</fnm>
               </au>
               <au>
                  <snm>Podar</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Short</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Mathur</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Detter</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Hugenholtz</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2005</pubdate>
            <volume>308</volume>
            <issue>5721</issue>
            <fpage>554</fpage>
            <lpage>557</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1107851</pubid>
                  <pubid idtype="pmpid" link="fulltext">15845853</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Community structure and metabolism through reconstruction of microbial genomes from the environment</p>
            </title>
            <aug>
               <au>
                  <snm>Tyson</snm>
                  <fnm>GW</fnm>
               </au>
               <au>
                  <snm>Chapman</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hugenholtz</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Allen</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Ram</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Richardson</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Solovyev</snm>
                  <fnm>VV</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Rokhsar</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>Banfield</snm>
                  <fnm>JF</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2004</pubdate>
            <volume>428</volume>
            <issue>6978</issue>
            <fpage>37</fpage>
            <lpage>43</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature02340</pubid>
                  <pubid idtype="pmpid" link="fulltext">14961025</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Microbial community genomics in the ocean</p>
            </title>
            <aug>
               <au>
                  <snm>DeLong</snm>
                  <fnm>EF</fnm>
               </au>
            </aug>
            <source>Nat Rev Microbiol</source>
            <pubdate>2005</pubdate>
            <volume>3</volume>
            <issue>6</issue>
            <fpage>459</fpage>
            <lpage>469</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrmicro1158</pubid>
                  <pubid idtype="pmpid" link="fulltext">15886695</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Analysis of the virus population present in equine faeces indicates the presence of hundreds of uncharacterized virus genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Cann</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Fandrich</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Heaphy</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Virus Genes</source>
            <pubdate>2005</pubdate>
            <volume>30</volume>
            <issue>2</issue>
            <fpage>151</fpage>
            <lpage>156</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s11262-004-5624-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">15744573</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Environmental genome shotgun sequencing of the Sargasso Sea</p>
            </title>
            <aug>
               <au>
                  <snm>Venter</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Remington</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Heidelberg</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Halpern</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Rusch</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Paulsen</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Fouts</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Levy</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Knap</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Lomas</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Nealson</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hoffman</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Parsons</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Baden-Tillson</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Pfannkoch</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Rogers</snm>
                  <fnm>YH</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>HO</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>304</volume>
            <issue>5667</issue>
            <fpage>66</fpage>
            <lpage>74</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1093857</pubid>
                  <pubid idtype="pmpid" link="fulltext">15001713</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>An application of statistics to comparative metagenomics</p>
            </title>
            <aug>
               <au>
                  <snm>Rodriguez-Brito</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Rohwer</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Edwards</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <inpress/>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16549025</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Genome sequencing in microfabricated high-density picolitre reactors</p>
            </title>
            <aug>
               <au>
                  <snm>Margulies</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Egholm</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Altman</snm>
                  <fnm>WE</fnm>
               </au>
               <au>
                  <snm>Attiya</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bader</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Bemben</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Berka</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Braverman</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>YJ</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Dewell</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Du</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Fierro</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Gomes</snm>
                  <fnm>XV</fnm>
               </au>
               <au>
                  <snm>Godwin</snm>
                  <fnm>BC</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Helgesen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ho</snm>
                  <fnm>CH</fnm>
               </au>
               <au>
                  <snm>Irzyk</snm>
                  <fnm>GP</fnm>
               </au>
               <au>
                  <snm>Jando</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Alenquer</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Jarvie</snm>
                  <fnm>TP</fnm>
               </au>
               <au>
                  <snm>Jirage</snm>
                  <fnm>KB</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Knight</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Lanza</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Leamon</snm>
                  <fnm>JH</fnm>
               </au>
               <au>
                  <snm>Lefkowitz</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Lei</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lohman</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Lu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Makhijani</snm>
                  <fnm>VB</fnm>
               </au>
               <au>
                  <snm>McDade</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>McKenna</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Nickerson</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Nobile</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Plant</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Puc</snm>
                  <fnm>BP</fnm>
               </au>
               <au>
                  <snm>Ronan</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Roth</snm>
                  <fnm>GT</fnm>
               </au>
               <au>
                  <snm>Sarkis</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>Simons</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Simpson</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Srinivasan</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tartaro</snm>
                  <fnm>KR</fnm>
               </au>
               <au>
                  <snm>Tomasz</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Vogt</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Volkmer</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>SH</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Weiner</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Begley</snm>
                  <fnm>RF</fnm>
               </au>
               <au>
                  <snm>Rothberg</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2005</pubdate>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16056220</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Real-time DNA sequencing using detection of pyrophosphate release</p>
            </title>
            <aug>
               <au>
                  <snm>Ronaghi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Karamohamed</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Pettersson</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Uhlen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nyren</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Anal Biochem</source>
            <pubdate>1996</pubdate>
            <volume>242</volume>
            <issue>1</issue>
            <fpage>84</fpage>
            <lpage>89</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/abio.1996.0432</pubid>
                  <pubid idtype="pmpid" link="fulltext">8923969</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>A sequencing method based on real-time pyrophosphate</p>
            </title>
            <aug>
               <au>
                  <snm>Ronaghi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Uhlen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nyren</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1998</pubdate>
            <volume>281</volume>
            <issue>5375</issue>
            <fpage>363, 365</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.281.5375.363</pubid>
                  <pubid idtype="pmpid" link="fulltext">9705713</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes </p>
            </title>
            <aug>
               <au>
                  <snm>Overbeek</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Begley</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Butler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Choudhuri</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chuang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Cohoon</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>de Cr&#233;cy-Lagard</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Diaz</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Disz</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Edwards</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Fonstein</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Frank</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Gerdes</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Glass</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Goesmann</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hanson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Iwata-Reuyl</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Jensen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Jamshidi</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Krause</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Kubal</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Larsen</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Linke</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>McHardy</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Meyer</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Neuweger</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Olsen</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Olson</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Osterman</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Portnoy</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Pusch</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Rodionov</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>R&#252;ckert</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Steiner</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Stevens</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Thiele</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Vassieva</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Ye</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Zagnitko</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Vonstein</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1251668</pubid>
                  <pubid idtype="pmpid" link="fulltext">16214803</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>The ribosomal database project (RDP-II): sequences and tools for high-throughput rRNA analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Cole</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Chai</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Farris</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Kulam</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>McGarrell</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Garrity</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Tiedje</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <issue>Database issue</issue>
            <fpage>D294</fpage>
            <lpage>6</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">539992</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608200</pubid>
                  <pubid idtype="doi">10.1093/nar/gki038</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Characterization of soil bacteria that desulphurize organic sulphur compounds .1. Classification and growth studies</p>
            </title>
            <aug>
               <au>
                  <snm>Klubek</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Schmidt</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Burnham</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Microbios</source>
            <pubdate>1996</pubdate>
            <volume>88</volume>
            <issue>357</issue>
            <fpage>223</fpage>
            <lpage>236</lpage>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Microbiological investigation on black crusts from open-air stone monuments of Bologna (Italy)</p>
            </title>
            <aug>
               <au>
                  <snm>Turtura</snm>
                  <fnm>GC</fnm>
               </au>
               <au>
                  <snm>Perfetto</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lorenzelli</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Microbiologica</source>
            <pubdate>2000</pubdate>
            <volume>23</volume>
            <issue>2</issue>
            <fpage>207</fpage>
            <lpage>228</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10872690</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>The SEED</p>
            </title>
            <url>http://theseed.uchicago.edu/FIG/index.cgi</url>
         </bibl>
         <bibl id="B20">
            <title>
               <p>The SEED: A peer-to-peer environment for genome annotation</p>
            </title>
            <aug>
               <au>
                  <snm>Overbeek</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Disz</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Stevens</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Commun ACM</source>
            <pubdate>2004</pubdate>
            <volume>47</volume>
            <issue>11</issue>
            <fpage>46</fpage>
            <lpage>51</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1145/1029496.1029525</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Use of SYBR Green I for rapid epifluorescence counts of marine viruses and bacteria</p>
            </title>
            <aug>
               <au>
                  <snm>Noble</snm>
                  <fnm>RT</fnm>
               </au>
               <au>
                  <snm>Fuhrman</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Aquat Microb Ecol</source>
            <pubdate>1998</pubdate>
            <volume>14</volume>
            <issue>2</issue>
            <fpage>113</fpage>
            <lpage>118</lpage>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Phylogenetic identification and in situ detection of individual microbial cells without cultivation</p>
            </title>
            <aug>
               <au>
                  <snm>Amann</snm>
                  <fnm>RI</fnm>
               </au>
               <au>
                  <snm>Ludwig</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Schleifer</snm>
                  <fnm>KH</fnm>
               </au>
            </aug>
            <source>Microbiol Rev</source>
            <pubdate>1995</pubdate>
            <volume>59</volume>
            <issue>1</issue>
            <fpage>143</fpage>
            <lpage>169</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">239358</pubid>
                  <pubid idtype="pmpid">7535888</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Basic local alignment search tool</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Gish</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1990</pubdate>
            <volume>215</volume>
            <issue>3</issue>
            <fpage>403</fpage>
            <lpage>410</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1990.9999</pubid>
                  <pubid idtype="pmpid" link="fulltext">2231712</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Ribosomal database project - II</p>
            </title>
            <url>http://rdp.cme.msu.edu/</url>
         </bibl>
         <bibl id="B25">
            <title>
               <p>FIG Subsystem Forum</p>
            </title>
            <url>http://www.subsys.info</url>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Base-calling of automated sequencer traces using phred. II. Error probabilities</p>
            </title>
            <aug>
               <au>
                  <snm>Ewing</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1998</pubdate>
            <volume>8</volume>
            <issue>3</issue>
            <fpage>186</fpage>
            <lpage>194</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9521922</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
