<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art><ui>1471-2229-10-55</ui><ji>1471-2229</ji><fm>
<dochead>Research article</dochead>
<bibl>
<title>
<p>A full-length enriched cDNA library and expressed sequence tag analysis of the parasitic weed, <it>Striga hermonthica</it>
</p>
</title>
<aug>
<au id="A1"><snm>Yoshida</snm><fnm>Satoko</fnm><insr iid="I1"/><email>satokoy@psc.riken.jp</email></au>
<au id="A2"><snm>Ishida</snm><mi>K</mi><fnm>Juliane</fnm><insr iid="I1"/><insr iid="I2"/><email>jkishida@psc.riken.jp</email></au>
<au id="A3"><snm>Kamal</snm><mi>M</mi><fnm>Nasrein</fnm><insr iid="I3"/><email>renokamal@yahoo.com</email></au>
<au id="A4"><snm>Ali</snm><mi>M</mi><fnm>Abdelbagi</fnm><insr iid="I3"/><email>abdmali@yahoo.com</email></au>
<au id="A5"><snm>Namba</snm><fnm>Shigetou</fnm><insr iid="I2"/><email>anamba@mail.ecc.u-tokyo.ac.jp</email></au>
<au ca="yes" id="A6"><snm>Shirasu</snm><fnm>Ken</fnm><insr iid="I1"/><email>ken.shirasu@psc.riken.jp</email></au>
</aug>
<insg>
<ins id="I1"><p>Plant Science Center, RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan</p></ins>
<ins id="I2"><p>Department of Agricultural and Environmental Biology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan</p></ins>
<ins id="I3"><p>Biotechnology Laboratory, Agricultural Research Corporation, Wad Medani 126, Sudan</p></ins>
</insg>
<source>BMC Plant Biology</source>
<issn>1471-2229</issn>
<pubdate>2010</pubdate>
<volume>10</volume>
<issue>1</issue>
<fpage>55</fpage>
<url>http://www.biomedcentral.com/1471-2229/10/55</url>
<xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2229-10-55</pubid><pubid idtype="pmpid">20353604</pubid></pubidlist></xrefbib>
</bibl>
<history><rec><date><day>2</day><month>12</month><year>2009</year></date></rec><acc><date><day>30</day><month>3</month><year>2010</year></date></acc><pub><date><day>30</day><month>3</month><year>2010</year></date></pub></history>
<cpyrt><year>2010</year><collab>Yoshida et al; licensee BioMed Central Ltd.</collab><note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
<abs>
<sec>
<st>
<p>Abstract</p>
</st>
<sec>
<st>
<p>Background</p>
</st>
<p>The obligate parasitic plant witchweed (<it>Striga hermonthica</it>) infects major cereal crops such as sorghum, maize, and millet, and is the most devastating weed pest in Africa. An understanding of the nature of its parasitism would contribute to the development of more sophisticated management methods. However, the molecular and genomic resources currently available for the study of <it>S. hermonthica </it>are limited.</p>
</sec>
<sec>
<st>
<p>Results</p>
</st>
<p>We constructed a full-length enriched cDNA library of <it>S. hermonthica</it>, sequenced 37,710 clones from the library, and obtained 67,814 expressed sequence tag (EST) sequences. The ESTs were assembled into 17,317 unigenes that included 10,319 contigs and 6,818 singletons. The <it>S. hermonthica </it>unigene dataset was subjected to a comparative analysis with other plant genomes or ESTs. Approximately 80% of the unigenes have homologs in other dicotyledonous plants including <it>Arabidopsis</it>, poplar, and grape. We found that 589 unigenes are conserved in the hemiparasitic <it>Triphysaria </it>species but not in other plant species. These are good candidates for genes specifically involved in plant parasitism. Furthermore, we found 1,445 putative simple sequence repeats (SSRs) in the <it>S. hermonthica </it>unigene dataset. We tested 64 pairs of PCR primers flanking the SSRs to develop genetic markers for the detection of polymorphisms. Most primer sets amplified polymorphicbands from individual plants collected at a single location, indicating high genetic diversity in <it>S. hermonthica</it>. We selected 10 primer pairs to analyze <it>S. hermonthica </it>harvested in the field from different host species and geographic locations. A clustering analysis suggests that genetic distances are not correlated with host specificity.</p>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>Our data provide the first extensive set of molecular resources for studying <it>S. hermonthica</it>, and include EST sequences, a comparative analysis with other plant genomes, and useful genetic markers. All the data are stored in a web-based database and freely available. These resources will be useful for genome annotation, gene discovery, functional analysis, molecular breeding, epidemiological studies, and studies of plant evolution.</p>
</sec>
</sec>
</abs>
</fm><bdy>
<sec>
<st>
<p>Background</p>
</st>
<p>
<it>Striga hermonthica </it>is an obligate root parasite belonging to the family Orobanchaceae, and is a major constraint of crop production in sub-Saharan Africa. <it>S. hermonthica </it>infests economically important crops such as sorghum, maize, millet, and upland rice, and the yield losses caused by this species have been estimated to cost as much as US$ 7 billion annually <abbrgrp>
<abbr bid="B1">1</abbr>
</abbrgrp>. However, methods for controlling <it>S. hermonthica </it>are not well established. Despite its agricultural importance, the molecular mechanisms controlling the establishment of parasitism are poorly understood.</p>
<p>The <it>S. hermonthica </it>life cycle is unique and well adapted to its parasitic lifestyle. The seeds need to be exposed to germination stimulants exudated from the host roots, such as strigolactones and ethylene; otherwise they can remain dormant in the soil for several decades <abbrgrp>
<abbr bid="B2">2</abbr>
</abbrgrp>. The seeds are tiny and possess limited amounts of nutrients, and this restricts their growth without a host connection. When a potential host is recognized through the sensing of strigolactones or other germination stimulants, the seeds that are close to the host roots (within 5 mm) can germinate. The germinated seedlings form haustoria, which are round shaped organs specialized in host attachment and penetration <abbrgrp>
<abbr bid="B3">3</abbr>
</abbrgrp>. The formation of haustoria also requires host-derived signal compounds. The haustoria penetrate the host roots and finally connect with the vasculature to rob the host plant of water and nutrients. This dramatic developmental transition from an autotrophic to a heterotrophic lifestyle occurs within several days.</p>
<p>Intensive efforts in the scientific community, mainly in the United States during the 1960s, lead to the identification of some germination stimulants. This was followed by the development of a "suicidal germination" strategy to eradicate <it>Striga </it>weeds <abbrgrp>
<abbr bid="B4">4</abbr>
</abbrgrp>. By this strategy, a germination stimulant (in this case ethylene) is mixed in the soil to trigger germination in the absence of the hosts. This approach was used successfully to eradicate <it>Striga asiatica </it>infestations in North Carolina. Although suicidal germination was effective for controlling <it>S. asiatica</it>, this approach was not applicable for African farmers due to the high cost of the strategy and the much larger scale of infestation.</p>
<p>Whole genome sequencing is a valuable approach to understanding an organism. The genome sequences of growing numbers of model and crop plant species have been published in recent years, providing new insights in plant biology. The development of new generation sequencing technologies has dramatically accelerated the speed of large-scale sequencing. However, the <it>de novo </it>sequencing of the whole genome of a non-model plant is still a challenging and laborious task <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp>. Expressed sequence tags (ESTs) are a less expensive alternative for gaining information about the expressed genes of an organism <abbrgrp>
<abbr bid="B6">6</abbr>
</abbrgrp>. In particular, the ESTs from a full-length enriched cDNA library provide the complete sequences of functional proteins <abbrgrp>
<abbr bid="B7">7</abbr>
</abbrgrp>.</p>
<p>This study aims to provide genome scale molecular resources for understanding the parasitic processes of the obligate parasite, <it>S. hermonthica</it>. We constructed a full-length enriched cDNA library from <it>S. hermonthica </it>and generated a large-scale EST dataset by reading the sequences of individual clones from both ends. The only other genus from the family Orobanchaceae with publically available EST data is <it>Triphysaria </it>
<abbrgrp>
<abbr bid="B8">8</abbr>
</abbrgrp>. <it>Triphysaria </it>spp. are facultative hemiparasites, which are able to complete their life cycles without hosts. The comparison of our <it>S. hermonthica </it>EST dataset with those of <it>Triphysaria </it>and other non-parasitic plantspecies enabled us to identify the potentially parasite specific genes. Furthermore, our results provide the tools to analyze genetic diversity within <it>S. hermonthica</it>. We found 1,445 putative simple sequence repeats (SSRs) that could be useful as markers. We amplified the genomic regions flanking some of these SSRs from <it>S. hermonthica </it>individuals that were collected in different fields in Africa. The results revealed high sequence divergence in the <it>S. hermonthica </it>genomes. All the sequences and the annotation results are freely available on the internet <abbrgrp>
<abbr bid="B9">9</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>Results and Discussion</p>
</st>
<sec>
<st>
<p>Genome size of <it>S. hermonthica</it>
</p>
</st>
<p>
<it>S. hermonthica </it>is likely to be a diploid species with a chromosome number of n = 19 <abbrgrp>
<abbr bid="B10">10</abbr>
</abbrgrp>. First, we estimated the genome size of <it>S. hermonthica </it>to gain information about its genome contents. Leaves of <it>S. hermonthica </it>plants parasitizing to rice were harvested and the DNA contents were measured with a flow cytometer. <it>Arabidopsis thaliana</it>, whose genome size is 128 Mbp, was used as a control. Five individual plants were used for the measurements with two or more replicates for each plant. The genome size of <it>S. hermonthica </it>was estimated to be 1,801 Mbp (&#177; 321 Mbp) (Fig. <figr fid="F1">1</figr>), which is approximately 14 times that of <it>Arabidopsis</it>, 4 times those of rice and poplar, and 2 times that of sorghum.</p>
<fig id="F1"><title><p>Figure 1</p></title><caption><p>Genome size of <it>S. hermonthica </it>estimated by flow cytometry</p></caption><text>
   <p><b>Genome size of <it>S. hermonthica </it>estimated by flow cytometry</b>. The genome size of <it>S. hermonthica </it>(pink) was estimated by comparison with <it>Arabidopsis </it>(blue). n = 5.</p>
</text><graphic file="1471-2229-10-55-1" hint_layout="single"/></fig>
<sec>
<st>
<p>Full-length enriched cDNA library construction</p>
</st>
<p>To construct a full-length enriched cDNA library containing highly variable sequences, total RNA was extracted from various <it>S. hermonthica </it>tissues at various developmental stages (Table <tblr tid="T1">1</tblr>). A full-length enriched normalized cDNA library was constructed using a mixture of these RNAs as starting materials. To assess the quality of the resulting library, the inserts from 90 randomly picked clones were amplified by PCR with primers specific to the library vector, and the insert sizes were estimated by agarose-gel electrophoresis (Table <tblr tid="T2">2</tblr>). The average insert size was approximately 1.42 kb, which is similar to the average insert size of the RIKEN <it>Arabidopsis </it>Full-Length (RAFL) cDNA clones (estimated at 1,445 bp) <abbrgrp>
<abbr bid="B11">11</abbr>
<abbr bid="B12">12</abbr>
</abbrgrp>. This average insert size was similar to that of a poplar full-length cDNA library (<it>Populus nigra</it>, about 1.4 kb) <abbrgrp>
<abbr bid="B13">13</abbr>
</abbrgrp>, and slightly shorter than those from soybean and wheat (approximately 1.5 kb) <abbrgrp>
<abbr bid="B12">12</abbr>
<abbr bid="B14">14</abbr>
</abbrgrp>. The longest insert was estimated at more than 3 kb, suggesting that the library contains relatively long cDNAs.</p>
<tbl id="T1"><title><p>Table 1</p></title><caption><p>RNA samples used for the <it>S. hermonthica </it>full-length enriched cDNA library construction.</p></caption><tblbdy cols="2">
      <r>
         <c ca="left">
            <p>
               <b>Tissue</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Growth stage or treatment</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="2">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Seedlings</p>
         </c>
         <c ca="left">
            <p>At 3 d after strigol treamtment</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Seedlings</p>
         </c>
         <c ca="left">
            <p>At 3 d after co-incubation with rice roots</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Leaves and stems</p>
         </c>
         <c ca="left">
            <p>From mature plants parasitized on rice</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Roots (secondary haustoria)</p>
         </c>
         <c ca="left">
            <p>From mature plants parasitized on rice in rhizotron</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Flowers</p>
         </c>
         <c ca="left">
            <p>From mature plants parasitized on rice</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Axenically grown plants</p>
         </c>
         <c ca="left">
            <p>Grown axenically for 1 month</p>
         </c>
      </r>
   </tblbdy></tbl>
<tbl id="T2"><title><p>Table 2</p></title><caption><p>Distribution of insert lengths in the <it>S. hermonthica </it>full-length enriched cDNA library.</p></caption><tblbdy cols="3">
      <r>
         <c ca="left">
            <p>
               <b>Length (kb)</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Clone number</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Frequency (%)</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="3">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>&lt;0.5</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>0.5-1.0</p>
         </c>
         <c ca="center">
            <p>18</p>
         </c>
         <c ca="center">
            <p>20.0</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>1.0-1.5</p>
         </c>
         <c ca="center">
            <p>35</p>
         </c>
         <c ca="center">
            <p>38.9</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>1.5-2.0</p>
         </c>
         <c ca="center">
            <p>23</p>
         </c>
         <c ca="center">
            <p>25.6</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>2.0-2.5</p>
         </c>
         <c ca="center">
            <p>9</p>
         </c>
         <c ca="center">
            <p>10.0</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>2.5-3.0</p>
         </c>
         <c ca="center">
            <p>4</p>
         </c>
         <c ca="center">
            <p>4.4</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>&#8805;3.0</p>
         </c>
         <c ca="center">
            <p>1</p>
         </c>
         <c ca="center">
            <p>1.1</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Total</p>
         </c>
         <c ca="center">
            <p>90</p>
         </c>
         <c ca="center">
            <p>100</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>*Average insert length = 1.42 kb</p>
   </tblfn></tbl>
<p>To assess the proportion of the library containing full-length cDNA clones, we randomly picked 90 clones and sequenced them from both the 5' and 3' ends. These DNA sequences were analyzed against the <it>Arabidopsis </it>genome database using the blastx program. Of the 90 clones, 79 contained sequences similar to those of <it>Arabidopsis </it>genes (e_value &lt; e-10), while the insert sequences of the other 11 clones did not show any similarity. The 5'- and 3'- sequences of the 79 clones were aligned with the homologous <it>Arabidopsis </it>cDNAs. The 5'-sequences of 62 clones contained ATG start codons at similar positions to those in the corresponding <it>Arabidopsis </it>homologs, and 59 possessed stop codons at the equivalent positions. Therefore, we estimated that approximately 75% of the clones in the <it>S. hermonthica </it>library encode full-length cDNAs. Among the 59 sequenced full-length clones, the average lengths of the 5'- and 3'-untranslated regions (UTRs) were 127 bp and 203 bp, respectively, and the longest 5'-and 3' -UTRs were 486 bp and 480 bp, respectively.</p>
</sec>
</sec>
<sec>
<st>
<p>EST sequencing and statistical analysis</p>
</st>
<p>Next, we sequenced both the 5'- and 3'-ends of 37,710 clones from the <it>S. hermonthica </it>full-length enriched cDNA library. The sequence chromatograms were analyzed using the EST2uni package <abbrgrp>
<abbr bid="B15">15</abbr>
</abbrgrp>, which is an automated analysis tool for the clean-up, clustering, and annotation of EST sequences. Among the 75,330 raw sequence reads, we found that 67,814 were of good quality and were deposited in the DNA Databank of Japan [DDBJ: <ext-link ext-link-id="FS438984-FS506797" ext-link-type="ddbj">FS438984-FS506797</ext-link>]. The sequences are clustered into 17,137 non-redundant unigenes (10,319 contigs and 6,818 singletons) (Table <tblr tid="T3">3</tblr>). The average GC content among the unigene sequences is 44.5%. The lengths of the unigenes are distributed between 82 and 3,949 bp, and most of them (11,546 unigenes) have sequence lengths between 601 and 900 bp (Additional file <supplr sid="S1">1</supplr>), with an average of 810.3 bp. Most (84%) of the unigenes are comprised of fewer than 6 ESTs (Additional file <supplr sid="S1">1</supplr>), suggesting that the redundancy rate is relatively low in this normalized library.</p>
<tbl id="T3"><title><p>Table 3</p></title><caption><p>Summary of the <it>S. hermonthica </it>EST sequence analysis</p></caption><tblbdy cols="2">
      <r>
         <c ca="left">
            <p>
               <b>Group</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Records</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="2">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Number of independent clones</p>
         </c>
         <c ca="left">
            <p>37,710</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Number of raw sequences</p>
         </c>
         <c ca="left">
            <p>75,330</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Number of high quality sequences</p>
         </c>
         <c ca="left">
            <p>67,814</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Number of unigenes</p>
         </c>
         <c ca="left">
            <p>17,137</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>singletons</p>
         </c>
         <c ca="left">
            <p>6,818</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>contigs</p>
         </c>
         <c ca="left">
            <p>10,319</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Average unigene length</p>
         </c>
         <c ca="left">
            <p>775.3 bp</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Minimum unigene length</p>
         </c>
         <c ca="left">
            <p>101 bp</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Maximum unigene length</p>
         </c>
         <c ca="left">
            <p>3,051 bp</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Average number of ESTs per unigene</p>
         </c>
         <c ca="left">
            <p>2.9</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Maximum number of ESTs per contig</p>
         </c>
         <c ca="left">
            <p>106</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Number of superunigenes</p>
         </c>
         <c ca="left">
            <p>12,272</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>with more than one unigene</p>
         </c>
         <c ca="left">
            <p>2,203</p>
         </c>
      </r>
      <r>
         <c indent="1" ca="left">
            <p>with one unigene</p>
         </c>
         <c ca="left">
            <p>10,069</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Number of putative SNPs (pSNPs)</p>
         </c>
         <c ca="left">
            <p>9,299</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Number of putative SSRs (pSSRs)</p>
         </c>
         <c ca="left">
            <p>1,445</p>
         </c>
      </r>
   </tblbdy></tbl>
<suppl id="S1">
<title>
<p>Additional file 1</p>
</title>
<text>
<p>
<b>Distribution of unigene lengths and EST numbers per unigene</b>. (A) Distribution of unigene lengths in the entire <it>S. hermonthica </it>unigene dataset. (B) Distribution of EST numbers per unigene.</p>
</text>
<file name="1471-2229-10-55-S1.PDF">
   <p>Click here for file</p>
</file>
</suppl>
</sec>
<sec>
<st>
<p>Functional annotation of the unigene sequences</p>
</st>
<p>For the functional annotation of the 17,137 unigene sequences, we carried out a blastx analysis against the UniRef90 database <abbrgrp>
<abbr bid="B16">16</abbr>
<abbr bid="B17">17</abbr>
</abbrgrp>. About 79% of the <it>S. hermonthica </it>unigenes were annotated as homologs of known proteins. For further functional annotations of the structural domains, the Pfam database <abbrgrp>
<abbr bid="B18">18</abbr>
</abbrgrp> was searched using the HMMER program (ver. 2.3.2, <abbrgrp>
<abbr bid="B19">19</abbr>
<abbr bid="B20">20</abbr>
</abbrgrp>), and 31% (5367) of the unigenes contained Pfam hits. Then the <it>S. hermonthica </it>unigenes were classified into Gene ontology (GO) groups based on their similarities with the corresponding <it>Arabidopsis </it>genes (Fig. <figr fid="F2">2</figr>). In the classification of genes according to their cellular components, we found that 16% of the unigenes encode putative membrane proteins and 10% encode putative plastid proteins. In the classification of molecular functions, 12% were assigned to catalytic activity. These percentages are similar to those in <it>Arabidopsis </it>
<abbrgrp>
<abbr bid="B21">21</abbr>
</abbrgrp>, indicating that there was no functional bias among the predicted proteins encoded in the <it>S. hermonthica </it>library.</p>
<fig id="F2"><title><p>Figure 2</p></title><caption><p>Gene ontology analysis of <it>S. hermonthica </it>unigene-encoding products</p></caption><text>
   <p><b>Gene ontology analysis of <it>S. hermonthica </it>unigene-encoding products</b>. The <it>S. hermonthica </it>unigenes were classified according to their predicted biological functions (A), molecular functions (B), and cellular components (C). The numbers in each category were compared with those in <it>A. thaliana</it>.</p>
</text><graphic file="1471-2229-10-55-2" hint_layout="single"/></fig>
</sec>
<sec>
<st>
<p>Comparative analysis with other plant genes</p>
</st>
<p>The <it>S. hermonthica </it>unigenes were compared with genes in other plant genomes, including <it>A. thaliana</it>, poplar (<it>Populus trichocarpa</it>), grape (<it>Vitis vinifera</it>), soybean (<it>Glycine max</it>), rice (<it>Oryza sativa</it>), sorghum (<it>Sorghum bicolor</it>), a moss (<it>Physcomitrella patens</it>), and an algae (<it>Chlamydomonas reindardtii</it>) <abbrgrp>
<abbr bid="B22">22</abbr>
<abbr bid="B23">23</abbr>
<abbr bid="B24">24</abbr>
<abbr bid="B25">25</abbr>
<abbr bid="B26">26</abbr>
</abbrgrp>. Seventy-seven to seventy-nine percent of the <it>S. hermonthica </it>unigenes showed similarities with genes from other dicotyledonous plants (<it>Arabidopsis</it>, grape, soybean, and poplar), as detected by blastx (e_value &lt; e-10). Approximately 75% of the unigenes have homologs in monocotyledonous plants (rice and sorghum), and approximately 65% and 38% showed blastx hits in the <it>P. patens </it>and <it>C. reindardtii </it>databases, respectively. These lower percentages of blast hits are consistent with the greater evolutionary distances from those organisms.</p>
<p>We plotted the percentages of <it>S. hermonthica </it>unigenes against levels of amino acid sequence identity with homologs in the other plant genomes (Fig. <figr fid="F3">3</figr>). Larger percentages of <it>S. hermonthica </it>unigenes showed higher levels of identity with poplar and grape sequences than with sequences from the other plant species. The identity scores corresponding to half the population of <it>S. hermonthica </it>unigenes were 0.68 for grape and poplar, 0.65 for <it>Arabidopsis</it>, 0.62 for rice, and 0.56 for <it>P. patens</it>. These numbers roughly reflect the evolutionary distances between <it>S. hermonthica </it>and these species.</p>
<fig id="F3"><title><p>Figure 3</p></title><caption><p>Cumulative count curves of identity between <it>S. hermonthica </it>unigenes and those from other plant species</p></caption><text>
   <p><b>Cumulative count curves of identity between <it>S. hermonthica </it>unigenes and those from other plant species</b>. All the sequenced <it>S. hermonthica </it>unigenes were used in blastx or tblastx searches against the peptide databases of the indicated plant species. The curves represent the percentages of <it>S. hermonthica </it>unigenes that showed higher levels of identity than the values on the <it>x</it>-axis.</p>
</text><graphic file="1471-2229-10-55-3" hint_layout="single"/></fig>
<p>Large scale EST sequence datasets have previously been reported for <it>Triphysaria versicolor </it>
<abbrgrp>
<abbr bid="B8">8</abbr>
</abbrgrp> and <it>Triphysaria pusilla </it>
<abbrgrp>
<abbr bid="B27">27</abbr>
</abbrgrp>, which are hemiparasitic plants belonging to the Orobanchaceae. The assembled EST sequences are available at the plantGDB web site <abbrgrp>
<abbr bid="B28">28</abbr>
</abbrgrp>. Althoughthe genus <it>Triphysaria </it>is closely related taxonomically to <it>S. hermonthica</it>, only 74% of the <it>S. hermonthica </it>unigenes showed similarity to <it>Triphysaria </it>sequences (including both <it>T. pusilla </it>and <it>T. versicolor</it>), when analyzed with the tblastx program (Table <tblr tid="T4">4</tblr>). This is significantly lower than percentages of similarity found with the other dicotyledonous plants, but this is likely due to the lack of saturation of the <it>Triphysaria </it>EST datasets.</p>
<tbl id="T4"><title><p>Table 4</p></title><caption><p>Summary of blast search results using <it>S. hermonthica </it>unigenes.</p></caption><tblbdy cols="4">
      <r>
         <c ca="left">
            <p>
               <b>Species</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>DB version</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Number of hits</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>% Unigenes</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="4">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Populus trichocarpa</it>
            </p>
         </c>
         <c ca="left">
            <p>JGI ver1.1</p>
         </c>
         <c ca="center">
            <p>13,573</p>
         </c>
         <c ca="center">
            <p>79.2</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Glycine max</it>
            </p>
         </c>
         <c ca="left">
            <p>JGI ver1.1</p>
         </c>
         <c ca="center">
            <p>12,716</p>
         </c>
         <c ca="center">
            <p>79.0</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Vitis vinifera</it>
            </p>
         </c>
         <c ca="left">
            <p>ver1</p>
         </c>
         <c ca="center">
            <p>13,345</p>
         </c>
         <c ca="center">
            <p>77.9</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Arabidopsis thaliana</it>
            </p>
         </c>
         <c ca="left">
            <p>TAIR8</p>
         </c>
         <c ca="center">
            <p>13,255</p>
         </c>
         <c ca="center">
            <p>77.3</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Oryza sativa</it>
            </p>
         </c>
         <c ca="left">
            <p>TIGR ver6</p>
         </c>
         <c ca="center">
            <p>12,841</p>
         </c>
         <c ca="center">
            <p>74.9</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Sorghum bicolor</it>
            </p>
         </c>
         <c ca="left">
            <p>JGI ver1.1</p>
         </c>
         <c ca="center">
            <p>12,803</p>
         </c>
         <c ca="center">
            <p>74.7</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Triphysaria pusilla</it>
            </p>
         </c>
         <c ca="left">
            <p>EST</p>
         </c>
         <c ca="center">
            <p>12,716</p>
         </c>
         <c ca="center">
            <p>74.2</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Physcomitrella patens</it>
            </p>
         </c>
         <c ca="left">
            <p>JGI ver1.1</p>
         </c>
         <c ca="center">
            <p>11,140</p>
         </c>
         <c ca="center">
            <p>65.0</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Chlamydomonas reinhardtii</it>
            </p>
         </c>
         <c ca="left">
            <p>JGI ver1.1</p>
         </c>
         <c ca="center">
            <p>6,477</p>
         </c>
         <c ca="center">
            <p>37.8</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>No hit</p>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>2,389</p>
         </c>
         <c ca="center">
            <p>13.9</p>
         </c>
      </r>
   </tblbdy></tbl>
<p>The conservation of the genes between <it>S. hermonthica </it>and <it>Arabidopsis</it>, grape, poplar, or <it>Triphysaria </it>spp. is shown in a Venn diagram (Fig. <figr fid="F4">4</figr>). Among the 17,137 unigenes, 11,711 (68%) are conserved among all five groups. Only 19, 36, and 58 of the <it>S. hermonthica </it>unigenes are conserved specifically in <it>Arabidopsis</it>, grape, and poplar, respectively. Interestingly, we found that 662 (3.9%) of the <it>S. hermonthica </it>unigenes are conserved in <it>Triphysaria </it>spp. but not in <it>Arabidopsis</it>, grape, or poplar.</p>
<fig id="F4"><title><p>Figure 4</p></title><caption><p>Homologous gene groups between <it>S. hermonthica </it>and four other plant species</p></caption><text>
   <p><b>Homologous gene groups between <it>S. hermonthica </it>and four other plant species</b>. The numbers of <it>S. hermonthica </it>unigenes that have homologues in the indicated plant species are represented by a Venn diagram. A: <it>A. thaliana</it>, G: <it>V. vinifera</it>, P: <it>P. trichocarpa</it>, T: <it>T. pusilla</it>or <it>T. versicolor</it>, and S: <it>S. hermonthica</it>.</p>
</text><graphic file="1471-2229-10-55-4" hint_layout="single"/></fig>
<p>Of these 662 sequences, 73 show similarities to sequences in other databases such as rice, sorghum, soybean, <it>Physcomitrella</it>, UniRef90 or nr (the non-redundant peptide database from NCBI). We found no other homologs for the remaining 589 unigenes (Additional file <supplr sid="S2">2</supplr>). Since <it>T. pusilla </it>and <it>T. versicolor </it>are hemiparasitic plants, these 589 might include genes specific to parasitic plants. The ongoing project to sequence the genome of <it>Mimulus </it>spp. may help to narrow down the number of candidate genes that are involved in parasitism, because <it>Mimulus </it>spp. are non-parasitic members of the family Scrophulariaceae, which is taxonomically close to Orobanchaceae. The 2,389 unigenes (14%) that did not show significant hits with any known peptide sequences in the tested databases (including nr) are also listed in Additional file <supplr sid="S2">2</supplr>. These unigenes may include sequences that are specific to <it>Striga</it>.</p>
<suppl id="S2">
<title>
<p>Additional file 2</p>
</title>
<text>
<p>
<b>Lists of <it>S. hermonthica </it>unigenes that are potentially specific to parasitic plants</b>. Sheet 1- The list of <it>S. hermonthica </it>unigenes that have homologs in <it>T. pusilla </it>or <it>T. versicolor </it>but not in other species databases. Sheet 2- The list of <it>S. hermonthica </it>unigenes that do not have homologs in other known sequences.</p>
</text>
<file name="1471-2229-10-55-S2.XLS">
   <p>Click here for file</p>
</file>
</suppl>
</sec>
<sec>
<st>
<p>Genetic diversity of the <it>S. hermonthica </it>sequences</p>
</st>
<p>
<it>S. hermonthica </it>is an obligate outcrossing plant with high levels of morphological and genetic variation <abbrgrp>
<abbr bid="B29">29</abbr>
</abbrgrp>. The EST2uni program detected 9,299 putative single nucleotide polymorphisms (SNPs) among the <it>S. hermonthica </it>unigenes. To exclude the misidentification of sequencing errors as SNPs, only polymorphisms confirmed by at least 2 independent sequences were counted, although there is still the possibility that those polymorphisms occurred during cDNA synthesis. The average frequency of SNPs in the unigene sequences is 0.67%, or approximately 1 SNP per 1.5 kbp. Although these SNPs will need to be confirmed, these data will be useful for developing EST-SNP markers for <it>S. hermonthica </it>
<abbrgrp>
<abbr bid="B30">30</abbr>
</abbrgrp>.</p>
<p>We found 1,445 di-, tri- or tetra-nucleotide microsatellites (or SSRs) among the <it>S. hermonthica </it>unigenes. The most frequent of these are the tri-nucleotide repeats (Additional file <supplr sid="S3">3</supplr>), which is in agreement with previous studies of other plant species <abbrgrp>
<abbr bid="B31">31</abbr>
<abbr bid="B32">32</abbr>
<abbr bid="B33">33</abbr>
</abbrgrp>. The most frequent individual microsatellite repeat is AG (including TC, GA, and TC) (283, 19.6%) and the second most frequent is AC (including TG, CA, and GT) (218, 15.1%). The most frequent tri-nucleotide repeat is ATC (including TCA and CAT) (157, 11.0%) (Additional file <supplr sid="S4">4</supplr>).</p>
<suppl id="S3">
<title>
<p>Additional file 3</p>
</title>
<text>
<p>
<b>Distribution of SSR patterns detected in <it>S. hermonthica </it>ESTs.</b>
</p>
</text>
<file name="1471-2229-10-55-S3.PDF">
   <p>Click here for file</p>
</file>
</suppl>
<suppl id="S4">
<title>
<p>Additional file 4</p>
</title>
<text>
<p>
<b>Distribution of SSR motifs detected in S. hermonthica ESTs.</b>
</p>
</text>
<file name="1471-2229-10-55-S4.PDF">
   <p>Click here for file</p>
</file>
</suppl>
<p>The EST-SSR sequences are good candidates for genetic markers, which can be used for molecular diagnosis, for biotyping weeds, and for investigating the genetic diversity and population structures of <it>S. hermonthica</it>. To investigate whether the SSRs that we identified can be used as such markers, we designed primers using sequences flanking the putative SSRs and looked for polymorphisms by PCR. First, we pooled DNA samples extracted from the leaves of several plants in the same field and used the DNA pools as PCR templates. Of the 64 primer sets tested, 44 successfully amplified DNA bands. However, 26 primer sets (59%) produced smears or multiple bands that were not countable and only 18 primer pairs (41%) amplified clear separate bands (Additional file <supplr sid="S5">5</supplr>). The smeared bands may indicate heterozygosity and genetic diversity among the individual plants harvested from the same field. Therefore, we tested the individual plants for polymorphisms. Several markers that showed smear patterns from the pooled DNA templates actually amplified clear polymorphic bands from individual plants in the same population (Additional file <supplr sid="S6">6</supplr>). These data verify that <it>S. hermonthica </it>is a highly adaptable weed that has maintained a high degree of genetic variation and plasticity, to survive in various ecosystems <abbrgrp>
<abbr bid="B34">34</abbr>
</abbrgrp>.</p>
<suppl id="S5">
<title>
<p>Additional file 5</p>
</title>
<text>
<p>
<b>SSR information</b>. Sheet1- The list of SSRs analyzed in this study, with SSR ID, primer sequences, and PCR results. The yellow colored linesindicate the markers used in this study.</p>
</text>
<file name="1471-2229-10-55-S5.XLS">
   <p>Click here for file</p>
</file>
</suppl>
<suppl id="S6">
<title>
<p>Additional file 6</p>
</title>
<text>
<p>
<b>Examples of PCR results from the amplification of SSR-containing regions in <it>S. hermonthica</it>
</b>. (A) Agarose gel images of PCR results using the indicated primer sets and pooled genomic DNAs from the populations listed in Fig. <figr fid="F5">5</figr>. The population numbers correspond to the numbers in Fig. <figr fid="F5">5A</figr>. (B) An agarose gel image showing PCR results using the SSR8 primer set and genomic DNAs extracted from individual plantsfrom the population in Kenya.</p>
</text>
<file name="1471-2229-10-55-S6.PDF">
   <p>Click here for file</p>
</file>
</suppl>
</sec>
<sec>
<st>
<p>Genetic distances among <it>S. hermonthica </it>populations with different hosts</p>
</st>
<p>Although individual <it>S. hermonthica </it>plants possess highly diversified genomes, 18 of the primer sets we tested showed countable band patterns when using pooled DNA templates. Using those primer sets, we investigated the relationships between different <it>S. hermonthica </it>populations from 6 fields growing sorghum, maize, or pearl millet in various locations in Sudan or Kenya <abbrgrp>
<abbr bid="B35">35</abbr>
</abbrgrp>. Of the 18 primer sets, 10 showed clear polymorphisms for different <it>S. hermonthica </it>populations (Table <tblr tid="T5">5</tblr>, Additional file <supplr sid="S5">5</supplr>). The analysis of PCR products was carried out using MultiNa<sup>&#174; </sup>(Shimadzu, Japan), a microchip electrophoresis system that permits the separation of small fragments and that can detect 5 bp differences. The average polymorphism information content (PIC) was 0.463, which confirms that the SSR markers used in this study were highly informative The lowest PIC value was 0.305 for SSR57, and the highest was 0.545 for SSR26 (Table <tblr tid="T5">5</tblr>). The analyzed loci included 3 di-, 3 tri-, and 4 tetra-nucleotide repeats. A total of 27 alleles were detected, with an average number of alleles per locus of 2.7. The genetic diversity among the six populations was revealed by the gene diversity values, which ranged from 0.375 to 0.625, with an average of 0.549. These results suggest a high level of diversity among the surveyed populations, as was expected for this obligate outcrossing plant <abbrgrp>
<abbr bid="B36">36</abbr>
<abbr bid="B37">37</abbr>
<abbr bid="B38">38</abbr>
</abbrgrp>.</p>
<tbl id="T5"><title><p>Table 5</p></title><caption><p>Genetic diversity among <it>S. hermonthica </it>populations collected from various locations and host plants.</p></caption><tblbdy cols="7">
      <r>
         <c ca="left">
            <p>
               <b>SSR ID</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Primer name</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Repeat unit</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>No of repeats</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>No of alleles</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Gene diversity</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>PIC</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="7">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ShSSR_ShContig8678_1</p>
         </c>
         <c ca="center">
            <p>SSR17</p>
         </c>
         <c ca="center">
            <p>AC</p>
         </c>
         <c ca="center">
            <p>18</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>0.611</p>
         </c>
         <c ca="center">
            <p>0.535</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ShSSR_ShContig6892_1</p>
         </c>
         <c ca="center">
            <p>SSR26</p>
         </c>
         <c ca="center">
            <p>AG</p>
         </c>
         <c ca="center">
            <p>15</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>0.625</p>
         </c>
         <c ca="center">
            <p>0.545</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ShSSR_ShSHAA-aai51d05.b1_c_s_1</p>
         </c>
         <c ca="center">
            <p>SSR33</p>
         </c>
         <c ca="center">
            <p>AG</p>
         </c>
         <c ca="center">
            <p>13</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>0.611</p>
         </c>
         <c ca="center">
            <p>0.535</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ShSSR_ShContig9253_1</p>
         </c>
         <c ca="center">
            <p>SSR43</p>
         </c>
         <c ca="center">
            <p>CCG</p>
         </c>
         <c ca="center">
            <p>10</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>0.486</p>
         </c>
         <c ca="center">
            <p>0.368</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ShSSR_ShContig5481_1</p>
         </c>
         <c ca="center">
            <p>SSR50</p>
         </c>
         <c ca="center">
            <p>AAG</p>
         </c>
         <c ca="center">
            <p>9</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>0.569</p>
         </c>
         <c ca="center">
            <p>0.477</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ShSSR_ShContig5198_1</p>
         </c>
         <c ca="center">
            <p>SSR53</p>
         </c>
         <c ca="center">
            <p>ACC</p>
         </c>
         <c ca="center">
            <p>8</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>0.486</p>
         </c>
         <c ca="center">
            <p>0.368</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ShSSR_ShContig5533_1</p>
         </c>
         <c ca="center">
            <p>SSR57</p>
         </c>
         <c ca="center">
            <p>AACT</p>
         </c>
         <c ca="center">
            <p>6</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>0.375</p>
         </c>
         <c ca="center">
            <p>0.305</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ShSSR_ShContig10128_1</p>
         </c>
         <c ca="center">
            <p>SSR58</p>
         </c>
         <c ca="center">
            <p>AAAC</p>
         </c>
         <c ca="center">
            <p>7</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>0.569</p>
         </c>
         <c ca="center">
            <p>0.505</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ShSSR_ShSHAA-aab89e01.b1_c_s_1</p>
         </c>
         <c ca="center">
            <p>SSR59</p>
         </c>
         <c ca="center">
            <p>AAAC</p>
         </c>
         <c ca="center">
            <p>6</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>0.542</p>
         </c>
         <c ca="center">
            <p>0.460</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ShSSR_ShContig9110_1</p>
         </c>
         <c ca="center">
            <p>SSR63</p>
         </c>
         <c ca="center">
            <p>AAAG</p>
         </c>
         <c ca="center">
            <p>5</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>0.611</p>
         </c>
         <c ca="center">
            <p>0.535</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>Average</p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>2.700</p>
         </c>
         <c ca="center">
            <p>0.549</p>
         </c>
         <c ca="center">
            <p>0.463</p>
         </c>
      </r>
   </tblbdy></tbl>
<p>We also looked for correlations between host species and <it>S. hermonthica </it>biotypes, using the Unweighted Pair Group Method with Arithmetic mean (UPGMA) clustering analysis. The populations from El Obeid (host: sorghum), Dirweesh (host: sorghum), and Kenya (host: maize) clustered in one group, while the population from Elkaraiba (host: sorghum) was in a distant branch of the same group. Those from Tandalti (host: pearl millet) and Agadi (host: maize) formed another cluster (Fig. <figr fid="F5">5</figr>). Thus, we did not detect any correlations between genetic distance and host specificity in this study. This result is consistent with previous epidemiological reports <abbrgrp>
<abbr bid="B35">35</abbr>
<abbr bid="B38">38</abbr>
<abbr bid="B39">39</abbr>
<abbr bid="B40">40</abbr>
</abbrgrp>. In summary, our results suggest that the SSRs found in our study could be useful tools for further investigations of genetic diversity in <it>S. hermonthica</it>.</p>
<fig id="F5"><title><p>Figure 5</p></title><caption><p>Clustering analysis of <it>S. hermonthica</it> populations using SSR polymorphisms</p></caption><text>
   <p><b>Clustering analysis of <it>S. hermonthica</it> populations using SSR polymorphisms</b>. A. <it>S. hermonthica </it>populations used in this study. B. A UPGMA dendrogram constructed using polymorphisms at 10 SSR loci with a total of 27 alleles. Bootstrap values are indicated at supporting nodes when the values are greater than 50.</p>
</text><graphic file="1471-2229-10-55-5" hint_layout="single"/></fig>
</sec>
<sec>
<st>
<p>Web-based database</p>
</st>
<p>The results of the sequencing and analysis of the <it>S. hermonthica </it>ESTs are freely available online from our web-based database <abbrgrp>
<abbr bid="B9">9</abbr>
</abbrgrp>. The web interface was based on the original EST2uni web site <abbrgrp>
<abbr bid="B15">15</abbr>
</abbrgrp>. The database contains features for complex query searches and a blast search. A page for each unigene consists of its sequence, contig images, results of blast similarity searches, lists of detected SSRs and SNPs, and GO categorizations. In addition, the homologs of each unigene are linked to outside databases such as The Arabidopsis Information Resource (TAIR) <abbrgrp>
<abbr bid="B41">41</abbr>
</abbrgrp>. This web-based database will be a powerful tool for the detailed analysis of <it>S. hermonthica </it>genes.</p>
</sec>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>This paper provides large scale EST information about <it>S. hermonthica</it>, which can be used in studies of parasitic plants, plant-plant interactions, weed management, and plant evolution. Comparative analyses between <it>S. hermonthica </it>and other plant genomes should allow us to identify genes responsible for plant parasitism. These genes are of particular interest as potential targets for future pest management strategies against noxious parasitic weeds. Our analysis also highlights the intra-species genetic diversity of <it>S. hermonthica</it>. A more detailed analysis might contribute to future breeding programs to develop resistant crops, since genetic variation in the weed population could be the main factor allowing the quick breakdown of resistance. In summary, our study provides powerful analytical tools for the molecular analysis of the parasitic weed <it>S. hermonthica</it>. Our data will alsocontribute to the annotation of genes identified by the on-going genome-scale sequencing of the parasitic genera from Orobanchaceae.</p>
</sec>
<sec>
<st>
<p>Methods</p>
</st>
<sec>
<st>
<p>Plant materials and growth conditions</p>
</st>
<p>
<it>S. hermonthica </it>seeds collected from a sorghum field in 1994 in Kenya were provided by Dr. A. G. Babiker (Univ. of Sudan, Khartoum, Sudan). Rice seeds (<it>Oryza sativa </it>L. subspecies <it>japonica</it>, cultivar Koshihikari) were originally obtained from the National Institute of Agricultural Sciences (NIAS, Tsukuba, Japan). <it>S. hermonthica </it>plants parasitizing rice were grown in rhizotrons as described previously <abbrgrp>
<abbr bid="B42">42</abbr>
</abbrgrp> or in soil (1:1 mixture of vermiculite: clay). For the axenic culture of <it>S. hermonthica</it>, seeds were sterilized with 20% bleach solution (approx. 6% NaOCl) for 5 min and washed thoroughly with sterile water. The sterile seeds were preconditioned on MS medium with 1% sucrose and 0.5% phytagel (Sigma) at 26&#176;C for 7 to 10 days in the dark and germination was stimulated by the exogenous application of 5 &#956;l 1 &#956;M Strigol per plate. Sterile <it>S. hermonthica </it>plants were grown on the same medium at 26&#176;C with a 16-h photoperiod, and the medium was renewed every 3 weeks.</p>
</sec>
<sec>
<st>
<p>Determination of nuclear DNA content</p>
</st>
<p>The nuclear DNA content was analyzed with a flow cytometer (Partec PA, Tokyo, Japan). Soil-grown <it>S. hermonthica </it>(host: rice) leaves were chopped with a razor blade into small pieces and analyzed according to the previously published method <abbrgrp>
<abbr bid="B43">43</abbr>
</abbrgrp>. Leaves of <it>Arabidopsis </it>(ecotype Col -0) were used as the control.</p>
</sec>
<sec>
<st>
<p>RNA extraction</p>
</st>
<p>The <it>S. hermonthica </it>tissues and developmental stages used for RNA extraction are listed in Table <tblr tid="T1">1</tblr>. <it>S. hermonthica </it>RNAs were extracted using a modified cetyl trimethylammonium bromide (CTAB) method. Briefly, plant tissues were ground under liquid nitrogen and suspended in 5 &#215; volumes of CTAB solution (2% CTAB, 2% polybinylpyrrolidone (PVP), 25 mM ethylenediaminetetraacetic acid(EDTA), 2 M NaCl, 1% beta-mercaptoethanol, 100 mM Tris-HCl (pH 8.0)) and phenol:chloroform (5:1, pH 4.7, Sigma). The mixtures were shaken at 55&#176;C for 5 min. After 10 min centrifugation, the aqueous phase was extracted with an equal volume of phenol:chloroform, and subsequently with chloroform. The RNAs were precipitated by adding 0.25 volumes of 10 M LiCl. The RNA pellet was washed with 70% ethanol and then dissolved in nuclease-free water. Samples were subsequently purified using the PureLink RNA mini kit (Invitrogen) according to the manufacture's instructions. To obtain mRNA for library construction, total RNAs from each tissue and developmental stage were mixed and purified using an mRNA purification kit (GE) according to the manufacture's instructions. The quality and quantity of the total RNA and the mRNA were assessed by measurements of OD<sub>230</sub>, OD<sub>260</sub>, and OD<sub>280</sub>, followed by visual checking by electrophoresis.</p>
</sec>
<sec>
<st>
<p>Library construction and EST sequencing</p>
</st>
<p>The construction of the normalized, full-length enriched library was carried out in Evrogen (Russia). The cDNA normalization was conducted using a Duplex-specific nuclease (DSN)-based method, and full-length cDNAs were enriched using the SMART&#8482; technology (Clontech). Each cDNA was inserted into the pAL17.3 vector. Sequencing of randomly picked clones was performed in the Genome Center at Washington University using the ABI3730 capillary sequencer.</p>
</sec>
<sec>
<st>
<p>Computational analysis</p>
</st>
<p>The EST sequences were automatically trimmed, clustered and annotated using the EST2uni analysis pipeline <abbrgrp>
<abbr bid="B15">15</abbr>
</abbrgrp>. Sequence assembly was performed using the CAP3 program with the default parameter settings <abbrgrp>
<abbr bid="B44">44</abbr>
</abbrgrp>. Blast searches were performed with NCBI blast program against the databases shown in Table <tblr tid="T4">4</tblr>. The <it>S. hermonthica </it>online database was constructed based on the EST2uni web program with slight modifications.</p>
</sec>
<sec>
<st>
<p>SSR markers and genetic diversity analysis</p>
</st>
<p>Genomic DNA was extracted from about 10 g of <it>S. hermonthica </it>seeds using the modified CTAB method described previously <abbrgrp>
<abbr bid="B35">35</abbr>
</abbrgrp>. Primers flanking the microsatellites were designed using the PRIMER 3 program <abbrgrp>
<abbr bid="B45">45</abbr>
</abbrgrp>. The PCRs were performed in 10 &#956;l volumes with one initial denaturation step of 1 min at 95&#176;C, followed by 40 cycles of 15 sec at 94&#176;C, 30 sec at 60&#176;C and 30 sec at 72&#176;C, anda final extension step of 5 min at 72&#176;C. The PCR products were analyzed either by 4% agarose gel electrophoresis (Additional file <supplr sid="S6">6</supplr>) or using the MCE-202 MultiNa Microchip Electrophoresis System for DNA/RNA analysis (Shimadzu, Japan) using the DNA-500 kit (Table <tblr tid="T5">5</tblr> and Fig. <figr fid="F5">5</figr>). The data were analyzedusing the PowerMarker program version 3.25 <abbrgrp>
<abbr bid="B46">46</abbr>
</abbrgrp>, and the genetic diversity was estimated based on allelic numbersand the gene diversity value:</p>
<p>
<display-formula>
<graphic file="1471-2229-10-55-i1.gif"/>
</display-formula>
</p>
<p>where <it>n </it>is the number of populations sampled, <it>p</it>
<sub>
<it>lu </it>
</sub>is the frequency of <it>u</it>th allele at the <it>l</it>th locus, and <it>f </it>is the inbreeding coefficient (association between alleles) at the <it>l</it>th locus. The Polymorphism Information Content (PIC) was estimated as <inline-formula>
<graphic file="1471-2229-10-55-i2.gif"/>
</inline-formula>, where the p<sub>
<it>lv </it>
</sub>is the frequency of the <it>v</it>th allele at the <it>l</it>th locus. The phylogenetic UPGMA tree was generated based on a matrix of the frequencies and distances using the LogSharedAllele algorithm with the PowerMarker v.3.25 program. Bootstrap analysis was performed using the software package WINBOOT <abbrgrp>
<abbr bid="B47">47</abbr>
</abbrgrp>.</p>
</sec>
</sec>
<sec>
<st>
<p>Authors' contributions</p>
</st>
<p>SY carried out the data collection and bioinformatic analyses, and drafted the manuscript. JKI performed the SSR marker analyses. NMK and AMA collected <it>S. hermonthica </it>seeds and extracted genomic DNAs. SN participated in the design and coordination of the study. KS conceived of the study, contributed to designing the experiments, and drafted the manuscript. All authors read and approved the final manuscript.</p>
</sec>
</bdy><bm>
<ack>
<sec>
<st>
<p>Acknowledgements</p>
</st>
<p>We thank Dr K. Mochida for advice on bioinformatics, K. Akiyama and T. Sakurai for web-server maintenance, and Dr A. G. Babiker for providing the <it>S. hermonthica </it>seeds. This work was funded by grants from the Gatsby Charitable Foundation, the RIKEN president fund, and KAKENHI (19780040 and 21780044 to SY and 19678001 to KS). JKI is supported by the MEXT scholarship program.</p>
</sec>
</ack>
<refgrp><bibl id="B1"><title><p>Observations on the current status of <it>Orobanche </it>and <it>Striga</it>problems worldwide</p></title><aug><au><snm>Parker</snm><fnm>C</fnm></au></aug><source>Pest management science</source><pubdate>2009</pubdate><volume>65</volume><issue>5</issue><fpage>453</fpage><lpage>459</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/ps.1713</pubid><pubid idtype="pmpid" link="fulltext">19206075</pubid></pubidlist></xrefbib></bibl><bibl id="B2"><title><p>Rhizosphere communication of plants, parasitic plants and AM fungi</p></title><aug><au><snm>Bouwmeester</snm><fnm>HJ</fnm></au><au><snm>Roux</snm><fnm>C</fnm></au><au><snm>Lopez-Raez</snm><fnm>JA</fnm></au><au><snm>Becard</snm><fnm>G</fnm></au></aug><source>Trends in plant science</source><pubdate>2007</pubdate><volume>12</volume><issue>5</issue><fpage>224</fpage><lpage>230</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.tplants.2007.03.009</pubid><pubid idtype="pmpid" link="fulltext">17416544</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>Host-plant recognition by parasitic <it>Scrophulariaceae</it></p></title><aug><au><snm>Yoder</snm><fnm>JI</fnm></au></aug><source>Current Opinion in Plant Biology</source><pubdate>2001</pubdate><volume>4</volume><issue>4</issue><fpage>359</fpage><lpage>365</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S1369-5266(00)00185-0</pubid><pubid idtype="pmpid" link="fulltext">11418347</pubid></pubidlist></xrefbib></bibl><bibl id="B4"><title><p>Plant resistance to parasitic plants: molecular approaches to an old foe</p></title><aug><au><snm>Rispail</snm><fnm>N</fnm></au><au><snm>Dita</snm><fnm>MA</fnm></au><au><snm>Gonzalez-Verdejo</snm><fnm>C</fnm></au><au><snm>Perez-de-Luque</snm><fnm>A</fnm></au><au><snm>Castillejo</snm><fnm>MA</fnm></au><au><snm>Prats</snm><fnm>E</fnm></au><au><snm>Roman</snm><fnm>B</fnm></au><au><snm>Jorrin</snm><fnm>J</fnm></au><au><snm>Rubiales</snm><fnm>D</fnm></au></aug><source>The New phytologist</source><pubdate>2007</pubdate><volume>173</volume><issue>4</issue><fpage>703</fpage><lpage>712</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1469-8137.2007.01980.x</pubid><pubid idtype="pmpid" link="fulltext">17286819</pubid></pubidlist></xrefbib></bibl><bibl id="B5"><title><p>Next-generation sequencing technologies and their implications for crop genetics and breeding</p></title><aug><au><snm>Varshney</snm><fnm>RK</fnm></au><au><snm>Nayak</snm><fnm>SN</fnm></au><au><snm>May</snm><fnm>GD</fnm></au><au><snm>Jackson</snm><fnm>SA</fnm></au></aug><source>Trends in Biotechnology</source><pubdate>2009</pubdate><volume>27</volume><issue>9</issue><fpage>522</fpage><lpage>530</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.tibtech.2009.05.006</pubid><pubid idtype="pmpid" link="fulltext">19679362</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>Expressed sequence tags: alternative or complement to whole genome sequences?</p></title><aug><au><snm>Rudd</snm><fnm>S</fnm></au></aug><source>Trends in plant science</source><pubdate>2003</pubdate><volume>8</volume><issue>7</issue><fpage>321</fpage><lpage>329</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S1360-1385(03)00131-6</pubid><pubid idtype="pmpid" link="fulltext">12878016</pubid></pubidlist></xrefbib></bibl><bibl id="B7"><title><p>Sequencing analysis of 20,000 full-length cDNA clones from cassava reveals lineage specific expansions in gene families related to stress response</p></title><aug><au><snm>Sakurai</snm><fnm>T</fnm></au><au><snm>Plata</snm><fnm>G</fnm></au><au><snm>Rodriguez-Zapata</snm><fnm>F</fnm></au><au><snm>Seki</snm><fnm>M</fnm></au><au><snm>Salcedo</snm><fnm>A</fnm></au><au><snm>Toyoda</snm><fnm>A</fnm></au><au><snm>Ishiwata</snm><fnm>A</fnm></au><au><snm>Tohme</snm><fnm>J</fnm></au><au><snm>Sakaki</snm><fnm>Y</fnm></au><au><snm>Shinozaki</snm><fnm>K</fnm></au><au><snm>Ishitani</snm><fnm>M</fnm></au></aug><source>BMC Plant Biol</source><pubdate>2007</pubdate><volume>7</volume><fpage>66</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2229-7-66</pubid><pubid idtype="pmcid">2245942</pubid><pubid idtype="pmpid">18096061</pubid></pubidlist></xrefbib></bibl><bibl id="B8"><title><p>Pscroph, a parasitic plant EST database enriched for parasite associated transcripts</p></title><aug><au><snm>Torres</snm><fnm>MJ</fnm></au><au><snm>Tomilov</snm><fnm>AA</fnm></au><au><snm>Tomilova</snm><fnm>N</fnm></au><au><snm>Reagan</snm><fnm>RL</fnm></au><au><snm>Yoder</snm><fnm>JI</fnm></au></aug><source>BMC Plant Biol</source><pubdate>2005</pubdate><volume>5</volume><fpage>24</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2229-5-24</pubid><pubid idtype="pmcid">1325228</pubid><pubid idtype="pmpid">16288663</pubid></pubidlist></xrefbib></bibl><bibl id="B9"><title><p>Striga hermonthica EST database</p></title><url>http://striga.psc.riken.jp</url></bibl><bibl id="B10"><title><p>Reproductive Ability of Hybrids of Striga aspera and Striga hermonthica</p></title><aug><au><snm>Aigbokhan</snm><fnm>EI</fnm></au><au><snm>Berner</snm><fnm>DK</fnm></au><au><snm>Musselman</snm><fnm>LJ</fnm></au></aug><source>Phytopathology</source><pubdate>1998</pubdate><volume>88</volume><issue>6</issue><fpage>563</fpage><lpage>567</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1094/PHYTO.1998.88.6.563</pubid><pubid idtype="pmpid" link="fulltext">18944910</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>Functional annotation of a full-length Arabidopsis cDNA collection</p></title><aug><au><snm>Seki</snm><fnm>M</fnm></au><au><snm>Narusaka</snm><fnm>M</fnm></au><au><snm>Kamiya</snm><fnm>A</fnm></au><au><snm>Ishida</snm><fnm>J</fnm></au><au><snm>Satou</snm><fnm>M</fnm></au><au><snm>Sakurai</snm><fnm>T</fnm></au><au><snm>Nakajima</snm><fnm>M</fnm></au><au><snm>Enju</snm><fnm>A</fnm></au><au><snm>Akiyama</snm><fnm>K</fnm></au><au><snm>Oono</snm><fnm>Y</fnm></au><au><snm>Muramatsu</snm><fnm>M</fnm></au><au><snm>Hayashizaki</snm><fnm>Y</fnm></au><au><snm>Kawai</snm><fnm>J</fnm></au><au><snm>Carninci</snm><fnm>P</fnm></au><au><snm>Itoh</snm><fnm>M</fnm></au><au><snm>Ishii</snm><fnm>Y</fnm></au><au><snm>Arakawa</snm><fnm>T</fnm></au><au><snm>Shibata</snm><fnm>K</fnm></au><au><snm>Shinagawa</snm><fnm>A</fnm></au><au><snm>Shinozaki</snm><fnm>K</fnm></au></aug><source>Science</source><pubdate>2002</pubdate><volume>296</volume><issue>5565</issue><fpage>141</fpage><lpage>145</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1071006</pubid><pubid idtype="pmpid" link="fulltext">11910074</pubid></pubidlist></xrefbib></bibl><bibl id="B12"><title><p>Sequencing and analysis of approximately 40,000 soybean cDNA clones from a full-length-enriched cDNA library</p></title><aug><au><snm>Umezawa</snm><fnm>T</fnm></au><au><snm>Sakurai</snm><fnm>T</fnm></au><au><snm>Totoki</snm><fnm>Y</fnm></au><au><snm>Toyoda</snm><fnm>A</fnm></au><au><snm>Seki</snm><fnm>M</fnm></au><au><snm>Ishiwata</snm><fnm>A</fnm></au><au><snm>Akiyama</snm><fnm>K</fnm></au><au><snm>Kurotani</snm><fnm>A</fnm></au><au><snm>Yoshida</snm><fnm>T</fnm></au><au><snm>Mochida</snm><fnm>K</fnm></au><au><snm>Kasuga</snm><fnm>M</fnm></au><au><snm>Todaka</snm><fnm>D</fnm></au><au><snm>Maruyama</snm><fnm>K</fnm></au><au><snm>Nakashima</snm><fnm>K</fnm></au><au><snm>Enju</snm><fnm>A</fnm></au><au><snm>Mizukado</snm><fnm>S</fnm></au><au><snm>Ahmed</snm><fnm>S</fnm></au><au><snm>Yoshiwara</snm><fnm>K</fnm></au><au><snm>Harada</snm><fnm>K</fnm></au><au><snm>Tsubokura</snm><fnm>Y</fnm></au><au><snm>Hayashi</snm><fnm>M</fnm></au><au><snm>Sato</snm><fnm>S</fnm></au><au><snm>Anai</snm><fnm>T</fnm></au><au><snm>Ishimoto</snm><fnm>M</fnm></au><au><snm>Funatsuki</snm><fnm>H</fnm></au><au><snm>Teraishi</snm><fnm>M</fnm></au><au><snm>Osaki</snm><fnm>M</fnm></au><au><snm>Shinano</snm><fnm>T</fnm></au><au><snm>Akashi</snm><fnm>R</fnm></au><au><snm>Sakaki</snm><fnm>Y</fnm></au><etal/></aug><source>DNA Res</source><pubdate>2008</pubdate><volume>15</volume><issue>6</issue><fpage>333</fpage><lpage>346</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/dnares/dsn024</pubid><pubid idtype="pmcid">2608845</pubid><pubid idtype="pmpid">18927222</pubid></pubidlist></xrefbib></bibl><bibl id="B13"><title><p>Functional annotation of 19,841 Populus nigra full-length enriched cDNA clones</p></title><aug><au><snm>Nanjo</snm><fnm>T</fnm></au><au><snm>Sakurai</snm><fnm>T</fnm></au><au><snm>Totoki</snm><fnm>Y</fnm></au><au><snm>Toyoda</snm><fnm>A</fnm></au><au><snm>Nishiguchi</snm><fnm>M</fnm></au><au><snm>Kado</snm><fnm>T</fnm></au><au><snm>Igasaki</snm><fnm>T</fnm></au><au><snm>Futamura</snm><fnm>N</fnm></au><au><snm>Seki</snm><fnm>M</fnm></au><au><snm>Sakaki</snm><fnm>Y</fnm></au><au><snm>Shinozaki</snm><fnm>K</fnm></au><au><snm>Shinohara</snm><fnm>K</fnm></au></aug><source>BMC genomics</source><pubdate>2007</pubdate><volume>8</volume><fpage>448</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-8-448</pubid><pubid idtype="pmcid">2222646</pubid><pubid idtype="pmpid">18053163</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>Construction of a full-length cDNA library from young spikelets of hexaploid wheat and its characterization by large-scale sequencing of expressed sequence tags</p></title><aug><au><snm>Ogihara</snm><fnm>Y</fnm></au><au><snm>Mochida</snm><fnm>K</fnm></au><au><snm>Kawaura</snm><fnm>K</fnm></au><au><snm>Murai</snm><fnm>K</fnm></au><au><snm>Seki</snm><fnm>M</fnm></au><au><snm>Kamiya</snm><fnm>A</fnm></au><au><snm>Shinozaki</snm><fnm>K</fnm></au><au><snm>Carninci</snm><fnm>P</fnm></au><au><snm>Hayashizaki</snm><fnm>Y</fnm></au><au><snm>Shin</snm><fnm>IT</fnm></au><au><snm>Kohara</snm><fnm>Y</fnm></au><au><snm>Yamazaki</snm><fnm>Y</fnm></au></aug><source>Genes &amp; genetic systems</source><pubdate>2004</pubdate><volume>79</volume><issue>4</issue><fpage>227</fpage><lpage>232</lpage></bibl><bibl id="B15"><title><p>EST2uni: an open, parallel tool for automated EST analysis and database creation, with a data mining web interface and microarray expression data integration</p></title><aug><au><snm>Forment</snm><fnm>J</fnm></au><au><snm>Gilabert</snm><fnm>F</fnm></au><au><snm>Robles</snm><fnm>A</fnm></au><au><snm>Conejero</snm><fnm>V</fnm></au><au><snm>Nuez</snm><fnm>F</fnm></au><au><snm>Blanca</snm><fnm>JM</fnm></au></aug><source>BMC bioinformatics</source><pubdate>2008</pubdate><volume>9</volume><fpage>5</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-9-5</pubid><pubid idtype="pmcid">2258287</pubid><pubid idtype="pmpid">18179701</pubid></pubidlist></xrefbib></bibl><bibl id="B16"><title><p>UniRef: comprehensive and non-redundant UniProt reference clusters</p></title><aug><au><snm>Suzek</snm><fnm>BE</fnm></au><au><snm>Huang</snm><fnm>H</fnm></au><au><snm>McGarvey</snm><fnm>P</fnm></au><au><snm>Mazumder</snm><fnm>R</fnm></au><au><snm>Wu</snm><fnm>CH</fnm></au></aug><source>Bioinfomatics</source><pubdate>2007</pubdate><volume>23</volume><issue>10</issue><fpage>1282</fpage><lpage>1288</lpage><xrefbib><pubid idtype="doi">10.1093/bioinformatics/btm098</pubid></xrefbib></bibl><bibl id="B17"><title><p>Uniref</p></title><url>http://www.uniprot.org/help/uniref</url></bibl><bibl id="B18"><title><p>Pfam</p></title><url>http://pfam.sanger.ac.uk/</url></bibl><bibl id="B19"><title><p>hmmer</p></title><url>http://hmmer.janelia.org/</url></bibl><bibl id="B20"><title><p>Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins</p></title><aug><au><snm>Bateman</snm><fnm>A</fnm></au><au><snm>Birney</snm><fnm>E</fnm></au><au><snm>Durbin</snm><fnm>R</fnm></au><au><snm>Eddy</snm><fnm>SR</fnm></au><au><snm>Finn</snm><fnm>RD</fnm></au><au><snm>Sonnhammer</snm><fnm>EL</fnm></au></aug><source>Nucleic acidsresearch</source><pubdate>1999</pubdate><volume>27</volume><issue>1</issue><fpage>260</fpage><lpage>262</lpage><xrefbib><pubid idtype="doi">10.1093/nar/27.1.260</pubid></xrefbib></bibl><bibl id="B21"><title><p>Functional annotation of the Arabidopsis genome using controlled vocabularies</p></title><aug><au><snm>Berardini</snm><fnm>TZ</fnm></au><au><snm>Mundodi</snm><fnm>S</fnm></au><au><snm>Reiser</snm><fnm>L</fnm></au><au><snm>Huala</snm><fnm>E</fnm></au><au><snm>Garcia-Hernandez</snm><fnm>M</fnm></au><au><snm>Zhang</snm><fnm>P</fnm></au><au><snm>Mueller</snm><fnm>LA</fnm></au><au><snm>Yoon</snm><fnm>J</fnm></au><au><snm>Doyle</snm><fnm>A</fnm></au><au><snm>Lander</snm><fnm>G</fnm></au><au><snm>Moseyko</snm><fnm>N</fnm></au><au><snm>Yoo</snm><fnm>D</fnm></au><au><snm>Xu</snm><fnm>I</fnm></au><au><snm>Zoeckler</snm><fnm>B</fnm></au><au><snm>Montoya</snm><fnm>M</fnm></au><au><snm>Miller</snm><fnm>N</fnm></au><au><snm>Weems</snm><fnm>D</fnm></au><au><snm>Rhee</snm><fnm>SY</fnm></au></aug><source>Plant physiology</source><pubdate>2004</pubdate><volume>135</volume><issue>2</issue><fpage>745</fpage><lpage>755</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1104/pp.104.040071</pubid><pubid idtype="pmcid">514112</pubid><pubid idtype="pmpid">15173566</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>Analysis of the genome sequence of the flowering plant Arabidopsis thaliana</p></title><aug><au><cnm>Arabidopsis-Genome-Initiative</cnm></au></aug><source>Nature</source><pubdate>2000</pubdate><volume>408</volume><issue>6814</issue><fpage>796</fpage><lpage>815</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/35048692</pubid><pubid idtype="pmpid" link="fulltext">11130711</pubid></pubidlist></xrefbib></bibl><bibl id="B23"><title><p>The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla</p></title><aug><au><snm>Jaillon</snm><fnm>O</fnm></au><au><snm>Aury</snm><fnm>JM</fnm></au><au><snm>Noel</snm><fnm>B</fnm></au><au><snm>Policriti</snm><fnm>A</fnm></au><au><snm>Clepet</snm><fnm>C</fnm></au><au><snm>Casagrande</snm><fnm>A</fnm></au><au><snm>Choisne</snm><fnm>N</fnm></au><au><snm>Aubourg</snm><fnm>S</fnm></au><au><snm>Vitulo</snm><fnm>N</fnm></au><au><snm>Jubin</snm><fnm>C</fnm></au><au><snm>Vezzi</snm><fnm>A</fnm></au><au><snm>Legeai</snm><fnm>F</fnm></au><au><snm>Hugueney</snm><fnm>P</fnm></au><au><snm>Dasilva</snm><fnm>C</fnm></au><au><snm>Horner</snm><fnm>D</fnm></au><au><snm>Mica</snm><fnm>E</fnm></au><au><snm>Jublot</snm><fnm>D</fnm></au><au><snm>Poulain</snm><fnm>J</fnm></au><au><snm>Bruyere</snm><fnm>C</fnm></au><au><snm>Billault</snm><fnm>A</fnm></au><au><snm>Segurens</snm><fnm>B</fnm></au><au><snm>Gouyvenoux</snm><fnm>M</fnm></au><au><snm>Ugarte</snm><fnm>E</fnm></au><au><snm>Cattonaro</snm><fnm>F</fnm></au><au><snm>Anthouard</snm><fnm>V</fnm></au><au><snm>Vico</snm><fnm>V</fnm></au><au><snm>Del Fabbro</snm><fnm>C</fnm></au><au><snm>Alaux</snm><fnm>M</fnm></au><au><snm>Di Gaspero</snm><fnm>G</fnm></au><au><snm>Dumas</snm><fnm>V</fnm></au><etal/></aug><source>Nature</source><pubdate>2007</pubdate><volume>449</volume><issue>7161</issue><fpage>463</fpage><lpage>467</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature06148</pubid><pubid idtype="pmpid" link="fulltext">17721507</pubid></pubidlist></xrefbib></bibl><bibl id="B24"><title><p>The Sorghum bicolor genome and the diversification of grasses</p></title><aug><au><snm>Paterson</snm><fnm>AH</fnm></au><au><snm>Bowers</snm><fnm>JE</fnm></au><au><snm>Bruggmann</snm><fnm>R</fnm></au><au><snm>Dubchak</snm><fnm>I</fnm></au><au><snm>Grimwood</snm><fnm>J</fnm></au><au><snm>Gundlach</snm><fnm>H</fnm></au><au><snm>Haberer</snm><fnm>G</fnm></au><au><snm>Hellsten</snm><fnm>U</fnm></au><au><snm>Mitros</snm><fnm>T</fnm></au><au><snm>Poliakov</snm><fnm>A</fnm></au><au><snm>Schmutz</snm><fnm>J</fnm></au><au><snm>Spannagl</snm><fnm>M</fnm></au><au><snm>Tang</snm><fnm>H</fnm></au><au><snm>Wang</snm><fnm>X</fnm></au><au><snm>Wicker</snm><fnm>T</fnm></au><au><snm>Bharti</snm><fnm>AK</fnm></au><au><snm>Chapman</snm><fnm>J</fnm></au><au><snm>Feltus</snm><fnm>FA</fnm></au><au><snm>Gowik</snm><fnm>U</fnm></au><au><snm>Grigoriev</snm><fnm>IV</fnm></au><au><snm>Lyons</snm><fnm>E</fnm></au><au><snm>Maher</snm><fnm>CA</fnm></au><au><snm>Martis</snm><fnm>M</fnm></au><au><snm>Narechania</snm><fnm>A</fnm></au><au><snm>Otillar</snm><fnm>RP</fnm></au><au><snm>Penning</snm><fnm>BW</fnm></au><au><snm>Salamov</snm><fnm>AA</fnm></au><au><snm>Wang</snm><fnm>Y</fnm></au><au><snm>Zhang</snm><fnm>L</fnm></au><au><snm>Carpita</snm><fnm>NC</fnm></au><etal/></aug><source>Nature</source><pubdate>2009</pubdate><volume>457</volume><issue>7229</issue><fpage>551</fpage><lpage>556</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature07723</pubid><pubid idtype="pmpid" link="fulltext">19189423</pubid></pubidlist></xrefbib></bibl><bibl id="B25"><title><p>The Physcomitrella genome reveals evolutionary insights into the conquest of land by plants</p></title><aug><au><snm>Rensing</snm><fnm>SA</fnm></au><au><snm>Lang</snm><fnm>D</fnm></au><au><snm>Zimmer</snm><fnm>AD</fnm></au><au><snm>Terry</snm><fnm>A</fnm></au><au><snm>Salamov</snm><fnm>A</fnm></au><au><snm>Shapiro</snm><fnm>H</fnm></au><au><snm>Nishiyama</snm><fnm>T</fnm></au><au><snm>Perroud</snm><fnm>PF</fnm></au><au><snm>Lindquist</snm><fnm>EA</fnm></au><au><snm>Kamisugi</snm><fnm>Y</fnm></au><au><snm>Tanahashi</snm><fnm>T</fnm></au><au><snm>Sakakibara</snm><fnm>K</fnm></au><au><snm>Fujita</snm><fnm>T</fnm></au><au><snm>Oishi</snm><fnm>K</fnm></au><au><snm>Shin</snm><fnm>IT</fnm></au><au><snm>Kuroki</snm><fnm>Y</fnm></au><au><snm>Toyoda</snm><fnm>A</fnm></au><au><snm>Suzuki</snm><fnm>Y</fnm></au><au><snm>Hashimoto</snm><fnm>S</fnm></au><au><snm>Yamaguchi</snm><fnm>K</fnm></au><au><snm>Sugano</snm><fnm>S</fnm></au><au><snm>Kohara</snm><fnm>Y</fnm></au><au><snm>Fujiyama</snm><fnm>A</fnm></au><au><snm>Anterola</snm><fnm>A</fnm></au><au><snm>Aoki</snm><fnm>S</fnm></au><au><snm>Ashton</snm><fnm>N</fnm></au><au><snm>Barbazuk</snm><fnm>WB</fnm></au><au><snm>Barker</snm><fnm>E</fnm></au><au><snm>Bennetzen</snm><fnm>JL</fnm></au><au><snm>Blankenship</snm><fnm>R</fnm></au><etal/></aug><source>Science</source><pubdate>2008</pubdate><volume>319</volume><issue>5859</issue><fpage>64</fpage><lpage>69</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1150646</pubid><pubid idtype="pmpid" link="fulltext">18079367</pubid></pubidlist></xrefbib></bibl><bibl id="B26"><title><p>The genome of black cottonwood, Populus trichocarpa (Torr. &amp; Gray)</p></title><aug><au><snm>Tuskan</snm><fnm>GA</fnm></au><au><snm>Difazio</snm><fnm>S</fnm></au><au><snm>Jansson</snm><fnm>S</fnm></au><au><snm>Bohlmann</snm><fnm>J</fnm></au><au><snm>Grigoriev</snm><fnm>I</fnm></au><au><snm>Hellsten</snm><fnm>U</fnm></au><au><snm>Putnam</snm><fnm>N</fnm></au><au><snm>Ralph</snm><fnm>S</fnm></au><au><snm>Rombauts</snm><fnm>S</fnm></au><au><snm>Salamov</snm><fnm>A</fnm></au><au><snm>Schein</snm><fnm>J</fnm></au><au><snm>Sterck</snm><fnm>L</fnm></au><au><snm>Aerts</snm><fnm>A</fnm></au><au><snm>Bhalerao</snm><fnm>RR</fnm></au><au><snm>Bhalerao</snm><fnm>RP</fnm></au><au><snm>Blaudez</snm><fnm>D</fnm></au><au><snm>Boerjan</snm><fnm>W</fnm></au><au><snm>Brun</snm><fnm>A</fnm></au><au><snm>Brunner</snm><fnm>A</fnm></au><au><snm>Busov</snm><fnm>V</fnm></au><au><snm>Campbell</snm><fnm>M</fnm></au><au><snm>Carlson</snm><fnm>J</fnm></au><au><snm>Chalot</snm><fnm>M</fnm></au><au><snm>Chapman</snm><fnm>J</fnm></au><au><snm>Chen</snm><fnm>GL</fnm></au><au><snm>Cooper</snm><fnm>D</fnm></au><au><snm>Coutinho</snm><fnm>PM</fnm></au><au><snm>Couturier</snm><fnm>J</fnm></au><au><snm>Covert</snm><fnm>S</fnm></au><au><snm>Cronk</snm><fnm>Q</fnm></au><etal/></aug><source>Science</source><pubdate>2006</pubdate><volume>313</volume><issue>5793</issue><fpage>1596</fpage><lpage>1604</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1128691</pubid><pubid idtype="pmpid" link="fulltext">16973872</pubid></pubidlist></xrefbib></bibl><bibl id="B27"><title><p>Triphysaria EST database</p></title><url>http://www.plantsciences.ucdavis.edu/yoder/lab/Sequence_index.html</url></bibl><bibl id="B28"><title><p>PlantGDB</p></title><url>http://www.plantGDB.org</url></bibl><bibl id="B29"><title><p>The genus Striga (Scrophulariaceae) in Africa</p></title><aug><au><snm>Mohamed</snm><fnm>KI</fnm></au><au><snm>Musselman</snm><fnm>LJ</fnm></au><au><snm>Riches</snm><fnm>CR</fnm></au></aug><source>Ann Mo Bot Gard</source><pubdate>2001</pubdate><volume>88</volume><issue>1</issue><fpage>60</fpage><lpage>103</lpage><xrefbib><pubid idtype="doi">10.2307/2666132</pubid></xrefbib></bibl><bibl id="B30"><title><p>A set of EST-SNPs for map saturation and cultivar identification in melon</p></title><aug><au><snm>Deleu</snm><fnm>W</fnm></au><au><snm>Esteras</snm><fnm>C</fnm></au><au><snm>Roig</snm><fnm>C</fnm></au><au><snm>Gonzalez-To</snm><fnm>M</fnm></au><au><snm>Fernandez-Silva</snm><fnm>I</fnm></au><au><snm>Gonzalez-Ibeas</snm><fnm>D</fnm></au><au><snm>Blanca</snm><fnm>J</fnm></au><au><snm>Aranda</snm><fnm>M</fnm></au><au><snm>Arus</snm><fnm>P</fnm></au><au><snm>Nuez</snm><fnm>F</fnm></au><au><snm>Monforte</snm><fnm>A</fnm></au><au><snm>Pico</snm><fnm>M</fnm></au><au><snm>Garcia-Mas</snm><fnm>J</fnm></au></aug><source>BMC Plant Biology</source><pubdate>2009</pubdate><volume>9</volume><issue>1</issue><fpage>90</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2229-9-90</pubid><pubid idtype="pmcid">2722630</pubid><pubid idtype="pmpid">19604363</pubid></pubidlist></xrefbib></bibl><bibl id="B31"><title><p>Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat</p></title><aug><au><snm>Kantety</snm><fnm>RV</fnm></au><au><snm>La Rota</snm><fnm>M</fnm></au><au><snm>Matthews</snm><fnm>DE</fnm></au><au><snm>Sorrells</snm><fnm>ME</fnm></au></aug><source>Plant Molecular Biology</source><pubdate>2002</pubdate><volume>48</volume><issue>5-6</issue><fpage>501</fpage><lpage>510</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1023/A:1014875206165</pubid><pubid idtype="pmpid" link="fulltext">11999831</pubid></pubidlist></xrefbib></bibl><bibl id="B32"><title><p>Utility of EST-derived SSR in cultivated peanut (Arachis hypogaea L.) and Arachis wild species</p></title><aug><au><snm>Liang</snm><fnm>X</fnm></au><au><snm>Chen</snm><fnm>X</fnm></au><au><snm>Hong</snm><fnm>Y</fnm></au><au><snm>Liu</snm><fnm>H</fnm></au><au><snm>Zhou</snm><fnm>G</fnm></au><au><snm>Li</snm><fnm>S</fnm></au><au><snm>Guo</snm><fnm>B</fnm></au></aug><source>BMC Plant Biol</source><pubdate>2009</pubdate><volume>9</volume><issue>1</issue><fpage>35</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2229-9-35</pubid><pubid idtype="pmcid">2678122</pubid><pubid idtype="pmpid">19309524</pubid></pubidlist></xrefbib></bibl><bibl id="B33"><title><p>A White Campion (Silene latifolia) floral expressed sequence tag (EST) library: annotation, EST-SSR characterization, transferability, and utility for comparative mapping</p></title><aug><au><snm>Moccia</snm><fnm>M</fnm></au><au><snm>Oger-Desfeux</snm><fnm>C</fnm></au><au><snm>Marais</snm><fnm>G</fnm></au><au><snm>Widmer</snm><fnm>A</fnm></au></aug><source>BMC genomics</source><pubdate>2009</pubdate><volume>10</volume><issue>1</issue><fpage>243</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-10-243</pubid><pubid idtype="pmcid">2689282</pubid><pubid idtype="pmpid">19467153</pubid></pubidlist></xrefbib></bibl><bibl id="B34"><title><p>Genetic diversity and origin of weedy rice (Oryza sativa f. spontanea) populations found in North-eastern China revealed by simple sequence repeat (SSR) markers</p></title><aug><au><snm>Cao</snm><fnm>Q</fnm></au><au><snm>Lu</snm><fnm>BR</fnm></au><au><snm>Xia</snm><fnm>H</fnm></au><au><snm>Rong</snm><fnm>J</fnm></au><au><snm>Sala</snm><fnm>F</fnm></au><au><snm>Spada</snm><fnm>A</fnm></au><au><snm>Grassi</snm><fnm>F</fnm></au></aug><source>Annals of botany</source><pubdate>2006</pubdate><volume>98</volume><issue>6</issue><fpage>1241</fpage><lpage>1252</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/aob/mcl210</pubid><pubid idtype="pmpid">17056615</pubid></pubidlist></xrefbib></bibl><bibl id="B35"><title><p>Molecular diversity of Striga hermonthica collected from different locations and host plant species</p></title><aug><au><snm>Abdelbagi</snm><fnm>MA</fnm></au><au><snm>Yasir</snm><fnm>SA</fnm></au><au><snm>Ahmed</snm><fnm>AE</fnm></au><au><snm>Dawoud</snm><fnm>AD</fnm></au><au><snm>Yabuta-Miyamoto</snm><fnm>S</fnm></au><au><snm>Sugimoto</snm><fnm>Y</fnm></au></aug><source>Sudan Journal of Agricultural Research</source><pubdate>2007</pubdate><volume>10</volume><fpage>121</fpage><lpage>126</lpage></bibl><bibl id="B36"><title><p>Genetic Diversity of Striga and Implications for 71 Control and Modeling Future Distributions</p></title><aug><au><snm>Mohamed</snm><fnm>KI</fnm></au><au><snm>Bolin</snm><fnm>JF</fnm></au><au><snm>Musselman</snm><fnm>LJ</fnm></au><au><snm>Peterson</snm><fnm>AT</fnm></au></aug><source>Integrating new technologies for Striga control: Towards Ending the Witch-Hunt</source><publisher>World Scientific Publishing Company</publisher><editor>Ejeta G, Gressel J</editor><pubdate>2007</pubdate><fpage>71</fpage><lpage>84</lpage><xrefbib><pubid idtype="doi">full_text</pubid></xrefbib></bibl><bibl id="B37"><title><p>Molecular markers for the study of pathogen variability: implications for breeding resistance to Striga hermonthica</p></title><aug><au><snm>Koyama</snm><fnm>ML</fnm></au></aug><source>Application of molecular markers in plant breedingbreeding Training manual for a seminar held at IITA, Ibadan, Nigeria, from 16-17 August 1999</source><publisher>Patancheru 502 324, Andhra Pradesh, India: International Crops Research Institute for the Semi-Arid Tropics (ICRISAT)</publisher><editor>Haussmann BIG, Geiger HH, Hess DE, Hash CT, Bramel-Cox P</editor><pubdate>2000</pubdate><fpage>133</fpage><lpage>152</lpage></bibl><bibl id="B38"><title><p>Population structure, genetic diversity and host specificity of the parasitic weed Striga hermonthica (Scrophulariaceae) in Sahel</p></title><aug><au><snm>Olivier</snm><fnm>A</fnm></au><au><snm>Glaszmann</snm><fnm>JC</fnm></au><au><snm>Lanaud</snm><fnm>C</fnm></au><au><snm>Leroux</snm><fnm>GD</fnm></au></aug><source>Plant Systematics and Evolution</source><pubdate>1998</pubdate><volume>209</volume><issue>1-2</issue><fpage>33</fpage><lpage>45</lpage><xrefbib><pubid idtype="doi">10.1007/BF00991522</pubid></xrefbib></bibl><bibl id="B39"><title><p>A Study of Genetic Diversity among Host-Specific Populations of the Witchweed Striga-Hermonthica (Scrophulariaceae) in Africa</p></title><aug><au><snm>Bharathalakshmi</snm><fnm></fnm></au><au><snm>Werth</snm><fnm>CR</fnm></au><au><snm>Musselman</snm><fnm>LJ</fnm></au></aug><source>Plant Systematics and Evolution</source><pubdate>1990</pubdate><volume>172</volume><issue>1-4</issue><fpage>1</fpage><lpage>12</lpage><xrefbib><pubid idtype="doi">10.1007/BF00937794</pubid></xrefbib></bibl><bibl id="B40"><title><p>Genetic diversity of Striga hermonthica and Striga asiatica populations in Kenya</p></title><aug><au><snm>Gethi</snm><fnm>JG</fnm></au><au><snm>Smith</snm><fnm>ME</fnm></au><au><snm>Mitchell</snm><fnm>SE</fnm></au><au><snm>Kresovich</snm><fnm>S</fnm></au></aug><source>Weed Research</source><pubdate>2005</pubdate><volume>45</volume><issue>1</issue><fpage>64</fpage><lpage>73</lpage><xrefbib><pubid idtype="doi">10.1111/j.1365-3180.2004.00432.x</pubid></xrefbib></bibl><bibl id="B41"><title><p>The Arabidopsis Information Resource (TAIR)</p></title><url>http://www.arabidopsis.org/</url></bibl><bibl id="B42"><title><p>Multiple layers of incompatibility to the parasitic witchweed, Striga hermonthica</p></title><aug><au><snm>Yoshida</snm><fnm>S</fnm></au><au><snm>Shirasu</snm><fnm>K</fnm></au></aug><source>The New phytologist</source><pubdate>2009</pubdate><volume>183</volume><issue>1</issue><fpage>180</fpage><lpage>189</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1469-8137.2009.02840.x</pubid><pubid idtype="pmpid" link="fulltext">19402875</pubid></pubidlist></xrefbib></bibl><bibl id="B43"><title><p>Increased level of polyploidy1, a conserved repressor of CYCLINA2 transcription, controls endoreduplication in Arabidopsis</p></title><aug><au><snm>Yoshizumi</snm><fnm>T</fnm></au><au><snm>Tsumoto</snm><fnm>Y</fnm></au><au><snm>Takiguchi</snm><fnm>T</fnm></au><au><snm>Nagata</snm><fnm>N</fnm></au><au><snm>Yamamoto</snm><fnm>YY</fnm></au><au><snm>Kawashima</snm><fnm>M</fnm></au><au><snm>Ichikawa</snm><fnm>T</fnm></au><au><snm>Nakazawa</snm><fnm>M</fnm></au><au><snm>Yamamoto</snm><fnm>N</fnm></au><au><snm>Matsui</snm><fnm>M</fnm></au></aug><source>The Plant cell</source><pubdate>2006</pubdate><volume>18</volume><issue>10</issue><fpage>2452</fpage><lpage>2468</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1105/tpc.106.043869</pubid><pubid idtype="pmcid">1626625</pubid><pubid idtype="pmpid">17012601</pubid></pubidlist></xrefbib></bibl><bibl id="B44"><title><p>CAP3: A DNA Sequence Assembly Program</p></title><aug><au><snm>Huang</snm><fnm>X</fnm></au><au><snm>Madan</snm><fnm>A</fnm></au></aug><source>Genome research</source><pubdate>1999</pubdate><volume>9</volume><issue>9</issue><fpage>868</fpage><lpage>877</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.9.9.868</pubid><pubid idtype="pmcid">310812</pubid><pubid idtype="pmpid">10508846</pubid></pubidlist></xrefbib></bibl><bibl id="B45"><title><p>Primer3</p></title><url>http://frodo.wi.mit.edu/primer3/</url></bibl><bibl id="B46"><title><p>PowerMarker: an integrated analysis environment for genetic marker analysis</p></title><aug><au><snm>Liu</snm><fnm>K</fnm></au><au><snm>Muse</snm><fnm>SV</fnm></au></aug><source>Bioinformatics (Oxford, England)</source><pubdate>2005</pubdate><volume>21</volume><issue>9</issue><fpage>2128</fpage><lpage>2129</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/bti282</pubid><pubid idtype="pmpid" link="fulltext">15705655</pubid></pubidlist></xrefbib></bibl><bibl id="B47"><title><p>WINBOOT: A program for performing bootstrap analysis of binary data to determine the confidence limits of UPGMA-based dendrograms</p></title><aug><au><snm>Yap</snm><fnm>IV</fnm></au><au><snm>Nelson</snm><fnm>R</fnm></au></aug><source>IRRI Discussion Paper Series</source><pubdate>1996</pubdate><volume>14</volume></bibl></refgrp>
</bm></art>
