<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-8-280</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Database</dochead>
      <bibl>
         <title>
            <p>PepBank - a database of peptides based on sequence text mining and public peptide data sources</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Shtatland</snm>
               <fnm>Timur</fnm>
               <insr iid="I1"/>
               <email>tshtatland@partners.org</email>
            </au>
            <au id="A2">
               <snm>Guettler</snm>
               <fnm>Daniel</fnm>
               <insr iid="I1"/>
               <email>dguettler@partners.org</email>
            </au>
            <au id="A3">
               <snm>Kossodo</snm>
               <fnm>Misha</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>dalten@gmail.com</email>
            </au>
            <au id="A4">
               <snm>Pivovarov</snm>
               <fnm>Misha</fnm>
               <insr iid="I1"/>
               <email>mpivovarov@partners.org</email>
            </au>
            <au id="A5">
               <snm>Weissleder</snm>
               <fnm>Ralph</fnm>
               <insr iid="I1"/>
               <email>weissleder@helix.mgh.harvard.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Center for Molecular Imaging Research, Massachusetts General Hospital, Harvard Medical School, Bldg. 149, 13<sup>th </sup>Street, Room 5406, Charlestown, MA 02129, USA</p>
            </ins>
            <ins id="I2">
               <p>Northern Essex Community College, 100 Elliott Street, Haverhill, MA 01830, USA</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2007</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>280</fpage>
         <url>http://www.biomedcentral.com/1471-2105/8/280</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17678535</pubid>
               <pubid idtype="doi">10.1186/1471-2105-8-280</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>16</day>
               <month>3</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>01</day>
               <month>8</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>01</day>
               <month>8</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Shtatland et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Peptides are important molecules with diverse biological functions and biomedical uses. To date, there does not exist a single, searchable archive for peptide sequences or associated biological data. Rather, peptide sequences still have to be mined from abstracts and full-length articles, and/or obtained from the fragmented public sources.</p>
            </sec>
            <sec>
               <st>
                  <p>Description</p>
               </st>
               <p>We have constructed a new database (PepBank), which at the time of writing contains a total of 19,792 individual peptide entries. The database has a web-based user interface with a simple, Google-like search function, advanced text search, and BLAST and Smith-Waterman search capabilities. The major source of peptide sequence data comes from text mining of MEDLINE abstracts. Another component of the database is the peptide sequence data from public sources (ASPD and UniProt). An additional, smaller part of the database is manually curated from sets of full text articles and text mining results. We show the utility of the database in different examples of affinity ligand discovery.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>We have created and maintain a database of peptide sequences. The database has biological and medical applications, for example, to predict the binding partners of biologically interesting peptides, to develop peptide based therapeutic or diagnostic agents, or to predict molecular targets or binding specificities of peptides resulting from phage display selection.  The database is freely available on <url>http://pepbank.mgh.harvard.edu/</url>, and the text mining source code (Peptide::Pubmed) is freely available above as well as on CPAN (<url>http://www.cpan.org/</url>).</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Peptides have emerged as important affinity ligands for diagnostic and therapeutic medical uses as well as materials for a host of applications in biotechnology. While many excellent databases exist that provide protein sequence data <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>, protein interaction data <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>, and peptide data <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>, a substantial fraction of literature data remains untapped. Unfortunately, the wealth of the peptide sequences in these sources is often difficult to access by modern methods of sequence similarity searching, because peptide sequences are not extracted in a suitable format. We therefore sought to address this issue by developing a combination of automatically mining MEDLINE abstracts for peptide sequences, combining the existing bioinformatics sources, and manually curating the full text articles and MEDLINE text mining results. The data, available through a web-based interface for simple and more advanced text search and BLAST and Smith-Waterman sequence similarity search, proved useful in our own work. Examination of initial data yielded some surprises as well, providing an incentive for us to make further improvements to the database. We hope that the peptide database, the associated tools, and the text mining algorithm will be useful to the larger biomedical community.</p>
         <p>Peptides are defined by International Union of Pure and Applied Chemistry and International Union of Biochemistry and Molecular Biology (IUPAC-IUB) as compounds "produced by amide formation between a carboxyl group of one amino acid and an amino group of another" <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. In this paper, we use the term "peptides" as a common synonym for oligopeptides, which are defined as having "fewer than about 10&#8211;20 residues"<abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. We thus currently use an IUPAC-IUB length cut-off of 20 amino acid residues or less. Many of the peptides used as pharmaceutical and diagnostic agents fall within this cut-off.</p>
         <p>Naturally occurring peptides function as hormones, transmitters, and modulators of numerous biological processes <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Both naturally occurring and synthetic peptides are used in therapeutic applications <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, for example somatostatin analogs in tumor radiotherapy <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp> and oxytocin to induce labor <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. Examples of diagnostic uses include membrane-translocating agents <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, receptor targeting agents <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, and enzyme substrates <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. Driven by the great interest in the diverse applications of peptides, the new peptidomics field is rapidly emerging <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. The functions of peptides, including their interacting partners, are determined by their sequence and similar to longer proteins, can be predicted based on sequence similarity.</p>
         <p>Prior knowledge can be used to predict or shorten the list of possible binding partners of a given peptide of interest, provided a peptide shares significant sequence similarity with other peptides or proteins whose binding partners are known <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B23">23</abbr></abbrgrp>. One can also use a sequence similarity search to remove peptides with similarity to other peptides with known, undesirable properties such as non-specific binding <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> or toxicity. Computational predictions are relatively fast and inexpensive, but require a peptide sequence database with links to peptide data, for use with sequence similarity search methods such as basic local alignment search tool (BLAST) <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp> or Smith-Waterman search <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp>. The non-sequence (text) data in such a peptide database can be queried with text search tools for biological, therapeutic or diagnostic applications, for example to find peptides that are enzyme inhibitors and whose sequences are available.</p>
         <p>We searched through the existing bioinformatics sources, and found no single source that fully suited our needs. With the exception of the Receptor Ligand Contacts (RELIC) database and web-server <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> and Artificially Selected Proteins/Peptides Database (ASPD) <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, most large protein sequence and interaction databases that allow both sequence similarity and text annotation searches have two major drawbacks. First, most of their sequences are of biological origin, while many phage display <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp> or combinatorial screens yield non-biological sequence hits. There is no large repository of chemically generated unnatural sequences, similar to what PubChem <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> or ChemBank <abbrgrp><abbr bid="B31">31</abbr></abbrgrp> are for compounds. Second, there exists less data on short peptides than on longer proteins, and usually no facile way to restrict the search to short sequences only. This is important because performing an unrestricted sequence similarity search often results in a large proportion of false positives due to hits to proteins in which the peptide sequence is buried and not accessible for binding, or is in a conformation different from that in a shorter peptide. The same sequence may have different binding properties when displayed on a phage versus when presented as part of the native protein <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Sequence similarity based predictions are further hampered for conformationally constrained peptides, designed specifically to have properties different from the same sequence in linear form <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. ASPD <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> and RELIC <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> databases do not have these drawbacks, are well curated, but are relatively small compared with the large amount of sequence data in the MEDLINE abstracts. For example, the ASPD database has 1,717 entries of 20 amino acid or shorter sequences. RELIC (a server with many useful peptide sequence analysis tools) has 3,632 peptide sequences that result from phage display selections, but only 7 distinct targets to which they bind. Other peptide databases have different purposes and are more specialized by design, for example antimicrobial (the Antimicrobial Peptide Database (APD) <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>, and others <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>), phosphorylation sites (Scansite <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>), or major histocompatibility complex related (SYFPEITHI <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>, EPIMHC <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, and others <abbrgrp><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr></abbrgrp>).</p>
         <p>In order to create a database suitable for the identification of affinity ligands, we developed text mining methods to extract peptide sequences from MEDLINE abstracts and compiled them in a single, easily searchable database. While far from complete, the database is a useful publicly available source of peptide sequences and the associated data. Below we show how the database was constructed, how it functions, and how it can be used to identify targeting ligands.</p>
      </sec>
      <sec>
         <st>
            <p>Construction and content</p>
         </st>
         <sec>
            <st>
               <p>Database model and overview</p>
            </st>
            <p>The database model (Figure <figr fid="F1">1</figr>) was adopted from the Proteomics Standards Initiative Molecular Interactions (PSI MI) model for storage of biological interactions <abbrgrp><abbr bid="B43">43</abbr></abbrgrp> and was extended to facilitate secure access to curate entries. Each entry is associated with a "peptide sequence", an "interactor", an "experiment" and a "group". The group serves to assign user permissions for curating entries. Separate tables, which are not shown for clarity, define controlled vocabularies. These were adopted where possible from the existing ontologies. Organism vocabulary used for peptides, interactors and interactions was adopted from the National Center for Biotechnology Information (NCBI) Taxonomy <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. The detection method vocabulary, utilized for experiments, was adopted from PSI MI ontology using the descendants of the term MI:0001, "interaction detection method".</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Database core model</p>
               </caption>
               <text>
                  <p><b>Database core model</b>. MySQL tables are shown as rectangles. Mandatory attributes are in <b>bold</b>, optional are in <it>italics</it>. Relationships are shown as lines, with the arrows pointing from the primary to the foreign keys, and multiplicities as shown.</p>
               </text>
               <graphic file="1471-2105-8-280-1"/>
            </fig>
            <p>The application is using the open source Ruby on Rails framework <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> with a MySQL database <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> in the backend. The BLAST search <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp> was implemented using the NCBI binaries <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. The Smith-Waterman search was implemented using the SSEARCH program from the FASTA3 distribution <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B47">47</abbr></abbrgrp>. The databases for sequence similarity searches included, in addition to sequences, the motifs, with any variable positions replaced with X for simplicity (for example, motif 'P(P/S)GH(Y/F)K' was used as 'PXGHXK').</p>
            <p>The database was constructed from the following sources (with the current number of entries in parentheses): text mining of MEDLINE abstracts (13,596 entries), manual curation of full text PDF articles (859), and other public sources: ASPD (1,717) and UniProt (3,620), as described in the sections below. The total number of entries is currently 19,792. A small fraction of the peptide sequences resulting from MEDLINE abstract text mining were manually curated: 1,773 entries were validated as correct peptide sequences, and 170 of those were more fully annotated with additional interaction data present in the abstract. The database continues to grow as the new data are added to the sources such as MEDLINE and UniProt.</p>
         </sec>
         <sec>
            <st>
               <p>MEDLINE abstract text mining</p>
            </st>
            <p>In order to identify abstracts with peptide sequences, the entire MEDLINE database with its 15 million records was downloaded from the National Library of Medicine (NLM) ftp site <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>. The text mining code was written in Perl, a language selected due to its text processing capabilities, and widely used in many important biomedical literature text mining applications <abbrgrp><abbr bid="B49">49</abbr><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr></abbrgrp>. Data were processed in 3 steps. First, each abstract was assigned a score based on how likely it was to contain a peptide sequence anywhere within the text. Second, each individual word was assigned a score based on how likely it was to contain a peptide sequence. For each word, a combined score was then computed based on both the word score and the abstract score. Thus, in total, we used three types of scores (abstract, word, and combined). Third, the sequences associated with the words were cleaned, and ambiguities resolved. After these tasks were completed, the words were ranked by the combined score and included in the peptide database based on empirically determined thresholds. Each unique sequence per abstract identified by text mining was assigned one database entry. Multiple occurrences of the same sequence in different forms, such as 'RGD' and 'Arg-Gly-Asp', were considered a single entry.</p>
            <p>Text mining was performed on a Fedora Core 5 Linux virtual machine running on an HP DL320 server with two 3 GHz Xeon processors, allocated 512 MB of RAM. The data resided on a file server connected via Gigabit Ethernet. Text mining of the entire MEDLINE (baseline distribution and updates) took 44 hours, with an additional 16 hours for pre-processing: downloading, uncompressing/compressing and parsing MEDLINE distribution files. The resulting database was 35 MB. Incremental weekly processing of MEDLINE updates took on average under 1 hour.</p>
            <sec>
               <st>
                  <p>Step 1. Classification of abstracts</p>
               </st>
               <p>MEDLINE entries that were either duplicates, or did not have abstracts, or were older than 1950 were removed. The older abstracts, which were published prior to the development of Edman degradation <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>, did not contain peptide sequences. Several pattern categories of interest were created, such as those related to peptides, phage display, proteases, and others. For each abstract, the total number of matches to patterns in each category was computed, for example, for the 'peptide' category this included the number of matches to 'peptid' or 'hormone', and if at least one of these patterns was present, additionally included the number of matches to less specific patterns such as 'sequenc' or 'motif'. The title, abstract, medical subject heading terms and the chemical list were all scored. Some of the abstracts, especially those published before mid-1990s, often include peptide sequences which are related to protein digestion and sequencing. These sequences usually represent parts of longer proteins, rather than individual peptides, and were thus scored differently. Any matches to this 'digestion' category of patterns were counted. The abstract score was computed as the sum of the number of matches to categories 'peptide' and 'phage', minus the number of matches to the 'digestion' category. Additional terms were added to the abstract score for matches to more than one pattern category in the same abstract, for example the number of matches to patterns from the 'phage' category multiplied by the number of matches to the 'peptide' category. Phosphorylated peptides, such as those selected using the oriented phosphopeptide library technique <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>, were not scored any differently from other peptides, that is, neither included not excluded specifically. There is a useful resource, Scansite, dedicated specifically to the phosphorylated peptides <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>, which can be used for this application. Texts with a large number or fraction of words in all caps tend to produce many false positives, thus the abstract score was decreased for such abstracts. The abstract score was then transformed for convenience to the (0,1) interval using the function: <it>y = x/(1+x)</it>. An abstract score below 0 was assigned to 0. An abstract related to peptide sequences tended to have a score close to 1, and an unrelated one to 0.</p>
            </sec>
            <sec>
               <st>
                  <p>Step 2. Classification of words</p>
               </st>
               <p>Each abstract was split into words on whitespace. Each word was matched against a series of peptide sequence pattern categories, in order of decreasing specificities of patterns, until the first successful match. The pattern categories were: full names of amino acids (longest, most specific, such as 'valine' or 'valyl'), 3 letter symbols (such as 'Val') and 1 letter symbols (such as 'V', least specific). Because the recommendations of IUPAC-IUB for reporting peptide sequences <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> were not followed in a large number of abstracts, we had to use a complex classification method and added methods to clean sequences and resolve the ambiguities. Any word that matched a pattern of peptide sequence of at least two amino acids was assigned a score. The score was an empirically calculated measure used to distinguish peptide sequences from other terms, such as nucleic acid sequences, gene symbols, acronyms and all caps English words, which they sometimes closely resemble or are even identical to, when taken out of context.</p>
               <p>The above score was defined by several factors. The length/amino acid symbol factor was based on the length of the sequence in amino acids (higher score for longer sequence patterns, which were more specific) and on the type of amino acid symbols used (higher score for the more specific full names than for 1 letter symbols). The degenerate amino acid factor was based on the fraction and the total number of degenerate amino acids (lower score for degenerate amino acids such as 'X' or 'Xaa', which may represent, for example, the starting randomized phage display library rather than the selected peptide). Other factors reflected similarity to either of the following categories: Roman numerals, nucleic acid sequences, gene names and gene symbols, English words, scientific terms or abbreviations, or a combination of the above. The list of abbreviations was derived from the comprehensive ADAM database <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>. The list of gene names and symbols was derived from Entrez Gene <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>, UniProt <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> and Human Gene Nomenclature (HGNC) <abbrgrp><abbr bid="B56">56</abbr></abbrgrp> databases. An additional factor represented similarity of a given word to protein sequences relative to English words. It was computed for all words that matched a pattern of sequences in 1 letter amino acid symbols. The word was broken up into overlapping k-mers. For example, for k = 3, word 'EYHHYNK' was broken up into 'EYH', 'YHH', 'HHY', 'HYN', 'YNK'. The proportions of all possible k-mers were precomputed in the databases of known protein sequences (from UniProt) and non-sequences (here, English words from MEDLINE abstracts not related to peptides), designated P<sub>p </sub>and P<sub>n</sub>, respectively. We used the databases of protein sequences and non-sequences of 8 &#215; 10<sup>7 </sup>k-mers each, with k = 3, replacing counts of 0 with 1 to avoid division by 0. The protein/English word similarity factor was defined as the product over all overlapping k-mers within the word of (P<sub>p</sub>/P<sub>n</sub>). For a word with all k-mers equally frequent among sequences and non-sequences, the factor was 1, while for a word such as 'EYHHYNK' in which on average the k-mers were more frequent in protein sequences than in English words, the factor was greater than 1.</p>
               <p>The word score was transformed to the (0,1) interval, similarly as in the abstract score. The word score thus depended only on the properties of the word itself, rather than on the context (the properties of the abstract). The combined word/abstract score was then computed for each word, and reflected the abstract score, the word score, and the maximum word score for all words in the abstract, included because sequences tend to occur together in abstracts. The combined word/abstract score s<sub>c </sub>was computed according to the formula</p>
               <p>
                  <display-formula>s<sub>c </sub>= s<sub>a</sub>(w<sub>1</sub>s<sub>w </sub>+ w<sub>2</sub>s<sub>m</sub>), for s<sub>w </sub>> 0,</display-formula>
               </p>
               <p>
                  <display-formula>s<sub>c </sub>= 0, for s<sub>w </sub>= 0,</display-formula>
               </p>
               <p>where s<sub>a </sub>is the abstract score, s<sub>w </sub>is the word score of the current word, s<sub>m </sub>is the maximum word score for all words in the abstract, and w<sub>1</sub>, w<sub>2 </sub>are the weights (w<sub>1 </sub>> w<sub>2</sub>). The combined score varied in the (0,1) interval. Words that matched peptide sequence patterns in abstracts related to peptides tended to have a score close to 1, and close to 0 otherwise.</p>
            </sec>
            <sec>
               <st>
                  <p>Step 3. Clean-up</p>
               </st>
               <p>Words that matched peptide sequence patterns were cleaned in a series of steps and converted to 1 letter amino acid symbols, as follows. The terminal marks and modifications, such as 'H(2)N-' or '-CO-Ph', were removed. Numbers representing amino acid positions were removed. Other modifications, such as phosphate in 'pY' were removed. Motifs such as '(L/I)' or 'L/I' were resolved. Amino acids that do not have a 1 letter IUPAC symbol were replaced with X. As a result, a large variety of different sequence formats were resolved, including 'N-acetyl-l-aspartyl-l-glutamyl-l-valyl-l-aspartyl-7-amino-4-methylcoumarin' to 'DEVD', 'Gly1-Val2-Thr3-Ser4' to 'GVTS', '(Arg-Glu(EDANS)-Ser-Gln)' to 'RESQ', 'TRDI-pY-ETD-pY-pY-RK' to 'TRDIYETDYYRK', and others.</p>
               <p>To estimate precision of text mining, 50 sequences with the combined score above the threshold for inclusion in PepBank were selected at random from the text mining output. Each of these positive predictions was manually verified, whether or not the word contained a peptide sequence (40 out of 50 were found correctly, precision = 0.8), and whether or not the word contained a peptide sequence AND the sequence was parsed 100% correctly (35 out of 50 correct, precision = 0.7). If the identified sequence was a partial protein sequence, rather than a peptide or a phage display sequence, it was considered an error: such sequences are typically entered in protein databases and do not need to be mined from text (most of the errors in precision were of this type). One or more incorrect amino acid was also considered an error.</p>
               <p>For estimating recall, we created a separate test set of 50 sequences by searching in PubMed for recent review articles using as a query "peptide OR peptides" alone or in combination with "sequence OR sequences", and followed the PubMed abstract links for the references cited in the reviews. Peptide sequences were manually extracted from the abstracts without any automated pattern matching. The text mining output with the combined score above the threshold for inclusion in PepBank was matched against these positive real cases. Again, for each case we manually verified whether or not the algorithm found the word, which contained this peptide sequence (12 out of 50 correct, recall = 0.24), and whether or not the algorithm found the word AND the sequence was parsed 100% correctly (10 out of 50 correct, recall = 0.2). Most of the errors in recall were due to blanks (often typos) inside peptide sequences or due to unrecognized amino acid modifications.</p>
               <p>The pioneering method to identify DNA and protein sequences in text, based on Markov models was described by Wren and co-workers <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>. Our text mining method, while similar in spirit, has different goals and thus uses a different sequence identification strategy. One of our main goals was to rapidly identify peptides with potential therapeutic and diagnostic utility (including those derived from phage display peptides), rather than identifying peptide epitopes and providing an aid to their manual curation. We also use extensive context information from the abstract, and collect peptide motifs in addition to sequences. We clean the sequences and provide access to the data for biologists through a simple web-based interface for text and sequence similarity searches. We do not place a minimum length restriction on sequences, such as 6 amino acids, because many therapeutic peptides are relatively short, for example the well-known RGD motif and many others found in phage display. Due to the substantial differences in goals and methods between our approach and that of others, it may be interesting to develop in the future a hybrid method combining the strengths of both approaches.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Other sources</p>
            </st>
            <p>All peptide sequences with length 20 or below were extracted from ASPD <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> and UniProt <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, and fields that mapped to PepBank were parsed and stored (for example, interactor fields from ASPD, peptide fields from UniProt). The links from PepBank to the source databases were provided for all entries. Many of the peptides were stored in UniProt as part of the longer precursor proteins, producing peptides on cleavage. These peptide sequences were extracted using the UniProt feature table by selecting those with feature key "peptide" or "chain" and feature length under 20. Additional entries were manually curated, capturing the available interaction data, from the full text articles on phage display in PDF format. The articles were chosen to represent a small but diverse selection of reports within this field.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Utility and discussion</p>
         </st>
         <sec>
            <st>
               <p>User interface</p>
            </st>
            <p>The web-based user interface to PepBank offers text search (both Quick and Advanced), as well as sequence similarity search (BLAST and Smith-Waterman algorithms). The Quick Search function offers a simple, Google-like search for biologists looking for peptide data in all fields. Advanced Search options include querying data by individual fields. Exact search, wildcard (*) and any single character (_) are supported in text search, which enables, for example, searching for a sequence pattern as a query. The results of the text search are displayed as a table sortable in the browser, with hyperlinks to the original sources (MEDLINE/PubMed, ASPD, UniProt) and to more detailed information.</p>
         </sec>
         <sec>
            <st>
               <p>Text search example: VEGFR related peptides</p>
            </st>
            <p>To illustrate the utility of PepBank, we use the example of identifying peptides with affinity to VEGFR1, an important therapeutic target <abbrgrp><abbr bid="B58">58</abbr></abbrgrp>. The user can search for VEGFR using either Quick or Advanced Search, obtain a set of peptide sequences related to this target, and view details for the selected sequences. In the example shown in Figure <figr fid="F2">2</figr>, sequence 'WHSDMEWWYLLG' is identified <abbrgrp><abbr bid="B59">59</abbr></abbrgrp>. Prompted by these results, the user of PepBank may be interested in testing this peptide sequence in novel forms (for example, dendrimers, or conjugated to nanoparticles), or for novel biomedical applications (imaging different tumor types, atherosclerosis, or arthritis). There is currently no database where the user can easily obtain such information as it relates to molecular targets and peptide sequences. One can also query directly for a biological process (such as apoptosis or angiogenesis) or for the target cell line or tissue (such as BICR-H1 or U937).</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Web-based user interface of PepBank</p>
               </caption>
               <text>
                  <p><b>Web-based user interface of PepBank</b>. Illustration of a typical user workflow. The user enters the query with Quick or Advanced Search. The results are returned in a table sortable in the browser. The user selects the entry or entries of interest. The sequence in the example shown was obtained by text mining and was then manually curated. The score, between 0 and 1, reflects the degree of confidence in the interaction (higher score for more confidence). Manually curated entries receive higher score than entries from automated text mining.</p>
               </text>
               <graphic file="1471-2105-8-280-2"/>
            </fig>
            <p>To determine whether the database would yield target leads against known drug targets, we randomly chose a set of 20 defined drug targets from the 547 approved drug target data set in DrugBank <abbrgrp><abbr bid="B60">60</abbr></abbrgrp>. The randomly chosen drug targets were not skewed towards peptide receptors and included: squalene epoxidase, RAF proto-oncogene serine/threonine-protein kinase, muscarinic acetylcholine receptor M4, opioid mu receptor (OP3), adenosine A1 receptor, GABA transaminase, amidophosphoribosyltransferase precursor, tryptophan 5-hydroxylase 1, apoptosis regulator Bcl-2, matrix protein M2, vascular endothelial growth factor receptor 2 precursor, amiloride-sensitive sodium channel gamma-subunit, ribonucleotide reductase, cAMP phosphodiesterase, coagulation factor VIII, high affinity immunoglobulin epsilon receptor alpha-subunit precursor, retinol-binding protein I, glycine alpha 2 receptor, cytochrome P450 51, GABA-A receptor subunit <it>(C. elegans)</it>. Relevant peptides were defined as those interacting with the target or its ortholog, or modulating the function of the target, for example by acting as a competitor. Relevant peptides in our database were identified in approximately 25% of the above drug targets.</p>
         </sec>
         <sec>
            <st>
               <p>Sequence similarity search examples</p>
            </st>
            <p>As an illustrative example, we performed an all-against-all BLAST search of PepBank sequences. One of the surprises was the discovery of an exact match to sequence 'GETRAPL' from phage display selection for peptides that bind to secreted protein acidic and rich in cysteine (SPARC) <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>. The sequence had a BLAST hit with an E-value of 0.06 to an isolate from phage display selection of peptides that bind human saphenous vein smooth muscle cells <abbrgrp><abbr bid="B62">62</abbr></abbrgrp>. Following the BLAST results, we then found that in addition to these 2 selections, the exact same sequence was isolated independently multiple times by different groups in selections with unrelated targets. GETRAPL was found in phage display selections of peptides that bind human immunodeficiency virus type 1 (HIV-1) accessory viral protein (Vpr) <abbrgrp><abbr bid="B63">63</abbr></abbrgrp>, chromatin high mobility group protein 1, box A (HMGB1) from rat <abbrgrp><abbr bid="B64">64</abbr></abbrgrp>, mouse skeletal muscle tissue <it>in vivo </it><abbrgrp><abbr bid="B65">65</abbr></abbrgrp>, and mouse brain cells <it>in vivo </it><abbrgrp><abbr bid="B66">66</abbr></abbrgrp>.</p>
            <p>We suggest that one of the utilities for PepBank is to search the peptide sequences of interest to the user with BLAST or Smith-Waterman algorithms to find any important similarities to the known peptides collected in our database. In this example, the search can be used to remove a relatively nonspecific binder GETRAPL. Note that searching PepBank with these tools is a unique resource: an exact match may be easy to find, but using a partial match such as GETRA as a query finds GETRAPL only in PepBank, but not in PubMed <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> or on Google. Searching with BLAST <abbrgrp><abbr bid="B67">67</abbr></abbrgrp> or with Smith-Waterman/SSEARCH methods <abbrgrp><abbr bid="B47">47</abbr></abbrgrp> using GETRAPL as a query against nr database <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> gives no peptide hits cited above. A large interactions database IntAct <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> gives no hits for GETRAPL query at all.</p>
            <p>Another surprise discovery in the all-against-all BLAST search of PepBank sequences was the multiple occurrence of the sequence SVSVGMKPSPRP. The sequence had several exact matches over its entire length of 12 amino acids, with an E-value of 1 &#215; 10<sup>-6</sup>. It was isolated in phage display selection for peptides that bind to DNA <abbrgrp><abbr bid="B68">68</abbr></abbrgrp>. In this selection SVSVGMKPSPRP was the only sequence studied due to its dominance (9 out of 10) in the selected pool. The exact same sequence was isolated in phage display selection for peptides binding to human monoclonal IgM <abbrgrp><abbr bid="B69">69</abbr></abbrgrp>, and to the mirror image of Alzheimer's disease amyloid peptide Abeta(1&#8211;42) <abbrgrp><abbr bid="B70">70</abbr></abbrgrp>. The sources for these sequences were MEDLINE abstract text mining, ASPD database, and manually curated full text articles, respectively. In addition, SVSVGMKPSPRP occurs in several patents <abbrgrp><abbr bid="B71">71</abbr><abbr bid="B72">72</abbr></abbrgrp>. Several groups note multiple isolation of this remarkable sequence in their own and other, unrelated, experiments <abbrgrp><abbr bid="B73">73</abbr><abbr bid="B74">74</abbr></abbrgrp>. The sequence has also been identified in a recent excellent review <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> which covers the important topic of target-unrelated sequences in phage display. Interestingly, all of the studies with both GETRAPL and SVSVGMKPSPRP were done with the phage display libraries from the same manufacturer, thus suggesting a library- or methodology-specific phenomenon. Both sequences illustrate one of the suggested utilities for PepBank, namely that one can search it with a sequence query using BLAST or Smith-Waterman algorithms to find any important similarities to the known peptides.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>A new text mining tool was developed and used to identify peptide sequences in MEDLINE abstracts. These data were combined with two of the public sources of peptide sequence data, ASPD and UniProt, as well as with manually curated peptide data. The database application was developed to query the data using text and sequence similarity search through a web-based user interface. The utility of PepBank was demonstrated using different examples of peptide sequences. The results show that the database has valuable biological and medical applications. In the future, we plan to add other public sources of peptide data, such as the peptide subset of the Molecular Interaction database (MINT) <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>, and other sources for text mining, such as full-text journal articles. Also, in the future we will apply machine learning techniques to improve the accuracy of text mining to extract sequences. In the next release, we plan to add the functionalities to download the data in a standard format, such as PSI MI, and to search the database for peptide motifs.</p>
      </sec>
      <sec>
         <st>
            <p>Availability and requirements</p>
         </st>
         <p>The database is freely available on <url>http://pepbank.mgh.harvard.edu/</url>, and the text mining source code (Peptide::Pubmed) is freely available above as well as on CPAN <url>http://www.cpan.org/</url>.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>TS designed and developed the text mining algorithm, curated the database contents, co-designed the database and the interface and wrote the manuscript, DG designed and developed the database, the web application and the interface, MK co-curated the database contents, MP designed the architecture of the entire web site and designed the database and the interface, RW provided the conceptual design and the overall guidance of the entire project and co-wrote the manuscript. All authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Timo Duchrow, Vladimir Kubatin, Lee Josephson, Ching Tung, Elena Aikawa, Kim Kelly and Rajesh Anbazhagan for helpful discussions and feedback on the database, Jason Brown and Brett Dikeman for system administration work and Melissa Carlson for editorial assistance. We are grateful to the authors and curators of the resources we used: ADAM (in particular, Neil Smalheiser), MEDLINE/NLM, UniProt and ASPD, and to anonymous reviewers for their comments. This work was supported in part by NIH grants PO1-AI54904 (RW), P50-CA86355 (RW), U54-CA126515 (RW).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>The Universal Protein Resource (UniProt): an expanding universe of protein information</p>
            </title>
            <aug>
               <au>
                  <snm>Wu</snm>
                  <fnm>CH</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Bairoch</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Natale</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Barker</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Boeckmann</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Ferro</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gasteiger</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Magrane</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Mazumder</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>O'Donovan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Redaschi</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Suzek</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>Database issue</issue>
            <fpage>D187</fpage>
            <lpage>91</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347523</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381842</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj161</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Database resources of the National Center for Biotechnology Information</p>
            </title>
            <aug>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Barrett</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Benson</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>SH</fnm>
               </au>
               <au>
                  <snm>Canese</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Chetvernin</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>DiCuccio</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Edgar</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Federhen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Geer</snm>
                  <fnm>LY</fnm>
               </au>
               <au>
                  <snm>Kapustin</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Khovayko</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Landsman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Maglott</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Ostell</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Pruitt</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Schuler</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Sequeira</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Sherry</snm>
                  <fnm>ST</fnm>
               </au>
               <au>
                  <snm>Sirotkin</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Souvorov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Starchenko</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Tatusova</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Wagner</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Yaschenko</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <issue>Database issue</issue>
            <fpage>D5</fpage>
            <lpage>12</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1781113</pubid>
                  <pubid idtype="pmpid" link="fulltext">17170002</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl1031</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>The International Protein Index: an integrated database for proteomics experiments</p>
            </title>
            <aug>
               <au>
                  <snm>Kersey</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Duarte</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Karavidopoulou</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Proteomics</source>
            <pubdate>2004</pubdate>
            <volume>4</volume>
            <issue>7</issue>
            <fpage>1985</fpage>
            <lpage>1988</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/pmic.200300721</pubid>
                  <pubid idtype="pmpid" link="fulltext">15221759</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>The Biomolecular Interaction Network Database and related tools 2005 update</p>
            </title>
            <aug>
               <au>
                  <snm>Alfarano</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Andrade</snm>
                  <fnm>CE</fnm>
               </au>
               <au>
                  <snm>Anthony</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Bahroos</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Bajec</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bantoft</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Betel</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Bobechko</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Boutilier</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Burgess</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Buzadzija</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Cavero</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>D'Abreo</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Donaldson</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Dorairajoo</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Dumontier</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Dumontier</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Earles</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Farrall</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Feldman</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Garderman</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Gong</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Gonzaga</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Grytsan</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Gryz</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Gu</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Haldorsen</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Halupa</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Haw</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hrvojic</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hurrell</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Isserlin</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Jack</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Juma</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Khan</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kon</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Konopinsky</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Le</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Ling</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Magidin</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Moniakis</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Montojo</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Moore</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Muskat</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Ng</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Paraiso</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Parker</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Pintilie</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Pirone</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Salama</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Sgro</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shan</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Shu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Siew</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Skinner</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Snyder</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Stasiuk</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Strumpf</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Tuekam</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Tao</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Willis</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wolting</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Wong</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wrong</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Xin</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Yao</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Yates</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Zheng</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Pawson</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ouellette</snm>
                  <fnm>BF</fnm>
               </au>
               <au>
                  <snm>Hogue</snm>
                  <fnm>CW</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <issue>Database issue</issue>
            <fpage>D418</fpage>
            <lpage>24</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540005</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608229</pubid>
                  <pubid idtype="doi">10.1093/nar/gki051</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>MINT: the Molecular INTeraction database</p>
            </title>
            <aug>
               <au>
                  <snm>Chatr-aryamontri</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ceol</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Palazzi</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Nardelli</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Schneider</snm>
                  <fnm>MV</fnm>
               </au>
               <au>
                  <snm>Castagnoli</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Cesareni</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <issue>Database issue</issue>
            <fpage>D572</fpage>
            <lpage>4</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1751541</pubid>
                  <pubid idtype="pmpid" link="fulltext">17135203</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl950</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>IntAct--open source resource for molecular interaction data</p>
            </title>
            <aug>
               <au>
                  <snm>Kerrien</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Alam-Faruque</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Aranda</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bancarz</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Bridge</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Derow</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Dimmer</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Feuermann</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Friedrichsen</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Huntley</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kohler</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Khadake</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Leroy</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Liban</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lieftink</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Montecchi-Palazzi</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Orchard</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Risse</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Robbe</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Roechert</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Thorneycroft</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hermjakob</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <issue>Database issue</issue>
            <fpage>D561</fpage>
            <lpage>5</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1751531</pubid>
                  <pubid idtype="pmpid" link="fulltext">17145710</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl958</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>MIPS: analysis and annotation of proteins from whole genomes in 2005</p>
            </title>
            <aug>
               <au>
                  <snm>Mewes</snm>
                  <fnm>HW</fnm>
               </au>
               <au>
                  <snm>Frishman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Mayer</snm>
                  <fnm>KF</fnm>
               </au>
               <au>
                  <snm>Munsterkotter</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Noubibou</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Pagel</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Rattei</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Oesterheld</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ruepp</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Stumpflen</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>Database issue</issue>
            <fpage>D169</fpage>
            <lpage>72</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347510</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381839</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj148</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Human protein reference database--2006 update</p>
            </title>
            <aug>
               <au>
                  <snm>Mishra</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Suresh</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kumaran</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kannabiran</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Suresh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bala</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Shivakumar</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Anuradha</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Reddy</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Raghavan</snm>
                  <fnm>TM</fnm>
               </au>
               <au>
                  <snm>Menon</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hanumanthu</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gupta</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Upendran</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gupta</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mahesh</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jacob</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Mathew</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Chatterjee</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Arun</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Sharma</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chandrika</snm>
                  <fnm>KN</fnm>
               </au>
               <au>
                  <snm>Deshpande</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Palvankar</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Raghavnath</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Krishnakanth</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Karathia</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Rekha</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Nayak</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Vishnupriya</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Kumar</snm>
                  <fnm>HG</fnm>
               </au>
               <au>
                  <snm>Nagini</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kumar</snm>
                  <fnm>GS</fnm>
               </au>
               <au>
                  <snm>Jose</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Deepthi</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Mohan</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Gandhi</snm>
                  <fnm>TK</fnm>
               </au>
               <au>
                  <snm>Harsha</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Deshpande</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Sarker</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Prasad</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Pandey</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>Database issue</issue>
            <fpage>D411</fpage>
            <lpage>4</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347503</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381900</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj141</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions</p>
            </title>
            <aug>
               <au>
                  <snm>Xenarios</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Salwinski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Duan</snm>
                  <fnm>XJ</fnm>
               </au>
               <au>
                  <snm>Higney</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Eisenberg</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <issue>1</issue>
            <fpage>303</fpage>
            <lpage>305</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99070</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752321</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.303</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>RELIC--a bioinformatics server for combinatorial peptide analysis and identification of protein-ligand interaction sites</p>
            </title>
            <aug>
               <au>
                  <snm>Mandava</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Makowski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Devarapalli</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Uzubell</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Rodi</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Proteomics</source>
            <pubdate>2004</pubdate>
            <volume>4</volume>
            <issue>5</issue>
            <fpage>1439</fpage>
            <lpage>1460</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/pmic.200300680</pubid>
                  <pubid idtype="pmpid" link="fulltext">15188413</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>ASPD (Artificially Selected Proteins/Peptides Database): a database of proteins and peptides evolved in vitro</p>
            </title>
            <aug>
               <au>
                  <snm>Valuev</snm>
                  <fnm>VP</fnm>
               </au>
               <au>
                  <snm>Afonnikov</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Ponomarenko</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Milanesi</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Kolchanov</snm>
                  <fnm>NA</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <issue>1</issue>
            <fpage>200</fpage>
            <lpage>202</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99101</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752292</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.200</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Synthetic antibiotic peptides database</p>
            </title>
            <aug>
               <au>
                  <snm>Wade</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Englund</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Protein Pept Lett</source>
            <pubdate>2002</pubdate>
            <volume>9</volume>
            <issue>1</issue>
            <fpage>53</fpage>
            <lpage>57</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.2174/0929866023408986</pubid>
                  <pubid idtype="pmpid" link="fulltext">12141924</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>APD: the Antimicrobial Peptide Database</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <issue>Database issue</issue>
            <fpage>D590</fpage>
            <lpage>2</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308759</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681488</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh025</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>IUPAC-IUB Joint Commission on Biochemical Nomenclature (JCBN). Nomenclature and symbolism for amino acids and peptides. Recommendations 1983</p>
            </title>
            <source>Biochem J</source>
            <pubdate>1984</pubdate>
            <volume>219</volume>
            <issue>2</issue>
            <fpage>345</fpage>
            <lpage>373</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1153490</pubid>
                  <pubid idtype="pmpid">6743224</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Amino Acids and Peptides</p>
            </title>
            <aug>
               <au>
                  <snm>Barrett</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Elmore</snm>
                  <fnm>DT</fnm>
               </au>
            </aug>
            <publisher>Cambridge , Cambridge University Press</publisher>
            <pubdate>1998</pubdate>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Dosimetry in Peptide radionuclide receptor therapy: a review</p>
            </title>
            <aug>
               <au>
                  <snm>Cremonesi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ferrari</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bodei</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Tosi</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Paganelli</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>J Nucl Med</source>
            <pubdate>2006</pubdate>
            <volume>47</volume>
            <issue>9</issue>
            <fpage>1467</fpage>
            <lpage>1475</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16954555</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Candidates for peptide receptor radiotherapy today and in the future</p>
            </title>
            <aug>
               <au>
                  <snm>Reubi</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Macke</snm>
                  <fnm>HR</fnm>
               </au>
               <au>
                  <snm>Krenning</snm>
                  <fnm>EP</fnm>
               </au>
            </aug>
            <source>J Nucl Med</source>
            <pubdate>2005</pubdate>
            <volume>46 Suppl 1</volume>
            <fpage>67S</fpage>
            <lpage>75S</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15653654</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>High- versus low-dose oxytocin for augmentation or induction of labor</p>
            </title>
            <aug>
               <au>
                  <snm>Patka</snm>
                  <fnm>JH</fnm>
               </au>
               <au>
                  <snm>Lodolce</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Johnston</snm>
                  <fnm>AK</fnm>
               </au>
            </aug>
            <source>Ann Pharmacother</source>
            <pubdate>2005</pubdate>
            <volume>39</volume>
            <issue>1</issue>
            <fpage>95</fpage>
            <lpage>101</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15572602</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Protamine as an efficient membrane-translocating peptide</p>
            </title>
            <aug>
               <au>
                  <snm>Reynolds</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Weissleder</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Josephson</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Bioconjug Chem</source>
            <pubdate>2005</pubdate>
            <volume>16</volume>
            <issue>5</issue>
            <fpage>1240</fpage>
            <lpage>1245</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bc0501451</pubid>
                  <pubid idtype="pmpid" link="fulltext">16173804</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Detection of vascular adhesion molecule-1 expression using a novel multimodal nanoparticle</p>
            </title>
            <aug>
               <au>
                  <snm>Kelly</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Allport</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Tsourkas</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Shinde-Patil</snm>
                  <fnm>VR</fnm>
               </au>
               <au>
                  <snm>Josephson</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Weissleder</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Circ Res</source>
            <pubdate>2005</pubdate>
            <volume>96</volume>
            <issue>3</issue>
            <fpage>327</fpage>
            <lpage>336</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1161/01.RES.0000155722.17881.dd</pubid>
                  <pubid idtype="pmpid" link="fulltext">15653572</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>A novel method for imaging apoptosis using a caspase-1 near-infrared fluorescent probe</p>
            </title>
            <aug>
               <au>
                  <snm>Messerli</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Prabhakar</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Shah</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Cortes</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Murthy</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Weissleder</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Breakefield</snm>
                  <fnm>XO</fnm>
               </au>
               <au>
                  <snm>Tung</snm>
                  <fnm>CH</fnm>
               </au>
            </aug>
            <source>Neoplasia</source>
            <pubdate>2004</pubdate>
            <volume>6</volume>
            <issue>2</issue>
            <fpage>95</fpage>
            <lpage>105</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1502090</pubid>
                  <pubid idtype="pmpid" link="fulltext">15140398</pubid>
                  <pubid idtype="doi">10.1593/neo.03214</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Peptidomics: the comprehensive analysis of peptides in complex biological mixtures</p>
            </title>
            <aug>
               <au>
                  <snm>Schulz-Knappe</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Zucht</snm>
                  <fnm>HD</fnm>
               </au>
               <au>
                  <snm>Heine</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Jurgens</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hess</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Schrader</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Comb Chem High Throughput Screen</source>
            <pubdate>2001</pubdate>
            <volume>4</volume>
            <issue>2</issue>
            <fpage>207</fpage>
            <lpage>217</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11281836</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>In vivo interrogation of the molecular display of atherosclerotic lesion surfaces</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Bhattacharjee</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Boisvert</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Dilley</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Edgington</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Am J Pathol</source>
            <pubdate>2003</pubdate>
            <volume>163</volume>
            <issue>5</issue>
            <fpage>1859</fpage>
            <lpage>1871</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1892421</pubid>
                  <pubid idtype="pmpid" link="fulltext">14578186</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>The nature of target-unrelated peptides recovered in the screening of phage-displayed random peptide libraries with antibodies</p>
            </title>
            <aug>
               <au>
                  <snm>Menendez</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Scott</snm>
                  <fnm>JK</fnm>
               </au>
            </aug>
            <source>Anal Biochem</source>
            <pubdate>2005</pubdate>
            <volume>336</volume>
            <issue>2</issue>
            <fpage>145</fpage>
            <lpage>157</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1464800</pubid>
                  <pubid idtype="pmpid" link="fulltext">15620878</pubid>
                  <pubid idtype="doi">10.1016/j.ab.2004.09.048</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Basic local alignment search tool</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Gish</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1990</pubdate>
            <volume>215</volume>
            <issue>3</issue>
            <fpage>403</fpage>
            <lpage>410</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2231712</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Schaffer</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <issue>17</issue>
            <fpage>3389</fpage>
            <lpage>3402</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">146917</pubid>
                  <pubid idtype="pmpid" link="fulltext">9254694</pubid>
                  <pubid idtype="doi">10.1093/nar/25.17.3389</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Identification of common molecular subsequences</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>TF</fnm>
               </au>
               <au>
                  <snm>Waterman</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1981</pubdate>
            <volume>147</volume>
            <issue>1</issue>
            <fpage>195</fpage>
            <lpage>197</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(81)90087-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">7265238</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms</p>
            </title>
            <aug>
               <au>
                  <snm>Pearson</snm>
                  <fnm>WR</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>1991</pubdate>
            <volume>11</volume>
            <issue>3</issue>
            <fpage>635</fpage>
            <lpage>650</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0888-7543(91)90071-L</pubid>
                  <pubid idtype="pmpid" link="fulltext">1774068</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>GP</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1985</pubdate>
            <volume>228</volume>
            <issue>4705</issue>
            <fpage>1315</fpage>
            <lpage>1317</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.4001944</pubid>
                  <pubid idtype="pmpid" link="fulltext">4001944</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Phage Display</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>GP</fnm>
               </au>
               <au>
                  <snm>Petrenko</snm>
                  <fnm>VA</fnm>
               </au>
            </aug>
            <source>Chem Rev</source>
            <pubdate>1997</pubdate>
            <volume>97</volume>
            <issue>2</issue>
            <fpage>391</fpage>
            <lpage>410</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/cr960065d</pubid>
                  <pubid idtype="pmpid" link="fulltext">11848876</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Small molecules, big players: the National Cancer Institute's Initiative for Chemical Genetics</p>
            </title>
            <aug>
               <au>
                  <snm>Tolliday</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Clemons</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Ferraiolo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Koehler</snm>
                  <fnm>AN</fnm>
               </au>
               <au>
                  <snm>Lewis</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Schreiber</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Gerhard</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>Eliasof</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Cancer Res</source>
            <pubdate>2006</pubdate>
            <volume>66</volume>
            <issue>18</issue>
            <fpage>8935</fpage>
            <lpage>8942</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1158/0008-5472.CAN-06-2552</pubid>
                  <pubid idtype="pmpid" link="fulltext">16982730</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>The role of structure in antibody cross-reactivity between peptides and folded proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Craig</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Sanschagrin</snm>
                  <fnm>PC</fnm>
               </au>
               <au>
                  <snm>Rozek</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lackie</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kuhn</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Scott</snm>
                  <fnm>JK</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>281</volume>
            <issue>1</issue>
            <fpage>183</fpage>
            <lpage>201</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1998.1907</pubid>
                  <pubid idtype="pmpid" link="fulltext">9680484</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Designing scaffolds of peptides for phage display libraries</p>
            </title>
            <aug>
               <au>
                  <snm>Uchiyama</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Tanaka</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Minari</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Tokui</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>J Biosci Bioeng</source>
            <pubdate>2005</pubdate>
            <volume>99</volume>
            <issue>5</issue>
            <fpage>448</fpage>
            <lpage>456</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1263/jbb.99.448</pubid>
                  <pubid idtype="pmpid">16233816</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>PenBase, the shrimp antimicrobial peptide penaeidin database: sequence-based classification and recommended nomenclature</p>
            </title>
            <aug>
               <au>
                  <snm>Gueguen</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Garnier</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Robert</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Lefranc</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Mougenot</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>de Lorgeril</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Janech</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gross</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Warr</snm>
                  <fnm>GW</fnm>
               </au>
               <au>
                  <snm>Cuthbertson</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Barracco</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Bulet</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Aumelas</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Bo</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Xiang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Tassanakajon</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Piquemal</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Bachere</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Dev Comp Immunol</source>
            <pubdate>2006</pubdate>
            <volume>30</volume>
            <issue>3</issue>
            <fpage>283</fpage>
            <lpage>288</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.dci.2005.04.003</pubid>
                  <pubid idtype="pmpid" link="fulltext">15963564</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Defensins knowledgebase: a manually curated database and information source focused on the defensins family of antimicrobial peptides</p>
            </title>
            <aug>
               <au>
                  <snm>Seebah</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Suresh</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Zhuo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Choong</snm>
                  <fnm>YH</fnm>
               </au>
               <au>
                  <snm>Chua</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Chuon</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Beuerman</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Verma</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <issue>Database issue</issue>
            <fpage>D265</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1669742</pubid>
                  <pubid idtype="pmpid" link="fulltext">17090586</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl866</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Scansite 2.0: Proteome-wide prediction of cell signaling interactions using short sequence motifs</p>
            </title>
            <aug>
               <au>
                  <snm>Obenauer</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Cantley</snm>
                  <fnm>LC</fnm>
               </au>
               <au>
                  <snm>Yaffe</snm>
                  <fnm>MB</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <issue>13</issue>
            <fpage>3635</fpage>
            <lpage>3641</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">168990</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824383</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg584</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>SYFPEITHI: database for MHC ligands and peptide motifs</p>
            </title>
            <aug>
               <au>
                  <snm>Rammensee</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bachmann</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Emmerich</snm>
                  <fnm>NP</fnm>
               </au>
               <au>
                  <snm>Bachor</snm>
                  <fnm>OA</fnm>
               </au>
               <au>
                  <snm>Stevanovic</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Immunogenetics</source>
            <pubdate>1999</pubdate>
            <volume>50</volume>
            <issue>3-4</issue>
            <fpage>213</fpage>
            <lpage>219</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s002510050595</pubid>
                  <pubid idtype="pmpid" link="fulltext">10602881</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>EPIMHC: a curated database of MHC-binding peptides for customized computational vaccinology</p>
            </title>
            <aug>
               <au>
                  <snm>Reche</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Glutting</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Reinherz</snm>
                  <fnm>EL</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>9</issue>
            <fpage>2140</fpage>
            <lpage>2141</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti269</pubid>
                  <pubid idtype="pmpid" link="fulltext">15657103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>MHCBN: a comprehensive database of MHC binding and non-binding peptides</p>
            </title>
            <aug>
               <au>
                  <snm>Bhasin</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Singh</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Raghava</snm>
                  <fnm>GP</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>5</issue>
            <fpage>665</fpage>
            <lpage>666</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg055</pubid>
                  <pubid idtype="pmpid" link="fulltext">12651731</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>JenPep: a database of quantitative functional peptide data for immunology</p>
            </title>
            <aug>
               <au>
                  <snm>Blythe</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Doytchinova</snm>
                  <fnm>IA</fnm>
               </au>
               <au>
                  <snm>Flower</snm>
                  <fnm>DR</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <issue>3</issue>
            <fpage>434</fpage>
            <lpage>439</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.3.434</pubid>
                  <pubid idtype="pmpid" link="fulltext">11934742</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>MPID: MHC-Peptide Interaction Database for sequence-structure-function information on peptides binding to MHC molecules</p>
            </title>
            <aug>
               <au>
                  <snm>Govindarajan</snm>
                  <fnm>KR</fnm>
               </au>
               <au>
                  <snm>Kangueane</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Tan</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Ranganathan</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>2</issue>
            <fpage>309</fpage>
            <lpage>310</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/19.2.309</pubid>
                  <pubid idtype="pmpid" link="fulltext">12538264</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Population of the HLA ligand database</p>
            </title>
            <aug>
               <au>
                  <snm>Sathiamurthy</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hickman</snm>
                  <fnm>HD</fnm>
               </au>
               <au>
                  <snm>Cavett</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Zahoor</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Prilliman</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Metcalf</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fernandez Vina</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hildebrand</snm>
                  <fnm>WH</fnm>
               </au>
            </aug>
            <source>Tissue Antigens</source>
            <pubdate>2003</pubdate>
            <volume>61</volume>
            <issue>1</issue>
            <fpage>12</fpage>
            <lpage>19</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1034/j.1399-0039.2003.610102.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">12622773</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>The HUPO PSI's molecular interaction format--a community standard for the representation of protein interaction data</p>
            </title>
            <aug>
               <au>
                  <snm>Hermjakob</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Montecchi-Palazzi</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Bader</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Wojcik</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Salwinski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Ceol</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Moore</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Orchard</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sarkans</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>von Mering</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Roechert</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Poux</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Jung</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Mersch</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kersey</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lappe</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Zeng</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Rana</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Nikolski</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Husi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Brun</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Shanker</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Grant</snm>
                  <fnm>SG</fnm>
               </au>
               <au>
                  <snm>Sander</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Pandey</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Brazma</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jacq</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Vidal</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sherman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Legrain</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Cesareni</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Xenarios</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Eisenberg</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Steipe</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Hogue</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2004</pubdate>
            <volume>22</volume>
            <issue>2</issue>
            <fpage>177</fpage>
            <lpage>183</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt926</pubid>
                  <pubid idtype="pmpid" link="fulltext">14755292</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Ruby on Rails</p>
            </title>
            <url>http://www.rubyonrails.org</url>
         </bibl>
         <bibl id="B45">
            <title>
               <p>MySQL</p>
            </title>
            <url>http://www.mysql.com</url>
         </bibl>
         <bibl id="B46">
            <title>
               <p>The National Center for Biotechnology Information (NCBI) ftp site</p>
            </title>
            <url>ftp://ftp.ncbi.nih.gov/</url>
         </bibl>
         <bibl id="B47">
            <title>
               <p>University of Virginia FASTA server</p>
            </title>
            <url>http://fasta.bioch.virginia.edu/fasta_www2/fasta_list2.shtml</url>
         </bibl>
         <bibl id="B48">
            <title>
               <p>The National Library of Medicine (NLM) ftp site</p>
            </title>
            <url>ftp://ftp.nlm.nih.gov/</url>
         </bibl>
         <bibl id="B49">
            <title>
               <p>CGMIM: automated text-mining of Online Mendelian Inheritance in Man (OMIM) to identify genetically-associated cancers and candidate genes</p>
            </title>
            <aug>
               <au>
                  <snm>Bajdik</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Kuo</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Rusaw</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Brooks-Wilson</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <fpage>78</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1274267</pubid>
                  <pubid idtype="pmpid" link="fulltext">15796777</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-6-78</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Text mining neuroscience journal articles to populate neuroscience databases</p>
            </title>
            <aug>
               <au>
                  <snm>Crasto</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Marenco</snm>
                  <fnm>LN</fnm>
               </au>
               <au>
                  <snm>Migliore</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mao</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Nadkarni</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Shepherd</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Neuroinformatics</source>
            <pubdate>2003</pubdate>
            <volume>1</volume>
            <issue>3</issue>
            <fpage>215</fpage>
            <lpage>237</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1385/NI:1:3:215</pubid>
                  <pubid idtype="pmpid" link="fulltext">15046245</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>PreBIND and Textomy--mining the biomedical literature for protein-protein interactions using a support vector machine</p>
            </title>
            <aug>
               <au>
                  <snm>Donaldson</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>de Bruijn</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Wolting</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lay</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Tuekam</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Baskin</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bader</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Michalickova</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Pawson</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hogue</snm>
                  <fnm>CW</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <fpage>11</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">153503</pubid>
                  <pubid idtype="pmpid" link="fulltext">12689350</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-4-11</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Method for determination of the amino acid sequence in peptides</p>
            </title>
            <aug>
               <au>
                  <snm>Edman</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Acta Chem Scand</source>
            <pubdate>1950</pubdate>
            <volume>4</volume>
            <fpage>283</fpage>
            <lpage>293</lpage>
         </bibl>
         <bibl id="B53">
            <title>
               <p>SH2 domains recognize specific phosphopeptide sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Songyang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Shoelson</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Chaudhuri</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gish</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Pawson</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Haser</snm>
                  <fnm>WG</fnm>
               </au>
               <au>
                  <snm>King</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Roberts</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ratnofsky</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lechleider</snm>
                  <fnm>RJ</fnm>
               </au>
               <etal/>
            </aug>
            <source>Cell</source>
            <pubdate>1993</pubdate>
            <volume>72</volume>
            <issue>5</issue>
            <fpage>767</fpage>
            <lpage>778</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0092-8674(93)90404-E</pubid>
                  <pubid idtype="pmpid">7680959</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>ADAM: another database of abbreviations in MEDLINE</p>
            </title>
            <aug>
               <au>
                  <snm>Zhou</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Torvik</snm>
                  <fnm>VI</fnm>
               </au>
               <au>
                  <snm>Smalheiser</snm>
                  <fnm>NR</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <issue>22</issue>
            <fpage>2813</fpage>
            <lpage>2818</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btl480</pubid>
                  <pubid idtype="pmpid" link="fulltext">16982707</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>Entrez Gene: gene-centered information at NCBI</p>
            </title>
            <aug>
               <au>
                  <snm>Maglott</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ostell</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Pruitt</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Tatusova</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <issue>Database issue</issue>
            <fpage>D26</fpage>
            <lpage>31</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1761442</pubid>
                  <pubid idtype="pmpid" link="fulltext">17148475</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl993</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>The HUGO Gene Nomenclature Database, 2006 updates</p>
            </title>
            <aug>
               <au>
                  <snm>Eyre</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Ducluzeau</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Sneddon</snm>
                  <fnm>TP</fnm>
               </au>
               <au>
                  <snm>Povey</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bruford</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Lush</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>Database issue</issue>
            <fpage>D319</fpage>
            <lpage>21</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347509</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381876</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj147</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Markov model recognition and classification of DNA/protein sequences within large text databases</p>
            </title>
            <aug>
               <au>
                  <snm>Wren</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Hildebrand</snm>
                  <fnm>WH</fnm>
               </au>
               <au>
                  <snm>Chandrasekaran</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Melcher</snm>
                  <fnm>U</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>21</issue>
            <fpage>4046</fpage>
            <lpage>4053</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti657</pubid>
                  <pubid idtype="pmpid" link="fulltext">16159926</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Development of vascular endothelial growth factor receptor (VEGFR) kinase inhibitors as anti-angiogenic agents in cancer therapy</p>
            </title>
            <aug>
               <au>
                  <snm>Underiner</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Ruggeri</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Gingrich</snm>
                  <fnm>DE</fnm>
               </au>
            </aug>
            <source>Curr Med Chem</source>
            <pubdate>2004</pubdate>
            <volume>11</volume>
            <issue>6</issue>
            <fpage>731</fpage>
            <lpage>745</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.2174/0929867043455756</pubid>
                  <pubid idtype="pmpid" link="fulltext">15032727</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>Suppression of tumor growth and metastasis by a VEGFR-1 antagonizing peptide identified from a phage display library</p>
            </title>
            <aug>
               <au>
                  <snm>An</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lei</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Song</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Jin</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Meng</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Shou</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Int J Cancer</source>
            <pubdate>2004</pubdate>
            <volume>111</volume>
            <issue>2</issue>
            <fpage>165</fpage>
            <lpage>173</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/ijc.20214</pubid>
                  <pubid idtype="pmpid" link="fulltext">15197767</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>DrugBank: a comprehensive resource for in silico drug discovery and exploration</p>
            </title>
            <aug>
               <au>
                  <snm>Wishart</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>Knox</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Guo</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Shrivastava</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hassanali</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Stothard</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Chang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Woolsey</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>Database issue</issue>
            <fpage>D668</fpage>
            <lpage>72</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347430</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381955</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj067</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>In vivo imaging of molecularly targeted phage</p>
            </title>
            <aug>
               <au>
                  <snm>Kelly</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Waterman</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Weissleder</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Neoplasia</source>
            <pubdate>2006</pubdate>
            <volume>8</volume>
            <issue>12</issue>
            <fpage>1011</fpage>
            <lpage>1018</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1783712</pubid>
                  <pubid idtype="pmpid" link="fulltext">17217618</pubid>
                  <pubid idtype="doi">10.1593/neo.06610</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>Development of efficient viral vectors selective for vascular smooth muscle cells</p>
            </title>
            <aug>
               <au>
                  <snm>Work</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Nicklin</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Brain</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Dishart</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Von Seggern</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Hallek</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Buning</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>AH</fnm>
               </au>
            </aug>
            <source>Mol Ther</source>
            <pubdate>2004</pubdate>
            <volume>9</volume>
            <issue>2</issue>
            <fpage>198</fpage>
            <lpage>208</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.ymthe.2003.11.006</pubid>
                  <pubid idtype="pmpid" link="fulltext">14759804</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>Design and assay of inhibitors of HIV-1 Vpr Cell Killing and growth arrest activity using microbial assay systems</p>
            </title>
            <aug>
               <au>
                  <snm>Sankovich</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Koleski</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Baell</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Matthews</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Azad</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Macreadie</snm>
                  <fnm>IG</fnm>
               </au>
            </aug>
            <source>J Biomol Screen</source>
            <pubdate>1998</pubdate>
            <volume>3</volume>
            <issue>4</issue>
            <fpage>299</fpage>
            <lpage>304</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1177/108705719800300409</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B64">
            <title>
               <p>HMGB1 interacts with many apparently unrelated proteins by recognizing short amino acid sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Dintilhac</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bernues</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2002</pubdate>
            <volume>277</volume>
            <issue>9</issue>
            <fpage>7021</fpage>
            <lpage>7028</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M108417200</pubid>
                  <pubid idtype="pmpid" link="fulltext">11748221</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B65">
            <title>
               <p>Methods and compositions for targeting compounds to muscle. United States Patent 6399575</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>BF</fnm>
               </au>
               <au>
                  <snm>Samoilova</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <pubdate>2001</pubdate>
         </bibl>
         <bibl id="B66">
            <title>
               <p>Methods and compositions for targeting compounds to the central nervous system. United States Patent 6399575</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>BF</fnm>
               </au>
               <au>
                  <snm>Samoilova</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>HJ</fnm>
               </au>
            </aug>
            <pubdate>2002</pubdate>
         </bibl>
         <bibl id="B67">
            <title>
               <p>The National Center for Biotechnology Information (NCBI) BLAST server</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/BLAST/</url>
         </bibl>
         <bibl id="B68">
            <title>
               <p>A DNA-binding peptide from a phage display library</p>
            </title>
            <aug>
               <au>
                  <snm>Wolcke</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Weinhold</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nucleosides Nucleotides Nucleic Acids</source>
            <pubdate>2001</pubdate>
            <volume>20</volume>
            <issue>4-7</issue>
            <fpage>1239</fpage>
            <lpage>1241</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1081/NCN-100002526</pubid>
                  <pubid idtype="pmpid">11562993</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B69">
            <title>
               <p>Two human neonatal IgM antibodies encoded by different variable-region genes bind the same linear peptide: evidence for a stereotyped repertoire of epitope recognition</p>
            </title>
            <aug>
               <au>
                  <snm>Messmer</snm>
                  <fnm>BT</fnm>
               </au>
               <au>
                  <snm>Sullivan</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Chiorazzi</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Rodman</snm>
                  <fnm>TC</fnm>
               </au>
               <au>
                  <snm>Thaler</snm>
                  <fnm>DS</fnm>
               </au>
            </aug>
            <source>J Immunol</source>
            <pubdate>1999</pubdate>
            <volume>162</volume>
            <issue>4</issue>
            <fpage>2184</fpage>
            <lpage>2192</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9973494</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B70">
            <title>
               <p>Selection of D-amino-acid peptides that bind to Alzheimer's disease amyloid peptide abeta1-42 by mirror image phage display</p>
            </title>
            <aug>
               <au>
                  <snm>Wiesehan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Buder</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Linke</snm>
                  <fnm>RP</fnm>
               </au>
               <au>
                  <snm>Patt</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Stoldt</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Unger</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Schmitt</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bucci</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Willbold</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Chembiochem</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <issue>8</issue>
            <fpage>748</fpage>
            <lpage>753</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/cbic.200300631</pubid>
                  <pubid idtype="pmpid" link="fulltext">12898626</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B71">
            <title>
               <p>Control of crop pests &amp; animal parasites through direct neuronal uptake. United States Patent 20030181376</p>
            </title>
            <aug>
               <au>
                  <snm>Atkinson</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>McPherson</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Winter</snm>
                  <fnm>MD</fnm>
               </au>
            </aug>
            <pubdate>2003</pubdate>
         </bibl>
         <bibl id="B72">
            <title>
               <p>Identification of peptides that facilitate uptake and cytoplasmic and/or nuclear transport of proteins, DNA and viruses. United States Patent 20030219826</p>
            </title>
            <aug>
               <au>
                  <snm>Robbins</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Mi</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Frizzell</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Glorioso</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Gambotto</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mai</snm>
                  <fnm>JC</fnm>
               </au>
            </aug>
            <pubdate>2003</pubdate>
         </bibl>
         <bibl id="B73">
            <title>
               <p>A novel method to identify and characterise peptide mimotopes of heat shock protein 70-associated antigens</p>
            </title>
            <aug>
               <au>
                  <snm>Arnaiz</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Madrigal-Estebas</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Todryk</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>James</snm>
                  <fnm>TC</fnm>
               </au>
               <au>
                  <snm>Doherty</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Bond</snm>
                  <fnm>U</fnm>
               </au>
            </aug>
            <source>J Immune Based Ther Vaccines</source>
            <pubdate>2006</pubdate>
            <volume>4</volume>
            <fpage>2</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1482705</pubid>
                  <pubid idtype="pmpid" link="fulltext">16603084</pubid>
                  <pubid idtype="doi">10.1186/1476-8518-4-2</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B74">
            <title>
               <p>Selection by phage display of peptides targeting the HIV-1 TAR element</p>
            </title>
            <aug>
               <au>
                  <snm>Kolb</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Boiziau</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>RNA Biol</source>
            <pubdate>2005</pubdate>
            <volume>2</volume>
            <issue>1</issue>
            <fpage>28</fpage>
            <lpage>33</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17132933</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
