<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-9-S7-P21</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Poster presentation</dochead>
      <bibl>
         <title>
            <p>Proteome discovery pipeline for mass spectrometry-based proteomics</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Gough</snm>
               <fnm>Erik</fnm>
               <insr iid="I1"/>
               <email>goughes@purdue.edu</email>
            </au>
            <au id="A2">
               <snm>Oh</snm>
               <fnm>Cheolhwan</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A3">
               <snm>He</snm>
               <fnm>Jing</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A4">
               <snm>Riley</snm>
               <mi>P</mi>
               <fnm>Catherine</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A5">
               <snm>Buck</snm>
               <mi>R</mi>
               <fnm>Charles</fnm>
               <insr iid="I1"/>
            </au>
            <au id="A6">
               <snm>Zhang</snm>
               <fnm>Xiang</fnm>
               <insr iid="I2"/>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Bindley Bioscience Center, Purdue University, West Lafayette, IN 47907, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Chemistry, Center for Regulatory and Environment Analytical Metabolomics, University of Louisville, Louisville, KY 40292, USA</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <supplement>
            <title>
               <p>UT-ORNL-KBRIN Bioinformatics Summit 2008</p>
            </title>
            <editor>Eric C Rouchka and Julia Krushkal</editor>
            <note>Meeting abstracts &#8211; A single PDF containing all abstracts in this Supplement is available <a href="http://www.biomedcentral.com/content/pdf/1471-2105-9-S7-full.pdf">here</a>.</note>
         </supplement>
         <conference>
            <title>
               <p>UT-ORNL-KBRIN Bioinformatics Summit 2008</p>
            </title>
            <location>Cadiz, KY, USA</location>
            <date-range>28&#8211;30 March 2008</date-range>
            <url>http://www.kbrin.louisville.edu/summit/</url>
         </conference>
         <issn>1471-2105</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>Suppl 7</issue>
         <fpage>P21</fpage>
         <url>http://www.biomedcentral.com/1471-2105/9/S7/P21</url>
         <xrefbib>
            <pubid idtype="doi">10.1186/1471-2105-9-S7-P21</pubid>
         </xrefbib>
      </bibl>
      <history>
         <pub>
            <date>
               <day>8</day>
               <month>7</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Gough et al; licensee BioMed Central Ltd.</collab>
      </cpyrt>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Overview</p>
         </st>
         <p>We have developed the Proteome Discovery Pipeline, a stand-alone bioinformatics platform used for LC/MS data analysis and biomarker discovery. Data is processed in a series of self-contained analytical steps using modules that are controlled by a graphical user interface. The user interface was developed in Visual C++ 6.0 and provides a multi-threaded, tabbed user interface with each tab representing a step in the analysis process. Modules included are spectrum deconvolution, alignment, normalization, significance tests and pattern recognition. Modules consist of applications developed in C++ and the R scripting language, which are called as external processes from the GUI using inputted parameters. Molecular correlation analysis can be viewed interactively using SysNet. Figure <figr fid="F1">1</figr> shows the architecture of the Proteome Discovery Pipeline.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Architecture of the proteome discovery pipeline</p>
            </caption>
            <text>
               <p>Architecture of the proteome discovery pipeline.</p>
            </text>
            <graphic file="1471-2105-9-S7-P21-1"/>
         </fig>
      </sec>
      <sec>
         <st>
            <p>Spectrum deconvolution</p>
         </st>
         <p>XMass <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> uses chemical noise filtering, charge state fitting and de-isotoping for improved analysis of complex peptide samples. Overlapping peptide signals in mass spectra were deconvoluted by correlation with modeled peptide isotopic peak profiles. Isotopic peak profiles for peptides were generated <it>in silico </it>from a protein database to produce reference model distributions.</p>
      </sec>
      <sec>
         <st>
            <p>Peak alignment</p>
         </st>
         <p>XAlign <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> is a two-step alignment algorithm. The first step is to detect significant peaks that are common to all samples. In the second step, all samples are aligned to the median sample using refined m/z and retention time variation values, where pattern recognition is applied as needed.</p>
      </sec>
      <sec>
         <st>
            <p>Normalization</p>
         </st>
         <p>Several normalization methods have been developed for proteomics, including auto-scaling, reference sample, log linear model, trimmed constant mean, and average intensity.</p>
      </sec>
      <sec>
         <st>
            <p>Statistical significance tests</p>
         </st>
         <p>Several different test methods (two-tailed t-test, one-way ANOVA, Kolmogorov-Smirnov test, the Mann-Whitney test) can be used to identify data elements that make large contributions to the protein profile of a sample or that distinguish groups of samples from others.</p>
      </sec>
      <sec>
         <st>
            <p>Pattern recognition</p>
         </st>
         <p>We have implemented principal component analysis (PCA), linear discriminate analysis (LDA), canonical discriminate analysis (CDA), and clustering objects on subset of attributes (COSA) <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> as clustering methods.</p>
      </sec>
      <sec>
         <st>
            <p>Molecular correlation</p>
         </st>
         <p>The software package, SysNet <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, is used to provide a dynamic visualization environment for molecular correlation of 'omics data. SysNet visualizes the 'omics expression data as a two-dimensional network. It features a circular layout, where molecular species are represented as nodes and all nodes are located on circles. The intermolecular correlations are represented as links, or edges, between nodes.</p>
      </sec>
   </bdy>
   <bm>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Data preprocessing in liquid chromatography mass spectrometry based proteomics</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Asara</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Adamec</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ouzzani</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Elmagarmid</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>4054</fpage>
            <lpage>4059</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti660</pubid>
                  <pubid idtype="pmpid" link="fulltext">16150809</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>An automated method for the analysis of stable isotope labeling data for proteomics</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Hines</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Adamec</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Asara</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Naylor</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Regnier</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>J Am Soc Mass Spectrom</source>
            <pubdate>2005</pubdate>
            <volume>16</volume>
            <fpage>1181</fpage>
            <lpage>1191</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.jasms.2005.03.016</pubid>
                  <pubid idtype="pmpid" link="fulltext">15922621</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Clustering objects on subsets of attributes</p>
            </title>
            <aug>
               <au>
                  <snm>Friedman</snm>
                  <fnm>JH</fnm>
               </au>
               <au>
                  <snm>Meulman</snm>
                  <fnm>JJ</fnm>
               </au>
            </aug>
            <source>J R Statist Soc B</source>
            <pubdate>2004</pubdate>
            <volume>66</volume>
            <issue>Part 4</issue>
            <fpage>1</fpage>
            <lpage>25</lpage>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Interactive analysis of 'omics molecular expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ouyang</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Stephenson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kane</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Salt</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Prabhakar</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Burger</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Buck</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>X</fnm>
               </au>
            </aug>
            <source>BMC Systems Biology</source>
            <pubdate>2008</pubdate>
            <volume>2</volume>
            <fpage>23</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2294108</pubid>
                  <pubid idtype="pmpid" link="fulltext">18312669</pubid>
                  <pubid idtype="doi">10.1186/1752-0509-2-23</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>

