Proteome discovery pipeline for mass spectrometry-based proteomics

Gough, Erik; Oh, Cheolhwan; He, Jing; Riley, Catherine P; Buck, Charles R; Zhang, Xiang

doi:10.1186/1471-2105-9-S7-P21

Volume 9 Supplement 7

UT-ORNL-KBRIN Bioinformatics Summit 2008

Poster presentation
Open access
Published: 08 July 2008

Proteome discovery pipeline for mass spectrometry-based proteomics

Erik Gough¹,
Cheolhwan Oh¹,
Jing He¹,
Catherine P Riley¹,
Charles R Buck¹ &
…
Xiang Zhang²

BMC Bioinformatics volume 9, Article number: P21 (2008) Cite this article

2869 Accesses
4 Citations
Metrics details

Overview

We have developed the Proteome Discovery Pipeline, a stand-alone bioinformatics platform used for LC/MS data analysis and biomarker discovery. Data is processed in a series of self-contained analytical steps using modules that are controlled by a graphical user interface. The user interface was developed in Visual C++ 6.0 and provides a multi-threaded, tabbed user interface with each tab representing a step in the analysis process. Modules included are spectrum deconvolution, alignment, normalization, significance tests and pattern recognition. Modules consist of applications developed in C++ and the R scripting language, which are called as external processes from the GUI using inputted parameters. Molecular correlation analysis can be viewed interactively using SysNet. Figure 1 shows the architecture of the Proteome Discovery Pipeline.

Spectrum deconvolution

XMass [1] uses chemical noise filtering, charge state fitting and de-isotoping for improved analysis of complex peptide samples. Overlapping peptide signals in mass spectra were deconvoluted by correlation with modeled peptide isotopic peak profiles. Isotopic peak profiles for peptides were generated in silico from a protein database to produce reference model distributions.

Peak alignment

XAlign [2] is a two-step alignment algorithm. The first step is to detect significant peaks that are common to all samples. In the second step, all samples are aligned to the median sample using refined m/z and retention time variation values, where pattern recognition is applied as needed.

Normalization

Several normalization methods have been developed for proteomics, including auto-scaling, reference sample, log linear model, trimmed constant mean, and average intensity.

Statistical significance tests

Several different test methods (two-tailed t-test, one-way ANOVA, Kolmogorov-Smirnov test, the Mann-Whitney test) can be used to identify data elements that make large contributions to the protein profile of a sample or that distinguish groups of samples from others.

Pattern recognition

We have implemented principal component analysis (PCA), linear discriminate analysis (LDA), canonical discriminate analysis (CDA), and clustering objects on subset of attributes (COSA) [3] as clustering methods.

Molecular correlation

The software package, SysNet [4], is used to provide a dynamic visualization environment for molecular correlation of 'omics data. SysNet visualizes the 'omics expression data as a two-dimensional network. It features a circular layout, where molecular species are represented as nodes and all nodes are located on circles. The intermolecular correlations are represented as links, or edges, between nodes.

References

Zhang X, Asara J, Adamec J, Ouzzani M, Elmagarmid A: Data preprocessing in liquid chromatography mass spectrometry based proteomics. Bioinformatics 2005, 21: 4054–4059. 10.1093/bioinformatics/bti660
Article CAS PubMed Google Scholar
Zhang X, Hines W, Adamec J, Asara J, Naylor S, Regnier F: An automated method for the analysis of stable isotope labeling data for proteomics. J Am Soc Mass Spectrom 2005, 16: 1181–1191. 10.1016/j.jasms.2005.03.016
Article CAS PubMed Google Scholar
Friedman JH, Meulman JJ: Clustering objects on subsets of attributes. J R Statist Soc B 2004, 66(Part 4):1–25.
Google Scholar
Zhang M, Ouyang Q, Stephenson A, Kane MD, Salt DE, Prabhakar S, Burger J, Buck C, Zhang X: Interactive analysis of 'omics molecular expression data. BMC Systems Biology 2008, 2: 23. 10.1186/1752-0509-2-23
Article PubMed Central PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Bindley Bioscience Center, Purdue University, West Lafayette, IN, 47907, USA
Erik Gough, Cheolhwan Oh, Jing He, Catherine P Riley & Charles R Buck
Department of Chemistry, Center for Regulatory and Environment Analytical Metabolomics, University of Louisville, Louisville, KY, 40292, USA
Xiang Zhang

Authors

Erik Gough
View author publications
You can also search for this author in PubMed Google Scholar
Cheolhwan Oh
View author publications
You can also search for this author in PubMed Google Scholar
Jing He
View author publications
You can also search for this author in PubMed Google Scholar
Catherine P Riley
View author publications
You can also search for this author in PubMed Google Scholar
Charles R Buck
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Erik Gough.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Gough, E., Oh, C., He, J. et al. Proteome discovery pipeline for mass spectrometry-based proteomics. BMC Bioinformatics 9 (Suppl 7), P21 (2008). https://doi.org/10.1186/1471-2105-9-S7-P21

Download citation

Published: 08 July 2008
DOI: https://doi.org/10.1186/1471-2105-9-S7-P21

UT-ORNL-KBRIN Bioinformatics Summit 2008

Proteome discovery pipeline for mass spectrometry-based proteomics

Overview

Spectrum deconvolution

Peak alignment

Normalization

Statistical significance tests

Pattern recognition

Molecular correlation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Bioinformatics

Contact us

UT-ORNL-KBRIN Bioinformatics Summit 2008

Proteome discovery pipeline for mass spectrometry-based proteomics

Overview

Spectrum deconvolution

Peak alignment

Normalization

Statistical significance tests

Pattern recognition

Molecular correlation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us