Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: UT-ORNL-KBRIN Bioinformatics Summit 2010

Open Access Poster presentation

Development of tools for the automated analysis of spectra generated by tandem mass spectrometry

Sally Ellingson1, Joe Hughes2, Dylan Storey1*, Rick Weber3 and Nathan VerBerkmoes4

Author Affiliations

1 Genome Sciences and Technology, University of Tennessee Knoxville, Knoxville, TN 37996, USA

2 Department of Ecology and Evolutionary Biology, University of Tennessee Knoxville, Knoxville, TN 37996, USA

3 Department of Electrical Engineering and Computer Science, University of Tennessee Knoxville, Knoxville, TN 37996, USA

4 Chemical Sciences Division, Oak Ridge National Lab, Oak Ridge TN 37831, USA

For all author emails, please log on.

BMC Bioinformatics 2010, 11(Suppl 4):P27  doi:10.1186/1471-2105-11-S4-P27

The electronic version of this article is the complete one and can be found online at:

Published:23 July 2010

© 2010 Storey et al; licensee BioMed Central Ltd.


While multiple tools exist for the analysis and identification of spectra generated in shotgun proteomics experiments, few easily implemented tools exist that allow for the automated analysis of the quality of spectra. A researcher’s knowledge of the quality of a spectra from an experiment can be helpful in determining possible reasons for misidentification or lack of identification of spectra in a sample.

Materials and methods

We are developing a automated high throughput method that analyses spectra from 2d-LC-MS/MS datasets to determine their quality and overall determines the quality of the run. We will then compare our programs to existing programs that perform a similar function. Our program calculates a quality score based on the following metrics: signal/noise ratio, absolute signal intensity, peak number, predicted mass distances between peak, and percent of incoming mass accounted for by peaks. These scores are then graphed against the outputs of common database search algorithms in order to display the following four categories: High-quality/Identified, High-quality/Unidentified, Low-quality/Identified, and Low-quality/Unidentified. We are currently testing the algorithm against 2d-LC-MS/MS runs of a mixed protein standard and blanks with no peptide spectra. The application samples are a time series of metaproteomes collected from environmental ground waters after biostimulation.