Log on / register
Feedback | Support | My details
Open AccessResearch article

A robust linear regression based algorithm for automated evaluation of peptide identifications from shotgun proteomics by use of reversed-phase liquid chromatography retention time

Hua Xu1 email, Lanhao Yang2 email and Michael A Freitas1 email

1Department of Molecular Virology Immunology and Medical Genetics, Comprehensive Cancer Center, the Ohio State University Medical Center, Columbus, 43210, OH, USA

2Department of Chemistry, the Ohio State University, Columbus, 43210, OH, USA

author email corresponding author email

BMC Bioinformatics 2008, 9:347doi:10.1186/1471-2105-9-347

Published: 19 August 2008

Abstract

Background

Rejection of false positive peptide matches in database searches of shotgun proteomic experimental data is highly desirable. Several methods have been developed to use the peptide retention time as to refine and improve peptide identifications from database search algorithms. This report describes the implementation of an automated approach to reduce false positives and validate peptide matches.

Results

A robust linear regression based algorithm was developed to automate the evaluation of peptide identifications obtained from shotgun proteomic experiments. The algorithm scores peptides based on their predicted and observed reversed-phase liquid chromatography retention times. The robust algorithm does not require internal or external peptide standards to train or calibrate the linear regression model used for peptide retention time prediction. The algorithm is generic and can be incorporated into any database search program to perform automated evaluation of the candidate peptide matches based on their retention times. It provides a statistical score for each peptide match based on its retention time.

Conclusion

Analysis of peptide matches where the retention time score was included resulted in a significant reduction of false positive matches with little effect on the number of true positives. Overall higher sensitivities and specificities were achieved for database searches carried out with MassMatrix, Mascot and X!Tandem after implementation of the retention time based score algorithm.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.