PAnalyzer: A software tool for protein inference in shotgun proteomics
1 Department of Communications Engineering, University of the Basque Country (UPV/EHU), Alda. Urquijo s/n, Bilbao, 48013, Spain
2 Proteomics Core Facility-SGIKER, University of the Basque Country (UPV/EHU), Barrio Sarriena s/n, Leioa, 48940, Spain
3 Department of Biochemistry and Molecular Biology, University of the Basque Country (UPV/EHU), Barrio Sarriena s/n, Leioa, 48940, Spain
4 Department of Genetics, Physical Anthropology and Animal Physiology, University of the Basque Country (UPV/EHU), Barrio Sarriena s/n, Leioa, 48940, Spain
5 Proteolysis in diseases, IPATIMUP - Institute of Molecular Pathology and Immunology of the University of Porto, Rua Dr. Roberto Frias s/n, Porto, 4200-465, Portugal
BMC Bioinformatics 2012, 13:288 doi:10.1186/1471-2105-13-288Published: 5 November 2012
Protein inference from peptide identifications in shotgun proteomics must deal with ambiguities that arise due to the presence of peptides shared between different proteins, which is common in higher eukaryotes. Recently data independent acquisition (DIA) approaches have emerged as an alternative to the traditional data dependent acquisition (DDA) in shotgun proteomics experiments. MSE is the term used to name one of the DIA approaches used in QTOF instruments. MSE data require specialized software to process acquired spectra and to perform peptide and protein identifications. However the software available at the moment does not group the identified proteins in a transparent way by taking into account peptide evidence categories. Furthermore the inspection, comparison and report of the obtained results require tedious manual intervention. Here we report a software tool to address these limitations for MSE data.
In this paper we present PAnalyzer, a software tool focused on the protein inference process of shotgun proteomics. Our approach considers all the identified proteins and groups them when necessary indicating their confidence using different evidence categories. PAnalyzer can read protein identification files in the XML output format of the ProteinLynx Global Server (PLGS) software provided by Waters Corporation for their MSE data, and also in the mzIdentML format recently standardized by HUPO-PSI. Multiple files can also be read simultaneously and are considered as technical replicates. Results are saved to CSV, HTML and mzIdentML (in the case of a single mzIdentML input file) files. An MSE analysis of a real sample is presented to compare the results of PAnalyzer and ProteinLynx Global Server.
We present a software tool to deal with the ambiguities that arise in the protein inference process. Key contributions are support for MSE data analysis by ProteinLynx Global Server and technical replicates integration. PAnalyzer is an easy to use multiplatform and free software tool.