Integrated multi-level quality control for proteomic profiling studies using mass spectrometry
- Equal contributors
1 Clinical and Biomedical Proteomics Group, Cancer Research UK Clinical Centre, Leeds Institute of Molecular Medicine, St James's University Hospital, Beckett Street, Leeds LS9 7TF, UK
2 Section of Epidemiology and Biostatistics, Leeds Institute of Molecular Medicine, St James's University Hospital, Beckett Street, Leeds LS9 7TF, UK
3 Department of Clinical Biochemistry and Immunology, Leeds General Infirmary, Great George Street, Leeds LS1 3EX, UK
BMC Bioinformatics 2008, 9:519 doi:10.1186/1471-2105-9-519Published: 4 December 2008
Proteomic profiling using mass spectrometry (MS) is one of the most promising methods for the analysis of complex biological samples such as urine, serum and tissue for biomarker discovery. Such experiments are often conducted using MALDI-TOF (matrix-assisted laser desorption/ionisation time-of-flight) and SELDI-TOF (surface-enhanced laser desorption/ionisation time-of-flight) MS. Using such profiling methods it is possible to identify changes in protein expression that differentiate disease states and individual proteins or patterns that may be useful as potential biomarkers. However, the incorporation of quality control (QC) processes that allow the identification of low quality spectra reliably and hence allow the removal of such data before further analysis is often overlooked. In this paper we describe rigorous methods for the assessment of quality of spectral data. These procedures are presented in a user-friendly, web-based program. The data obtained post-QC is then examined using variance components analysis to quantify the amount of variance due to some of the factors in the experimental design.
Using data from a SELDI profiling study of serum from patients with different levels of renal function, we show how the algorithms described in this paper may be used to detect systematic variability within and between sample replicates, pooled samples and SELDI chips and spots. Manual inspection of those spectral data that were identified as being of poor quality confirmed the efficacy of the algorithms. Variance components analysis demonstrated the relatively small amount of technical variance attributable to day of profile generation and experimental array.
Using the techniques described in this paper it is possible to reliably detect poor quality data within proteomic profiling experiments undertaken by MS. The removal of these spectra at the initial stages of the analysis substantially improves the confidence of putative biomarker identification and allows inter-experimental comparisons to be carried out with greater confidence.