Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

Exon level integration of proteomics and microarray data

Danny A Bitton1, Michał J Okoniewski1, Yvonne Connolly2 and Crispin J Miller1*

Author affiliations

1 Cancer Research UK, Applied Computational Biology and Bioinformatics Group, Paterson Institute for Cancer Research, The University of Manchester, Christie Hospital Site, Wilmslow Road, Manchester, M20 4BX, UK

2 Cancer Research UK, Proteomics Service, Paterson Institute for Cancer Research, The University of Manchester, Christie Hospital Site, Wilmslow Road, Manchester, M20 4BX, UK

For all author emails, please log on.

Citation and License

BMC Bioinformatics 2008, 9:118  doi:10.1186/1471-2105-9-118

Published: 25 February 2008

Abstract

Background

Previous studies comparing quantitative proteomics and microarray data have generally found poor correspondence between the two. We hypothesised that this might in part be because the different assays were targeting different parts of the expressed genome and might therefore be subjected to confounding effects from processes such as alternative splicing.

Results

Using a genome database as a platform for integration, we combined quantitative protein mass spectrometry with Affymetrix Exon array data at the level of individual exons. We found significantly higher degrees of correlation than have been previously observed (r = 0.808). The study was performed using cell lines in equilibrium in order to reduce a major potential source of biological variation, thus allowing the analysis to focus on the data integration methods in order to establish their performance.

Conclusion

We conclude that part of the variation observed when integrating microarray and proteomics data may occur as a consequence both of the data analysis and of the high granularity to which studies have until recently been limited. The approach opens up the possibility for the first time of considering combined microarray and proteomics datasets at the level of individual exons and isoforms, important given the high proportion of alternative splicing observed in the human genome.