Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Research article

PNAC: a protein nucleolar association classifier

Michelle S Scott1*, François-Michel Boisvert2, Angus I Lamond2 and Geoffrey J Barton1

Author Affiliations

1 Division of Biological Chemistry and Drug Discovery, College of Life Sciences, University of Dundee, Dow Street, Dundee DD1 5EH, UK

2 Wellcome Trust Centre for Gene Regulation and Expression, College of Life Sciences, University of Dundee, Dow Street, Dundee DD1 5EH, UK

For all author emails, please log on.

BMC Genomics 2011, 12:74  doi:10.1186/1471-2164-12-74

Published: 27 January 2011

Abstract

Background

Although primarily known as the site of ribosome subunit production, the nucleolus is involved in numerous and diverse cellular processes. Recent large-scale proteomics projects have identified thousands of human proteins that associate with the nucleolus. However, in most cases, we know neither the fraction of each protein pool that is nucleolus-associated nor whether their association is permanent or conditional.

Results

To describe the dynamic localisation of proteins in the nucleolus, we investigated the extent of nucleolar association of proteins by first collating an extensively curated literature-derived dataset. This dataset then served to train a probabilistic predictor which integrates gene and protein characteristics. Unlike most previous experimental and computational studies of the nucleolar proteome that produce large static lists of nucleolar proteins regardless of their extent of nucleolar association, our predictor models the fluidity of the nucleolus by considering different classes of nucleolar-associated proteins. The new method predicts all human proteins as either nucleolar-enriched, nucleolar-nucleoplasmic, nucleolar-cytoplasmic or non-nucleolar. Leave-one-out cross validation tests reveal sensitivity values for these four classes ranging from 0.72 to 0.90 and positive predictive values ranging from 0.63 to 0.94. The overall accuracy of the classifier was measured to be 0.85 on an independent literature-based test set and 0.74 using a large independent quantitative proteomics dataset. While the three nucleolar-association groups display vastly different Gene Ontology biological process signatures and evolutionary characteristics, they collectively represent the most well characterised nucleolar functions.

Conclusions

Our proteome-wide classification of nucleolar association provides a novel representation of the dynamic content of the nucleolus. This model of nucleolar localisation thus increases the coverage while providing accurate and specific annotations of the nucleolar proteome. It will be instrumental in better understanding the central role of the nucleolus in the cell and its interaction with other subcellular compartments.