Open Access Highly Accessed Methodology article

A method for increasing expressivity of Gene Ontology annotations using a compositional approach

Rachael P Huntley1, Midori A Harris2, Yasmin Alam-Faruque1, Judith A Blake3, Seth Carbon4, Heiko Dietze4, Emily C Dimmer1, Rebecca E Foulger1, David P Hill3, Varsha K Khodiyar5, Antonia Lock2, Jane Lomax1, Ruth C Lovering5, Prudence Mutowo-Meullenet1, Tony Sawford1, Kimberly Van Auken6, Valerie Wood2 and Christopher J Mungall4*

Author Affiliations

1 European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK

2 Department of Biochemistry, Cambridge Systems Biology Centre, University of Cambridge, Sanger Building, 80 Tennis Court Road, Cambridge CB2 1GA, UK

3 The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA

4 Lawrence Berkeley National Laboratory, Genomics Division, Berkeley, CA 94720, USA

5 Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, UK

6 California Institute of Technology, Division of Biology 156-29, Pasadena, CA 91125, USA

For all author emails, please log on.

BMC Bioinformatics 2014, 15:155  doi:10.1186/1471-2105-15-155

Published: 21 May 2014

Abstract

Background

The Gene Ontology project integrates data about the function of gene products across a diverse range of organisms, allowing the transfer of knowledge from model organisms to humans, and enabling computational analyses for interpretation of high-throughput experimental and clinical data. The core data structure is the annotation, an association between a gene product and a term from one of the three ontologies comprising the GO. Historically, it has not been possible to provide additional information about the context of a GO term, such as the target gene or the location of a molecular function. This has limited the specificity of knowledge that can be expressed by GO annotations.

Results

The GO Consortium has introduced annotation extensions that enable manually curated GO annotations to capture additional contextual details. Extensions represent effector–target relationships such as localization dependencies, substrates of protein modifiers and regulation targets of signaling pathways and transcription factors as well as spatial and temporal aspects of processes such as cell or tissue type or developmental stage. We describe the content and structure of annotation extensions, provide examples, and summarize the current usage of annotation extensions.

Conclusions

The additional contextual information captured by annotation extensions improves the utility of functional annotation by representing dependencies between annotations to terms in the different ontologies of GO, external ontologies, or an organism’s gene products. These enhanced annotations can also support sophisticated queries and reasoning, and will provide curated, directional links between many gene products to support pathway and network reconstruction.

Keywords:
Gene Ontology; Functional annotation; Annotation extension; Manual curation