Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Italian Society of Bioinformatics (BITS): Annual Meeting 2011

Open Access Research

Argot2: a large scale function prediction tool relying on semantic similarity of weighted Gene Ontology terms

Marco Falda1*, Stefano Toppo1, Alessandro Pescarolo1, Enrico Lavezzo2, Barbara Di Camillo3, Andrea Facchinetti3, Elisa Cilia4, Riccardo Velasco4 and Paolo Fontana4

Author Affiliations

1 Department of Molecular Medicine, University of Padova, via U. Bassi 58/B, 35121, Padova, Italy

2 Department of Molecular Medicine, University of Padova, via Gabelli 63, 35121, Padova, Italy

3 Department of Information Engineering, University of Padova, via Gradenigo 6, 35131, Padova, Italy

4 Istituto Agrario San Michele all'Adige Research and Innovation Centre, Foundation Edmund Mach, via E. Mach 1, 38010, San Michele all'Adige (Trento), Italy

For all author emails, please log on.

BMC Bioinformatics 2012, 13(Suppl 4):S14  doi:10.1186/1471-2105-13-S4-S14

Published: 28 March 2012

Abstract

Background

Predicting protein function has become increasingly demanding in the era of next generation sequencing technology. The task to assign a curator-reviewed function to every single sequence is impracticable. Bioinformatics tools, easy to use and able to provide automatic and reliable annotations at a genomic scale, are necessary and urgent. In this scenario, the Gene Ontology has provided the means to standardize the annotation classification with a structured vocabulary which can be easily exploited by computational methods.

Results

Argot2 is a web-based function prediction tool able to annotate nucleic or protein sequences from small datasets up to entire genomes. It accepts as input a list of sequences in FASTA format, which are processed using BLAST and HMMER searches vs UniProKB and Pfam databases respectively; these sequences are then annotated with GO terms retrieved from the UniProtKB-GOA database and the terms are weighted using the e-values from BLAST and HMMER. The weighted GO terms are processed according to both their semantic similarity relations described by the Gene Ontology and their associated score. The algorithm is based on the original idea developed in a previous tool called Argot. The entire engine has been completely rewritten to improve both accuracy and computational efficiency, thus allowing for the annotation of complete genomes.

Conclusions

The revised algorithm has been already employed and successfully tested during in-house genome projects of grape and apple, and has proven to have a high precision and recall in all our benchmark conditions. It has also been successfully compared with Blast2GO, one of the methods most commonly employed for sequence annotation. The server is freely accessible at http://www.medcomp.medicina.unipd.it/Argot2 webcite.