Log on / register
Feedback | Support | My details

This article is part of the supplement: The Second Automated Function Prediction Meeting .

Open AccessProceedings

Gene function prediction based on genomic context clustering and discriminative learning: an application to bacteriophages

Jason Li1 email, Saman K Halgamuge1 email, Christopher I Kells1 email and Sen-Lin Tang2 email

1Dynamic Systems & Control Group, DoMME, University of Melbourne, Melbourne, Australia

2Research Center for Biodiversity, Academia Sinica, Taipei, Taiwan

author email corresponding author email

BMC Bioinformatics 2007, 8(Suppl 4):S6doi:10.1186/1471-2105-8-S4-S6

Published: 22 May 2007

Abstract

Background

Existing methods for whole-genome comparisons require prior knowledge of related species and provide little automation in the function prediction process. Bacteriophage genomes are an example that cannot be easily analyzed by these methods. This work addresses these shortcomings and aims to provide an automated prediction system of gene function.

Results

We have developed a novel system called SynFPS to perform gene function prediction over completed genomes. The prediction system is initialized by clustering a large collection of weakly related genomes into groups based on their resemblance in gene distribution. From each individual group, data are then extracted and used to train a Support Vector Machine that makes gene function predictions. Experiments were conducted with 9 different gene functions over 296 bacteriophage genomes. Cross validation results gave an average prediction accuracy of ~80%, which is comparable to other genomic-context based prediction methods. Functional predictions are also made on 3 uncharacterized genes and 12 genes that cannot be identified by sequence alignment. The software is publicly available at http://www.synteny.net/ webcite.

Conclusion

The proposed system employs genomic context to predict gene function and detect gene correspondence in whole-genome comparisons. Although our experimental focus is on bacteriophages, the method may be extended to other microbial genomes as they share a number of similar characteristics with phage genomes such as gene order conservation.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.