Open Access Highly Accessed Research article

Large-scale Gene Ontology analysis of plant transcriptome-derived sequences retrieved by AFLP technology

Alessandro Botton1, Giulio Galla1, Ana Conesa2, Christian Bachem3, Angelo Ramina1* and Gianni Barcaccia1

Author Affiliations

1 Department of Environmental Agronomy and Crop Science, University of Padova, Viale dell'Università 16, Campus of Agripolis, 35020 Legnaro, Italy

2 Bioinformatics Department, Centro de Investigaçión Príncipe Felipe, Avda. Autopista Saler 16, 46013 Valencia, Spain

3 Department of Plant Sciences, Laboratory of Plant Breeding, Wageningen University and Research Centrum, PO Box 386, 6700 AJ Wageningen, The Netherlands

For all author emails, please log on.

BMC Genomics 2008, 9:347  doi:10.1186/1471-2164-9-347

Published: 24 July 2008

Abstract

Background

After 10-year-use of AFLP (Amplified Fragment Length Polymorphism) technology for DNA fingerprinting and mRNA profiling, large repertories of genome- and transcriptome-derived sequences are available in public databases for model, crop and tree species. AFLP marker systems have been and are being extensively exploited for genome scanning and gene mapping, as well as cDNA-AFLP for transcriptome profiling and differentially expressed gene cloning. The evaluation, annotation and classification of genomic markers and expressed transcripts would be of great utility for both functional genomics and systems biology research in plants. This may be achieved by means of the Gene Ontology (GO), consisting in three structured vocabularies (i.e. ontologies) describing genes, transcripts and proteins of any organism in terms of their associated cellular component, biological process and molecular function in a species-independent manner. In this paper, the functional annotation of about 8,000 AFLP-derived ESTs retrieved in the NCBI databases was carried out by using GO terminology.

Results

Descriptive statistics on the type, size and nature of gene sequences obtained by means of AFLP technology were calculated. The gene products associated with mRNA transcripts were then classified according to the three main GO vocabularies. A comparison of the functional content of cDNA-AFLP records was also performed by splitting the sequence dataset into monocots and dicots and by comparing them to all annotated ESTs of Arabidopsis and rice, respectively. On the whole, the statistical parameters adopted for the in silico AFLP-derived transcriptome-anchored sequence analysis proved to be critical for obtaining reliable GO results. Such an exhaustive annotation may offer a suitable platform for functional genomics, particularly useful in non-model species.

Conclusion

Reliable GO annotations of AFLP-derived sequences can be gathered through the optimization of the experimental steps and the statistical parameters adopted. The Blast2GO software was shown to represent a comprehensive bioinformatics solution for an annotation-based functional analysis. According to the whole set of GO annotations, the AFLP technology generates thorough information for angiosperm gene products and shares common features across angiosperm species and families. The utility of this technology for structural and functional genomics in plants can be implemented by serial annotation analyses of genome-anchored fragments and organ/tissue-specific repertories of transcriptome-derived fragments.