Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Database

EuroPineDB: a high-coverage web database for maritime pine transcriptome

Noé Fernández-Pozo1, Javier Canales1, Darío Guerrero-Fernández2, David P Villalobos1, Sara M Díaz-Moreno1, Rocío Bautista2, Arantxa Flores-Monterroso1, M Ángeles Guevara3, Pedro Perdiguero4, Carmen Collada34, M Teresa Cervera34, Álvaro Soto34, Ricardo Ordás5, Francisco R Cantón1, Concepción Avila1, Francisco M Cánovas1 and M Gonzalo Claros12*

Author Affiliations

1 Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Campus de Teatinos s/n, Universidad de Málaga, 29071 Málaga, Spain

2 Plataforma Andaluza de Bioinformática, Edificio de Bioinnovación, C/Severo Ochoa 34, Universidad de Málaga, 29590 Málaga, Spain

3 Departamento de Ecología y Genética Forestal, CIFOR-UNIA, Carretera de La Coruña, km 7,5, 28040 Madrid, Spain

4 UM Genómica y Ecofisiología Forestal INIA-UPM, Universidad Politécnica de Madrid, Madrid, Spain

5 Área de Fisiología Vegetal, Departamento BOS, Instituto Universitario de Biotecnología de Asturias, Universidad de Oviedo, 33071 Oviedo, Spain

For all author emails, please log on.

BMC Genomics 2011, 12:366  doi:10.1186/1471-2164-12-366

Published: 15 July 2011

Abstract

Background

Pinus pinaster is an economically and ecologically important species that is becoming a woody gymnosperm model. Its enormous genome size makes whole-genome sequencing approaches are hard to apply. Therefore, the expressed portion of the genome has to be characterised and the results and annotations have to be stored in dedicated databases.

Description

EuroPineDB is the largest sequence collection available for a single pine species, Pinus pinaster (maritime pine), since it comprises 951 641 raw sequence reads obtained from non-normalised cDNA libraries and high-throughput sequencing from adult (xylem, phloem, roots, stem, needles, cones, strobili) and embryonic (germinated embryos, buds, callus) maritime pine tissues. Using open-source tools, sequences were optimally pre-processed, assembled, and extensively annotated (GO, EC and KEGG terms, descriptions, SNPs, SSRs, ORFs and InterPro codes). As a result, a 10.5× P. pinaster genome was covered and assembled in 55 322 UniGenes. A total of 32 919 (59.5%) of P. pinaster UniGenes were annotated with at least one description, revealing at least 18 466 different genes. The complete database, which is designed to be scalable, maintainable, and expandable, is freely available at: http://www.scbi.uma.es/pindb/ webcite. It can be retrieved by gene libraries, pine species, annotations, UniGenes and microarrays (i.e., the sequences are distributed in two-colour microarrays; this is the only conifer database that provides this information) and will be periodically updated. Small assemblies can be viewed using a dedicated visualisation tool that connects them with SNPs. Any sequence or annotation set shown on-screen can be downloaded. Retrieval mechanisms for sequences and gene annotations are provided.

Conclusions

The EuroPineDB with its integrated information can be used to reveal new knowledge, offers an easy-to-use collection of information to directly support experimental work (including microarray hybridisation), and provides deeper knowledge on the maritime pine transcriptome.