Open Access Open Badges Research article

Deep sequencing for de novo construction of a marine fish (Sparus aurata) transcriptome database with a large coverage of protein-coding transcripts

Josep A Calduch-Giner1, Azucena Bermejo-Nogales1, Laura Benedito-Palos1, Itziar Estensoro2, Gabriel Ballester-Lozano1, Ariadna Sitjà-Bobadilla2 and Jaume Pérez-Sánchez1*

Author affiliations

1 Nutrigenomics and Fish Growth Endocrinology Group, Department of Marine Species Biology, Culture and Pathology, Institute of Aquaculture Torre de la Sal, Castellón, CSIC, Spain

2 Fish Pathology Group, Department of Marine Species Biology, Culture and Pathology. Institute of Aquaculture Torre de la Sal, Castellón, CSIC, Spain

For all author emails, please log on.

Citation and License

BMC Genomics 2013, 14:178  doi:10.1186/1471-2164-14-178

Published: 15 March 2013



The gilthead sea bream (Sparus aurata) is the main fish species cultured in the Mediterranean area and constitutes an interesting model of research. Nevertheless, transcriptomic and genomic data are still scarce for this highly valuable species. A transcriptome database was constructed by de novo assembly of gilthead sea bream sequences derived from public repositories of mRNA and collections of expressed sequence tags together with new high-quality reads from five cDNA 454 normalized libraries of skeletal muscle (1), intestine (1), head kidney (2) and blood (1).


Sequencing of the new 454 normalized libraries produced 2,945,914 high-quality reads and the de novo global assembly yielded 125,263 unique sequences with an average length of 727 nt. Blast analysis directed to protein and nucleotide databases annotated 63,880 sequences encoding for 21,384 gene descriptions, that were curated for redundancies and frameshifting at the homopolymer regions of open reading frames, and hosted at webcite. Among the annotated gene descriptions, 16,177 were mapped in the Ingenuity Pathway Analysis (IPA) database, and 10,899 were eligible for functional analysis with a representation in 341 out of 372 IPA canonical pathways. The high representation of randomly selected stickleback transcripts by Blast search in the nucleotide gilthead sea bream database evidenced its high coverage of protein-coding transcripts.


The newly assembled gilthead sea bream transcriptome represents a progress in genomic resources for this species, as it probably contains more than 75% of actively transcribed genes, constituting a valuable tool to assist studies on functional genomics and future genome projects.

Sparus aurata; Next-generation sequencing; De novo assembly; Transcriptome; Database