Open Access Research article

High-throughput transcriptome sequencing and preliminary functional analysis in four Neotropical tree species

Louise Brousseau12, Alexandra Tinaut3, Caroline Duret1, Tiange Lang4, Pauline Garnier-Gere56 and Ivan Scotti1*

Author Affiliations

1 INRA, UMR 0745 EcoFoG, Campus agronomique BP 709, F-97387 Cedex, France

2 INRA, UMR 1137 EEF, allée de l’Arboretum, 54280 Champenoux, French Guiana

3 University of French West Indies and French Guiana, UMR EcoFoG, Campus agronomique BP 709, F-97387 KOUROU, Cedex, French Guiana

4 Key Laboratory of Tropical Forest Ecology, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla, Yunnan 666303, China

5 INRA, UMR 1202 BIOGECO, F-33610 Cestas, France

6 BIOGECO, UMR 1202, University of Bordeaux, F-33400 Talence, France

For all author emails, please log on.

BMC Genomics 2014, 15:238  doi:10.1186/1471-2164-15-238

Published: 27 March 2014

Abstract

Background

The Amazonian rainforest is predicted to suffer from ongoing environmental changes. Despite the need to evaluate the impact of such changes on tree genetic diversity, we almost entirely lack genomic resources.

Results

In this study, we analysed the transcriptome of four tropical tree species (Carapa guianensis, Eperua falcata, Symphonia globulifera and Virola michelii) with contrasting ecological features, belonging to four widespread botanical families (respectively Meliaceae, Fabaceae, Clusiaceae and Myristicaceae). We sequenced cDNA libraries from three organs (leaves, stems, and roots) using 454 pyrosequencing. We have developed an R and bioperl-based bioinformatic procedure for de novo assembly, gene functional annotation and marker discovery. Mismatch identification takes into account single-base quality values as well as the likelihood of false variants as a function of contig depth and number of sequenced chromosomes. Between 17103 (for Symphonia globulifera) and 23390 (for Eperua falcata) contigs were assembled. Organs varied in the numbers of unigenes they apparently express, with higher number in roots. Patterns of gene expression were similar across species, with metabolism of aromatic compounds standing out as an overrepresented gene function. Transcripts corresponding to several gene functions were found to be over- or underrepresented in each organ. We identified between 4434 (for Symphonia globulifera) and 9076 (for Virola surinamensis) well-supported mismatches. The resulting overall mismatch density was comprised between 0.89 (S. globulifera) and 1.05 (V. surinamensis) mismatches/100 bp in variation-containing contigs.

Conclusion

The relative representation of gene functions in the four transcriptomes suggests that secondary metabolism may be particularly important in tropical trees. The differential representation of transcripts among tissues suggests differential gene expression, which opens the way to functional studies in these non-model, ecologically important species. We found substantial amounts of mismatches in the four species. These newly identified putative variants are a first step towards acquiring much needed genomic resources for tropical tree species.

Keywords:
454-Pyrosequencing; Tropical rainforest tree species; Polymorphism discovery