Transcriptome sequencing and microarray design for functional genomics in the extremophile Arabidopsis relative Thellungiella salsuginea (Eutrema salsugineum)
1 Max-Planck-Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, D-14476 Potsdam, Germany
2 FELDA Agricultural Services Sdn Bhd, Tingkat 7, Balai Felda, Jalan Gurney 1, 54000 Kuala Lumpur, Malaysia
3 Center for Computational Biology and Bioinformatics, Columbia University, 10032 New York, NY, USA
4 Institute of Biotechnology, University of Vilnius, V. Graičiūno 8, LT-02241 Vilnius, Lithuania
5 Max-Planck-Institute for Molecular Genetics, Ihnestrasse 63-73, D-14195 Berlin, Germany
6 Institut für Biologie I, RWTH Aachen, Worringer Weg 1, D-52056 Aachen, Germany
7 IBG-2: Pflanzenwissenschaften, Forschungszentrum Jülich, D-52425 Jülich, Germany
8 Max F. Perutz Laboratories, University and Medical University of Vienna, Dr. Bohrgasse 9, A-1030 Vienna, Austria
BMC Genomics 2013, 14:793 doi:10.1186/1471-2164-14-793Published: 14 November 2013
Most molecular studies of plant stress tolerance have been performed with Arabidopsis thaliana, although it is not particularly stress tolerant and may lack protective mechanisms required to survive extreme environmental conditions. Thellungiella salsuginea has attracted interest as an alternative plant model species with high tolerance of various abiotic stresses. While the T. salsuginea genome has recently been sequenced, its annotation is still incomplete and transcriptomic information is scarce. In addition, functional genomics investigations in this species are severely hampered by a lack of affordable tools for genome-wide gene expression studies.
Here, we report the results of Thellungiella de novo transcriptome assembly and annotation based on 454 pyrosequencing and development and validation of a T. salsuginea microarray. ESTs were generated from a non-normalized and a normalized library synthesized from RNA pooled from samples covering different tissues and abiotic stress conditions. Both libraries yielded partially unique sequences, indicating their necessity to obtain comprehensive transcriptome coverage. More than 1 million sequence reads were assembled into 42,810 unigenes, approximately 50% of which could be functionally annotated. These unigenes were compared to all available Thellungiella genome sequence information. In addition, the groups of Late Embryogenesis Abundant (LEA) proteins, Mitogen Activated Protein (MAP) kinases and protein phosphatases were annotated in detail. We also predicted the target genes for 384 putative miRNAs. From the sequence information, we constructed a 44 k Agilent oligonucleotide microarray. Comparison of same-species and cross-species hybridization results showed superior performance of the newly designed array for T. salsuginea samples. The developed microarrays were used to investigate transcriptional responses of T. salsuginea and Arabidopsis during cold acclimation using the MapMan software.
This study provides the first comprehensive transcriptome information for the extremophile Arabidopsis relative T. salsuginea. The data constitute a more than three-fold increase in the number of publicly available unigene sequences and will greatly facilitate genome annotation. In addition, we have designed and validated the first genome-wide microarray for T. salsuginea, which will be commercially available. Together with the publicly available MapMan software this will become an important tool for functional genomics of plant stress tolerance.