Anthurium andraeanum is one of the most popular tropical flowers. In temperate and cold zones, a much greater risk of cold stress occurs in the supply of Anthurium plants. Unlike the freeze-tolerant model plants, Anthurium plants are particularly sensitive to low temperatures. Improvement of chilling tolerance in Anthurium may significantly increase its production and extend its shelf-life. To date, no previous genomic information has been reported in Anthurium plants.
Using Illumina sequencing technology, we generated over two billion base of high-quality sequence in Anthurium, and demonstrated de novo assembly and annotation of genes without prior genome information. These reads were assembled into 44,382 unigenes (mean length = 560 bp). Based on similarity search with known protein in the non-redundant (nr) protein database, 27396 unigenes (62%) were functionally annotated with a cut-off E-value of 10-5. Further, DGE tags were mapped to the assembled transcriptome for gene expression analysis under cold stress. In total, 4363 differentially expressed genes were identified. Among these genes, 292, 805 and 708 genes were up-regulated after 1-h, 5-h and 24-h cold treatment, respectively. Then we mapped these cold-induced genes to the KEGG database. Specific enrichment was observed in photosynthesis pathway, metabolic pathways and oxidative phosphorylation pathway in 1-h cold-treated plants. After a 5-h cold treatment, the metabolic pathways and oxidative phosphorylation pathway were significantly identified as the top two pathways. After 24-h cold treatment, mRNA surveillance pathway, RNA transport pathway and plant-pathogen interaction pathway were significantly enriched. Together, a total of 39 cold-inducible transcription factors were identified, including subsets of AP2/ERF, Zinc figure, NAC, MYB and bZIP family members.
Our study is the first to provide the transcriptome sequence resource for Anthurium plants, and demonstrate its digital gene expression profiling under cold conditions using the assembled transcriptome data for reference. These data provides a valuable resource for genetic and genomic studies under abiotic conditions for Anthurium plants.
Keywords:Anthurium; Cold; Transcriptome; Digital gene expression
Low temperature is an environmental abiotic stimulus. To adapt to environmental changes, plants have various physiological response and defense systems to withstand chilling and freezing conditions. The regulatory mechanism in higher plants had been analyzed by studying a number of genes responding to cold stress at the transcriptional level [1,2]. In Arabidopsis thaliana, for example, thousands of genes were thought to be involved in abiotic stress [3,4]. Generally, the cold stress-inducible genes were classified into two groups: one group was that directly protect plant against environmental stresses; and the other was that regulate gene expression network and signaling in stress response . Recent progress had been made in analyzing the functions of stress-inducible genes, not only to understand the mechanisms of cold stress, but also to improve the chilling tolerance of plants by gene transfer. Genetic studies had identified many transcription factors that extensively involved in the regulation network of cold-inducible genes . The most well documented pathways involved a class of DREB/CBF transcription factors, which specific binding to the DRE/CRT cis-elements in the promoters of the target genes [5,7]. In A. thaliana, recent studies also focused on the upstream regulators of DREB/CBF. Several proteins, including ICE1, MYB15 and CAMTA3, had been identified as regulators of DREB1/CBF gene expression [8,9].
Genome-wide analyses have dramatically improved the efficiency of gene discovery. With the advent of next-generation sequencing, large scale transcriptome data became available in both model species and non-model species [10,11]. Although microarray technologies continue to make progress, genome-scale studies have been actively pursued for gene discovery and construction of biochemical and regulatory networks [12,13]. Using the transcriptome data for reference, DGE tag profiling generates a digital output proportional, allowing digital quantification and comparation across different experiments. A number of studies have used high-throughput expression analyses by stress conditions in higher plants. In A. thaliana, about 30% of the transcriptome was regulated by abiotic stress, and 2,409 genes were identified of great importance to cold, salt, and drought tolerance . In wheat, the effect of low temperature on transcriptome reprogramming was explored, and over 2% of the wheat genome showed a greater than two-fold changes by cold stress . In roots of cucumber, large-scale differentially expressed genes in several regulatory pathways were identified under waterlogging stress by digital gene expression profile . Although transcriptome changes have been identified, the biochemical functions of many stress-regulated genes remain unknown.
Anthurium andraeanum is one of the most popular tropical flowers in markets worldwide. Unlike the freeze-tolerant A. thaliana, Anthurium plants are irreversibly injured by long-term exposure to temperatures lower than 6°C. In temperate and cold areas, a much greater risk of cold stress occurs in the supply of Anthurium plants, especially in the process of production and transportation. After duration of long-time cold exposure, Anthurium plants cease to grow, and wilting symptoms begin to appear, with visible signs of yellowing leaves. It has been demonstrated that the physiological changes of cold response is mediated through the differential expression of many genes in model plants [4,17]. However, little is known about the cold-regulated genes and related pathways in Anthurium. To date, no previous genomic information has been reported in Anthurium plants, and fewer than 300 ESTs have previously been deposited in Genbank.
In this study, we presented the first comprehensive transcriptome characterization for Anthurium plants, and explored the effect of low temperature on global changes in the transcriptome. Using Illumina sequencing technology, we generated over two billion base of high-quality sequence, and demonstrated de novo assembly and annotation of genes without prior genome information. Furthermore, we compared the gene expression profiles of cold-treated and control Anthurium plants using DGE system, and identified numerous differentially and specifically expressed transcripts of cold-regulated genes. This represents a fully characterized Anthurium transcriptome, and provides a valuable resource for genetic and genomic studies of abiotic stress in the future.
Sequencing and reads assembly
To obtain a global overview of Anthurium transcriptome, a mixed cDNA sample of seedlings of cold-treated and control plants was prepared and sequenced using Illumina HiSeq™ 2000. Sequencing of normalized cDNA libraries resulted in approximately 2 Gbp of sequence data. After the first assembly, 411,416 contigs were obtained, and the mean contig size was 138 bp with lengths. Then the contigs were further assembled into 73,444 scaffolds with paired-end reads joining. The size distribution of contigs and scaffolds are shown in Additional file 1. After gap fillings of scaffolds, the final assembly consisted of 44,382 unigenes (Table 1). The length distributions of unigenes are given in Figure 1.
Format: PDF Size: 94KB Download file
This file can be viewed with: Adobe Acrobat Reader
Table 1. Sequence statistics of the Anthurium transcriptome
Figure 1. Histogram of unigene length distributions and the proportion of sequences annotated. Unigenes were searched in the NCBI nr protien database using blastx with a cut-off E-value of 1.0E-5.
The unigene sequences were firstly compared to the non-redundant (nr) protein database with a cut-off E-value of 10-5. As a result, 27,396 unigenes (62%) were annotated. Figure 1 indicated that the ratio of unigenes that could be matched to known genes had a linear relationship with the length of unigene sequences. As shown in the figure, 98% unigenes over 1,000 bp were annotated with gene names, whereas 46% sequences between 200 to 500 bp could be matched to known proteins (Figure 1). For E-value distribution, 61% homolog sequences ranged between 1.0E-5 to 1.0E-50, while 39% sequences had a threshold E-value less than 1.0E-50 that showed strong homology (Figure 2A). The species that provided the best BLASTx matches (first hit) was Oryza sativa, and there were more than ten thousand genes with the highest homology. The next closest species was A. thaliana, which showed 21% homology with Anthurium andraeanum (Figure 2B).
Figure 2. Characteristics of homology search of query sequences aligned by BLASTx to the nr database. (A) E-value distribution of unigenes BLASTx hits in the nr database with an E-value threshold of 1.0E-5. (B) Species distribution of the first BLAST hits for each sequence with a cut-off E-value of 1.0E-5.
We used Gene Ontology (GO) to classify functions of the annotated genes. Using Blast2GO suite, 13,705 sequences (31%) were categorized to three different GO trees (biological process, cellular component and molecular function). The three main categories were further classified to 54 functional groups. Seven groups, ‘cellular process’, ‘metabolic process’, ‘cell’, ‘cell part’, ‘organelle’, ‘binding’ and ‘catalytic’ are dominant clusters in GO classification (Table 2; Figure 3). We further performed phylogenetic classification using clusters of Orthologous Groups (COG) database. In total, 23,367 genes were matched, and they were grouped into 25 functional classes. The cluster for ‘General function prediction only’ (3,121) and ‘Transcription’ (2,276) were the two largest groups in percentage, which represent 13% and 10% respectively (Table 2; Figure 4). To make further understanding of the transcriptome data, we carried out pathway analysis with Kyoto Encyclopedia of Genes and Genomes (KEGG) mapping of the Anthurium andraeanum transcriptome. Totally 14,105 sequences were identified with pathway annotation, and they were functionally assigned to 125 KEGG pathways. The ‘metabolic pathways’ contributed to the greatest parts (3,473 members, 24.62%), followed by ‘spliceosome’ (1,320 members, 9.36%) and ‘plant-pathogen interaction’ (1,180 members, 8.37%) (Table 2; Additional file 2).
Table 2. All-in-one list of Anthurium transcriptome annotations
Figure 3. Histogram of Gene Ontology (GO) classifications. Biological process (brown), Cellular components (yellow) and Molecular function (blue). Percentages indicate the proportion of unigenes that have GO annotations.
Figure 4. Distribution of genes in the transcriptome with COG functional classification. 23367 sequences have a COG classification among 25 categories.
Format: XLSX Size: 18KB Download file
Digital gene expression (DGE) library sequencing
To generate digital expression signatures for Anthurium plants under low temperature, we used Solexa (Illumina) technology for sequencing. Briefly, using DGE technology, we sequenced six libraries: control plant samples before transferred from 23 to 6°C (warm sample), and cold-treated samples at 1, 2, 5, 10, 24 h under 6°C after transfer. To facilitate our analysis, three cold-treated samples (1, 5 and 24 h) were selected for comparation with control plants in the following analysis. In total, the DGE libraries generated between 4.6 and 4.9 million raw reads. After removing low quality reads, total number of clean tags per library ranged from 4.2 to 4.5 million, and number of clean tags entitles with unique nucleotide sequences ranged from 1,327,437 to 2,008,977 (Table 3). To evaluate the normality of the DGE data, we performed analysis of the distribution of tag expression. As shown in Figure 5, the distribution of total tags and distinct clean tags over different tag-abundance categories displayed similar patterns for all four DGE libraries. Small number categories of mRNA had high abundance, while the rest majority stays at a very low level of expression.
Table 3. Statistics of DGE sequencing of Anthurium leaves under low temperature
Figure 5. Distribution of total tags and distinct tags of DGE sequencing. The left figure is the distribution of total tag number, and the right figure is the distribution of distinct clean tags. Numbers in the brackets indicates the range of copy numbers.
Mapping DGE sequences to the reference transcriptome database
To reveal the molecular events of the DGE profiles, application of the transcriptome data was convened into further study of its digital gene expression signatures under low temperature in Anthurium plants. We mapped the tag sequences of the four DGE libraries to our transcriptome reference database. The reference database contained 44,382 unigenes. Among the generated distinct tags, 41,139 to 57,996 distinct tags were mapped to a gene in the reference transcriptome database. Over 40% sequences in the reference tag database could be mapped to a unique tag (Table 3). The saturation analysis was performed to check whether the number of detected genes increased proportionally to the sequence amount. Additional file 3 showed the trend of saturation, and the number of detected genes almost ceased to increase when sequencing amount reached 2 million or higher. The gene expression level was determined by number concentrations using TPM (number of transcripts per million clean tags). The normalized final counts were measured. As summarized in Additional file 4, only a small number of genes were highly expressed, most of the genes had fewer than 10 copies.
Format: PDF Size: 133KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional file 4. The gene expression level in DGE libraries. TPM (Transcripts Per Million clean tags) is a standardized indicator, pointing out number of transcript copies in every 1 million clean tags. The number of unambiguous clean tags for each gene was calculated and then normalized to TPM.
Format: XLSX Size: 261KB Download file
Analysis of differential gene expression
A total of 4,363 significantly changed genes were detected between cold-treated and control libraries. We found that over 50% of the differentially expressed genes had no homologues in the NCBI database. Screening of differentially expressed genes, the results showed that both up- and down-regulation of gene expression occur, but that, the expression of genes changed over time of cold treatments. Among all differentially expressed genes, 292 were quickly induced by cold while 1,743 were down-regulated at 1 h (Figure 6; Additional file 5). After 5-h cold treatment, 805 genes were up-regulated, and 1,768 were down-regulated (Figure 6; Additional file 6). After 24-h cold stress treatment, 708 genes were up-regulated, and 1,158 were down-regulated (Figure 6; Additional file 7).
Figure 6. Differentially expressed genes (DEGs) for Anthurium leaves under low temperature. The number of up-regulated and down-regulated genes in cold treated plants was shown. We use “FDR (False Discovery Rate) ≤ 0.001 and the absolute value of log2Ratio ≥ 1” as the threshold to judge the significance of gene expression difference. Control plants were grown at 23°C. 1 h, 5 h, 24 h, plants were treated under 6°C low temperature for 1 h, 5 h and 24 h, respectively.
Additional file 5. The differentially expressed genes between 1-h cold treatment and control plants. TPM: transcript copies per million tags. FDR: false discovery rate. We used “FDR ≤ 0.001 and the absolute value of log2Ratio ≥ 1” as the threshold to judge the significance of gene expression difference.
Format: XLSX Size: 355KB Download file
Additional file 6. The differentially expressed genes between 5-h cold treatment and control plants. TPM: transcript copies per million tags. FDR: false discovery rate. We used “FDR ≤ 0.001 and the absolute value of log2Ratio ≥ 1” as the threshold to judge the significance of gene expression difference.
Format: XLSX Size: 252KB Download file
Additional file 7. The differentially expressed genes between 24-h cold treatment and control plants. TPM: transcript copies per million tags. FDR: false discovery rate. We used “FDR ≤ 0.001 and the absolute value of log2Ratio ≥ 1” as the threshold to judge the significance of gene expression difference.
Format: PDF Size: 122KB Download file
This file can be viewed with: Adobe Acrobat Reader
Metabolic pathway by KEGG analysis of cold up-regulated genes
In this study, we concentrated on up-regulated genes by cold stress, and genes with expression ratios greater than 2 times compared with unstressed plants were defined as cold-inducible genes. To understand the biological function of these genes, we mapped the cold-induced genes to terms in KEGG database, with a view to identifies significantly enriched metabolic pathways or signal transduction pathways in DEGs comparing with the whole genome background. Among the mapped pathways, six pathway were significantly enriched (Qvalue ≤ 0.05) after the 1-h cold treatment. Notably, specific enrichment was observed in photosynthesis pathway, metabolic pathways and oxidative phosphorylation pathway in 1-h cold-treated plants. After a 5-h cold treatment, the metabolic pathways and oxidative phosphorylation pathway were significantly identified as the top two pathways. After 24-h cold treatment, we found that transcripts involved in mRNA surveillance pathway, RNA transport pathway and plant-pathogen interaction pathway were significantly enriched (Table 4).
Table 4. Significantly enriched pathways of up-regulated genes by cold stress in Anthurium
Analysis of cold up-regulated genes identified as putative transcription factors
Given that transcription factors appear to have a major effect on the network of cold-responsive genes, an objective of our work was to identify cold-inducible transcription factors. Together, a total of 39 up-regulated genes encode known or putative transcription factors. As shown in Table 5, twelve genes were characterized as AP2/ERF family transcription factors, constituting a large proportion of cold-inducible transcription factors. Among these genes, three dehydration-responsive element binding proteins (DREB) were found to be significantly induced by cold stress in Anthurium. A subset of Zinc figure family members, including WRKY subfamily, were involved in responses to cold stimuli. Next, relative small group of NAC, MYB and bZIP transcription factors family members was characterized as cold-inducible transcription factors (Table 5). To further evaluate the role of these transcription factors, we analyzed the expression level of certain transcription factors by calculating the number of unambiguous clean tags and then normalizing to TPM (transcript copies per million clean tags). As shown in Figure 7, these transcription factor genes were significantly up-regulated by cold stress. However, the induction model varied among transcription factor genes. For example, the expression of AP2/ERF1 and Znf1 was induced shortly after starting cold treatment, and the expression increased gradually over the treatment time. AP2/ERF2 and Znf2 had an expression peak at 1 h, and then the expressions decreased quickly to a relative low level.
Table 5. Transcription factors up-regulated by cold stress in Anthurium
Figure 7. Analysis of differentially expressed transcription factor genes in Anthurium under low temperature. The gene expression levels were determined by calculating the number of unambiguous tags for each gene and then normalized to TPM (transcript copies per million tags).
Evaluation of de novo transcriptome assembly quality
Despite the great advances in DNA sequencing technologies, no genomic data is available to date for Anthurium plants. With the development of RNA-seq, transcriptome analysis has become an attractive alternative for in-depth analysis at high resolution. In this study, we carried out de novo transcriptome assembly using short-read (Illumina) sequencing. To date, there are many studies using the Roche 454 GS platform [18,19], which has relative longer reads but at the expense of less sequence data per run. However, the Illumina generates greater depth of sequencing, ensuring more complete coverage of the transcriptome comparing with 454 GS platform. In this study, despite using shorter reads, our assembly is comparable to other published transcriptomes using 454 GS platform. As shown in Figure 1, more than13% unigenes were greater than 1 kb, and 34% unigenes were greater than 500 bp. These results demonstrated the effectiveness of the assembly in capturing a large portion of the transcriptome. Another useful metric is the proportion of the unigenes and its corresponding BLAST hit. Due to lack of genomic resources for Anthurium, the proportions of unigenes that have significant similarity to known proteins in GenBank were considered as ‘gold standard’ reference in our studies. Nearly 61.73% of our unigenes had matches in nr protein database, and this value was as high as other comparable statistics reported in de novo assemblies.
Analysis of differentially expressed genes
It has long been known that extensive changes of gene expression occur when plants are exposed to cold stress . Generally, both up- and down-regulation of gene expression occur under cold conditions. It was reported that more genes were up-regulated than down-regulated under cold stress in model plants . In A. thaliana, it was reported that 302 genes were found to be cold responsive, and 88 (27%) decreased in abundance . In wheat, over 2% of the whole genome showed altered levels of expression in response to cold stress. Among these genes, 1,711 genes were induced by a cold shock, and 1,402 were down-regulated . In this study, 4,363 differently expressed genes were identified in Anthurium, and the relative genomic proportion devoted to cold stress is unknown due to the lack of genome resources. About 30% genes were found to be cold inducible, more than 70% decreased in abundance. These results suggested that plants responses vary in their abilities to adapt to cold stress. Gene expression in tropical plants might extensively differ from that in freeze-tolerant model plants conferring their response to cold stimuli. Among the differentially expressed genes regulated by cold stress in Anthurium, over 50% of them had no homologues in the NCBI database. Some of these genes might represent new cold-responsible transcripts which have not been reported in model plants.
Photosynthesis is highly sensitive to changing temperatures. In barley, it has been reported that light and photosynthetic activity play an important role in plant frost resistance under cold conditions . In meadow fescue, about 50% of the differentially changed proteins were involved in photosynthesis during cold conditions . In this study, to understand the biological function of the genes in Anthurium, we mapped the differentially expressed genes to terms in KEGG database. The result showed that photosynthesis pathway was significantly enriched after 1-h cold treatment, suggesting a role in early response to low temperature. Detailed information of photosynthesis pathway in KEGG database indicated that photosystems I and II (PSI and PSII) were rapidly influenced by cold stress (see Additional file 8). The electrons from PSI and PSII might transfer to oxygen, and thereby leads to substantial generation of reactive oxygen species. Enrichment of oxidative phosphorylation pathway after 1-h cold treatment suggested that light-induced ROS production might occur. However, this hypothesis still lack of experimental evidence.
Additional file 8. Significantly enriched pathways by 1-h cold stress in Anthurium. (A) Significantly enriched pathways of differentially expressed genes by 1-h cold stress in Anthurium. (B) Detailed information of photosynthesis pathway in KEGG database.
Format: PDF Size: 248KB Download file
This file can be viewed with: Adobe Acrobat Reader
Major transcription factors of cold stress response
Plants have evolved physiological and biochemical ways in order to cope with cold stress. These responses require expression of large-scale cold-responsive genes. In model plants, transcription factors have been described to control the network of many target genes through direct binding to cis-elements of gene promoter regions [22-24]. In Anthurium, 39 up-regulated genes were identified as putative transcription factors, including AP2/ERF, Zn-finger, WRKY, NAC, MYB, bZIP families. These transcription factors exhibited different induction patterns over time periods (Table 5; Figure 7). Based on our results of Anthurium plants, the ‘early rapid response’ (phase-1) to cold stress might occurred within the initial 1 hour, while the ‘early slow response’ (phase-2) occurred between 1 and 5 hours. Majority of cold up-regulated transcription factors were activated at phase-1 and phase-2, and the transcription factors activated at phase-1 might play a key role in the activation of the coordinated expression of down-stream genes. Fewer transcription factors exhibited ‘late response’ profiles (phase-3) until 24 hours, and these transcription factors were likely controlled by phase-1 transcription factors.
One group of the most studied transcription factors involved in cold responses is the AP2/ERF family transcription factors, which includes four major subfamilies: the AP2, ERF, RAV and dehydration-responsive element-binding protein (DREB/CBF) subfamilies . Of these, the DREB subfamily has been described as major factors involved in plant cold-stress responses . The role of DREBs/CBFs has been well established in numerous plants, such as A. thaliana, rice , maize  and wheat . The DREB1s/CBFs expression levels were positively correlated with cold tolerance and it activate the expression of downstream cold-responsive genes via specific binding to the DRE/CRT cis-acting element in their promoters . In Anthurium, three DREB genes were found to be quickly and significantly induced by cold stress. The well-established DREB1/CBF pathway might functionally act in Anthurium plants.
Although the DREB1/CBF regulon appeared to be one of the main regulatory pathways involved in cold stress, and multiple low-temperature regulatory pathways in addition to the DREB1/CBF pathway were also studied. In A. thaliana, transcriptome-profiling indicated at least 28% of the cold-responsive genes were independent of the CBF cold response pathway, including at least 15 transcription factors . In wheat, at least one-third of the cold-inducible genes were not responsive to CBF regulation . Discovery of novel regulators and regulons by cold stress has always been a continuing story. Recently, several zinc-figure transcription factors were discovered to be novel regulators in A. thaliana and rice [30,31]. A novel MYBS3 transcription factor was identified in rice, and it could repress the well-known DREB1/CBF dependent signaling pathway . In Anthurium plants, many transcription factors in addition to AP2/ERF family were actively induced by cold stress. Notably, zinc-figure proteins constitute the largest cold-inducible transcription factor family within the initial 1 hour in Anthurium (Table 5). Early activation of these transcription factors implied that certain zinc-figure proteins might acts in parallel to the DREB /CBF pathway. Certain regulatory mechanisms might exist involving these transcription factors and their regulons in cold stress. Further investigation should focus on the role of these transcription factors and the related pathways under abiotic stress.
In this study, we used high-throughput sequencing data to characterize the transcriptome of Anthurium, a species for which little genomic data are available. Further, DGE tags were mapped to the assembled transcriptome for further gene expression analysis. A large number of candidate genes involved in cold stress were identified. This represents a fully characterized transcriptome, and provides a valuable resource for genetic and genomic studies in Anthurium plants. Additionally, DGE profiling provides new leads for functionally studies of genes involved in abiotic stress.
Anthurium cv. Alabama were used as experimental materials. Micropropagated plants were transferred and grown in the greenhouse for three month in Zhejiang academy of agricultural sciences. Control plants were then grown in a growth chamber under continuous light (20–40 μm sec-1 m-2) at 23°C. For cold treatments, plants were transferred to 6°C under the same light source. Fresh tissues were collected at various periods (1, 2, 5, 10 and 24 h). Samples of stems and leaves were separated and frozen in liquid nitrogen for further analysis.
RNA extraction and library preparation for tanscriptome
Total RNA was isolated using TRIzol reagent (Invitrogen, Carlsbad, CA) according to the manufacturer's instructions. Poly (A) + mRNA was purified with oligo (dT) beads, and then the mRNA was randomly segmented into small fragments by divalent cations (Fragmentation Buffer, Illumina, Hayward, CA) at 94°C for 5 min. These short fragments were used as templates to synthesize the first-strand cDNA using random hexamer primers. The second-strand cDNA was synthesized using RNaseH and DNA polymerase I. Short cDNA fragments were purified with QiaQuick PCR extraction kit. After that, the cDNA fragments were connected with sequencing adapters according to Illumina’s protocol (San Diego, CA USA). After agarose gel electrophoresis, the target fragments of 300–500 bp were selected for PCR amplification to create the final cDNA library.
Sequencing, de novo assembly and functional annotation
A mixed cDNA sample of control plants and 1, 2, 5, 10 and 24 h cold-treated plants were prepared for sequencing. The library was sequenced using Illumina HiSeq™ 2000. Raw reads were cleaned by removing adaptor sequences. Empty reads and reads with unknown sequences ‘N’ were removed before data analysis. Then de novo assembly was carried out with short reads assembling program-SOAPdenovo . Firstly, short reads were combined with certain length of overlap to form longer fragments without N, which were called contigs. Then the reads were mapped back to contigs, and congtigs from the same transcript as well as the distances between these contigs were detected with paired-end reads. Next, SOAPdenovo connected the contigs using N to represent unknown sequences between each two contigs, and then scaffolds were made. Paired-end reads were used again for gap filling of scaffolds to get sequences with least Ns and cannot be extended on either end. The final sequences were defined as unigenes. After assembly, unigenes were compared to the non-redundant (nr) protein database with a cut-off E-value of 10-5. The Gene Ontology  was analyzed using Blast2go software . In a final step, the COG  and KEGG pathway  annotation was performed, and the best aligning results were used to decide sequence direction of unigenes. The raw data are available in the ArrayExpress database (http://www.ebi.ac.uk/arrayexpress webcite) under accession number E-MTAB-1955.
Digital gene expression library preparation and sequencing
Total RNAs were extracted from cold-treated and control plants. About 6 μg of total RNA was prepared, and treated with Oligo (dT) magnetic beads adsorption to purify mRNA. Then the first- and second- strand cDNA were synthesized using Oligo (dT) as a primer. The 5′-ends of tags were generated by restriction enzyme NlaIII, which recognize and cut off the CATG sites. Subsequently, the cDNA fragments connected to Oligo (dT) beads were purified, and the Illumina adaptor 1 was ligated to the 5′ end of the digested cDNA fragments. The junction of Illumina adaptor 1 and CATG site was the recognition site of endonuclease MmeI. MmeI cut at 17 bp downstream of the CATG site, producing tags with adaptor 1. After removal of 3′ fragments with magnetic beads precipitation, Illumina adaptor 2 was introduced to 3′ ends of tags, acquiring 21 bp tags with different adaptors at both ends to form a tag library. After 15 cycles of linear PCR amplification, 105 bp fragments were purified by TBE-polyacrylamide gel electrophoresis. After denaturation, the single-chain molecules were fixed onto the Illumina Sequencing Chip. Each molecule grew into a single-molecule cluster sequencing template through in situ amplification. Then four types of nucleotides labeled by four colors were added, and sequencing was performed with the method of sequencing by synthesis (SBS). Each tunnel generated millions of raw reads with sequencing length of 49 bp. The DGE data are available in the ArrayExpress database (http://www.ebi.ac.uk/arrayexpress webcite) under accession number E-MTAB-1955.
Analysis and mapping of DGE tags
Sequencing-received raw image data was transformed by base calling into sequence data, which was called raw data or raw reads. Raw sequences have 3′ adaptor fragments as well as a few low quality sequences. Therefore, several types of impurities were filtered. The raw sequences were transformed into clean tags by removing adaptor sequences, empty reads and low quality sequences (reads with unknown sequences ‘N’). A virtual database containing all the possible CATG + 17 bases length sequences of the reference gene sequences was created. All clean tags were mapped to the reference sequences and only 1 bp mismatch was considered. Clean tags mapped to reference sequences from multiple genes were filtered. Remainder clean tags were designed as unambiguous clean tags. The number of unambiguous clean tags for each gene was calculated and then normalized to TPM (number of transcripts per million clean tags). A rigorous algorithm was used to identify differentially expressed genes between two samples . FDR (false discovery rate) was used to determine the threshold of P value in multiple analyses. We used “FDR ≤ 0.001 and the absolute value of log2Ratio (cold/control) ≥1” as the threshold to judge the significance of gene expression difference. More stringent criteria with smaller FDR and bigger fold-change value can be used to identify DEGs. For pathway enrichment analysis, we looked for significantly enriched metabolic pathways or signal transduction pathways in DEGs comparing with the whole genome background.
The authors declare that they have no competing interests.
XJL and DQT conceived and designed the experimental plan. XYP, YMY, WYW performed experiments. FZ, YYG and XLS analyzed and interpreted the sequence data. FQS and XJL drafted the manuscript. All authors read and approved the final manuscript.
We are grateful to Dr. Qiang Wei of Department of Molecular Physiology and Biophysics, Center for Human Genetics Research, Vanderbilt University, for kind help with bioinformatics analysis. This study is supported by the Natural Science Foundation of Zhejiang Province (LQ12C15003).
Biol Plantarum 2008, 52:601-615. Publisher Full Text
Kosmala A, Bocian A, Rapacz M, Jurczyk B, Zwierzykowski Z: Identification of leaf proteins differentially accumulated during cold acclimation between Festuca pratensis plants with distinct levels of frost tolerance.
Maruyama K, Takeda M, Kidokoro S, Yamada K, Sakuma Y, Urano K, Fujita M, Yoshiwara K, Matsukura S, Morishita Y, Sasaki R, Suzuki H, Saito K, Shibata D, Shinozaki K, Yamaguchi-Shinozaki K: Metabolic pathways involved in cold acclimation identified by integrated analysis of metabolites and transcripts regulated by DREB1A and DREB2A.
Physiol Plantarum 2006, 126:62-71. Publisher Full Text
Seki M, Narusaka M, Ishida J, Nanjo T, Fujita M, Oono Y, Kamiya A, Nakajima M, Enju A, Sakurai T, Satou M, Akiyama K, Taji T, Yamaguchi-Shinozaki K, Carninci P, Kawai J, Hayashizaki Y, Shinozaki K: Monitoring the expression profiles of 7000 Arabidopsis genes under drought, cold and high-salinity stresses using a full-length cDNA microarray.
Kaplan B, Davydov O, Knight H, Galon Y, Knight MR, Fluhr R, Fromm H: Rapid transcriptome changes induced by cytosolic Ca2+ transients reveal ABRE-related sequences as Ca2+-responsive cis elements in Arabidopsis.
Ramsey SA, Klemm SL, Zak DE, Kennedy KA, Thorsson V, Li B, Gilchrist M, Gold ES, Johnson CD, Litvak V, Navarro G, Roach JC, Rosenberger CM, Rust AG, Yudkovsky N, Aderem A, Shmulevich I: Uncovering a macrophage transcriptional program by integrating evidence from motif scanning and expression dynamics.
Amit I, Garber M, Chevrier N, Leite AP, Donner Y, Eisenhaure T, Guttman M, Grenier JK, Li WB, Zuk O, Schubert LA, Birditt B, Shay T, Goren A, Zhang X, Smith Z, Deering R, McDonald RC, Cabili M, Bernstein BE, Rinn JL, Meissner A, Root DE, Hacohen N, Regev A: Unbiased reconstruction of a mammalian transcriptional network mediating pathogen responses.
Monroy AF, Dryanova A, Malette B, Oren DH, Farajalla MR, Liu W, Danyluk J, Ubayasena LWC, Kane K, Scoles GJ, Sarhan F, Gulick PJ: Regulatory gene candidates and gene expression analysis of cold acclimation in winter and spring wheat.
Wall PK, Leebens-Mack J, Chanderbali AS, Barakat A, Wolcott E, Liang H, Landherr L, Tomsho LP, Hu Y, Carlson JE, Ma H, Schuster SC, Soltis DE, Soltis PS, Altman N, de Pamphilis CW: Comparison of next generation sequencing technologies for transcriptome characterization.
Barakat A, DiLoreto DS, Zhang Y, Smith C, Baier K, Powell WA, Wheeler N, Sederoff R, Carlson JE: Comparison of the transcriptomes of American chestnut (Castanea dentata) and Chinese chestnut (Castanea mollissima) in response to the chestnut blight infection.
Gilmour SJ, Zarka DG, Stockinger EJ, Salazar MP, Houghton JM, Thomashow MF: Low temperature regulation of the Arabidopsis CBF family of AP2 transcriptional activators as an early step in cold-induced COR gene expression.
Taji T, Ohsumi C, Iuchi S, Seki M, Kasuga M, Kobayashi M, Yamaguchi-Shinozaki K, Shinozaki K: Important roles of drought- and cold-inducible genes for galactinol synthase in stress tolerance in Arabidopsis thaliana.
Dubouzet JG, Sakuma Y, Ito Y, Kasuga M, Dubouzet EG, Miura S, Seki M, Shinozaki K, Yamaguchi-Shinozaki K: OsDREB genes in rice, Oryza sativa L., encode transcription activators that function in drought-, high-salt- and cold-responsive gene expression.
Biol Plantarum 2009, 53:475-483. Publisher Full Text
Physiol Plant 2006, 126:72-80. Publisher Full Text
Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, He G, Chen Y, Pan Q, Liu Y, Tang J, Wu G, Zhang H, Shi Y, Liu Y, Yu C, Wang B, Lu Y, Han C, Cheung DW, Yiu SM, Peng S, Xiaoqian Z, Liu G, Liao X, Li Y, Yang H, Wang J, Lam TW, Wang J: SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.
Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, Kiryutin B, Galperin MY, Fedorova ND, Koonin EV: The COG database: new developments in phylogenetic classification of proteins from complete genomes.