Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Highly Accessed Research article

An efficient approach to finding Siraitia grosvenorii triterpene biosynthetic genes by RNA-seq and digital gene expression analysis

Qi Tang12, Xiaojun Ma12*, Changming Mo2, Iain W Wilson3, Cai Song2, Huan Zhao1, Yanfang Yang4, Wei Fu1 and Deyou Qiu4*

Author Affiliations

1 Institute of Medicinal Plant, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing 100193, China

2 Guangxi Branch Institute, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Nanning 530023, China

3 CSIRO Plant Industry, PO Box 1600, Canberra ACT 2001, Australia

4 The Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China

For all author emails, please log on.

BMC Genomics 2011, 12:343  doi:10.1186/1471-2164-12-343

Published: 5 July 2011

Additional files

Additional file 1:

Overview of Siraitia grosvenorii transcriptome sequencing and assembly. (A) Size distribution of Illumina sequencing contigs. (B) Size distribution of Illumina sequencing scaffolds and which after paired-end and gap filling. (C) Size distribution of Illumina sequencing unigenes and which after paired-end and gap filling.

Format: XLS Size: 22KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 2:

Top BLAST hits from NCBI nr database. BLAST results against the NCBI nr database for all the distinct sequences with a cut-off E value above 10-5 are shown.

Format: XLS Size: 12.4MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 3:

Top BLAST hits from NCBI Swissprot database. BLAST results against the NCBI Swissprot database for all the distinct sequences with a cut-off E value above 10-5 are shown.

Format: XLS Size: 8.4MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 4:

Distribution of total tags and distinct tags over different tag abundance categories. (A) Distribution of total clean tags. Numbers in the square brackets indicate the range of copy numbers for a specific category of tags. For example, [2,5] means all the tags in this category has 2 to 5 copies. Numbers in the parentheses show the total tag copy number and ratio for all the tags in that category. (B) Distribution of distinct clean tags. Numbers in the square brackets indicate the range of copy numbers for a specific category of tags. Numbers in the parentheses show the total types of tags in that category.

Format: TIFF Size: 233KB Download file

Open Data

Additional file 5:

Differentially expressed genes between 3 DAF and 50 DAF. TPM: transcript copies per million tags. Raw intensity: the total number of tags sequenced for each gene. FDR: false discovery rate. We used FDR < 0.001 and the absolute value of log2Ratio ≤ 1 as the threshold to judge the significance of gene expression difference. In order to calculate the log2Ratio and FDR, we used TPM value of 0.001 instead of 0 for genes that do not express in one sample.

Format: XLS Size: 1.3MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 6:

Differentially expressed genes between 3 DAF and 70 DAF

Format: XLS Size: 1.3MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 7:

Differentially expressed genes between 50 DAF and 70 DAF

Format: XLS Size: 691KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 8:

Putative CYP450 genes in the Solexa cDNA library

Format: XLS Size: 36KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 9:

Putative UDPG genes in the Solexa cDNA library

Format: XLS Size: 35KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data