Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: 22nd International Conference on Genome Informatics: Bioinformatics

Open Access Highly Accessed Proceedings

Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study

Qiong-Yi Zhao1, Yi Wang2, Yi-Meng Kong1, Da Luo3, Xuan Li1* and Pei Hao4*

Author Affiliations

1 Key Laboratory of Synthetic Biology, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China

2 Institute of Massive Computing, Software Engineering Institute, East China Normal University, 3663 North Zhongshan Road, Shanghai, 200062, China

3 State Key Laboratory of Biocontrol, Sun Yat Sen University, Guangzhou, 510275, China

4 Shanghai Center for Bioinformation Technology, 100 Qinzhou Road, Shanghai, 200235, China

For all author emails, please log on.

BMC Bioinformatics 2011, 12(Suppl 14):S2  doi:10.1186/1471-2105-12-S14-S2

Published: 14 December 2011

Additional files

Additional file 1:

Basic statistics for de novo assembly with D. melanogaster data sets. The outcomes of transcript assemblies by each method: SOAPdenovo, SOAPdenovo-MK, ABySS, trans-ABySS, Oases, Oases-MK and Trinity. Assembled transcripts with no less than 100 bases are included. Low quality transcripts are defined as transcripts with more than 5% ambiguous nucleotides. #Since scaffolding system hasn’t been built in Trinity yet, the measure of low quality transcripts for Trinity is left as “-”.

Format: XLSX Size: 15KB Download file

Open Data

Additional file 2:

Basic statistics for de novo assembly with S. pombe data sets. The outcomes of transcript assemblies by each method are shown.

Format: XLSX Size: 14KB Download file

Open Data

Additional file 3:

Basic assembly statistics and BLASTX hits to Uniprot database using C. sinensis 2.3g data set. The outcomes of transcript assemblies by each method and measurements in the previous study are shown. §Some measurements are not available in the previous study, which are left as “-” in the table.

Format: XLSX Size: 13KB Download file

Open Data

Additional file 4:

List of C4H related transcripts assembled by Trinity and Oases-MK. BLAST results against the KEGG database with E-value ≤ 1.0e-5, and only transcripts with top blastx hits to Cinnamate 4-hydroxylase (EC1.14.13.11) are shown.

Format: XLSX Size: 16KB Download file

Open Data

Additional file 5:

Sequences of C4H related transcripts assembled by Trinity and Oases-MK. Fasta formatted sequences of C4H related transcripts that were listed in Additional file 3 are shown.

Format: TXT Size: 26KB Download file

Open Data