Open Access Highly Accessed Research article

Optimization of de novo transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms

Berat Z Haznedaroglu1, Darryl Reeves2, Hamid Rismani-Yazdi13 and Jordan Peccia1*

Author Affiliations

1 Department of Chemical and Environmental Engineering, Yale University, New Haven, CT 06511, USA

2 Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06511, USA

3 Now at the Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA

For all author emails, please log on.

BMC Bioinformatics 2012, 13:170  doi:10.1186/1471-2105-13-170

Published: 18 July 2012

Additional files

Additional file 1:

This spreadsheet contains the list of annotated KOIs missing in single k-mer assemblies (provided as separate tabs), but present in the clustered assembly obtained by CD-HIT-EST with 1.0 sequence identity.

Format: XLS Size: 595KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 2:

This spreadsheet contains the list of annotated KOIs missing in single k-mer assemblies (provided as separate tabs), but present in the clustered assembly obtained by Oases multi-k option.

Format: XLS Size: 622KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 3:

This spreadsheet contains the list of annotated KOIs missing in the clustered assembly obtained by CD-HIT-EST with 1.0 sequence identity, but present in the corresponding single k-mer assemblies (provided as separate tabs).

Format: XLS Size: 69KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 4:

This spreadsheet contains the list of annotated KOIs missing in the clustered assembly obtained by Oases multi-k option, but present in the corresponding single k-mer assemblies (provided as separate tabs).

Format: XLS Size: 84KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 5:

This file provides the reader with a representative workflow to generate optimized de novo transcriptome assembly.

Format: PPT Size: 200KB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 6:

This file contains in-house designed scripts used during the course of the study.

Format: ZIP Size: 441KB Download file

Open Data