Performance of various assembly algorithms. Assembled sequences were assessed by comparison to a reference set of 65 homoeologous triplets (A: ABySS, V: Velvet, O: Velvet/Oases, M: MIRA). Results for the transcript assembler Trinity (T), which has become available more recently, are also shown as a comparison. Subscripts on A, V, O and T indicate k-mer size; subscripts on M indicate assembly parameters as listed in Additional file 1: Table S1. Panel A shows the fraction of homoeologs identified (%ID >98%) vs. the fraction of contigs with evidence of chimeric assembly (for details, see Methods). A perfect assembly would appear near the bottom right hand corner of the plot. Note the high number of chimeric assemblies, i.e. lack of homoeolog-specificity, exhibited by the de-Bruijn graph-based Oases and Trinity assemblers. The larger k-mer sizes approach the average length of the Illumina reads, thereby decreasing the coverage per contig. Panel B shows the fraction of homoeologs identified (%ID >98%) plotted against the fraction of contigs with an alignment length larger than 50% of the relevant homoeolog length (see Methods), giving an indication of the fraction of sequence covered by individual contigs. In this panel, a perfect assembly would appear towards the top right hand corner of the plot. Note that the Velvet/Oases assemblies tend to produce the longest contigs, but at the expense of homoeolog-specificity (Panel A).
Schreiber et al. BMC Genomics 2012 13:492 doi:10.1186/1471-2164-13-492