Figure 5.

iMetAMOS classification output identifies possible contamination. On sample ERR233356 retrieved from the Sequence Read Archive, the majority of data is clearly sourced from a Mycobacterium. However, a significant fraction of the data (~10% of reads or ~29% of assembly) belongs to other, mostly unidentified, organisms. A subset of 3% of the reads (1.81 Mbp of the assembly) is identified as S. aureus and covers over 60% of the S. aureus genome. iMetAMOS automatically identified this potential contaminant and binned the contigs by genus to facilitate easy confirmation and removal by the user.

Koren et al. BMC Bioinformatics 2014 15:126   doi:10.1186/1471-2105-15-126
Download authors' original image