MAKER2 vs. ab initio predictors on second-generation genomes. We compared the performance of the ab initio predictor SNAP to the annotation pipeline MAKER2 on two second-generation genomes: L. humile (Argentine ant) and S. mediterranea (flatworm). Pfam domain content was used as a means to evaluate the performance of these algorithms, under the assumption that a poorly annotated genome will be globally depleted for domains relative to well-annotated genomes. (A) The average Pfam domain contents for six well annotated eukaryotic reference proteomes: H. sapiens, M. musculus, D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae. These data provide an upper bound for the expected domain content of a newly sequenced genome. The region of the pie chart outlined in red indicates the percentage of genes containing a Pfam domain; these are further subdivided by GO molecular function. (B) The Pfam domain content of SNAP produced ab initio predictions compared to MAKER2-SNAP gene annotations for the L. humile genome. (C) The Pfam domain content of SNAP ab initio gene predictions and MAKER2-SNAP annotations in the S. mediterranea genome.
Holt and Yandell BMC Bioinformatics 2011 12:491 doi:10.1186/1471-2105-12-491