Quality assessment of genome assembly using CEGMA for five PST isolates. Of the 248 core eukaryotic genes (CEGs) 88.7% could be identified in the three PST US isolate genomes (PST-130, PST-43 and PST-21). The CEGMA pipeline distinguishes between CEGs found in complete (A) copies or as partial fragments (B) and separates the CEGs based on levels of conservation across higher eukaryotes, with group 4 being the most conserved. The levels of complete gene coverage were high for all US isolates, indicating few core eukaryotic genes were split across contigs. For the two UK isolates (PST-08/21 and PST-87/7) complete gene coverage was reduced compared to partial gene coverage, indicating higher levels of fragmentation for these genomes.

Cantu et al. BMC Genomics 2013 14:270   doi:10.1186/1471-2164-14-270