Histogram of observed sample gene frequencies compared to the predicted number using the finite supragenome model. The number of genes for each frequency class was calculated using the results from our revised finite supragenome model (trained on all 17 strains). The observed and predicted number of core genes (2,266) found in all 17 strains agreed exactly; these values are not shown to avoid distortion of the scale of the graph. Distributed genes appear in two or more strains, but not all (from 2 to 16 here).
Boissy et al. BMC Genomics 2011 12:187 doi:10.1186/1471-2164-12-187