Number of k-mers. The plot shows the number of distinct k-mers found in the sequencing data from chr21 at different coverage levels, based on random subsampling of the data. The total number of distinct k-mers in the hg18 genome sequence of chr21 is 32.5 million k-mers. Unfiltered, the number of k-mers found increases at a steady rate after 5-fold coverage. When unique k-mers are removed, the number of filtered k-mers approaches the ideal number at around 7-fold coverage and the rate of increase is significantly reduced.
Melsted and Pritchard BMC Bioinformatics 2011 12:333 doi:10.1186/1471-2105-12-333