Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples
1 Genomics and Computational Biology Graduate Group, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
2 Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia, PA 19104, USA
3 Systems Biology Division, Molecular and Cellular Oncogenesis Program, Wistar Institute, Philadelphia, PA 19104, USA
BMC Genomics 2008, 9:509 doi:10.1186/1471-2164-9-509Published: 30 October 2008
Additional file 1:
Supplementary-1. Space coverage of n-mer space (1–20) for Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, Saccharomyces cerevisiae, and Escherichia coli (K-12) genomes by the stochastic searching algorithm.
Format: TXT Size: 5KB Download file
Additional file 2:
Supplementary-2. The 10-mer space coverage for 433 fully sequenced microbial with genomes GC content and genome sizes.
Format: TXT Size: 24KB Download file
Additional file 3:
A full list of high frequency n-mers for Homo sapiens genome.
Format: ZIP Size: 1.4MB Download file
Additional file 4:
Supplementary-4. The NCBI genome sequence accession ID for Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Saccharomyces cerevisiae.
Format: TXT Size: 1KB Download file
Additional file 5:
Supplementary-5. The NCBI genome sequence accession ID for 433 microbial.
Format: TXT Size: 69KB Download file