Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Software

snp-search: simple processing, manipulation and searching of SNPs from high-throughput sequencing

Ali Al-Shahib* and Anthony Underwood

Author Affiliations

Applied Bioinformatics and Laboratory Informatics Unit, Microbiology Services, Public Health England, 61 Colindale Avenue, London NW9 5EQ, UK

For all author emails, please log on.

BMC Bioinformatics 2013, 14:326  doi:10.1186/1471-2105-14-326

Published: 19 November 2013



A typical bacterial pathogen genome mapping project can identify thousands of single nucleotide polymorphisms (SNP). Interpreting SNP data is complex and it is difficult to conceptualise the data contained within the large flat files that are the typical output from most SNP calling algorithms. One solution to this problem is to construct a database that can be queried using simple commands so that SNP interrogation and output is both easy and comprehensible.


Here we present snp-search, a tool that manages SNP data and allows for manipulation and searching of SNP data. After creation of a SNP database from a VCF file, snp-search can be used to convert the selected SNP data into FASTA sequences, construct phylogenies, look for unique SNPs, and output contextual information about each SNP. The FASTA output from snp-search is particularly useful for the generation of robust phylogenetic trees that are based on SNP differences across the conserved positions in whole genomes. Queries can be designed to answer critical genomic questions such as the association of SNPs with particular phenotypes.


snp-search is a tool that manages SNP data and outputs useful information which can be used to test important biological hypotheses.

Single Nucleotide Polymorphisms (SNP); Variant Call Format (VCF); SQL database; High-throughput Sequencing; Next Generation Sequencing (NGS); Ruby; Phylogeny