Open Access Highly Accessed Software

SNP-VISTA: An interactive SNP visualization tool

Nameeta Shah12, Michael V Teplitsky2, Simon Minovitsky2, Len A Pennacchio23, Philip Hugenholtz3, Bernd Hamann12 and Inna L Dubchak23*

Author Affiliations

1 Institute for Data Analysis and Visualization, (IDAV), Department of Computer Science, University of California, Davis, One Shields Ave., Davis, CA 95616, USA

2 Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA, 94720, USA

3 DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA

For all author emails, please log on.

BMC Bioinformatics 2005, 6:292  doi:10.1186/1471-2105-6-292

Published: 8 December 2005



Recent advances in sequencing technologies promise to provide a better understanding of the genetics of human disease as well as the evolution of microbial populations. Single Nucleotide Polymorphisms (SNPs) are established genetic markers that aid in the identification of loci affecting quantitative traits and/or disease in a wide variety of eukaryotic species. With today's technological capabilities, it has become possible to re-sequence a large set of appropriate candidate genes in individuals with a given disease in an attempt to identify causative mutations. In addition, SNPs have been used extensively in efforts to study the evolution of microbial populations, and the recent application of random shotgun sequencing to environmental samples enables more extensive SNP analysis of co-occurring and co-evolving microbial populations. The program is available at webcite[1].


We have developed and present two modifications of an interactive visualization tool, SNP-VISTA, to aid in the analyses of the following types of data: A. Large-scale re-sequence data of disease-related genes for discovery of associated and/or causative alleles (GeneSNP-VISTA). B. Massive amounts of ecogenomics data for studying homologous recombination in microbial populations (EcoSNP-VISTA). The main features and capabilities of SNP-VISTA are: 1) mapping of SNPs to gene structure; 2) classification of SNPs, based on their location in the gene, frequency of occurrence in samples and allele composition; 3) clustering, based on user-defined subsets of SNPs, highlighting haplotypes as well as recombinant sequences; 4) integration of protein evolutionary conservation visualization; and 5) display of automatically calculated recombination points that are user-editable.


The main strength of SNP-VISTA is its graphical interface and use of visual representations, which support interactive exploration and hence better understanding of large-scale SNP data by the user.