Using a color-coded ambigraphic nucleic acid notation to visualize conserved palindromic motifs within and across genomes
1 An independent investigator, Frederick, MD, USA
2 Department of Visual Studies, College of Arts and Sciences, State University of New York at Buffalo, Buffalo, NY, USA
BMC Genomics 2014, 15:52 doi:10.1186/1471-2164-15-52Published: 22 January 2014
Ambiscript is a graphically-designed nucleic acid notation that uses symbol symmetries to support sequence complementation, highlight biologically-relevant palindromes, and facilitate the analysis of consensus sequences. Although the original Ambiscript notation was designed to easily represent consensus sequences for multiple sequence alignments, the notation’s black-on-white ambiguity characters are unable to reflect the statistical distribution of nucleotides found at each position. We now propose a color-augmented ambigraphic notation to encode the frequency of positional polymorphisms in these consensus sequences.
We have implemented this color-coding approach by creating an Adobe Flash® application ( http://www.ambiscript.org webcite) that shades and colors modified Ambiscript characters according to the prevalence of the encoded nucleotide at each position in the alignment. The resulting graphic helps viewers perceive biologically-relevant patterns in multiple sequence alignments by uniquely combining color, shading, and character symmetries to highlight palindromes and inverted repeats in conserved DNA motifs.
Juxtaposing an intuitive color scheme over the deliberate character symmetries of an ambigraphic nucleic acid notation yields a highly-functional nucleic acid notation that maximizes information content and successfully embodies key principles of graphic excellence put forth by the statistician and graphic design theorist, Edward Tufte.