An automated graphics tool for comparative genomics: the Coulson plot generator
1 LGC Genomics Ltd, Pindar Road, Hoddesdon, Hertfordshire EN11 0WZ, UK
2 Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 0XY, UK
3 Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge CB2 1QP, UK
BMC Bioinformatics 2013, 14:141 doi:10.1186/1471-2105-14-141Published: 27 April 2013
Comparative analysis is an essential component to biology. When applied to genomics for example, analysis may require comparisons between the predicted presence and absence of genes in a group of genomes under consideration. Frequently, genes can be grouped into small categories based on functional criteria, for example membership of a multimeric complex, participation in a metabolic or signaling pathway or shared sequence features and/or paralogy. These patterns of retention and loss are highly informative for the prediction of function, and hence possible biological context, and can provide great insights into the evolutionary history of cellular functions. However, representation of such information in a standard spreadsheet is a poor visual means from which to extract patterns within a dataset.
We devised the Coulson Plot, a new graphical representation that exploits a matrix of pie charts to display comparative genomics data. Each pie is used to describe a complex or process from a separate taxon, and is divided into sectors corresponding to the number of proteins (subunits) in a complex/process. The predicted presence or absence of proteins in each complex are delineated by occupancy of a given sector; this format is visually highly accessible and makes pattern recognition rapid and reliable. A key to the identity of each subunit, plus hierarchical naming of taxa and coloring are included. A java-based application, the Coulson plot generator (CPG) automates graphic production, with a tab or comma-delineated text file as input and generating an editable portable document format or svg file.
CPG software may be used to rapidly convert spreadsheet data to a graphical matrix pie chart format. The representation essentially retains all of the information from the spreadsheet but presents a graphically rich format making comparisons and identification of patterns significantly clearer. While the Coulson plot format is highly useful in comparative genomics, its original purpose, the software can be used to visualize any dataset where entity occupancy is compared between different classes.
CPG software is available at sourceforge http://sourceforge.net/projects/coulson webcite and http://dl.dropbox.com/u/6701906/Web/Sites/Labsite/CPG.html webcite