Composition Profiler: a tool for discovery and visualization of amino acid composition differences
1 Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA
2 Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN 46202, USA
3 Institute for Biological Instrumentation, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
BMC Bioinformatics 2007, 8:211 doi:10.1186/1471-2105-8-211Published: 19 June 2007
Composition Profiler is a web-based tool for semi-automatic discovery of enrichment or depletion of amino acids, either individually or grouped by their physico-chemical or structural properties.
The program takes two samples of amino acids as input: a query sample and a reference sample. The latter provides a suitable background amino acid distribution, and should be chosen according to the nature of the query sample, for example, a standard protein database (e.g. SwissProt, PDB), a representative sample of proteins from the organism under study, or a group of proteins with a contrasting functional annotation. The results of the analysis of amino acid composition differences are summarized in textual and graphical form.
As an exploratory data mining tool, our software can be used to guide feature selection for protein function or structure predictors. For classes of proteins with significant differences in frequencies of amino acids having particular physico-chemical (e.g. hydrophobicity or charge) or structural (e.g. α helix propensity) properties, Composition Profiler can be used as a rough, light-weight visual classifier.