This article is part of the supplement: Probabilistic Modeling and Machine Learning in Structural and Systems Biology
Improved functional prediction of proteins by learning kernel combinations in multilabel settings
ETH Zurich, Institute of Computational Science, Universität-Str. 6, CH-8092 Zurich, Switzerland
BMC Bioinformatics 2007, 8(Suppl 2):S12 doi:10.1186/1471-2105-8-S2-S12Published: 3 May 2007
We develop a probabilistic model for combining kernel matrices to predict the function of proteins. It extends previous approaches in that it can handle multiple labels which naturally appear in the context of protein function.
Explicit modeling of multilabels significantly improves the capability of learning protein function from multiple kernels. The performance and the interpretability of the inference model are further improved by simultaneously predicting the subcellular localization of proteins and by combining pairwise classifiers to consistent class membership estimates.
For the purpose of functional prediction of proteins, multilabels provide valuable information that should be included adequately in the training process of classifiers. Learning of functional categories gains from co-prediction of subcellular localization. Pairwise separation rules allow very detailed insights into the relevance of different measurements like sequence, structure, interaction data, or expression data. A preliminary version of the software can be downloaded from http://www.inf.ethz.ch/personal/vroth/KernelHMM/ webcite.