Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Proceedings of the Tenth Annual MCBIOS Conference

Open Access Proceedings

Drug activity prediction using multiple-instance learning via joint instance and feature selection

Zhendong Zhao1, Gang Fu2, Sheng Liu1, Khaled M Elokely2, Robert J Doerksen23*, Yixin Chen1* and Dawn E Wilkins1*

Author Affiliations

1 Department of Computer and Information Science, School of Engineering, University of Mississippi, University, 38677, USA

2 Department of Medicinal Chemistry, School of Pharmacy, University of Mississippi, University, 38677, USA

3 Research Institute of Pharmaceutical Sciences, School of Pharmacy, University of Mississippi, University, 38677, USA

For all author emails, please log on.

BMC Bioinformatics 2013, 14(Suppl 14):S16  doi:10.1186/1471-2105-14-S14-S16

Published: 9 October 2013

Abstract

Background

In drug discovery and development, it is crucial to determine which conformers (instances) of a given molecule are responsible for its observed biological activity and at the same time to recognize the most representative subset of features (molecular descriptors). Due to experimental difficulty in obtaining the bioactive conformers, computational approaches such as machine learning techniques are much needed. Multiple Instance Learning (MIL) is a machine learning method capable of tackling this type of problem. In the MIL framework, each instance is represented as a feature vector, which usually resides in a high-dimensional feature space. The high dimensionality may provide significant information for learning tasks, but at the same time it may also include a large number of irrelevant or redundant features that might negatively affect learning performance. Reducing the dimensionality of data will hence facilitate the classification task and improve the interpretability of the model.

Results

In this work we propose a novel approach, named multiple instance learning via joint instance and feature selection. The iterative joint instance and feature selection is achieved using an instance-based feature mapping and 1-norm regularized optimization. The proposed approach was tested on four biological activity datasets.

Conclusions

The empirical results demonstrate that the selected instances (prototype conformers) and features (pharmacophore fingerprints) have competitive discriminative power and the convergence of the selection process is also fast.