Discrimination of outer membrane proteins with improved performance
Department of Computer Science, Utah State University, Logan, UT 84322, USA
BMC Bioinformatics 2008, 9:47 doi:10.1186/1471-2105-9-47Published: 24 January 2008
Outer membrane proteins (OMPs) perform diverse functional roles in Gram-negative bacteria. Identification of outer membrane proteins is an important task.
This paper presents a method for distinguishing outer membrane proteins (OMPs) from non-OMPs (that is, globular proteins and inner membrane proteins (IMPs)). First, we calculated the average residue compositions of OMPs, globular proteins and IMPs separately using a training set. Then for each protein from the test set, its distances to the three groups were calculated based on residue composition using a weighted Euclidean distance (WED) approach. Proteins from the test set were classified into OMP versus non-OMP classes based on the least distance. The proposed method can distinguish between OMPs and non-OMPs with 91.0% accuracy and 0.639 Matthews correlation coefficient (MCC). We then improved the method by including homologous sequences into the calculation of residue composition and using a feature-selection method to select the single residue and di-peptides that were useful for OMP prediction. The final method achieves an accuracy of 96.8% with 0.859 MCC. In direct comparisons, the proposed method outperforms previously published methods.
The proposed method can identify OMPs with improved performance. It will be very helpful to the discovery of OMPs in a genome scale.