DNdisorder: predicting protein disorder using boosting and deep networks
1 Department of Computer Science, University of Missouri, Columbia, MO 65211, USA
2 Informatics Institute, University of Missouri, Columbia, MO 65211, USA
3 C. Bond Life Science Center, University of Missouri, Columbia, MO 65211, USA
BMC Bioinformatics 2013, 14:88 doi:10.1186/1471-2105-14-88Published: 6 March 2013
A number of proteins contain regions which do not adopt a stable tertiary structure in their native state. Such regions known as disordered regions have been shown to participate in many vital cell functions and are increasingly being examined as drug targets.
This work presents a new sequence based approach for the prediction of protein disorder. The method uses boosted ensembles of deep networks to make predictions and participated in the CASP10 experiment. In a 10 fold cross validation procedure on a dataset of 723 proteins, the method achieved an average balanced accuracy of 0.82 and an area under the ROC curve of 0.90. These results are achieved in part by a boosting procedure which is able to steadily increase balanced accuracy and the area under the ROC curve over several rounds. The method also compared competitively when evaluated against a number of state-of-the-art disorder predictors on CASP9 and CASP10 benchmark datasets.