Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

This article is part of the supplement: IEEE 7th International Conference on BioInformatics and BioEngineering at Harvard Medical School

Open Access Open Badges Research

Predicting protein disorder by analyzing amino acid sequence

Jack Y Yang1* and Mary Qu Yang2

Author affiliations

1 Harvard Medical School, Harvard University, P.O. Box 400888 Cambridge, MA 02115, USA

2 National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20852, USA

For all author emails, please log on.

Citation and License

BMC Genomics 2008, 9(Suppl 2):S8  doi:10.1186/1471-2164-9-S2-S8

Published: 16 September 2008



Many protein regions and some entire proteins have no definite tertiary structure, presenting instead as dynamic, disorder ensembles under different physiochemical circumstances. These proteins and regions are known as Intrinsically Unstructured Proteins (IUP). IUP have been associated with a wide range of protein functions, along with roles in diseases characterized by protein misfolding and aggregation.


Identifying IUP is important task in structural and functional genomics. We exact useful features from sequences and develop machine learning algorithms for the above task. We compare our IUP predictor with PONDRs (mainly neural-network-based predictors), disEMBL (also based on neural networks) and Globplot (based on disorder propensity).


We find that augmenting features derived from physiochemical properties of amino acids (such as hydrophobicity, complexity etc.) and using ensemble method proved beneficial. The IUP predictor is a viable alternative software tool for identifying IUP protein regions and proteins.