Open Access Highly Accessed Open Badges Research article

Low-complexity regions within protein sequences have position-dependent roles

Alain Coletta123*, John W Pinney4, David Y Weiss Solís56, James Marsh2, Steve R Pettifer2 and Teresa K Attwood1

Author affiliations

1 Faculty of Life Sciences, University of Manchester, Manchester M13 9PL, UK

2 School of Computer Science, University of Manchester, Manchester M13 9PL, UK

3 Switch Laboratory, Department of Applied Biological Sciences, Vrije Universiteit Brussel, 1050, Belgium

4 Centre for Bioinformatics, Division of Molecular Biosciences, Imperial College London, London SW7 2AZ, UK

5 Institute of Interdisciplinary Research (IRIBHM), School of Medicine, Free University of Brussels, 1070 Brussels, Belgium

6 IRIDIA-CoDE, Université Libre de Bruxelles, Ave. F. Roosevelt 50, 1050 Brussels, Belgium

For all author emails, please log on.

Citation and License

BMC Systems Biology 2010, 4:43  doi:10.1186/1752-0509-4-43

Published: 13 April 2010



Regions of protein sequences with biased amino acid composition (so-called Low-Complexity Regions (LCRs)) are abundant in the protein universe. A number of studies have revealed that i) these regions show significant divergence across protein families; ii) the genetic mechanisms from which they arise lends them remarkable degrees of compositional plasticity. They have therefore proved difficult to compare using conventional sequence analysis techniques, and functions remain to be elucidated for most of them. Here we undertake a systematic investigation of LCRs in order to explore their possible functional significance, placed in the particular context of Protein-Protein Interaction (PPI) networks and Gene Ontology (GO)-term analysis.


In keeping with previous results, we found that LCR-containing proteins tend to have more binding partners across different PPI networks than proteins that have no LCRs. More specifically, our study suggests i) that LCRs are preferentially positioned towards the protein sequence extremities and, in contrast with centrally-located LCRs, such terminal LCRs show a correlation between their lengths and degrees of connectivity, and ii) that centrally-located LCRs are enriched with transcription-related GO terms, while terminal LCRs are enriched with translation and stress response-related terms.


Our results suggest not only that LCRs may be involved in flexible binding associated with specific functions, but also that their positions within a sequence may be important in determining both their binding properties and their biological roles.