Open Access Research article

Solubility of recombinant Src homology 2 domains expressed in E. coli can be predicted by TANGO

Thorny Cecilie Bie Andersen1, Kjersti Lindsjø1, Cecilie Dahl Hem1, Lise Koll1, Per Eugen Kristiansen2, Lars Skjeldal3, Amy H Andreotti4 and Anne Spurkland1*

Author Affiliations

1 Department of Anatomy, Institute of Basal Medical Sciences, University of Oslo, Oslo, Norway

2 Department of Biosciences, University of Oslo, Oslo, Norway

3 Department of Chemistry, Biochemistry and Food science, Norwegian University of Life Sciences, Ås, Norway

4 Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa, USA

For all author emails, please log on.

BMC Biotechnology 2014, 14:3  doi:10.1186/1472-6750-14-3

Published: 14 January 2014



Signalling proteins often contain several well defined and conserved protein domains. Structural analyses of such domains by nuclear magnetic spectroscopy or X-ray crystallography may greatly inform the function of proteins. A limiting step is often the production of sufficient amounts of the recombinant protein. However, there is no particular way to predict whether a protein will be soluble when expressed in E.coli. Here we report our experience with expression of a Src homology 2 (SH2) domain.


The SH2 domain of the SH2D2A protein (or T cell specific adapter protein, TSAd) forms insoluble aggregates when expressed as various GST-fusion proteins in Escherichia coli (E. coli). Alteration of the flanking sequences, or growth temperature influenced expression and solubility of TSAd-SH2, however overall yield of soluble protein remained low. The algorithm TANGO, which predicts amyloid fibril formation in eukaryotic cells, identified a hydrophobic sequence within the TSAd-SH2 domain with high propensity for beta-aggregation. Mutation to the corresponding amino acids of the related HSH2- (or ALX) SH2 domain increased the yield of soluble TSAd-SH2 domains. High beta-aggregation values predicted by TANGO correlated with low solubility of recombinant SH2 domains as reported in the literature.


Solubility of recombinant proteins expressed in E.coli can be predicted by TANGO, an algorithm developed to determine the aggregation propensity of peptides. Targeted mutations representing corresponding amino acids in similar protein domains may increase solubility of recombinant proteins.

Bacterial inclusion bodies; Protein aggregation; Recombinant protein expression; SH2 domain; SH2D2A; Protein solubility