Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Research article

Expansion of tandem repeats in sea anemone Nematostella vectensis proteome: A source for gene novelty?

Guy Naamati1, Menachem Fromer1 and Michal Linial2*

Author Affiliations

1 School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, 91904, Israel

2 Department of Biological Chemistry, Institute of Life Sciences, The Sudarsky Center for Computational biology, The Hebrew University of Jerusalem, Jerusalem, 91904, Israel

For all author emails, please log on.

BMC Genomics 2009, 10:593  doi:10.1186/1471-2164-10-593

Published: 10 December 2009

Abstract

Background

The complete proteome of the starlet sea anemone, Nematostella vectensis, provides insights into gene invention dating back to the Cnidarian-Bilaterian ancestor. With the addition of the complete proteomes of Hydra magnipapillata and Monosiga brevicollis, the investigation of proteins having unique features in early metazoan life has become practical. We focused on the properties and the evolutionary trends of tandem repeat (TR) sequences in Cnidaria proteomes.

Results

We found that 11-16% of N. vectensis proteins contain tandem repeats. Most TRs cover 150 amino acid segments that are comprised of basic units of 5-20 amino acids. In total, the N. Vectensis proteome has about 3300 unique TR-units, but only a small fraction of them are shared with H. magnipapillata, M. brevicollis, or mammalian proteomes. The overall abundance of these TRs stands out relative to that of 14 proteomes representing the diversity among eukaryotes and within the metazoan world. TR-units are characterized by a unique composition of amino acids, with cysteine and histidine being over-represented. Structurally, most TR-segments are associated with coiled and disordered regions. Interestingly, 80% of the TR-segments can be read in more than one open reading frame. For over 100 of them, translation of the alternative frames would result in long proteins. Most domain families that are characterized as repeats in eukaryotes are found in the TR-proteomes from Nematostella and Hydra.

Conclusions

While most TR-proteins have originated from prediction tools and are still awaiting experimental validations, supportive evidence exists for hundreds of TR-units in Nematostella. The existence of TR-proteins in early metazoan life may have served as a robust mode for novel genes with previously overlooked structural and functional characteristics.