Non-random retention of protein-coding overlapping genes in Metazoa
- Equal contributors
1 Department of Biology and Genetics for Medical Sciences, University of Milan, 20133 Milan, Italy
2 Center for Genomic Medicine, Kyoto University Graduate School of Medicine, Konoe-cho, Yoshida, Sakyo-ku, 606-8501 Kyoto, Japan
3 Institute of Biomedical Technologies, National Research Council, Via Fantoli 16/15, 20138 Milan, Italy
4 European Molecular Biology Laboratory, Meyerhofstr.1, 69012 Heidelberg, Germany
5 Department of Experimental Oncology, European Institute of Oncology, Via Ripamonti 435, 20141 Milan, Italy
6 FIRC Institute of Molecular Oncology Foundation, Via Adamello 16, 20139 Milan, Italy
BMC Genomics 2008, 9:174 doi:10.1186/1471-2164-9-174Published: 16 April 2008
Although the overlap of transcriptional units occurs frequently in eukaryotic genomes, its evolutionary and biological significance remains largely unclear. Here we report a comparative analysis of overlaps between genes coding for well-annotated proteins in five metazoan genomes (human, mouse, zebrafish, fruit fly and worm).
For all analyzed species the observed number of overlapping genes is always lower than expected assuming functional neutrality, suggesting that gene overlap is negatively selected. The comparison to the random distribution also shows that retained overlaps do not exhibit random features: antiparallel overlaps are significantly enriched, while overlaps lying on the same strand and those involving coding sequences are highly underrepresented. We confirm that overlap is mostly species-specific and provide evidence that it frequently originates through the acquisition of terminal, non-coding exons. Finally, we show that overlapping genes tend to be significantly co-expressed in a breast cancer cDNA library obtained by 454 deep sequencing, and that different overlap types display different patterns of reciprocal expression.
Our data suggest that overlap between protein-coding genes is selected against in Metazoa. However, when retained it may be used as a species-specific mechanism for the reciprocal regulation of neighboring genes. The tendency of overlaps to involve non-coding regions of the genes leads to the speculation that the advantages achieved by an overlapping arrangement may be optimized by evolving regulatory non-coding transcripts.