Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Research article

Assessing the genomic evidence for conserved transcribed pseudogenes under selection

Amit N Khachane and Paul M Harrison*

Author Affiliations

Department of Biology, McGill University, Stewart Biology Building, 1205 Docteur Penfield Ave., Montreal, QC, H3A 1B1 Canada

For all author emails, please log on.

BMC Genomics 2009, 10:435  doi:10.1186/1471-2164-10-435

Published: 15 September 2009

Abstract

Background

Transcribed pseudogenes are copies of protein-coding genes that have accumulated indicators of coding-sequence decay (such as frameshifts and premature stop codons), but nonetheless remain transcribed. Recent experimental evidence indicates that transcribed pseudogenes may regulate the expression of homologous genes, through antisense interference, or generation of small interfering RNAs (siRNAs). Here, we assessed the genomic evidence for such transcribed pseudogenes of potential functional importance, in the human genome. The most obvious indicators of such functional importance are significant evidence of conservation and selection pressure.

Results

A variety of pseudogene annotations from multiple sources were pooled and filtered to obtain a subset of sequences that have significant mid-sequence disablements (frameshifts and premature stop codons), and that have clear evidence of full-length mRNA transcription. We found 1750 such transcribed pseudogene annotations (TPAs) in the human genome (corresponding to ~11.5% of human pseudogene annotations). We checked for syntenic conservation of TPAs in other mammals (rhesus monkey, mouse, rat, dog and cow). About half of the human TPAs are conserved in rhesus monkey, but strikingly, very few in mouse (~3%). The TPAs conserved in rhesus monkey show evidence of selection pressure (relative to surrounding intergenic DNA) on: (i) their GC content, and (ii) their rate of nucleotide substitution. This is in spite of distributions of Ka/Ks (ratios of non-synonymous to synonymous substitution rates), congruent with a lack of protein-coding ability. Furthermore, we have identified 68 human TPAs that are syntenically conserved in at least two other mammals. Interestingly, we observe three TPA sequences conserved in dog that have intermediate character (i.e., evidence of both protein-coding ability and pseudogenicity), and discuss the implications of this.

Conclusion

Through evolutionary analysis, we have identified candidate sequences for functional human transcribed pseudogenes, and have pinpointed 68 strong candidates for further investigation as potentially functional transcribed pseudogenes across multiple mammal species.