Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Highly Accessed Research article

Detecting transcription of ribosomal protein pseudogenes in diverse human tissues from RNA-seq data

Peter Tonner1, Vinodh Srinivasasainagendra2, Shaojie Zhang1* and Degui Zhi2*

Author affiliations

1 Department of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL, 32816, USA

2 Department of Biostatistics, Section on Statistical Genetics, University of Alabama at Birmingham, Birmingham, AL, 35294, USA

For all author emails, please log on.

Citation and License

BMC Genomics 2012, 13:412  doi:10.1186/1471-2164-13-412

Published: 21 August 2012

Abstract

Background

Ribosomal proteins (RPs) have about 2000 pseudogenes in the human genome. While anecdotal reports for RP pseudogene transcription exists, it is unclear to what extent these pseudogenes are transcribed. The RP pseudogene transcription is difficult to identify in microarrays due to potential cross-hybridization between transcripts from the parent genes and pseudogenes. Recently, transcriptome sequencing (RNA-seq) provides an opportunity to ascertain the transcription of pseudogenes. A challenge for pseudogene expression discovery in RNA-seq data lies in the difficulty to uniquely identify reads mapped to pseudogene regions, which are typically also similar to the parent genes.

Results

Here we developed a specialized pipeline for pseudogene transcription discovery. We first construct a “composite genome” that includes the entire human genome sequence as well as mRNA sequences of real ribosomal protein genes. We then map all sequence reads to the composite genome, and only exact matches were retained. Moreover, we restrict our analysis to strictly defined mappable regions and calculate the RPKM values as measurement of pseudogene transcription levels. We report evidences for the transcription of RP pseudogenes in 16 human tissues. By analyzing the Human Body Map 2.0 study RNA-sequencing data using our pipeline, we identified that one ribosomal protein (RP) pseudogene (PGOHUM-249508) is transcribed with RPKM 170 in thyroid. Moreover, three other RP pseudogenes are transcribed with RPKM > 10, a level similar to that of the normal RP genes, in white blood cell, kidney, and testes, respectively. Furthermore, an additional thirteen RP pseudogenes are of RPKM > 5, corresponding to the 20–30 percentile among all genes. Unlike ribosomal protein genes that are constitutively expressed in almost all tissues, RP pseudogenes are differentially expressed, suggesting that they may contribute to tissue-specific biological processes.

Conclusions

Using a specialized bioinformatics method, we identified the transcription of ribosomal protein pseudogenes in human tissues using RNA-seq data.

Keywords:
Ribosomal protein; Pseudogene; Transcription; RNA-seq data