Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

This article is part of the supplement: The 2007 International Conference on Bioinformatics & Computational Biology (BIOCOMP'07)

Open Access Research

Using RNase sequence specificity to refine the identification of RNA-protein binding regions

Xin Wang123, Guohua Wang124, Changyu Shen1, Lang Li1, Xinguo Wang5, Sean D Mooney26, Howard J Edenberg678, Jeremy R Sanford7* and Yunlong Liu128*

Author Affiliations

1 Division of Biostatistics Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202, USA

2 Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN 46202, USA

3 College of Automation, Harbin Engineering University, Harbin, Heilongjiang 150001, China

4 School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, 150001, China

5 The Center for Genomics and Bioinformatics, Indiana University, Bloomington, IN 47405, USA

6 Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202, USA

7 Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN 46202, USA

8 Center for Medical Genomics, Indiana University School of Medicine, Indianapolis, IN 46202, USA

For all author emails, please log on.

BMC Genomics 2008, 9(Suppl 1):S17  doi:10.1186/1471-2164-9-S1-S17

Published: 20 March 2008

Abstract

Massively parallel pyrosequencing is a high-throughput technology that can sequence hundreds of thousands of DNA/RNA fragments in a single experiment. Combining it with immunoprecipitation-based biochemical assays, such as cross-linking immunoprecipitation (CLIP), provides a genome-wide method to detect the sites at which proteins bind DNA or RNA. In a CLIP-pyrosequencing experiment, the resolutions of the detected protein binding regions are partially determined by the length of the detected RNA fragments (CLIP amplicons) after trimming by RNase digestion. The lengths of these fragments usually range from 50-70 nucleotides. Many genomic regions are marked by multiple RNA fragments. In this paper, we report an empirical approach to refine the localization of protein binding regions by using the distribution pattern of the detected RNA fragments and the sequence specificity of RNase digestion. We present two regions to which multiple amplicons map as examples to demonstrate this approach.