Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Selected articles from the Ninth Asia Pacific Bioinformatics Conference (APBC 2011)

Open Access Research

UMARS: Un-MAppable Reads Solution

Sung-Chou Li123, Wen-Ching Chan124, Chun-Hung Lai3, Kuo-Wang Tsai3, Chun-Nan Hsu45, Yuh-Shan Jou3, Hua-Chien Chen6, Chun-Hong Chen7 and Wen-chang Lin13*

Author Affiliations

1 Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan

2 Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan

3 Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan

4 Institute of Information Sciences, Academia Sinica, Taipei, Taiwan

5 Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292, USA

6 Molecular Medicine Research Center, Chang Gung University, Taoyuan, Taiwan

7 Division of Molecular and Genomic Medicine, National Health Research Institutes, Zhunan Town, Miaoli County, Taiwan

For all author emails, please log on.

BMC Bioinformatics 2011, 12(Suppl 1):S9  doi:10.1186/1471-2105-12-S1-S9

Published: 15 February 2011

Abstract

Background

Un-MAppable Reads Solution (UMARS) is a user-friendly web service focusing on retrieving valuable information from sequence reads that cannot be mapped back to reference genomes. Recently, next-generation sequencing (NGS) technology has emerged as a powerful tool for generating high-throughput sequencing data and has been applied to many kinds of biological research. In a typical analysis, adaptor-trimmed NGS reads were first mapped back to reference sequences, including genomes or transcripts. However, a fraction of NGS reads failed to be mapped back to the reference sequences. Such un-mappable reads are usually imputed to sequencing errors and discarded without further consideration.

Methods

We are investigating possible biological relevance and possible sources of un-mappable reads. Therefore, we developed UMARS to scan for virus genomic fragments or exon-exon junctions of novel alternative splicing isoforms from un-mappable reads. For mapping un-mappable reads, we first collected viral genomes and sequences of exon-exon junctions. Then, we constructed UMARS pipeline as an automatic alignment interface.

Results

By demonstrating the results of two UMARS alignment cases, we show the applicability of UMARS. We first showed that the expected EBV genomic fragments can be detected by UMARS. Second, we also detected exon-exon junctions from un-mappable reads. Further experimental validation also ensured the authenticity of the UMARS pipeline. The UMARS service is freely available to the academic community and can be accessed via http://musk.ibms.sinica.edu.tw/UMARS/ webcite.

Conclusions

In this study, we have shown that some un-mappable reads are not caused by sequencing errors. They can originate from viral infection or transcript splicing. Our UMARS pipeline provides another way to examine and recycle the un-mappable reads that are commonly discarded as garbage.