Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Research article

Wavelet-based identification of DNA focal genomic aberrations from single nucleotide polymorphism arrays

Youngmi Hur1 and Hyunju Lee2*

Author Affiliations

1 Dept of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, USA

2 Dept of Information and Communications, Gwangju Institute of Science and Technology, Gwangju, South Korea

For all author emails, please log on.

BMC Bioinformatics 2011, 12:146  doi:10.1186/1471-2105-12-146

Published: 11 May 2011

Abstract

Background

Copy number aberrations (CNAs) are an important molecular signature in cancer initiation, development, and progression. However, these aberrations span a wide range of chromosomes, making it hard to distinguish cancer related genes from other genes that are not closely related to cancer but are located in broadly aberrant regions. With the current availability of high-resolution data sets such as single nucleotide polymorphism (SNP) microarrays, it has become an important issue to develop a computational method to detect driving genes related to cancer development located in the focal regions of CNAs.

Results

In this study, we introduce a novel method referred to as the wavelet-based identification of focal genomic aberrations (WIFA). The use of the wavelet analysis, because it is a multi-resolution approach, makes it possible to effectively identify focal genomic aberrations in broadly aberrant regions. The proposed method integrates multiple cancer samples so that it enables the detection of the consistent aberrations across multiple samples. We then apply this method to glioblastoma multiforme and lung cancer data sets from the SNP microarray platform. Through this process, we confirm the ability to detect previously known cancer related genes from both cancer types with high accuracy. Also, the application of this approach to a lung cancer data set identifies focal amplification regions that contain known oncogenes, though these regions are not reported using a recent CNAs detecting algorithm GISTIC: SMAD7 (chr18q21.1) and FGF10 (chr5p12).

Conclusions

Our results suggest that WIFA can be used to reveal cancer related genes in various cancer data sets.