Identification of exonic regions in DNA sequences using cross-correlation and noise suppression by discrete wavelet transform
1 School of Engineering-Emerging Technologies, University of Tabriz, Tabriz 5166614761, Iran
2 Photonics and Nanocrystals Research Lab. (PNRL), Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz 5166614761, Iran
3 Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz 5166614761, Iran
BMC Bioinformatics 2011, 12:430 doi:10.1186/1471-2105-12-430Published: 3 November 2011
The identification of protein coding regions (exons) in DNA sequences using signal processing techniques is an important component of bioinformatics and biological signal processing. In this paper, a new method is presented for the identification of exonic regions in DNA sequences. This method is based on the cross-correlation technique that can identify periodic regions in DNA sequences.
The method reduces the dependency of window length on identification accuracy. The proposed algorithm is applied to different eukaryotic datasets and the output results are compared with those of other established methods. The proposed method increased the accuracy of exon detection by 4% to 41% relative to the most common digital signal processing methods for exon prediction.
We demonstrated that periodic signals can be estimated using cross-correlation. In addition, discrete wavelet transform (DWT) can minimise noise while maintaining the signal. The proposed algorithm, which combines cross-correlation and DWT, significantly increases the accuracy of exonic region identification.