Alu distribution and mutation types of cancer genes
1 Department of Computer Science, Xavier University of Louisiana, 1 Drexel Drive, New Orleans LA 70125, USA
2 IBM T.J. Watson Research, 19 Skyline Drive, Hawthorne NY 10532, USA
3 Tulane Cancer Center, Tulane School of Public Health and Tropical Medicine, New Orleans, Louisiana 70122, USA
BMC Genomics 2011, 12:157 doi:10.1186/1471-2164-12-157Published: 23 March 2011
Alu elements are the most abundant retrotransposable elements comprising ~11% of the human genome. Many studies have highlighted the role that Alu elements have in genetic instability and how their contribution to the assortment of mutagenic events can lead to cancer. As of yet, little has been done to quantitatively assess the association between Alu distribution and genes that are causally implicated in oncogenesis.
We have investigated the effect of various Alu densities on the mutation type based classifications of cancer genes. In order to establish the direct relationship between Alus and the cancer genes of interest, genome wide Alu-related densities were measured using genes rather than the sliding windows of fixed length as the units. Several novel genomic features, such as the density of the adjacent Alu pairs and the number of Alu-Exon-Alu triplets, were developed in order to extend the investigation via the multivariate statistical analysis toward more advanced biological insight. In addition, we characterized the genome-wide intron Alu distribution with a mixture model that distinguished genes containing Alu elements from those with no Alus, and evaluated the gene-level effect of the 5'-TTAAAA motif associated with Alu insertion sites using a two-step regression analysis method.
The study resulted in several novel findings worthy of further investigation. They include: (1) Recessive cancer genes (tumor suppressor genes) are enriched with Alu elements (p < 0.01) compared to dominant cancer genes (oncogenes) and the entire set of genes in the human genome; (2) Alu-related genomic features can be used to cluster cancer genes into biological meaningful groups; (3) The retention of exon Alus has been restricted in the human genome development, and an upper limit to the chromosome-level exon Alu densities is suggested by the distribution profile; (4) For the genes with at least one intron Alu repeat in individual chromosomes, the intron Alu densities can be well fitted by a Gamma distribution; (5) The effect of the 5'-TTAAAA motif on Alu densities varies across different chromosomes.