LINE FUSION GENES: a database of LINE expression in human genes
1 PBBRC, Interdisciplinary Research Program of Bioinformatics, Pusan National University, Busan 609-735, Korea
2 Division of Biological Sciences, College of Natural Sciences, Pusan National University, Busan 609-735, Korea
3 National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology, 52 Oun-dong, Yuson-gu, Daejeon 305-333, Korea
BMC Genomics 2006, 7:139 doi:10.1186/1471-2164-7-139Published: 7 June 2006
Long Interspersed Nuclear Elements (LINEs) are the most abundant retrotransposons in humans. About 79% of human genes are estimated to contain at least one segment of LINE per transcription unit. Recent studies have shown that LINE elements can affect protein sequences, splicing patterns and expression of human genes.
We have developed a database, LINE FUSION GENES, for elucidating LINE expression throughout the human gene database. We searched the 28,171 genes listed in the NCBI database for LINE elements and analyzed their structures and expression patterns. The results show that the mRNA sequences of 1,329 genes were affected by LINE expression. The LINE expression types were classified on the basis of LINEs in the 5' UTR, exon or 3' UTR sequences of the mRNAs. Our database provides further information, such as the tissue distribution and chromosomal location of the genes, and the domain structure that is changed by LINE integration. We have linked all the accession numbers to the NCBI data bank to provide mRNA sequences for subsequent users.
We believe that our work will interest genome scientists and might help them to gain insight into the implications of LINE expression for human evolution and disease.