This article is part of the supplement: Selected Proceedings of the 2010 AMIA Summit on Translational Bioinformatics
Assessment of genetic variation for the LINE-1 retrotransposon from next generation sequence data
1 Computer Engineering and Computer Science Department, Speed School of Engineering, University of Louisville, Louisville, KY 40292, USA
2 Biochemistry and Molecular Biology Department, School of Medicine, University of Louisville, Louisville, KY 40292, USA
BMC Bioinformatics 2010, 11(Suppl 9):S12 doi:10.1186/1471-2105-11-S9-S12Published: 28 October 2010
In humans, copies of the Long Interspersed Nuclear Element 1 (LINE-1) retrotransposon comprise 21% of the reference genome, and have been shown to modulate expression and produce novel splice isoforms of transcripts from genes that span or neighbor the LINE-1 insertion site.
In this work, newly released pilot data from the 1000 Genomes Project is analyzed to detect previously unreported full length insertions of the retrotransposon LINE-1. By direct analysis of the sequence data, we have identified 22 previously unreported LINE-1 insertion sites within the sequence data reported for a mother/father/daughter trio.
It is demonstrated here that next generation sequencing data, as well as emerging high quality datasets from individual genome projects allow us to assess the amount of heterogeneity with respect to the LINE-1 retrotransposon amongst humans, and provide us with a wealth of testable hypotheses as to the impact that this diversity may have on the health of individuals and populations.