LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms
1 Bioinformatics Research Centre (BIRC), School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
2 Institute for Infocomm Research (I2R), A*STAR (Agency for Science, Technology, and Research), 1 Fusionopolis, Singapore 138632, Singapore
3 NCBI, NLM, National Institutes of Health, 8600 Rockville Pike, Bethesda, Maryland 20894, USA
4 Genome Institute of Singapore, A*STAR, Biopolis, Singapore 138672, Singapore
BMC Bioinformatics 2014, 15:48 doi:10.1186/1471-2105-15-48Published: 17 February 2014
As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs.
Recently, an algorithm called “LDsplit” has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified.
LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of recombination hotspots among individuals, opening a new avenue for motif finding. Tested on an established motif and simulated datasets, LDsplit shows promise to discover novel DNA motifs for meiotic recombination hotspots.