KIRMES: kernel-based identification of regulatory modules in euchromatic sequences

Schultheiss, Sebastian J; Busch, Wolfgang; Lohmann, Jan; Kohlbacher, Oliver; Rätsch, Gunnar

doi:10.1186/1471-2105-10-S13-O1

Volume 10 Supplement 13

Highlights from the Fifth International Society for Computational Biology (ISCB) Student Council Symposium

Oral presentation
Open access
Published: 19 October 2009

KIRMES: kernel-based identification of regulatory modules in euchromatic sequences

Sebastian J Schultheiss^1,2,
Wolfgang Busch^2,3,
Jan Lohmann^2,4,
Oliver Kohlbacher⁵ &
…
Gunnar Rätsch¹

BMC Bioinformatics volume 10, Article number: O1 (2009) Cite this article

2860 Accesses
4 Citations
Metrics details

Background

We predict transcription factor (TF) target genes based on their regulatory sequence. A TF binding site is a short segment (~10 bp) near a gene's regulatory region that is recognized by respective TFs. Overrepresented motifs can be identified in regulatory sequences of a set of genes that is enriched with targets for a specific TF. Gibbs-sampling methods that try to identify position weight matrices to characterize binding sites have been successful for small genomes, but are problematic in higher eukaryotes, where motifs are degenerate and form cis-regulatory modules [1].

Methods

Our method classifies genes as TF targets. We use de novo motif finding and subsequently apply a Support Vector Machine employing a kernel that captures information about the motifs, their relative location, and sequence conservation (see Figure 1). The weighted degree kernel with shifts (WDS) computes the similarity of fixed-length sequences. We extend this kernel with conservation information and information about motif co-occurrence to the Regulatory Modules kernel [2]. KIRMES is available on our Galaxy server http://galaxy.tuebingen.mpg.de. Using positional oligomer importance matrices [3], we are able to make the output of the kernel interpretable by displaying a sequence logo of the oligomers that contributed most to the correct classification.

Results

We compared our method to a state-of-the-art Gibbs sampler, PRIORITY [4], on its own dataset with the published settings with respect to successful classification. We achieve correct predictions on 74% of their sets vs. 63% for PRIORITY. We let KIRMES classify gene sets obtained from microarrays of Arabidopsis thaliana. Using conservation as weighting for the WDS kernel improves performance. These results illustrate the power of our approach in exploiting the relationship between motifs as well as conservation to improve the recognition of TF targets. Interpretable results and an easy-to-use web service make this a valuable tool for any researcher interested in gene regulation.

References

Gupta M, Liu J: De novo cis-regulatory module elicitation for eukaryotic genomes. Proc Natl Acad Sci USA 2005, 102(20):7079–7084.
Article PubMed Central CAS PubMed Google Scholar
Schultheiss SJ, Busch W, Lohmann JU, Kohlbacher O, Rätsch G: KIRMES: Kernel-based identification of regulatory modules in euchromatic sequences. Bioinformatics 2009. epub: 23 April 2009. epub: 23 April 2009.
Google Scholar
Sonnenburg S, Zien A, Philips P, Rätsch G: POIMs: Positional Oligomer Importance Matrices – understanding support vector machine-based signal detectors. Bioinformatics 2008, 24(13):i6–14.
Article PubMed Central CAS PubMed Google Scholar
Gordan R, Narlikar L, Hartemink A: A fast, alignment-free, conservation-based method for transcription factor binding site discovery. In Lecture Notes in Computer Science: RECOMB 2008. Volume 4955. Springer, Heidelberg, Germany; 98–111.

Download references

Author information

Authors and Affiliations

Machine Learning in Biology Research Group, Friedrich Miescher Laboratory of the Max Planck Society, 72076, Tuebingen, Germany
Sebastian J Schultheiss & Gunnar Rätsch
Max Planck Institute for Developmental Biology, 72076, Tuebingen, Germany
Sebastian J Schultheiss, Wolfgang Busch & Jan Lohmann
Biology Department, Duke University, Durham, NC, 27710, USA
Wolfgang Busch
Department of Stem Cell Research, University of Heidelberg, 69120, Heidelberg, Germany
Jan Lohmann
Simulation of Biological Systems, Wilhelm Schickard Institute for Computer Science, University of Tuebingen, 72076, Tuebingen, Germany
Oliver Kohlbacher

Authors

Sebastian J Schultheiss
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Busch
View author publications
You can also search for this author in PubMed Google Scholar
Jan Lohmann
View author publications
You can also search for this author in PubMed Google Scholar
Oliver Kohlbacher
View author publications
You can also search for this author in PubMed Google Scholar
Gunnar Rätsch
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Schultheiss, S.J., Busch, W., Lohmann, J. et al. KIRMES: kernel-based identification of regulatory modules in euchromatic sequences. BMC Bioinformatics 10 (Suppl 13), O1 (2009). https://doi.org/10.1186/1471-2105-10-S13-O1

Download citation

Published: 19 October 2009
DOI: https://doi.org/10.1186/1471-2105-10-S13-O1

Highlights from the Fifth International Society for Computational Biology (ISCB) Student Council Symposium

KIRMES: kernel-based identification of regulatory modules in euchromatic sequences

Background

Methods

Results

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

BMC Bioinformatics

Contact us

Highlights from the Fifth International Society for Computational Biology (ISCB) Student Council Symposium

KIRMES: kernel-based identification of regulatory modules in euchromatic sequences

Background

Methods

Results

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us