Open Access Research article

Annotating novel genes by integrating synthetic lethals and genomic information

Daniel Schöner15*, Markus Kalisch2, Christian Leisner3, Lukas Meier2, Marc Sohrmann35, Mahamadou Faty4, Yves Barral3, Matthias Peter35, Wilhelm Gruissem15 and Peter Bühlmann25

Author Affiliations

1 Institute of Plant Science, ETH Zurich, Universitaetsstr. 2, 8092 Zurich, Switzerland

2 Seminar for Statistics, ETH Zurich, Leonhardstr. 27, 8092 Zurich, Switzerland

3 Institute of Biochemistry, ETH Zurich, Schafmattstr. 18, 8093 Zurich, Switzerland

4 Friedrich Miescher Institute, Maulbeerstrasse 66, Basel, Switzerland

5 Competence Center for Systems Physiology and Metabolic Diseases (CC-SPMD), Zurich, Switzerland

For all author emails, please log on.

BMC Systems Biology 2008, 2:3  doi:10.1186/1752-0509-2-3

Published: 14 January 2008

Abstract

Background

Large scale screening for synthetic lethality serves as a common tool in yeast genetics to systematically search for genes that play a role in specific biological processes. Often the amounts of data resulting from a single large scale screen far exceed the capacities of experimental characterization of every identified target. Thus, there is need for computational tools that select promising candidate genes in order to reduce the number of follow-up experiments to a manageable size.

Results

We analyze synthetic lethality data for arp1 and jnm1, two spindle migration genes, in order to identify novel members in this process. To this end, we use an unsupervised statistical method that integrates additional information from biological data sources, such as gene expression, phenotypic profiling, RNA degradation and sequence similarity. Different from existing methods that require large amounts of synthetic lethal data, our method merely relies on synthetic lethality information from two single screens. Using a Multivariate Gaussian Mixture Model, we determine the best subset of features that assign the target genes to two groups. The approach identifies a small group of genes as candidates involved in spindle migration. Experimental testing confirms the majority of our candidates and we present she1 (YBL031W) as a novel gene involved in spindle migration. We applied the statistical methodology also to TOR2 signaling as another example.

Conclusion

We demonstrate the general use of Multivariate Gaussian Mixture Modeling for selecting candidate genes for experimental characterization from synthetic lethality data sets. For the given example, integration of different data sources contributes to the identification of genetic interaction partners of arp1 and jnm1 that play a role in the same biological process.