Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

Combining calls from multiple somatic mutation-callers

Su Yeon Kim1*, Laurent Jacob2 and Terence P Speed13*

Author Affiliations

1 Department of Statistics, University of California at Berkeley, Berkeley CA 94720, USA

2 Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Lyon 1, CNRS, INRA, UMR5558 Villeurbanne, France

3 , Walter and Eliza Hall Institute of Medical Research and the University of Melbourne, Parkville, Victoria, Australia

For all author emails, please log on.

BMC Bioinformatics 2014, 15:154  doi:10.1186/1471-2105-15-154

Published: 21 May 2014

Abstract

Background

Accurate somatic mutation-calling is essential for insightful mutation analyses in cancer studies. Several mutation-callers are publicly available and more are likely to appear. Nonetheless, mutation-calling is still challenging and there is unlikely to be one established caller that systematically outperforms all others. Therefore, fully utilizing multiple callers can be a powerful way to construct a list of final calls for one’s research.

Results

Using a set of mutations from multiple callers that are impartially validated, we present a statistical approach for building a combined caller, which can be applied to combine calls in a wider dataset generated using a similar protocol. Using the mutation outputs and the validation data from The Cancer Genome Atlas endometrial study (6,746 sites), we demonstrate how to build a statistical model that predicts the probability of each call being a somatic mutation, based on the detection status of multiple callers and a few associated features.

Conclusion

The approach allows us to build a combined caller across the full range of stringency levels, which outperforms all of the individual callers.

Keywords:
Cancer genome; Somatic mutation-calling; Combining calls; Stacking