Open Access Open Badges Research article

Swiftly Computing Center Strings

Franziska Hufsky12*, Léon Kuchenbecker3, Katharina Jahn3, Jens Stoye3 and Sebastian Böcker1

Author Affiliations

1 Lehrstuhl für Bioinformatik, Friedrich-Schiller-Universität Jena, Ernst-Abbe-Platz 2, Jena, Germany

2 Max Planck Institute for Chemical Ecology, Beutenberg Campus, Jena, Germany

3 AG Genominformatik, Technische Fakultät, Universität Bielefeld, Bielefeld, Germany

For all author emails, please log on.

BMC Bioinformatics 2011, 12:106  doi:10.1186/1471-2105-12-106

Published: 19 April 2011



The center string (or closest string) problem is a classic computer science problem with important applications in computational biology. Given k input strings and a distance threshold d, we search for a string within Hamming distance at most d to each input string. This problem is NP complete.


In this paper, we focus on exact methods for the problem that are also swift in application. We first introduce data reduction techniques that allow us to infer that certain instances have no solution, or that a center string must satisfy certain conditions. We describe how to use this information to speed up two previously published search tree algorithms. Then, we describe a novel iterative search strategy that is effecient in practice, where some of our reduction techniques can also be applied. Finally, we present results of an evaluation study for two different data sets from a biological application.


We find that the running time for computing the optimal center string is dominated by the subroutine calls for d = dopt -1 and d = dopt. Our data reduction is very effective for both, either rejecting unsolvable instances or solving trivial positions. We find that this speeds up computations considerably.