Bio++: a set of C++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics
1 CNRS UMR 5171 – Génome, Populations, Interactions, Adaptation (GPIA), Université Montpellier 2, France
2 CNRS UMR 5554 – Institut des Sciences de I'Evolution de Montpellier (ISE-M), Université Montpellier 2, France
BMC Bioinformatics 2006, 7:188 doi:10.1186/1471-2105-7-188Published: 4 April 2006
A large number of bioinformatics applications in the fields of bio-sequence analysis, molecular evolution and population genetics typically share input/ouput methods, data storage requirements and data analysis algorithms. Such common features may be conveniently bundled into re-usable libraries, which enable the rapid development of new methods and robust applications.
We present Bio++, a set of Object Oriented libraries written in C++. Available components include classes for data storage and handling (nucleotide/amino-acid/codon sequences, trees, distance matrices, population genetics datasets), various input/output formats, basic sequence manipulation (concatenation, transcription, translation, etc.), phylogenetic analysis (maximum parsimony, markov models, distance methods, likelihood computation and maximization), population genetics/genomics (diversity statistics, neutrality tests, various multi-locus analyses) and various algorithms for numerical calculus.
Implementation of methods aims at being both efficient and user-friendly. A special concern was given to the library design to enable easy extension and new methods development. We defined a general hierarchy of classes that allow the developer to implement its own algorithms while remaining compatible with the rest of the libraries. Bio++ source code is distributed free of charge under the CeCILL general public licence from its website http://kimura.univ-montp2.fr/BioPP webcite.