Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Software

inTB - a data integration platform for molecular and clinical epidemiological analysis of tuberculosis

Patrícia Soares1, Renato J Alves1, Ana B Abecasis12, Carlos Penha-Gonçalves1, M Gabriela M Gomes1 and José B Pereira-Leal1*

Author Affiliations

1 Instituto Gulbenkian de Ciência, Rua da Quinta Grande 6, Apartado 14, Oeiras P-2781-901, Portugal

2 Present address: Instituto de Higiene e Medicina Tropical, Lisbon, Portugal

For all author emails, please log on.

BMC Bioinformatics 2013, 14:264  doi:10.1186/1471-2105-14-264

Published: 30 August 2013

Abstract

Background

Tuberculosis is currently the second highest cause of death from infectious diseases worldwide. The emergence of multi and extensive drug resistance is threatening to make tuberculosis incurable. There is growing evidence that the genetic diversity of Mycobacterium tuberculosis may have important clinical consequences. Therefore, combining genetic, clinical and socio-demographic data is critical to understand the epidemiology of this infectious disease, and how virulence and other phenotypic traits evolve over time. This requires dedicated bioinformatics platforms, capable of integrating and enabling analyses of this heterogeneous data.

Results

We developed inTB, a web-based system for integrated warehousing and analysis of clinical, socio-demographic and molecular data for Mycobacterium sp. isolates. As a database it can organize and display data from any of the standard genotyping methods (SNP, MIRU-VNTR, RFLP and spoligotype), as well as an extensive array of clinical and socio-demographic variables that are used in multiple countries to characterize the disease. Through the inTB interface it is possible to insert and download data, browse the database and search specific parameters. New isolates are automatically classified into strains according to an internal reference, and data uploaded or typed in is checked for internal consistency. As an analysis framework, the system provides simple, point and click analysis tools that allow multiple types of data plotting, as well as simple ways to download data for external analysis. Individual trees for each genotyping method are available, as well as a super tree combining all of them. The integrative nature of inTB grants the user the ability to generate trees for filtered subsets of data crossing molecular and clinical/socio-demografic information. inTB is built on open source software, can be easily installed locally and easily adapted to other diseases. Its design allows for use by research laboratories, hospitals or public health authorities. The full source code as well as ready to use packages is available at http://www.evocell.org/inTB webcite.

Conclusions

To the best of our knowledge, this is the only system capable of integrating different types of molecular data with clinical and socio-demographic data, empowering researchers and clinicians with easy to use analysis tools that were not possible before.