Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

This article is part of the supplement: Eleventh International Conference on Bioinformatics (InCoB2012): Computational Biology

Open Access Proceedings

Bayesian prediction of bacterial growth temperature range based on genome sequences

Dan B Jensen1*, Tammi C Vesth1, Peter F Hallin2, Anders G Pedersen1 and David W Ussery1

Author Affiliations

1 Technical University of Denmark, Center for Systems Biology, Denmark

2 Novozymes A/S, Denmark

For all author emails, please log on.

BMC Genomics 2012, 13(Suppl 7):S3  doi:10.1186/1471-2164-13-S7-S3

Published: 13 December 2012

Additional files

Additional File 1:

A rooted tree showing the bootstrap values (*.pdf).

Format: PDF Size: 7KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional File 2:

16s tree dendrogram colourcoded for all genomes (NEXUS tree file; can be viewed with a phylogenetic tree viewer such as TreeView).

Format: TXT Size: 14KB Download file

Open Data

Additional File 3:

The sequences of the members of the class-associated protein families (*.txt).

Format: TXT Size: 472KB Download file

Open Data

Additional File 4:

The likelihood of the members of the class-associated protein families (*.txt).

Format: TXT Size: 2KB Download file

Open Data

Additional File 5:

The predictive performance of the naïve Bayesian inference program, achieved when implementing a Gaussian likelihood function of the observed structural characteristics (*.txt).

Format: TXT Size: 3KB Download file

Open Data

Additional File 6:

The predictive performance of the naïve Bayesian inference program, achieved when implementing the observed protein family frequencies alone as likelihoods (*.txt).

Format: TXT Size: 2KB Download file

Open Data

Additional File 7:

The predictive performance of the naïve Bayesian inference program, achieved when combining the observed protein family frequencies with the Gaussian likelihood functions of observed structural characteristics (*.txt).

Format: TXT Size: 3KB Download file

Open Data

Additional File 8:

Mean and standard deviations of structural features of the training set, used as the basis for predictions, assuming a Gaussian distribution of the features. (*.doc).

Format: DOCX Size: 35KB Download file

Open Data

Additional File 9:

The strains selected for the training set, along with their thermophilicity classification. (*.txt).

Format: TXT Size: 3KB Download file

Open Data

Additional File 10:

The strains selected for the test set, along with their thermophilicity classification. (*.txt).

Format: TXT Size: 1KB Download file

Open Data

Additional File 11:

The python script used to perform the prediction-related calculations (*.py, can be read with any text viewer).

Format: PY Size: 5KB Download file

Open Data

Additional File 12:

The python script used to calculate the probability of the genome belonging to each of the three groups (*.py, can be read with any text viewer).

Format: PY Size: 4KB Download file

Open Data

Additional File 13:

The python script used in the combined prediction, using the posterior probabilities based on sequence feature-based predictions as prior probabilities in predictions based on protein family presence (*.py, can be read with any text viewer).

Format: PY Size: 6KB Download file

Open Data

Additional File 14:

The python script used to evaluate the predictions for each class individually by considering the three classes as two classes; the one the genome belongs to, and the one the genome does not belong to. (*.py, can be read with any text viewer).

Format: PY Size: 3KB Download file

Open Data