BMC Genomics

official impact factor 4.21

Open Access Highly Access Research article

Towards the identification of essential genes using targeted genome sequencing and comparative analysis

Adam M Gustafson1*, Evan S Snitkin1*, Stephen CJ Parker1, Charles DeLisi1,2 and Simon Kasif1,2,3

Author Affiliations

1 Bioinformatics Graduate Program, Boston University, Boston, MA 02215 USA

2 Department of Biomedical Engineering, Boston University, MA 02215 USA

3 Children's Hospital Informatics Program of the Harvard MIT Division in Health Sciences and Technology, Boston, MA, USA

For all author emails, please log on.

BMC Genomics 2006, 7:265 doi:10.1186/1471-2164-7-265

Published: 19 October 2006

Additional files

Additional File 5:

List of organisms used to calculate phyletic retention. A list of organisms used in the calculation of phyletic retention is shown. KEGG three letter codes are used to represent the organisms, unless otherwise noted.

Format: XLS Size: 23KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional File 3:

Feature matrices for S. cerevisiae. A raw data feature matrix, as well as an entropy discretized feature matrix are included.

Format: XLS Size: 4.5MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional File 4:

Feature matrices for E. coli. A raw data feature matrix, as well as an entropy discretized feature matrix are included.

Format: XLS Size: 2.6MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional File 1:

CMIM feature ranking. This excel file includes tables showing the CMIM feature ranking.

Format: XLS Size: 20KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional File 2:

Performance of naïve Bayes classifiers using subsets of features. For each set of features analyzed in this paper (e.g. SC_GenProt, EC_GenProt, etc...), CMIM was calculated such that features were ranked in order of most informative to least. PPV for the top 1, 5, 10 and 15% of predictions are shown when naïve Bayes classifier is constructed when using the top N features.

Format: XLS Size: 28KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional File 6:

Naïve Bayes classification results. For each of the feature sets used on E. coli and S. cerevisiae, the probability of a gene being essential, as reported by naïve Bayes, is provided.

Format: XLS Size: 1.5MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data