Table 1

M. tuberculosis dataset

strain ID

source

resist.

# of genes

lab.


H37Rv

NC_000962

DS

3988(26)

S

H37Ra

NC_009525

DS

4034(22)

C

F11

NC_009565

DS

3941(5)

B

KZN 4207(T)

PLoS One. [16]

DS

3902(47)

T

KZN 4207(B)

Broad Institute

DS

3996(4)

B

KZN 1435

Broad Institute

MDR

4059(10)

B

KZN V2475

PLoS One. [16]

MDR

3893(3792)

T

KZN 605

Broad Institute

XDR

4024(26)

B

KZN R506

PLoS One. [16]

XDR

3902(46)

T


Details for input strains for the M. tuberculosis case study. The first number in column called ’# of genes’ corresponds to the number of annotated genes, the second (in brackets) corresponds to the number of genes excluded in the study due to unusual start or stop codons or sequence length not divisible by three. In order to avoid ambiguity in naming the same strain sequenced by two labs we introduce an additional suffix (T or B). Characters in last column, called ’lab.’, describe the sequencing laboratories: B - The Broad Institute, T - Texas A&M University, C - Chinese National Human Genome Center at Shanghai, S - Sanger Institute.

Wozniak et al. BMC Genomics 2011 12(Suppl 2):S6   doi:10.1186/1471-2164-12-S2-S6

Open Data