Table 7

E. coli dataset

strain ID

source (GenBank ID)

# of genes

genome length

lab.


O26:H11 11368

AP010953

5363(4)

5697240

University of Tokyo

O157:H7 EC4115

CP001164

5315(0)

5572075

J. Craig Venter Institute

O157:H7 EDL933

AE005174

5348(10)

5528445

University of Wisconsin

O157:H7 TW14359

CP001368

5263(6)

5528136

University of Washington

O157:H7 Sakai

BA000007

5360(5)

5498450

GIRC

O103:H2 12009

AP010958

5053(4)

5449314

University of Tokyo

O55:H7 CB9615

CP001846

5014(0)

5386352

Nankai University

O111:H 11128

AP010960

4971(4)

5371077

University of Tokyo

042

FN554766

4792(18)

5241977

Welcome Trust Sanger Institute

CFT073

AE014075

5378(4)

5231428

University of Wisconsin

ED1a

CU928162

4914(4)

5209548

Genoscope

UMN026

CU928163

4825(4)

5202090

Genoscope

55989

CU928145

4762(4)

5154862

Institute Pasteur and Genoscope

ETEC H10407

FN649414

4696(3)

5153435

Welcome Trust Sanger Institute

IAI39

CU928164

4731(7)

5132068

Genoscope

ABU 83972

CP001671

4793(6)

5131397

Georg-August-University Goettingen

IHE3034

CP001969

4757(3)

5108383

IGS

APEC O1

CP000468

4467(3)

5082025

Iowa State University

SMS-3-5

CP000970

4742(3)

5068389

TIGR

UTI89

CP000243

5066(13)

5065741

Washington University

S88

CU928161

4695(3)

5032268

Genoscope

UM146

CP002167

4650(0)

4993013

MBRI

E24377A

CP000800

4755(0)

4979619

TIGR

O127:H6 E2348/69

FM180568

4553(4)

4965553

Sanger Institute

536

CP000247

4629(2)

4938920

University of Goettingen

W

CP002185

4478(4)

4900968

AIBN/KRIBB

SE11

AP009240

4679(0)

4887515

Kitasato Institute for Life Sciences

O83:H1 NRG 857C

CP001855

4429(13)

4747819

Public Health Agency of Canada Laboratory for Foodborne Zoonoses

ATCC 8739

CP000946

4180(7)

4746218

US DOE Joint Genome Institute

SE15

AP009378

4338(0)

4717338

Kitasato University

IAI1

CU928160

4353(4)

4700560

Genoscope

K-12 substr. DH10B

CP000948

4125(5)

4686137

University of Wisconsin-Madison

K-12 substr. W3110

AP009048

4225(9)

4646332

Nara Institute of Science and Technology

HS

CP000802

4383(3)

4643538

TIGR

K-12 substr. MG1655

U00096

4144(7)

4639675

University of Wisconsin-Madison

DH1

CP001637

4159(4)

4630707

US DOE Joint Genome Institute

BL21-Gold(DE3)pLysS

CP001665

4208(8)

4629812

US DOE Joint Genome Institute

BW2952

CP001396

4083(5)

4578159

TEDA School of Biological Sciences and Biotechnology

BL21(DE3) BL21

AM946981

4227(4)

4570938

Austrian Center for Biopharmaceutical Technology

B REL606

CP000819

4158(6)

4558953

International E. coli B Consortium

BL21(DE3)

CP001509

4181(23)

4558947

Korea Research Institute of Bioscience and Biotechnology


Details for input strains for the E. coli case study. The first number in column called ’# of genes’ corresponds to the number of annotated genes, the second (in brackets) corresponds to the number of genes excluded in the study due to unusual start or stop codons or sequence length not divisible by three.

Wozniak et al. BMC Genomics 2011 12(Suppl 2):S6   doi:10.1186/1471-2164-12-S2-S6

Open Data