Table 2

Comparison of T4 and CP220 Structural Proteins

CP220 CDS

T4 Lengtha

CP220 Lengtha

T4 Gene Functionb

CP220/T4 Identity (%)

E valuec


CPT_0001

610

744

Gp17 Terminase subunit with nuclease and ATPase activity; binds single-stranded DNA, Gp16 and Gp20

143/436 (32%)

E = 4e-47


CPT_0005

516

503

Gp39 DNA topoisomerase large subunit

186/516 (35%)

E = 4e-73


CPT_0009

342

347

Gp61 DNA primase subunit

73/330 (22%)

0.017


CPT_0010

319

304

Gp44 Clamp loader subunit, DNA polymerase accessory protein

95/319 (29%)

E = 8e-27


CPT_0011

305

352

RNaseH ribonuclease

56/151 (37%)

E = 6e-16


CPT_0029

157

178

Gp49 EndoVII packaging and recombination endonuclease VII

28/108 (25%)

E = 0.11


CPT_0030

524

569

Gp20 Portal vertex protein of head

130/456 (28%)

E = 5e-40


CPT_0033

487

429

Gp30 DNA ligase

128/484 (26%)

E = 2e-23


CPT_0034

659

516

Gp18 Tail sheath monomer

58/235 (24%)

E = 3e-04


CPT_0037

134

120

Gp25 Baseplate wedge subunit

26/66 (39%)

E = 2e-07


CPT_0041

660

1214

Gp6 Baseplate wedge subunit

86/389 (22%)

E = 0.025


CPT_0045

163

252

Gp19 Tail tube protein

49/176 (27%)

E = 9e-11


CPT_0046

150

158

Gp4 Head completion protein

44/147 (29%)

E = 5e-08


CPT_0048

185

255

Gp55 Sigma factor for T4 late transcription

33/123 (26%)

E = 0.37


CPT_0051

521

444

Gp23 Major head protein

131/412 (31%)

E = 2e-41


CPT_0053

659

578

Gp18 Tail sheath protein

149/436 (34%)

E = 1e-51


CPT_0058

163

196

Gp19 Tail tube protein

41/170 (24%)

E = 5e-09


CPT_0115

898

882

Gp43 DNA polymerase

244/901 (27%)

E = 4e-56


CPT_0125

475

445

Gp41 DNA primase-helicase subunit

120/453 (26%)

E = 1e-34


CPT_0148

587

472

UvsW RNA-DNA and DNA-DNA helicase, ATPase

124/407 (30%)

E = 5e-42


CPT_0174

272

456

Gp15 Tail sheath stabilizer and completion protein

54/214 (25%)

E = 1e-05


CPT_0177

301

306

Gp32 Single-stranded DNA binding protein

65/212 (30%)

E = 6e-12


CPT_0181

374

361

RnlA RNA ligase 1 and tail fiber attachment catalyst

80/297 (26%)

E = 3e-10


CPT_0193

212

236

Gp21 Prohead core protein protease

52/153 (33%)

E = 2e-08


a Length refers to the number of amino-acid residues in the CDS and are calculated from the genomic sequence

b T4 gene product and functions are according to Miller [26]

c Identity and E values are from NCBI BLAST results, performed on 5th January 2009

Timms et al. BMC Genomics 2010 11:214   doi:10.1186/1471-2164-11-214

Open Data