Table 1

Benchmark data for the comparative performance of a translated alignment. Six mammalian protein-coding genes were aligned either as DNA (using ClustalW; default parameters) or via their translations as amino acids (using transAlign; genetic code specified, otherwise default parameters). All analyses used ClustalW v1.83 on an 800-MHz dual-processor Macintosh G4 running OS 10.3.5. The alignment score is taken relative to the corresponding sequence from a manually aligned data set and is the opposite of the Hamming distance (i.e., matching bases score +1, mismatches score +0). The alignment score was calculated for each individual sequence and then averaged over all sequences in each data set. Gene symbols follow the HUGO Gene Nomenclature Committee (HGNC; [21]).

Amino-acid alignment


DNA alignment

Time (sec)


Data set

No. of sequences

Unaligned sequence length

Alignment time (sec)

Average alignment score

Amino-acid alignment

DNA profile alignment

transAlign processing

Total

Average alignment score


BDNF

100

256-768

475

579.28

52

14

0

66

774.61

MTCYB

2484

388-1200

1216963

437.54

127309

13823

34

141166

860.75

RAG1

128

543-3141

2804

2346.46

307

n/a

3

310

2345.13

RAG2

196

326-1584

6492

1583.85

733

n/a

3

736

1583.95

RBP3

484

627-1292

45122

598.26

4004

10636

9

14649

579.71

VWF

182

711-1310

8384

862.06

921

n/a

4

925

1002.16


Bininda-Emonds BMC Bioinformatics 2005 6:156   doi:10.1186/1471-2105-6-156

Open Data