BMC Bioinformatics

official impact factor 3.03

Open Access Highly Access Research article

Towards realistic benchmarks for multiple alignments of non-coding sequences

Jaebum Kim1 and Saurabh Sinha1,2*

Author Affiliations

1 Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA

2 Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA

For all author emails, please log on.

BMC Bioinformatics 2010, 11:54 doi:10.1186/1471-2105-11-54

Published: 26 January 2010

Additional files

Additional file 1:

Performance of multiple alignment tools compared by alignment sensitivity. The scores were calculated by using all synthetic data sets (left panel), and by using only data sets where the expected number of insertions is two times more than the number of deletions or vice versa (middle and right panels respectively).

Format: DOC Size: 142KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 2:

Performance of multiple alignment tools compared by alignment specificity. The scores were calculated by using all synthetic data sets (left panel), and by using only data sets where the expected number of insertions is two times more than the number of deletions or vice versa (middle and right panels respectively).

Format: DOC Size: 146KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 3:

An example data set from the benchmark shown (in part) with true alignment (top panel) and alignments computed by each different programs.

Format: DOC Size: 1.2MB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 4:

Dependence of performance (sensitivity (left) and specificity (right)) of each alignment program on various descriptive statistics of the data sets.

Format: DOC Size: 2.5MB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 5:

Performance of multiple alignment tools compared by alignment sensitivity of pairs of species. The scores were calculated by using all synthetic data sets (left panel), and by using only data sets where the expected number of insertions is two times more than the number of deletions or vice versa (middle and right panels respectively).

Format: DOC Size: 191KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 6:

Performance of multiple alignment tools compared by alignment specificity of pairs of species. The scores were calculated by using all synthetic data sets (left panel), and by using only data sets where the expected number of insertions is two times more than the number of deletions or vice versa (middle and right panels respectively).

Format: DOC Size: 195KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 7:

Comparison of estimated alignment sensitivity and specificity, using Mlagan or Pecan, as obtained from the Pollard et al. [21] benchmark and from our benchmark.

Format: DOC Size: 34KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 8:

Comparison of estimated alignment sensitivity and specificity as obtained from the Pollard et al. benchmark.

Format: DOC Size: 847KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 9:

Phylogenetic trees and branch lengths in Newick format.

Format: TXT Size: 1KB Download file

Open Data

Additional file 10:

Genome-wide distribution of the fraction of conserved blocks estimated by using Phastcons conservation scores and multiple alignments of Drosophila non-coding sequences obtained from UCSC Genome Browser Database.

Format: DOC Size: 206KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 11:

Descriptive statistics of traditional and new benchmarks.

Format: DOC Size: 2.3MB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data