Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

Gene Ontology term overlap as a measure of gene functional similarity

Meeta Mistry1 and Paul Pavlidis2*

Author Affiliations

1 CIHR/MSFHR Graduate Program in Bioinformatics, University of British Columbia, Canada

2 Department of Psychiatry and Centre for High-throughput Biology, University of British Columbia, British Columbia, Canada

For all author emails, please log on.

BMC Bioinformatics 2008, 9:327  doi:10.1186/1471-2105-9-327

Published: 4 August 2008

Additional files

Additional File 1:

TO correlation scores for multiple sets.

Format: DOC Size: 37KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional File 2:

Correlation values amongst various similarity measures.

Format: DOC Size: 74KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional File 3:

TO scores versus scores generated using vector -based measures. For every gene pair in the 100 k set of gene pairs, the term overlap was calculated and plotted against the scores generated by Cosine, Kappa, and Weighted Cosine measures.

Format: DOC Size: 59KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional File 4:

NTO scores versus TO, Resnik, Lin, and Jiang scores. For every gene pair in the 100 k set of gene pairs, the normalized term overlap was calculated and plotted against the term overlap scores (A), the averaged variant scores of each of the three semantic similarity measures (B-D), and the maximum variant scores of each of the three semantic similarity measures (E-G).

Format: DOC Size: 187KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional File 5:

Comparing sequence and semantic similarity ("average" variants). A BLAST sequence analysis was carried out to calculate a sequence similarity score for each gene pair in the 100 k set for which sequence data was available. Of those gene pairs we considered only the 53,264 which obtained a score greater than zero. Intervals were taken along the x-axis ln [Bit Score] and (A) Resnik, (B) Lin and (C) Jiang scores for the corresponding gene pairs were averaged and plotted.

Format: DOC Size: 43KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data