Table 7

The effect of similarity data.

Query Term

Query Type

Prominence Model

Focused Subgraph

Average No. Neighbors

Average Consistent Neighbors

Average Ratio

Q(R)

UROC (R)


autoimmune

protein

Hubs & Authorities

With Sim

1413.76

1.98

0

439

11773

No Sim

69.8

4.64

0.13

2062

57865

stromelysin

protein

Hubs & Authorities

With Sim

338.52

27.98

0.083

10348

263293

No Sim

214.91

19.3

0.22

6690

181282


These quality results were calculated for the stromelysin and autoimmune focused subgraphs when searching for proteins. The Hubs & Authorities values were computed using the Max scoring method. To compare these results to our previous results we recomputed all performance measures for the focused subgraphs that include similarity relations, but using the ranking that was produced without considering these relations. Interestingly, when using similarity data, the top scoring entity for the 'stromelysin' query is a protein (docid 986092) that does not contain the query term in its definition, nor do the DNA sequences, the UniGene clusters, and the enzyme family that are related to this protein. However, this protein, membrane type 5 matrix metalloproteinase, is significantly similar to many stromelysin proteins.

Shafer et al. BMC Bioinformatics 2006 7:71   doi:10.1186/1471-2105-7-71

Open Data