Table 3

SPARQL query example 1: descriptive statistical analysis of dataset contents

SELECT ?neoplasm ?variation (count (?variation) as ?occurrence)

WHERE {

?sample NCIT:Neoplasm_by_Morphology ?neoplasm.

?somatic_mutation logvd:hasSample ?sample.

?variation_id rdfs:label ?variation.

?somatic_mutation logvd:hasVariation ?variation_id.

}

GROUP BY ?neoplasm ?variation

ORDER BY ?neoplasm


?neoplasm

?variation

?occurrence


Acinar cell carcinoma

NM_000546.1:c.186A>C

1

Acinar cell carcinoma

NM_000546.1:c.408del1

1

Acinar cell carcinoma

NM_000546.1:c.454del1

1

Acinar cell carcinoma

NM_000546.1:c.590T>G

1

Acute leukemia, NOS

NM_000546.1:c.524G>A

2

Acute megakaryoblastic leukemia

NM_000546.1:c.605G>T

1

Acute megakaryoblastic leukemia

NM_000546.1:c.734G>T

1

Acute monocytic leukemia

NM_000546.1:c.584T>C

1

Acute myeloid leukemia with maturation

NM_000546.1:c.743G>A

1

Acute myeloid leukemia with maturation

NM_000546.1:c.862A>T

1

......

......

......


This query selects neoplasm and associated gene variation along with the number of related associations for all somatic mutations in the dataset. The output has been limited to the first 10 results. SPARQL query prefixes are not shown.

Zappa et al. BMC Bioinformatics 2012 13(Suppl 4):S7   doi:10.1186/1471-2105-13-S4-S7

Open Data