SPARQL
Contents
SPARQL#
Dieses Notebook: Auszüge aus
SPARQL 1.1 Query Language. W3C Recommendation 21 March 2013. https://www.w3.org/TR/sparql11-query/ (Lizenz: https://www.w3.org/Consortium/Legal/2023/doc-license )
mit dem Ziel, einzene SPARQL-Queries in Python mit RDFLib zu reproduzieren.
1.2.4 Terminology#
2.2 Multiple Matches#
from rdflib import Graph # RDFLib: https://rdflib.readthedocs.io/en/stable/
G = Graph().parse(format = "ttl", data = """
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
_:a foaf:name "Johnny Lee Outlaw" .
_:a foaf:mbox <mailto:jlow@example.com> .
_:b foaf:name "Peter Goodguy" .
_:b foaf:mbox <mailto:peter@example.org> .
_:c foaf:mbox <mailto:carol@example.org> .
""")
Q = """
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?mbox
WHERE
{ ?x foaf:name ?name .
?x foaf:mbox ?mbox }
"""
qres = G.query(Q)
qres
<rdflib.plugins.sparql.processor.SPARQLResult at 0x7f2f187451c0>
for row in qres:
print(row)
(rdflib.term.Literal('Johnny Lee Outlaw'), rdflib.term.URIRef('mailto:jlow@example.com'))
(rdflib.term.Literal('Peter Goodguy'), rdflib.term.URIRef('mailto:peter@example.org'))
2.5 Creating Values with Expressions#
G = Graph().parse(format = "ttl", data = """
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
_:a foaf:givenName "John" .
_:a foaf:surname "Doe" .
""")
Q = """
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name
WHERE {
?P foaf:givenName ?G ;
foaf:surname ?S
BIND(CONCAT(?G, " ", ?S) AS ?name)
}
"""
[row for row in G.query(Q) ]
[(rdflib.term.Literal('John Doe'),)]
2.6 Building RDF Graphs#
SPARQL has several query forms. The SELECT query form returns variable bindings. The CONSTRUCT query form returns an RDF graph.
JB: Wie sieht das in RDFLib aus? siehe https://rdflib.readthedocs.io/en/stable/intro_to_sparql.html#run-a-query :
The query method returns a rdflib.query.Result instance.
For SELECT queries, iterating over this returns rdflib.query.ResultRow instances, each containing a set of variable bindings.
For CONSTRUCT/DESCRIBE queries, iterating over the result object gives the triples.
For ASK queries, iterating will yield the single boolean answer, or evaluating the result object in a boolean-context (i.e. bool(result))
G = Graph().parse(format = "ttl", data = """
@prefix org: <http://example.com/ns#> .
_:a org:employeeName "Alice" .
_:a org:employeeId 12345 .
_:b org:employeeName "Bob" .
_:b org:employeeId 67890 .
""")
Q = """
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX org: <http://example.com/ns#>
CONSTRUCT { ?x foaf:name ?name }
WHERE { ?x org:employeeName ?name }
"""
Results:
qres = G.query(Q)
type(qres)
rdflib.plugins.sparql.processor.SPARQLResult
[ row for row in qres ]
[(rdflib.term.BNode('naed90cf8c98542bebb93164f0aa48201b2'),
rdflib.term.URIRef('http://xmlns.com/foaf/0.1/name'),
rdflib.term.Literal('Bob')),
(rdflib.term.BNode('naed90cf8c98542bebb93164f0aa48201b1'),
rdflib.term.URIRef('http://xmlns.com/foaf/0.1/name'),
rdflib.term.Literal('Alice'))]
which can be serialized in RDF/XML as:
# https://rdflib.readthedocs.io/en/stable/intro_to_parsing.html#saving-rdf
v_byte = qres.serialize(format="xml")
v_byte
b'<?xml version="1.0" encoding="utf-8"?>\n<rdf:RDF\n xmlns:ns1="http://xmlns.com/foaf/0.1/"\n xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"\n>\n <rdf:Description rdf:nodeID="naed90cf8c98542bebb93164f0aa48201b2">\n <ns1:name>Bob</ns1:name>\n </rdf:Description>\n <rdf:Description rdf:nodeID="naed90cf8c98542bebb93164f0aa48201b1">\n <ns1:name>Alice</ns1:name>\n </rdf:Description>\n</rdf:RDF>\n'
# https://stackoverflow.com/questions/6269765/what-does-the-b-character-do-in-front-of-a-string-literal
print(v_byte.decode('UTF-8'))
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF
xmlns:ns1="http://xmlns.com/foaf/0.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>
<rdf:Description rdf:nodeID="naed90cf8c98542bebb93164f0aa48201b2">
<ns1:name>Bob</ns1:name>
</rdf:Description>
<rdf:Description rdf:nodeID="naed90cf8c98542bebb93164f0aa48201b1">
<ns1:name>Alice</ns1:name>
</rdf:Description>
</rdf:RDF>
v_utf8 = qres.serialize(format="turtle").decode('UTF-8')
print(v_utf8)
@prefix ns1: <http://xmlns.com/foaf/0.1/> .
[] ns1:name "Alice" .
[] ns1:name "Bob" .
3 RDF Term Constraints (Informative)#
The examples in this section share one input graph:
Data:
G = Graph().parse(format = "ttl", data = """
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix : <http://example.org/book/> .
@prefix ns: <http://example.org/ns#> .
:book1 dc:title "SPARQL Tutorial" .
:book1 ns:price 42 .
:book2 dc:title "The Semantic Web" .
:book2 ns:price 23 .
""")
Q_3_1a = """
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?title
WHERE { ?x dc:title ?title
FILTER regex(?title, "^SPARQL")
}
"""
[row for row in G.query(Q_3_1a) ]
[(rdflib.term.Literal('SPARQL Tutorial'),)]
Q_3_1b = """
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?title
WHERE { ?x dc:title ?title
FILTER regex(?title, "web", "i" )
}
"""
[row for row in G.query(Q_3_1b) ]
[(rdflib.term.Literal('The Semantic Web'),)]
8 Negation#
G = Graph().parse(format = "ttl", data = """
@prefix : <http://example/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
:alice rdf:type foaf:Person .
:alice foaf:name "Alice" .
:bob rdf:type foaf:Person .
""")
Q = """
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?person
WHERE
{
?person rdf:type foaf:Person .
FILTER NOT EXISTS { ?person foaf:name ?name }
}
"""
[row for row in G.query(Q) ]
[(rdflib.term.URIRef('http://example/bob'),)]
G = Graph().parse(format = "ttl", data = """
@prefix : <http://example/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
:Kuh_123 a :Kuh .
:Kuh_123 a :Tier .
:Kuh rdfs:subClassOf :Tier .
:Igel a :Tier .
""")
Q = """
PREFIX : <http://example/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?z
WHERE
{
?a rdfs:subClassOf ?b .
?z a ?b .
FILTER NOT EXISTS { ?z a ?a}
}
"""
[row for row in G.query(Q) ]
[(rdflib.term.URIRef('http://example/Igel'),)]
Named Graphs#
TriG is a serialization format for RDF graphs. It is a plain text format for serializing named graphs and RDF Datasets https://en.wikipedia.org/wiki/TriG_(syntax)
Beispiel rdflib.html#rdflib.graph.Dataset#
from rdflib import Graph, Dataset, URIRef, Literal
# Create a new Dataset
ds = Dataset()
# simple triples goes to default graph
ds.add((URIRef("http://example.org/a"),
URIRef("http://www.example.org/b"),
Literal("foo")))
<Graph identifier=Nef60142883c04a02965fdfd9940a68ea (<class 'rdflib.graph.Dataset'>)>
# Create a graph in the dataset, if the graph name has already been
# used, the corresponding graph will be returned
# (ie, the Dataset keeps track of the constituent graphs)
g = ds.graph(URIRef("http://www.example.com/gr"))
g
<Graph identifier=http://www.example.com/gr (<class 'rdflib.graph.Graph'>)>
# add triples to the new graph as usual
g.add(
(URIRef("http://example.org/x"),
URIRef("http://example.org/y"),
Literal("bar")) )
<Graph identifier=http://www.example.com/gr (<class 'rdflib.graph.Graph'>)>
# alternatively: add a quad to the dataset -> goes to the graph
ds.add(
(URIRef("http://example.org/x"),
URIRef("http://example.org/z"),
Literal("foo-bar"),g) )
<Graph identifier=Nef60142883c04a02965fdfd9940a68ea (<class 'rdflib.graph.Dataset'>)>
for t in ds.triples((None,None,None)):
print(t)
(rdflib.term.URIRef('http://example.org/a'), rdflib.term.URIRef('http://www.example.org/b'), rdflib.term.Literal('foo'))
# querying quads() return quads; the fourth argument can be unrestricted
# (None) or restricted to a graph
for q in ds.quads((None, None, None, g)):
print(q)
# Note that in the call above -
# ds.quads((None,None,None,"http://www.example.com/gr"))
# would have been accepted, too
(rdflib.term.URIRef('http://example.org/x'), rdflib.term.URIRef('http://example.org/y'), rdflib.term.Literal('bar'), rdflib.term.URIRef('http://www.example.com/gr'))
(rdflib.term.URIRef('http://example.org/x'), rdflib.term.URIRef('http://example.org/z'), rdflib.term.Literal('foo-bar'), rdflib.term.URIRef('http://www.example.com/gr'))
for c in ds.graphs():
print(c)
<urn:x-rdflib:default> a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'Memory'].
<http://www.example.com/gr> a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'Memory'].
https://rdflib.readthedocs.io/en/stable/intro_to_parsing.html
ex2 = """PREFIX eg: <http://example.com/person/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
eg:graph-1 {
eg:drewp a foaf:Person .
eg:drewp eg:says "Hello World" .
}
eg:graph-2 {
eg:nick a foaf:Person .
eg:nick eg:says "Hi World" .
}"""
from rdflib import Dataset
from rdflib.namespace import RDF
g = Dataset()
g.parse(data = ex2, format = "trig")
for s, p, o, g in g.quads((None, RDF.type, None, None)):
print(s, g)
http://example.com/person/drewp http://example.com/person/graph-1
http://example.com/person/nick http://example.com/person/graph-2
Beipiel Wikipedia -> Trig#
# Quelle: https://en.wikipedia.org/wiki/TriG_(syntax)
trig_example = """@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix swp: <http://www.w3.org/2004/03/trix/swp-1/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix ex: <http://www.example.org/vocabulary#> .
@prefix : <http://www.example.org/exampleDocument#> .
:G1 { :Monica ex:name "Monica Murphy" .
:Monica ex:homepage <http://www.monicamurphy.org> .
:Monica ex:email <mailto:monica@monicamurphy.org> .
:Monica ex:hasSkill ex:Management }
:G2 { :Monica rdf:type ex:Person .
:Monica ex:hasSkill ex:Programming }
:G3 { :G1 swp:assertedBy _:w1 .
_:w1 swp:authority :Chris .
_:w1 dc:date "2003-10-02"^^xsd:date .
:G2 swp:quotedBy _:w2 .
:G3 swp:assertedBy _:w2 .
_:w2 dc:date "2003-09-03"^^xsd:date .
_:w2 swp:authority :Chris .
:Chris rdf:type ex:Person .
:Chris ex:email <mailto:chris@bizer.de> }"""
https://rdflib.readthedocs.io/en/stable/apidocs/rdflib.html#rdflib.graph.Dataset
from rdflib import Dataset
ds2 = Dataset()
ds2.parse(format = "trig", data = trig_example)
len(ds2)
15
# https://rdflib.readthedocs.io/en/stable/apidocs/rdflib.html#rdflib.graph.Dataset
# querying triples return them all regardless of the graph
ds2
<Graph identifier=N965e82fa329e4274968afccde53129f5 (<class 'rdflib.graph.Dataset'>)>
for s, p, o, g in ds2.quads((None, RDF.type, None, None)):
print(s, g)
http://www.example.org/exampleDocument#Monica http://www.example.org/exampleDocument#G2
http://www.example.org/exampleDocument#Chris http://www.example.org/exampleDocument#G3