SPARQL/Triples
Introduction
[edit | edit source]The statement "The sky has the color blue", consists of a subject ("the sky"), a predicate ("has the color"), and an object ("blue").
SPO or "subject, predicate, object" is known as a (Semantic) triple, or commonly referred to in Wikidata as a statement about data.
SPO is also used as a form of basic syntax layout for querying RDF data structures, or any graph database or triplestore, such as the Wikidata Query Service (WDQS).
See also w:en:Semantic triple
In Wikidata Query Service (WDQS) triples are used to describe the Query pattern in the WHERE
clause of the SELECT
statement
# ?child father Bach ?child wdt:P22 wd:Q1339.
In this case the triple ?child wdt:p22 wd:Q1339
specifies that the variable ?child
must have the parent/father Bach.
Any of the triple parts Subject, Predicate and Object may be variables. This makes this selection very versatile.
Triples with the same subject
[edit | edit source]Aditional variables can be added by adding additional triples. In the simplest case these triples use the same subject.
SELECT ?child ?childLabel ?genderLabel ?birth_date ?date_of_death
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P21 ?gender.
?child wdt:P569 ?birth_date.
?child wdt:P570 ?date_of_death.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
The first triple selects all the children of Bach. The additional triples links all these triples with a value for gender, birth date and date of death. The variable ?child
links all of them together.
If you look closely at the result you might have noticed that Johann Christoph Friedrich Bach has 2 lines in the list because there are 2 different birth dates, 21 and 23 of June 1732. In his case ?child wdt:P569 ?birth_date.
resulted into 2 values. See for further details at removing duplicates and modifiers.
OPTIONAL triples
[edit | edit source]If not all subjects have a value for a certain triple the subject is excluded. To have it included the OPTIONAL
keyword comes in handy.
SELECT DISTINCT ?child ?childLabel ?genderLabel ?birth_date ?date_of_death
WHERE
{
?child wdt:P22 wd:Q76.# ?child has father Obama
OPTIONAL{ ?child wdt:P21 ?gender. }
OPTIONAL{ ?child wdt:P569 ?birth_date. }
OPTIONAL{ ?child wdt:P570 ?date_of_death. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Both children are shown, even if one of the variables (in this case the date of death) is not filled in.
See the chapter OPTIONAL for a full description.
Complex triples
[edit | edit source]Triples are not limited to one subject. In fact triples can be linked in any thinkable way.
You would for instance be able to list the coordinates of the birth places of the children of Bach
SELECT ?child ?childLabel ?placeofbirthLabel ?coordinates
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P19 ?placeofbirth.
?placeofbirth wdt:P625 ?coordinates.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?placeofbirthLabel
You could even see these birthplaces (Köthen, Leipzig and Weimar) on a map by using #defaultView:Map
#defaultView:Map
SELECT ?placeofbirthLabel ?coordinates
(GROUP_CONCAT(DISTINCT ?childLabel; SEPARATOR=", ") AS ?children)
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P19 ?placeofbirth.
?placeofbirth wdt:P625 ?coordinates.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?child rdfs:label ?childLabel.
?placeofbirth rdfs:label ?placeofbirthLabel.
}
}
GROUP BY ?placeofbirthLabel ?coordinates ?children
If you click on a red dot you will get additional data as specified above with the variables ?placeofbirthLabel
and ?children
. We had to use GROUP BY
, GROUP_CONCAT
, DISTINCT
and all labels should be defined explicitly in the SERVICE
.
You can toggle between the Map display and standard table display by the Display drop down list, at the right side of the Run button.
See more about views at Map views or all views
Triples by number of variables
[edit | edit source]Triples with one variable
[edit | edit source]An example of a triple with one variable for Subject would be
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339. # ?child has father Bach
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This will list all Subjects (as variable ?child
) with Predicate father (P22) and Object Johann Sebastian Bach (Q1339).
An example of a triple with one variable for Predicate would be
SELECT ?predicate ?pLabel
WHERE
{
wd:Q57225 ?predicate wd:Q1339. # Johann Christoph Friedrich Bach ?predicate Johann Sebastian Bach
BIND( IRI(REPLACE( STR(?predicate),"prop/direct/","entity/" )) AS ?p).
# or ?p wikibase:directClaim ?predicate.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This will list all Predicates (as variable ?predicate
) with Object Johann Christoph Friedrich Bach (Q57225) and Subject Johann Sebastian Bach (Q1339).
It shows that he is not only his father (P22) but also student of (P1066) him
An example of a triple with one variable for Object would be
SELECT ?workloc ?worklocLabel
WHERE
{
wd:Q1339 wdt:P937 ?workloc. # Bach work location ?workloc
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This will list all Objects (as variable ?workloc
) with Subject Johann Sebastian Bach (Q1339) and Predicate work location (P937).
Triples with two variables
[edit | edit source]An example of a triple with 2 variables and only a fixed value for Subject would list all raw information available in Wikidata about Bach
SELECT ?predicate ?object
WHERE
{
wd:Q1339 ?predicate ?object. # Bach
}
See further at next section with 3 variables for further usage
An example of a triple with 2 variables and only a fixed value for Predicate would list all subjects (probably airports) with an IATA airport code
SELECT ?subject ?subjectLabel ?object
WHERE
{
?subject wdt:P238 ?object. # IATA airport code
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?object
An usage could be to check for duplicate IATA codes:
SELECT ?object (COUNT(?subject) AS ?count)
(MIN(?subject) AS ?subject1) (MAX(?subject) AS ?subject2)
(GROUP_CONCAT(DISTINCT ?subjectLabel; SEPARATOR=", ") AS ?subjectLabels)
WHERE
{
?subject wdt:P238 ?object. # IATA airport code
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?subject rdfs:label ?subjectLabel.
}
}
GROUP BY ?object
HAVING(COUNT(?subject) > 1)
ORDER BY ?object
An example of a triple with 2 variables and only a fixed value for Object would list all subjects related to Bach
SELECT ?subject ?subjectLabel ?subjectDescription ?predicate ?pLabel
WHERE
{
?subject ?predicate wd:Q1339. # Bach
BIND( IRI(REPLACE( STR(?predicate),"prop/direct/","entity/" )) AS ?p).
# or ?p wikibase:directClaim ?predicate.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?subject
An other possibility of a triple with fixed value for Object would list all subjects with value "ABC", and will show for instance airport Albacete Airport
SELECT ?subject ?subjectLabel ?subjectDescription ?predicate ?pLabel
WHERE
{
?subject ?predicate "ABC".
BIND( IRI(REPLACE( STR(?predicate),"prop/direct/","entity/" )) AS ?p).
# or ?p wikibase:directClaim ?predicate.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?subject
Triples with three variables
[edit | edit source]When you would use triples with all 3 as variables (one for Subject, one for Predicate and one for Object) you basically will list out the whole database. This can be done for small databases, and can be used as well to get a rough idea of the available data, on all available properties.
All raw information available in Wikidata about the children of Bach:
SELECT ?subject ?predicate ?object
WHERE
{
?subject ?predicate ?object.
?subject wdt:P22 wd:Q1339. # subject has father Bach
}
ORDER BY ?subject ?predicate ?object
LIMIT 10000
The same query but grouped by predicate:
SELECT DISTINCT ?subject ?subjectLabel ?predicate
(GROUP_CONCAT(DISTINCT ?object; SEPARATOR=", ") AS ?objects)
WHERE
{
?subject ?predicate ?object.
?subject wdt:P22 wd:Q1339. # subject has father Bach
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?subject ?subjectLabel ?predicate
ORDER BY ?subject ?subjectLabel ?predicate
LIMIT 10000
From the query below you can discover triples about the date the Wikidata page was last updated, the total number of statements, the number of sitelinks etc. These are schema:dateModified
, wikibase:statements
and wikibase:sitelinks
respectively.
SELECT ?subject ?subjectLabel ?datemodified ?statements ?sitelinks
WHERE
{
?subject wdt:P22 wd:Q1339. # subject has father Bach
?subject schema:dateModified ?datemodified.
?subject wikibase:statements ?statements.
?subject wikibase:sitelinks ?sitelinks.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}