SPARQL/Modifiers
There are 4 optional modifiers to the SELECT ... WHERE { .. }
. See the chapter on SELECT for an introduction.
The five modifiers are GROUP BY ...
, HAVING ...
, ORDER BY ...
, LIMIT ...
and OFFSET ...
.
GROUP BY
[edit | edit source]Let us consider this list of children of Bach, listing also their mother
SELECT ?mother ?motherLabel ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P25 ?mother.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Let us group this list by the mothers.
SELECT ?mother ?motherLabel (COUNT(?child) AS ?children)
(GROUP_CONCAT(DISTINCT ?childLabel; SEPARATOR=", ") AS ?names)
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P25 ?mother.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?mother rdfs:label ?motherLabel.
?child rdfs:label ?childLabel.
}
}
GROUP BY ?mother ?motherLabel
We GROUP BY
both variables ?mother
and ?motherLabel
because if we would leave out the label we would end up with an error "Bad Aggregate".
First we added (COUNT(?child) AS ?children)
to count the number of children. COUNT
is one of the Aggregate functions MIN
, MAX
, SUM
, AVG
, COUNT
or SAMPLE
.
Notice first that it counts the variable ?child
. Notice as well that the syntax is (COUNT(?var1) AS ?var2)
.
As second information we added a combined list of children per mother by using (GROUP_CONCAT(DISTINCT ?var1; SEPARATOR=", ") AS ?var2)
.
When using GROUP_CONCAT
and a label, all labels should be defined explicitly in the SERVICE.
HAVING
[edit | edit source]HAVING
is always used in combination with GROUP BY
SELECT ?mother ?motherLabel (COUNT(?child) AS ?children)
(GROUP_CONCAT(DISTINCT ?childLabel; SEPARATOR=", ") AS ?names)
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P25 ?mother.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?mother rdfs:label ?motherLabel.
?child rdfs:label ?childLabel.
}
}
GROUP BY ?mother ?motherLabel
HAVING (COUNT(?child)>7)
HAVING
will filter out groups that does not meet the condition specified. In this case only one mother is shown, with 13 children.
As COUNT(?child)
is bound to the variable ?children
the HAVING
clause can also be written as HAVING (?children>7)
.
The HAVING
clause can be useful for finding duplicates for instance like HAVING (COUNT(?var)>1)
.
ORDER BY
[edit | edit source]ORDER BY something
sorts the results by something
. something
can be any expression or variables. This expression can also be wrapped in either ASC()
or DESC()
to specify the sorting order (ascending or descending). (If you don’t specify either, the default is ascending sort, so ASC(something)
is equivalent to just something
.)
SELECT ?mother ?motherLabel ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P25 ?mother.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?motherLabel) ?childLabel
LIMIT
[edit | edit source]LIMIT count
cuts off the result list at count
results, where count
is any natural number. For example, LIMIT 10
limits the query to ten results. LIMIT 1
only returns a single result.
This could be useful to get the top 10 results, or only 10 random results to see how the data looks like.
OFFSET
[edit | edit source]OFFSET count
can be used to skip the first number of results. OFFSET 100
LIMIT 10
returns records 101-110.
References
[edit | edit source]