Jump to content

SPARQL/Subqueries

From Wikibooks, open books for an open world

SPARQL allows one SELECT query to be nested inside another. The inner SELECT query is called a subquery and is evaluated first. The subquery result variable(s) can then be used in the outer SELECT query.

Simplest example:

SELECT ?x ?y WHERE {
  VALUES ?x { 1 2 3 4 }
  {
    SELECT ?y WHERE { VALUES ?y { 5 6 7 8 }  }
  }  # \subQuery
} # \mainQuery

Try it!

The example below calculates the population of each country in the world, expressing the population as a percentage of the world's total population. In order to calculate the world's total population, it uses a subquery.

SELECT ?countryLabel ?population (round(?population/?worldpopulation*1000)/10 AS ?percentage)
WHERE {
  ?country wdt:P31 wd:Q3624078;    # is a sovereign state
           wdt:P1082 ?population.

  { 
    # subquery to determine ?worldpopulation
    SELECT (sum(?population) AS ?worldpopulation)
    WHERE { 
      ?country wdt:P31 wd:Q3624078;    # is a sovereign state
               wdt:P1082 ?population. 
    }
  }

  SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".}
}
ORDER BY desc(?population)

Try it!

The syntax of a query with a subquery is shown below. A subquery is basically the same as a simple query and is enclosed within { brackets }.

SELECT  ... query result variables ...
WHERE 
{
        ... query pattern ...

        { # subquery
          SELECT  ... subquery result variables ...
          WHERE 
          {
                  ... subquery pattern ...
          }
                  ... optional subquery modifiers ...
        } # end of subquery

}
        ... optional query modifiers ...

Subqueries can be used, often with a LIMIT, to avoid a query timeout by fractioning the task. As an example, this query is timing out :

#100 humans with exactly 6 months between their month of birthday and their month of death.
SELECT DISTINCT ?itemLabel ?item WHERE {
  ?item wdt:P31 wd:Q5 ;
        p:P569/psv:P569 [wikibase:timePrecision ?datePrecision1; wikibase:timeValue ?naissance] ;
        p:P570/psv:P570 [wikibase:timePrecision ?datePrecision2; wikibase:timeValue ?mort ].
  filter(?datePrecision1>10)
  filter(?datePrecision2>10)
  
  bind(month(?mort) - month(?naissance) as ?mois)
  bind(day(?mort) - day(?naissance) as ?jour)
  
  filter(abs(?mois) = 6)
  filter(?jour = 0)

  SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".}
}
ORDER BY ?itemLabel
LIMIT 100

Try it!

but the same query putting the limit on the selected items in a subquery and the label service outside it didn't timeout :

#100 humans with exactly 6 months between their month of birthday and their month of death.
SELECT DISTINCT ?itemLabel ?item WHERE {
  {
    SELECT DISTINCT ?item WHERE {
      ?item wdt:P31 wd:Q5 ; 
            p:P569/psv:P569 [wikibase:timePrecision ?datePrecision1; wikibase:timeValue ?naissance] ;
            p:P570/psv:P570 [wikibase:timePrecision ?datePrecision2; wikibase:timeValue ?mort ].

      filter(?datePrecision1>10)
      filter(?datePrecision2>10)

      bind(month(?mort) - month(?naissance) as ?mois)
      bind(day(?mort) - day(?naissance) as ?jour)
      filter(abs(?mois) = 6)
      filter(?jour = 0)
    }
    LIMIT 100
  }

  SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".}
}
ORDER BY ?itemLabel

Try it!