Jump to content

SPARQL/Sentences

From Wikibooks, open books for an open world

Comma, Semicolon and Period

[edit | edit source]

At the Basics chapter we’ve seen all children of Johann Sebastian Bach – more specifically: all items with the father Johann Sebastian Bach. But Bach had two wives, and so those items have two different mothers: what if we only want to see the children of Johann Sebastian Bach with his first wife, Maria Barbara Bach (Q57487)? Try writing that query, based on the one above.

Done that? Okay, then onto the solution! The simplest way to do this is to add a second triple with that restriction:

SELECT ?child ?childLabel
WHERE
{
  ?child wdt:P22 wd:Q1339.
  ?child wdt:P25 wd:Q57487.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

In English, this reads:

Child has father Johann Sebastian Bach. Child has mother Maria Barbara Bach.

That sounds a bit awkward, doesn’t it? In natural language, we’d abbreviate this:

Child has father Johann Sebastian Bach and mother Maria Barbara Bach.

In fact, it’s possible to express the same abbreviation in SPARQL as well: if you end a triple with a semicolon (;) instead of a period, you can add another predicate-object pair. This allows us to abbreviate the above query to:

SELECT ?child ?childLabel
WHERE
{
  ?child wdt:P22 wd:Q1339;
         wdt:P25 wd:Q57487.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

which has the same results, but less repetition in the query.

Now suppose that, out of those results, we’re interested only in those children who also were also composers and pianists. The relevant properties and items are occupation (P106), composer (Q36834) and pianist (Q486748). Try updating the above query to add these restrictions!

Here’s my solution:

SELECT ?child ?childLabel
WHERE
{
  ?child wdt:P22 wd:Q1339;
         wdt:P25 wd:Q57487;
         wdt:P106 wd:Q36834;
         wdt:P106 wd:Q486748.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

This uses the ; abbreviation two more times to add the two required occupations. But as you might notice, there’s still some repetition. This is as if we said:

Child has occupation composer and occupation pianist.

which we would usually abbreviate as:

Child has occupation composer and pianist.

And SPARQL has some syntax for that as well: just like a ; allows you to append a predicate-object pair to a triple (reusing the subject), a , allows you to append another object to a triple (reusing both subject and predicate). With this, the query can be abbreviated to:

SELECT ?child ?childLabel
WHERE
{
  ?child wdt:P22 wd:Q1339;
         wdt:P25 wd:Q57487;
         wdt:P106 wd:Q36834,
                  wd:Q486748.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

Note: indentation and other whitespace doesn’t actually matter – I’ve just indented the query to make it more readable. You can also write this as:

SELECT ?child ?childLabel
WHERE
{
  ?child wdt:P22 wd:Q1339;
         wdt:P25 wd:Q57487;
         wdt:P106 wd:Q36834, wd:Q486748.
  # both occupations in one line
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

or, rather less readable:

SELECT ?child ?childLabel
WHERE
{
  ?child wdt:P22 wd:Q1339;
  wdt:P25 wd:Q57487;
  wdt:P106 wd:Q36834,
  wd:Q486748.
  # no indentation; makes it hard to distinguish between ; and ,
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

Luckily, the WDQS editor indents lines for you automatically, so you usually don’t have to worry about this.

Alright, let’s summarize here. We’ve seen that queries are structured like text. Each triple about a subject is terminated by a period. Multiple predicates about the same subject are separated by semicolons, and multiple objects for the same subject and predicate can be listed separated by commas.

SELECT ?s1 ?s2 ?s3
WHERE
{
  ?s1 p1 o1;
      p2 o2;
      p3 o31, o32, o33.
  ?s2 p4 o41, o42.
  ?s3 p5 o5;
      p6 o6.
}

Brackets ([ ])

[edit | edit source]

Now I want to introduce one more abbreviation that SPARQL offers. So if you’ll humor me for one more hypothetical scenario…

Suppose we’re not actually interested in Bach’s children. (Who knows, perhaps that’s actually true for you!) But we are interested in his grandchildren. (Hypothetically.) There’s one complication here: a grandchild may be related to Bach via the mother or the father. That’s two different properties, which is inconvenient. Instead, let’s flip the relation around: Wikidata also has a “child” property, child (P40), which points from parent to child and is gender-independent. With this information, can you write a query that returns Bach’s grandchildren?

Here’s my solution:

SELECT ?grandChild ?grandChildLabel
WHERE
{
  wd:Q1339 wdt:P40 ?child.
  ?child wdt:P40 ?grandChild.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

In natural language, this reads:

Bach has a child ?child. ?child has a child ?grandChild.

Once more, I propose that we abbreviate this English sentence, and then I want to show you how SPARQL supports a similar abbreviation. Observe how we actually don’t care about the child: we don’t use the variable except to talk about the grandchild. We could therefore abbreviate the sentence to:

Bach has as child someone who has a child ?grandChild.

Instead of saying who Bach’s child is, we just say “someone”: we don’t care who it is. But we can refer back to them because we’ve said “someone who”: this starts a relative clause, and within that relative clause we can say things about “someone” (e. g., that he or she “has a child ?grandChild”). In a way, “someone” is a variable, but a special one that’s only valid within this relative clause, and one that we don’t explicitly refer to (we say “someone who is this and does that”, not “someone who is this and someone who does that” – that’s two different “someone”s).

In SPARQL, this can be written as:

SELECT ?grandChild ?grandChildLabel
WHERE
{
  wd:Q1339 wdt:P40 [ wdt:P40 ?grandChild ].
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

You can use a pair of brackets ([]) in place of a variable, which acts as an anonymous variable. Inside the brackets, you can specify predicate-object pairs, just like after a ; after a normal triple; the implicit subject is in this case the anonymous variable that the brackets represent. (Note: also just like after a ;, you can add more predicate-object pairs with more semicolons, or more objects for the same predicate with commas.)

And that’s it for triple patterns! There’s more to SPARQL, but as we’re about to leave the parts of it that are strongly analogous to natural language,

Summary

[edit | edit source]

I’d like to summarize that relationship once more:

natural language example SPARQL example
sentence Juliet loves Romeo. period juliet loves romeo.
conjunction (clause) Romeo loves Juliet and kills himself. semicolon romeo loves juliet; kills romeo.
conjunction (noun) Romeo kills Tybalt and himself. comma romeo kills tybalt, romeo.
relative clause Juliet loves someone who kills Tybalt. brackets juliet loves [ kills tybalt ].

References

[edit | edit source]