SPARQL/Printable version
This is the print version of SPARQL You won't see this message or any elements not part of the book's content when you print or preview this page. |
The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/SPARQL
Basics
SPARQL may look complicated, but the simple basics will already get you a long way – if you want, you can stop reading after this chapter, and you’ll already know enough to write many interesting queries. The other chapters just add information about more topics that you can use to write different queries. Each of them will empower you to write even more awesome queries, but none of them are necessary – you can stop reading at any point and hopefully still walk away with a lot of useful knowledge!
Also, if you’ve never heard of Wikidata, SPARQL, or WDQS before, here’s a short explanation of those terms:
- Wikidata is a knowledge database. It contains lots of statements, like “the capital of Canada is Ottawa”, or “the Mona Lisa is painted in oil paint on poplar wood”, or “gold has a thermal conductivity of 25.418 joule per mole kelvin”.
- SPARQL is a language to formulate questions (queries) for knowledge databases. With the right database, a SPARQL query could answer questions like “what is the most popular tonality in music?” or “which character was portrayed by the most actors?” or “what’s the distribution of blood types?” or “which authors’ works entered the public domain this year?”.
- WDQS, the Wikidata Query Service, brings the two together: You enter a SPARQL query, it runs it against Wikidata’s dataset and shows you the result.
SPARQL basics
[edit | edit source]A simple SPARQL query looks like this:
SELECT ?a ?b ?c
WHERE
{
x y ?a.
m n ?b.
?b f ?c.
}
The SELECT
clause lists variables that you want returned (variables start with a question mark), and the WHERE
clause contains restrictions on them, mostly in the form of triples.
All information in Wikidata (and similar knowledge databases) is stored in the form of triples;
when you run the query, the query service tries to fill in the variables with actual values so that the resulting triples appear in the knowledge database,
and returns one result for each combination of variables it finds.
A triple can be read like a sentence (which is why it ends with a period), with a subject, a predicate, and an object:
SELECT ?fruit
WHERE
{
?fruit hasColor yellow.
?fruit tastes sour.
}
The results for this query could include, for example, “lemon”. In Wikidata, most properties are “has”-kind properties, so the query might instead read:
SELECT ?fruit
WHERE
{
?fruit color yellow.
?fruit taste sour.
}
which reads like “?fruit
has color ‘yellow’” (not “?fruit
is the color of ‘yellow’” – keep this in mind for property pairs like “parent”/“child”!).
However, that’s not a good example for WDQS. Taste is subjective, so Wikidata doesn’t have a property for it. Instead, let’s think about parent/child relationships, which are mostly unambiguous.
Our first query
[edit | edit source]Suppose we want to list all children of the baroque composer Johann Sebastian Bach. Using pseudo-elements like in the queries above, how would you write that query?
Hopefully you got something like this:
SELECT ?child
WHERE
{
# either this...
?child parent Bach.
# or this...
?child father Bach.
# or this.
Bach child ?child.
# (note: everything after a ‘#’ is a comment and ignored by WDQS.)
}
The first two triples say that the ?child
must have the parent/father Bach; the third says that Bach must have the child ?child
. Let’s go with the second one for now.
So what remains to be done in order to turn this into a proper WDQS query? On Wikidata, items and properties are not identified by human-readable names like “father” (property) or “Bach” (item). (For good reason: “Johann Sebastian Bach” is also the name of a German painter, and “Bach” might also refer to the surname, the French commune, the Mercury crater, etc.) Instead, Wikidata items and properties are assigned an identifier. To find the identifier for an item, we search for the item and copy the Q-number of the result that sounds like it’s the item we’re looking for (based on the description, for example). To find the identifier for a property, we do the same, but search for “P:search term” instead of just “search term”, which limits the search to properties. This tells us that the famous composer Johann Sebastian Bach is Q1339, and the property to designate an item’s father is P22.
And last but not least, we need to include prefixes. For simple WDQS triples, items should be prefixed with wd:
, and properties with wdt:
. (But this only applies to fixed values – variables don’t get a prefix!)
Putting this together, we arrive at our first proper WDQS query:
SELECT ?child
WHERE
{
# ?child father Bach
?child wdt:P22 wd:Q1339.
}
Click that “Try it” link, then “Run” the query on the WDQS page. What do you get?
child |
---|
wd:Q57225 |
wd:Q76428 |
… |
Well that’s disappointing. You just see the identifiers. You can click on them to see their Wikidata page (including a human-readable label), but isn’t there a better way to see the results?
Well, as it happens, there is! (Aren’t rhetorical questions great?) If you include the magic text
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
somewhere within the WHERE
clause, you get additional variables: For every variable ?foo
in your query, you now also have a variable ?fooLabel
, which contains the label of the item behind ?foo
. If you add this to the SELECT
clause, you get the item as well as its label:
SELECT ?child ?childLabel
WHERE
{
# ?child father Bach
?child wdt:P22 wd:Q1339.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Try running that query – you should see not only the item numbers, but also the names of the various children.
child | childLabel |
---|---|
wd:Q57225 | Johann Christoph Friedrich Bach |
wd:Q76428 | Carl Philipp Emanuel Bach |
… | … |
This completes the basics. Try amending this by varying the properties.
References
[edit | edit source]
Wikidata Query Service
Wikimedia runs the public service instance of WDQS (Wikidata Query Service), which is available for use at http://query.wikidata.org/.
GUI
[edit | edit source]The GUI at the home page of http://query.wikidata.org/ allows you to edit and submit SPARQL queries to the query engine. The results are displayed as an HTML table. Note that every query has a unique URL which can be bookmarked for later use. Going to this URL will put the query in the edit window, but will not run it - you still have to click "Execute" for that.
One can also generate a short URL for the query via a URL shortening service by clicking the "Generate short URL" link on the right - this will produce the shortened URL for the current query.
The "Add prefixes" button generates the header containing standard prefixes for SPARQL queries. The full list of prefixes that can be useful is listed in the RDF format documentation. Note that most common prefixes work automatically, since WDQS supports them out of the box.
The GUI also features a simple entity explorer which can be activated by clicking on the "🔍" symbol next to the entity result. Clicking on the entity Q-id itself will take you to the entity page on wikidata.org.
Default views
[edit | edit source]If you run the query in the WDQS GUI, you can choose which view to present by specifying a comment: #defaultView:viewName
at the beginning of the query. Supported views are:
Table
- default view, displays the results as a table of valuesMap
- displays coordinate points if any present in the resultImageGrid
- displays images present in the result as a gridBubbleChart
- displays bubble chart for numbers found in the resultTreeMap
- displays hierarchical tree map for numbers found in the resultTimeline
- for results having dates, displays timeline placing each row at appropriate timeDimensions
- displays rows as lines between points on the scales representing each columnGraph
- displays result as a connected graph, usinglinkTo
column
SPARQL endpoint
[edit | edit source]SPARQL queries can be submitted directly to the SPARQL endpoint with a GET request to https://query.wikidata.org/sparql?query=SPARQL
POST requests can be sent to query.wikidata.org/bigdata/namespace/wdq/sparql
. The result is returned as XML by default, or as JSON if either the query parameter format=json
or the header Accept: application/sparql-results+json
are provided.
JSON format is standard SPARQL 1.1 Query Results JSON Format.
Supported formats
[edit | edit source]The following output formats are currently supported by the SPARQL endpoint:
Format | HTTP Header | Query parameter | Description |
---|---|---|---|
XML | Accept: application/sparql-results+xml | format=xml | XML result format, is returned by default. As specified in https://www.w3.org/TR/rdf-sparql-XMLres/ |
JSON | Accept: application/sparql-results+json | format=json | JSON result format, as in: https://www.w3.org/TR/sparql11-results-json/ |
TSV | Accept: text/tab-separated-values | As specified in https://www.w3.org/TR/sparql11-results-csv-tsv/ | |
CSV | Accept: text/csv | As specified in https://www.w3.org/TR/sparql11-results-csv-tsv/ | |
Binary RDF | Accept: application/x-binary-rdf-results-table |
Query timeout
[edit | edit source]There is a hard query deadline configured which is set to 60 seconds. That is true both for the GUI and the public SPARQL endpoint. If your query does not execute in the allowed time, try to optimize it, or reduce the amount of data it returns.
Every query will timeout when it takes more time to execute than this configured deadline. You may want to optimize the query or report a problematic query here mw:Wikidata_query_service/Problematic_queries
Introduction to Wikidata Query Service
[edit | edit source]See SPARQL/Wikidata Query Service - Introduction
References
[edit | edit source]Other SPARQL Endpoints
[edit | edit source]See www.w3.org/wiki/SparqlEndpoints for a list of other SPARQL Endpoints.
Wikidata Query Service - Introduction
How to use Wikidata for a simple query
[edit | edit source]Let’s go through a simple example demonstrating how to get a list of all known cats in the world.
Get a list of all the cats in the world
[edit | edit source]Use this URL to get to the Query Service: https://query.wikidata.org
- Click Examples
- Select Cats from the list that appears in the new window
- Click Run just below the code box
This has now given us a list of all the famous and infamous cats on the Internet – or at least the ones Wikidata knows about. This is great, provided you are a cat person and not, say, a dog person.
What about dogs?
[edit | edit source]If you want to list the dogs instead (or anything else for that matter), there are two ways to edit your query:
- Use the visual query editor to change the item from cat to dog:
- Manually edit the query Each item on Wikidata is uniquely identified using a code. This code is Q146 for “cat”. To find the corresponding code for “dog”, you can look it up on Wikidata:
- Go to https://www.wikidata.org
- Type “dog” in the search box
- Click on the first result from the drop-down menu
- Take note of the item number for “dog” from the Wikidata page that opens (Q144)
To change your query from “cat” to “dog”, simply replace Q146 by Q144 in the SPARQL query editor.
Run the program, and you will be presented with a list of all the famous and infamous dogs on the Internet
Diving into the User Interface
[edit | edit source]How to change the language that the results are displayed in?
[edit | edit source]With this search query tool, you can not only customize and search for simple or aggregated, compound, nested and complex queries but you can also search in ANY language and get results in ANY language too.
- Change the default language code from “en” (English) to any other language code, say “ja” for Japanese.
#Cats
SELECT ?item ?itemLabel WHERE
{
?item wdt:P31 wd:Q146 .
SERVICE wikibase:label { bd:serviceParam wikibase:language "ja" }
}
How to change the sorting order of the results displayed?
[edit | edit source]The order of the items displayed can be easily changed in the interface. Just click on the relevant columns of the search results to customize the sorting order:
How to change the language of the UI?
[edit | edit source]Click the word English and select the language from the list displayed:
Notice the UI has changed to the selected language:
But I want pictures of cats! How to search for images?
[edit | edit source]So far, we’ve seen the output of the queries displayed as a table. Let’s try for some images:
Click the “Examples” button and select the “Even more cats, with pictures” example. Run the query and cat images should pop up at the lower half of the screen.#
This is what displays the result as images instead of a table. You can manually switch how the data is displayed by using the “Display” menu in the lower right corner of the result. Try switching the menu to “Table” to see the same result in a table view.
Let’s have a look at what’s new in the query, compared to the query in the previous chapter. Written in plain English, the query would read “Give me all items that have something to do with cats. Also give me the image for each item. And by the way, display the result as a grid of images, not as a table.”
#defaultView:ImageGrid
What looks like a comment, is actually an instruction to display the query result as images instead of a table.?x1
This means that we query for any predicate, not only for “is an instance of”. Think of predicates like “depicts” (P180) or “is named after” (P138). We are not using ?x1 anywhere else in the query, meaning the placeholder stands for “I don’t care what the predicate is, give me all results that somehow have a cat as the subject”OPTIONAL
Inside the curly braces you see another statement that adds optional data to the result. The placeholder ?item is reused and coupled with the predicate P18 (meaning “(has) image” and a new placeholder ?pic. Wrapping the statement with OPTIONAL means that items do not necessarily have to have a P18 predicate to show up in the list and that the ?pic placeholder can be empty.
Try modifying the query:
- Search for dogs instead of cats. Hint: Change the Object part of the first statement in WHERE.
- Only show items that “depict” cats (you’ll get paintings). Hint: replace ?x1 with something else.
- Remove the OPTIONAL
There are other ways to show your data, but not all of them are always applicable, which is why some are grayed out in the menu. “Image map” is only selectable if the result data actually contains image URLs. In one of the following chapters you’ll learn how to display items as points on a map.
How to share your query for others to use?
[edit | edit source]If you want to share your query with someone else – say, on social media – you can create a short link for your query.
- Complete your query
- Click the LINK icon:
- Copy the URL listed there. This is the URL for the query.
Finding things on maps
[edit | edit source]There are more ways to visualize the query results. If the query asks for geocoordinates, the results can be displayed on a map.
Look at this example of lighthouses in Norway. When you run the query, you’ll see red dots that mark the location of lighthouses on the Norwegian coast.
There is even more
[edit | edit source]There are more visualizations for you to explore:
- Bubble Chart
- Tree Map
- Dimensions
- Graph
Look through the query examples and try them out!
See also
[edit | edit source]See also this Video Querying Wikidata with SPARQL for Absolute Beginners
References
[edit | edit source]
Prefixes
WDQS understands many shortcut abbreviations, known as prefixes. Some are internal to Wikidata wd
, wdt
, p
, ps
, bd
, etc. and many others are commonly used external prefixes, like rdf
, skos
, owl
, schema
, etc.
In the following query, we are asking for items where there is a statement of "P279 = Q7725634" or in fuller terms, selecting subjects that have a predicate of "subclass of" with an object of = "literary work".
For simple WDQS triples, items should be prefixed with wd:
, and properties with wdt:
. This only applies to fixed values – variables don’t get a prefix.
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wds: <http://www.wikidata.org/entity/statement/>
PREFIX wdv: <http://www.wikidata.org/value/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX bd: <http://www.bigdata.com/rdf#>
# The below SELECT query does the following:
# Selects all the items(?s subjects) and their descriptions(?desc)
# that have(WHERE) the statement(?s subject) has a direct property(wdt:) = P279 <subclasses of>
# with a value of entity(wd:) = Q7725634 <Literary Work>
# and Optionally return the label and description using the Wikidata label service
SELECT ?s ?desc WHERE {
?s wdt:P279 wd:Q7725634 .
OPTIONAL {
?s rdfs:label ?desc
FILTER (LANG(?desc) = "en").
}
}
Without the use of prefixes this should be written as
# The below SELECT query does the following:
# Selects all the items(?s subjects) and their descriptions(?desc)
# that have(WHERE) the statement(?s subject) has a direct property(wdt:) = P279 <subclasses of>
# with a value of entity(wd:) = Q7725634 <Literary Work>
# and Optionally return the label and description using the Wikidata label service
SELECT ?s ?desc WHERE {
?s <http://www.wikidata.org/prop/direct/P279> <http://www.wikidata.org/entity/Q7725634> .
OPTIONAL {
?s <http://www.w3.org/2000/01/rdf-schema#label> ?desc
FILTER (LANG(?desc) = "en").
}
}
Actually all mentioned prefixes are built in in the Wikidata Query Service, so they can be left out
# The below SELECT query does the following:
# Selects all the items(?s subjects) and their descriptions(?desc)
# that have(WHERE) the statement(?s subject) has a direct property(wdt:) = P279 <subclasses of>
# with a value of entity(wd:) = Q7725634 <Literary Work>
# and Optionally return the label and description using the Wikidata label service
SELECT ?s ?desc WHERE {
?s wdt:P279 wd:Q7725634 .
OPTIONAL {
?s rdfs:label ?desc
FILTER (LANG(?desc) = "en").
}
}
The full list of built in prefixes is
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wds: <http://www.wikidata.org/entity/statement/>
PREFIX wdv: <http://www.wikidata.org/value/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX bd: <http://www.bigdata.com/rdf#>
PREFIX wdref: <http://www.wikidata.org/reference/>
PREFIX psv: <http://www.wikidata.org/prop/statement/value/>
PREFIX psn: <http://www.wikidata.org/prop/statement/value-normalized/>
PREFIX pqv: <http://www.wikidata.org/prop/qualifier/value/>
PREFIX pqn: <http://www.wikidata.org/prop/qualifier/value-normalized/>
PREFIX pr: <http://www.wikidata.org/prop/reference/>
PREFIX prv: <http://www.wikidata.org/prop/reference/value/>
PREFIX prn: <http://www.wikidata.org/prop/reference/value-normalized/>
PREFIX wdno: <http://www.wikidata.org/prop/novalue/>
PREFIX wdata: <http://www.wikidata.org/wiki/Special:EntityData/>
PREFIX schema: <http://schema.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX bds: <http://www.bigdata.com/rdf/search#>
PREFIX gas: <http://www.bigdata.com/rdf/gas#>
PREFIX hint: <http://www.bigdata.com/queryHints#>
References
[edit | edit source]- mw:Wikibase/Indexing/RDF_Dump_Format#Prefixes_used - details about each prefix
- Wikidata query service/User Manual
Sentences
Comma, Semicolon and Period
[edit | edit source]At the Basics chapter we’ve seen all children of Johann Sebastian Bach – more specifically: all items with the father Johann Sebastian Bach. But Bach had two wives, and so those items have two different mothers: what if we only want to see the children of Johann Sebastian Bach with his first wife, Maria Barbara Bach (Q57487)? Try writing that query, based on the one above.
Done that? Okay, then onto the solution! The simplest way to do this is to add a second triple with that restriction:
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339.
?child wdt:P25 wd:Q57487.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
In English, this reads:
Child has father Johann Sebastian Bach. Child has mother Maria Barbara Bach.
That sounds a bit awkward, doesn’t it? In natural language, we’d abbreviate this:
Child has father Johann Sebastian Bach and mother Maria Barbara Bach.
In fact, it’s possible to express the same abbreviation in SPARQL as well: if you end a triple with a semicolon (;
) instead of a period, you can add another predicate-object pair. This allows us to abbreviate the above query to:
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339;
wdt:P25 wd:Q57487.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
which has the same results, but less repetition in the query.
Now suppose that, out of those results, we’re interested only in those children who also were also composers and pianists. The relevant properties and items are occupation (P106), composer (Q36834) and pianist (Q486748). Try updating the above query to add these restrictions!
Here’s my solution:
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339;
wdt:P25 wd:Q57487;
wdt:P106 wd:Q36834;
wdt:P106 wd:Q486748.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This uses the ;
abbreviation two more times to add the two required occupations. But as you might notice, there’s still some repetition. This is as if we said:
Child has occupation composer and occupation pianist.
which we would usually abbreviate as:
Child has occupation composer and pianist.
And SPARQL has some syntax for that as well: just like a ;
allows you to append a predicate-object pair to a triple (reusing the subject), a ,
allows you to append another object to a triple (reusing both subject and predicate). With this, the query can be abbreviated to:
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339;
wdt:P25 wd:Q57487;
wdt:P106 wd:Q36834,
wd:Q486748.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Note: indentation and other whitespace doesn’t actually matter – I’ve just indented the query to make it more readable. You can also write this as:
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339;
wdt:P25 wd:Q57487;
wdt:P106 wd:Q36834, wd:Q486748.
# both occupations in one line
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
or, rather less readable:
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339;
wdt:P25 wd:Q57487;
wdt:P106 wd:Q36834,
wd:Q486748.
# no indentation; makes it hard to distinguish between ; and ,
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Luckily, the WDQS editor indents lines for you automatically, so you usually don’t have to worry about this.
Alright, let’s summarize here. We’ve seen that queries are structured like text. Each triple about a subject is terminated by a period. Multiple predicates about the same subject are separated by semicolons, and multiple objects for the same subject and predicate can be listed separated by commas.
SELECT ?s1 ?s2 ?s3
WHERE
{
?s1 p1 o1;
p2 o2;
p3 o31, o32, o33.
?s2 p4 o41, o42.
?s3 p5 o5;
p6 o6.
}
Brackets ([ ])
[edit | edit source]Now I want to introduce one more abbreviation that SPARQL offers. So if you’ll humor me for one more hypothetical scenario…
Suppose we’re not actually interested in Bach’s children. (Who knows, perhaps that’s actually true for you!) But we are interested in his grandchildren. (Hypothetically.) There’s one complication here: a grandchild may be related to Bach via the mother or the father. That’s two different properties, which is inconvenient. Instead, let’s flip the relation around: Wikidata also has a “child” property, child (P40), which points from parent to child and is gender-independent. With this information, can you write a query that returns Bach’s grandchildren?
Here’s my solution:
SELECT ?grandChild ?grandChildLabel
WHERE
{
wd:Q1339 wdt:P40 ?child.
?child wdt:P40 ?grandChild.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
In natural language, this reads:
Bach has a child
?child
.?child
has a child?grandChild
.
Once more, I propose that we abbreviate this English sentence, and then I want to show you how SPARQL supports a similar abbreviation. Observe how we actually don’t care about the child: we don’t use the variable except to talk about the grandchild. We could therefore abbreviate the sentence to:
Bach has as child someone who has a child
?grandChild
.
Instead of saying who Bach’s child is, we just say “someone”: we don’t care who it is. But we can refer back to them because we’ve said “someone who”: this starts a relative clause, and within that relative clause we can say things about “someone” (e. g., that he or she “has a child ?grandChild
”). In a way, “someone” is a variable, but a special one that’s only valid within this relative clause, and one that we don’t explicitly refer to (we say “someone who is this and does that”, not “someone who is this and someone who does that” – that’s two different “someone”s).
In SPARQL, this can be written as:
SELECT ?grandChild ?grandChildLabel
WHERE
{
wd:Q1339 wdt:P40 [ wdt:P40 ?grandChild ].
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
You can use a pair of brackets ([]
) in place of a variable, which acts as an anonymous variable. Inside the brackets, you can specify predicate-object pairs, just like after a ;
after a normal triple; the implicit subject is in this case the anonymous variable that the brackets represent. (Note: also just like after a ;
, you can add more predicate-object pairs with more semicolons, or more objects for the same predicate with commas.)
And that’s it for triple patterns! There’s more to SPARQL, but as we’re about to leave the parts of it that are strongly analogous to natural language,
Summary
[edit | edit source]I’d like to summarize that relationship once more:
natural language | example | SPARQL | example |
---|---|---|---|
sentence | Juliet loves Romeo. | period | juliet loves romeo.
|
conjunction (clause) | Romeo loves Juliet and kills himself. | semicolon | romeo loves juliet; kills romeo.
|
conjunction (noun) | Romeo kills Tybalt and himself. | comma | romeo kills tybalt, romeo.
|
relative clause | Juliet loves someone who kills Tybalt. | brackets | juliet loves [ kills tybalt ].
|
References
[edit | edit source]
Triples
Introduction
[edit | edit source]The statement "The sky has the color blue", consists of a subject ("the sky"), a predicate ("has the color"), and an object ("blue").
SPO or "subject, predicate, object" is known as a (Semantic) triple, or commonly referred to in Wikidata as a statement about data.
SPO is also used as a form of basic syntax layout for querying RDF data structures, or any graph database or triplestore, such as the Wikidata Query Service (WDQS).
See also w:en:Semantic triple
In Wikidata Query Service (WDQS) triples are used to describe the Query pattern in the WHERE
clause of the SELECT
statement
# ?child father Bach ?child wdt:P22 wd:Q1339.
In this case the triple ?child wdt:p22 wd:Q1339
specifies that the variable ?child
must have the parent/father Bach.
Any of the triple parts Subject, Predicate and Object may be variables. This makes this selection very versatile.
Triples with the same subject
[edit | edit source]Aditional variables can be added by adding additional triples. In the simplest case these triples use the same subject.
SELECT ?child ?childLabel ?genderLabel ?birth_date ?date_of_death
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P21 ?gender.
?child wdt:P569 ?birth_date.
?child wdt:P570 ?date_of_death.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
The first triple selects all the children of Bach. The additional triples links all these triples with a value for gender, birth date and date of death. The variable ?child
links all of them together.
If you look closely at the result you might have noticed that Johann Christoph Friedrich Bach has 2 lines in the list because there are 2 different birth dates, 21 and 23 of June 1732. In his case ?child wdt:P569 ?birth_date.
resulted into 2 values. See for further details at removing duplicates and modifiers.
OPTIONAL triples
[edit | edit source]If not all subjects have a value for a certain triple the subject is excluded. To have it included the OPTIONAL
keyword comes in handy.
SELECT DISTINCT ?child ?childLabel ?genderLabel ?birth_date ?date_of_death
WHERE
{
?child wdt:P22 wd:Q76.# ?child has father Obama
OPTIONAL{ ?child wdt:P21 ?gender. }
OPTIONAL{ ?child wdt:P569 ?birth_date. }
OPTIONAL{ ?child wdt:P570 ?date_of_death. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Both children are shown, even if one of the variables (in this case the date of death) is not filled in.
See the chapter OPTIONAL for a full description.
Complex triples
[edit | edit source]Triples are not limited to one subject. In fact triples can be linked in any thinkable way.
You would for instance be able to list the coordinates of the birth places of the children of Bach
SELECT ?child ?childLabel ?placeofbirthLabel ?coordinates
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P19 ?placeofbirth.
?placeofbirth wdt:P625 ?coordinates.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?placeofbirthLabel
You could even see these birthplaces (Köthen, Leipzig and Weimar) on a map by using #defaultView:Map
#defaultView:Map
SELECT ?placeofbirthLabel ?coordinates
(GROUP_CONCAT(DISTINCT ?childLabel; SEPARATOR=", ") AS ?children)
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P19 ?placeofbirth.
?placeofbirth wdt:P625 ?coordinates.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?child rdfs:label ?childLabel.
?placeofbirth rdfs:label ?placeofbirthLabel.
}
}
GROUP BY ?placeofbirthLabel ?coordinates ?children
If you click on a red dot you will get additional data as specified above with the variables ?placeofbirthLabel
and ?children
. We had to use GROUP BY
, GROUP_CONCAT
, DISTINCT
and all labels should be defined explicitly in the SERVICE
.
You can toggle between the Map display and standard table display by the Display drop down list, at the right side of the Run button.
See more about views at Map views or all views
Triples by number of variables
[edit | edit source]Triples with one variable
[edit | edit source]An example of a triple with one variable for Subject would be
SELECT ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339. # ?child has father Bach
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This will list all Subjects (as variable ?child
) with Predicate father (P22) and Object Johann Sebastian Bach (Q1339).
An example of a triple with one variable for Predicate would be
SELECT ?predicate ?pLabel
WHERE
{
wd:Q57225 ?predicate wd:Q1339. # Johann Christoph Friedrich Bach ?predicate Johann Sebastian Bach
BIND( IRI(REPLACE( STR(?predicate),"prop/direct/","entity/" )) AS ?p).
# or ?p wikibase:directClaim ?predicate.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This will list all Predicates (as variable ?predicate
) with Object Johann Christoph Friedrich Bach (Q57225) and Subject Johann Sebastian Bach (Q1339).
It shows that he is not only his father (P22) but also student of (P1066) him
An example of a triple with one variable for Object would be
SELECT ?workloc ?worklocLabel
WHERE
{
wd:Q1339 wdt:P937 ?workloc. # Bach work location ?workloc
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This will list all Objects (as variable ?workloc
) with Subject Johann Sebastian Bach (Q1339) and Predicate work location (P937).
Triples with two variables
[edit | edit source]An example of a triple with 2 variables and only a fixed value for Subject would list all raw information available in Wikidata about Bach
SELECT ?predicate ?object
WHERE
{
wd:Q1339 ?predicate ?object. # Bach
}
See further at next section with 3 variables for further usage
An example of a triple with 2 variables and only a fixed value for Predicate would list all subjects (probably airports) with an IATA airport code
SELECT ?subject ?subjectLabel ?object
WHERE
{
?subject wdt:P238 ?object. # IATA airport code
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?object
An usage could be to check for duplicate IATA codes:
SELECT ?object (COUNT(?subject) AS ?count)
(MIN(?subject) AS ?subject1) (MAX(?subject) AS ?subject2)
(GROUP_CONCAT(DISTINCT ?subjectLabel; SEPARATOR=", ") AS ?subjectLabels)
WHERE
{
?subject wdt:P238 ?object. # IATA airport code
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?subject rdfs:label ?subjectLabel.
}
}
GROUP BY ?object
HAVING(COUNT(?subject) > 1)
ORDER BY ?object
An example of a triple with 2 variables and only a fixed value for Object would list all subjects related to Bach
SELECT ?subject ?subjectLabel ?subjectDescription ?predicate ?pLabel
WHERE
{
?subject ?predicate wd:Q1339. # Bach
BIND( IRI(REPLACE( STR(?predicate),"prop/direct/","entity/" )) AS ?p).
# or ?p wikibase:directClaim ?predicate.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?subject
An other possibility of a triple with fixed value for Object would list all subjects with value "ABC", and will show for instance airport Albacete Airport
SELECT ?subject ?subjectLabel ?subjectDescription ?predicate ?pLabel
WHERE
{
?subject ?predicate "ABC".
BIND( IRI(REPLACE( STR(?predicate),"prop/direct/","entity/" )) AS ?p).
# or ?p wikibase:directClaim ?predicate.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?subject
Triples with three variables
[edit | edit source]When you would use triples with all 3 as variables (one for Subject, one for Predicate and one for Object) you basically will list out the whole database. This can be done for small databases, and can be used as well to get a rough idea of the available data, on all available properties.
All raw information available in Wikidata about the children of Bach:
SELECT ?subject ?predicate ?object
WHERE
{
?subject ?predicate ?object.
?subject wdt:P22 wd:Q1339. # subject has father Bach
}
ORDER BY ?subject ?predicate ?object
LIMIT 10000
The same query but grouped by predicate:
SELECT DISTINCT ?subject ?subjectLabel ?predicate
(GROUP_CONCAT(DISTINCT ?object; SEPARATOR=", ") AS ?objects)
WHERE
{
?subject ?predicate ?object.
?subject wdt:P22 wd:Q1339. # subject has father Bach
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?subject ?subjectLabel ?predicate
ORDER BY ?subject ?subjectLabel ?predicate
LIMIT 10000
From the query below you can discover triples about the date the Wikidata page was last updated, the total number of statements, the number of sitelinks etc. These are schema:dateModified
, wikibase:statements
and wikibase:sitelinks
respectively.
SELECT ?subject ?subjectLabel ?datemodified ?statements ?sitelinks
WHERE
{
?subject wdt:P22 wd:Q1339. # subject has father Bach
?subject schema:dateModified ?datemodified.
?subject wikibase:statements ?statements.
?subject wikibase:sitelinks ?sitelinks.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
SELECT
The SELECT
clause consists of 2 or 3 parts.
SELECT ... query result variables ...
WHERE {
... query pattern ...
}
... optional query modifiers ...
GROUP BY ...
HAVING ...
ORDER BY ...
LIMIT ...
The first part is the query result variables. These will be shown when the query is executed. The second part is the WHERE
clause with the query pattern. This defines the data selection and generates the variables, The last part are the optional modifiers.
Example
SELECT ?child ?childLabel
WHERE
{
# ?child father Bach
?child wdt:P22 wd:Q1339.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Query result variables
[edit | edit source]SELECT ?child ?childLabel ...
In the above example ?child
and ?childLabel
are the variables. The variables are separated by spaces. A variable can be displayed with another name (Alias) by using AS
, like (?child AS ?Child_of_Bach)
. Notice that the alias again should be a variable, and the combination should start and end with brackets (?a AS ?b)
.
PS. For aliases of Labels the label should be defined explicitly in the SERVICE
. Alternatively the variable could be named as requested there as well.
SELECT (?child AS ?Child_of_Bach) (?childLabel AS ?Name)
WHERE
{
# ?child father Bach
?child wdt:P22 wd:Q1339.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?child rdfs:label ?childLabel.
}
}
Query pattern
[edit | edit source]The query pattern specifies the data selection and generates the variables,
# ?child father Bach ?child wdt:P22 wd:Q1339.
In this case the triple ?child wdt:p22 wd:Q1339
specifies that the variable ?child
must have the parent/father Bach.
Any of the triple parts Subject, Predicate and Object may be variables. This makes this selection very versatile.
Additional triples can be added, for instance to show gender, birth date and date of death. Each sentence should end with a period. The new variable should be added at the top (query result variables) to display them.
SELECT ?child ?childLabel ?genderLabel ?birth_date ?date_of_death
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P21 ?gender.
?child wdt:P569 ?birth_date.
?child wdt:P570 ?date_of_death.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Optional query modifiers
[edit | edit source]See the chapter modifiers for a full description.
The above query can be sorted by the birth date by adding ORDER BY ?birth_date
after the last curly bracket of the WHERE { }
clause
SELECT ?child ?childLabel ?genderLabel ?birth_date ?date_of_death
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P21 ?gender.
?child wdt:P569 ?birth_date.
?child wdt:P570 ?date_of_death.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?birth_date
Removing duplicates
[edit | edit source]You might have noticed that the above query resulted in 21 records, while Bach had only 20 children. This is caused by 2 entries of Johann Christoph Friedrich Bach, because there are 2 different birth dates, 21 and 23 of June 1732.
Generally it is advised to use DISTINCT
in the result variables like
SELECT DISTINCT ?child ?childLabel ?genderLabel ?birth_date ?date_of_death
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P21 ?gender.
?child wdt:P569 ?birth_date.
?child wdt:P570 ?date_of_death.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?child
DISTINCT
removes duplicate entries, but in this case that does not help as the entries are different. To be helpful ALL parameters should have been equal.
What helps is to group by child and combine the values of ?birth_date
. Also ?date_of_death
and ?genderLabel
are grouped to be consistent
SELECT ?child ?childLabel
(GROUP_CONCAT(DISTINCT ?genderLabel; SEPARATOR=", ") AS ?genderLabels)
(GROUP_CONCAT(DISTINCT ?birth_date; SEPARATOR=", ") AS ?birth_dates)
(GROUP_CONCAT(DISTINCT ?date_of_death; SEPARATOR=", ") AS ?dates_of_death)
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P21 ?gender.
?child wdt:P569 ?birth_date.
?child wdt:P570 ?date_of_death.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?child rdfs:label ?childLabel.
?gender rdfs:label ?genderLabel.
}
}
GROUP BY ?child ?childLabel
ORDER BY ?birth_dates
The optional query modifier used is GROUP BY
, and the variables are combined by using (GROUP_CONCAT(DISTINCT ?var1; SEPARATOR=", ") AS ?var2)
. The ORDER BY
has been modified by using the combined variable ?birth_dates
, instead of ?birth_date
. All labels should be defined explicitly in the SERVICE
.
See the chapter modifiers for a full description.
Another way to remove duplicates it so show only one of the possible values, by using one of the Aggregate functions MIN
, MAX
, SUM
or AVG
.
SELECT ?child ?childLabel
(MIN(?genderLabel) AS ?genderlabel1)
(MIN(?birth_date) AS ?birth_date1)
(MAX(?date_of_death) AS ?date_of_death1)
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P21 ?gender.
?child wdt:P569 ?birth_date.
?child wdt:P570 ?date_of_death.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?child rdfs:label ?childLabel.
?gender rdfs:label ?genderLabel.
}
}
GROUP BY ?child ?childLabel
ORDER BY ?birth_date1
The optional query modifier used is also GROUP BY
, and the variables are combined by using (MIN(?var) AS ?var1)
. The ORDER BY
has been modified by using the first birth date ?birth_date1
, instead of ?birth_date
. All labels should also be defined explicitly in the SERVICE
.
Adding missing entries
[edit | edit source]Let's try one of the above queries for Obama (Q76):
SELECT DISTINCT ?child ?childLabel ?genderLabel ?birth_date ?date_of_death
WHERE
{
?child wdt:P22 wd:Q76.# ?child has father Obama
?child wdt:P21 ?gender.
?child wdt:P569 ?birth_date.
?child wdt:P570 ?date_of_death.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?child
This results in NO results although Obama has 2 children. The reason is that to match this query, a potential result (a child) must match all the triples we listed: it must have a gender, and a birth data, and a date of death. If one or more of those properties don't exist, it won’t match. And that’s not what we want in this case: we primarily want a list of all the children – if additional data is available, we’d like to include it, but we don’t want that to limit our list of results.
The solution is to tell WDQS that those triples are OPTIONAL
:
SELECT DISTINCT ?child ?childLabel ?genderLabel ?birth_date ?date_of_death
WHERE
{
?child wdt:P22 wd:Q76.# ?child has father Obama
OPTIONAL{ ?child wdt:P21 ?gender. }
OPTIONAL{ ?child wdt:P569 ?birth_date. }
OPTIONAL{ ?child wdt:P570 ?date_of_death. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?child
Now both children are shown, and it is clear that the date of death is not filled in yet, as these children are still alive.
See the chapter OPTIONAL for a full description.
Fool proof: Add missing and remove duplicate entries
[edit | edit source]A fool proof list of children and their details of any given father would be
SELECT ?child ?childLabel
(GROUP_CONCAT(DISTINCT ?genderLabel; SEPARATOR=", ") AS ?genderLabels)
(GROUP_CONCAT(DISTINCT ?birth_date; SEPARATOR=", ") AS ?birth_dates)
(GROUP_CONCAT(DISTINCT ?date_of_death; SEPARATOR=", ") AS ?dates_of_death)
WHERE
{
?child wdt:P22 wd:Q76.# ?child has father Obama
OPTIONAL{ ?child wdt:P21 ?gender. }
OPTIONAL{ ?child wdt:P569 ?birth_date. }
OPTIONAL{ ?child wdt:P570 ?date_of_death. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?child rdfs:label ?childLabel.
?gender rdfs:label ?genderLabel.
}
}
GROUP BY ?child ?childLabel
ORDER BY ?birth_dates
This combines the OPTIONAL
clause in case a property is not present and the GROUP_CONCAT
clause in case of multiple properties.
OPTIONAL
With a triple ?person wdt:P569 ?birth_date
the birth date of a person can be retrieved. But there might be occasions when a persons birth date is not known. In that case this triple acts as a selection: Only persons with birth dates are selected.
And that’s not what we want in this case: we primarily want to include this person and if additional data is available, we’d like to list that, but we don’t want that to limit our list of results.
The solution is to tell WDQS that those triples are OPTIONAL
:
SELECT ?child ?childLabel ?genderLabel ?birth_date ?date_of_death
WHERE
{
?child wdt:P22 wd:Q76.# ?child has father Obama
OPTIONAL{ ?child wdt:P21 ?gender. }
OPTIONAL{ ?child wdt:P569 ?birth_date. }
OPTIONAL{ ?child wdt:P570 ?date_of_death. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?child
Without OPTIONAL
none of two the children of Obama would be shown, because the date of death is not filled for any of the children.
In general you should NOT include all optional triplets into one OPTIONAL { }
sentence.
SELECT ?child ?childLabel ?genderLabel ?birth_date ?date_of_death
WHERE
{
?child wdt:P22 wd:Q76.# ?child has father Obama
OPTIONAL{ ?child wdt:P21 ?gender. # In general you should NOT include all optional triplets into one sentence
?child wdt:P569 ?birth_date.
?child wdt:P570 ?date_of_death.
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?child
The result is that all variables are blank when only one of them is missing in the data.
Cases where this might be useful is when the variables are tightly linked.
FILTER
FILTER(condition)
is a clause you can insert into your SPARQL query to, well, filter the results. Inside the parentheses, you can put any expression of boolean type, and only those results where the expression returns true
are used.
FILTER on values
[edit | edit source]For example, to get a list of all humans born in 2015, we first get all humans with their date of birth –
SELECT ?person ?personLabel ?dob
WHERE
{
?person wdt:P31 wd:Q5;
wdt:P569 ?dob.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
– and then filter that to only return the results where the year of the date of birth is 2015. There are two ways to do that: extract the year of the date with the YEAR
function, and test that it’s 2015 –
FILTER(YEAR(?dob) = 2015).
– or check that the date is between Jan. 1st (inclusive), 2015 and Jan. 1st, 2016 (exclusive):
FILTER("2015-01-01"^^xsd:dateTime <= ?dob && ?dob < "2016-01-01"^^xsd:dateTime).
I’d say that the first one is more straightforward, but it turns out the second one is much faster, so let’s use that:
SELECT ?person ?personLabel ?dob
WHERE
{
?person wdt:P31 wd:Q5;
wdt:P569 ?dob.
FILTER("2015-01-01"^^xsd:dateTime <= ?dob && ?dob < "2016-01-01"^^xsd:dateTime).
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
FILTER on values in Labels
[edit | edit source]Another possible use of FILTER
is related to labels. The label service is very useful if you just want to display the label of a variable. But if you want to do stuff with the label – for example: check if it starts with “Mr. ” – you’ll find that it doesn’t work:
SELECT ?human ?humanLabel
WHERE
{
?human wdt:P31 wd:Q15632617. # fictional human
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
FILTER(STRSTARTS(?humanLabel, "Mr. ")). # This does not work
}
This query finds all instances of Q15632617 and tests if their label starts with "Mr. "
(STRSTARTS
is short for “string starts [with]”; there’s also STRENDS
and CONTAINS
). The reason why this doesn’t work is that the label service adds its variables very late during query evaluation; at the point where we try to filter on ?humanLabel
, the label service hasn’t created that variable yet.
Fortunately, the label service isn’t the only way to get an item’s label. Labels are also stored as regular triples, using the predicate rdfs:label
. Of course, this means all labels, not just English ones; if we only want English labels, we’ll have to filter on the language of the label:
FILTER(LANG(?label) = "en").
The LANG
function returns the language of a monolingual string, and here we only select those labels that are in English. The full query is:
SELECT ?human ?humanLabel
WHERE
{
?human wdt:P31 wd:Q15632617; # fictional human
rdfs:label ?humanLabel.
FILTER(LANG(?humanLabel) = "en").
FILTER(STRSTARTS(?humanLabel, "Mr. ")).
}
We get the label with the ?human rdfs:label ?label
triple, restrict it to English labels, and then check if it starts with “Mr. ”.
FILTER NOT EXISTS
[edit | edit source]Let's add images to these fictional humans
SELECT ?human ?humanLabel ?image
WHERE
{
?human wdt:P31 wd:Q15632617; # fictional human
rdfs:label ?humanLabel.
FILTER(LANG(?humanLabel) = "en").
FILTER(STRSTARTS(?humanLabel, "Mr. ")).
OPTIONAL{ ?human wdt:P18 ?image. }
}
The code OPTIONAL{ ?human wdt:P18 ?image. }
shows all the fictional humans that start with “Mr. ” and shows an image if available.
To select only fictional humans starting with “Mr. ” with an image the code ?human wdt:P18 ?image.
can be used.
To select those fictional humans without an image the code FILTER NOT EXISTS{ ?human wdt:P18 ?image. }
should be used
SELECT ?human ?humanLabel ?image
WHERE
{
?human wdt:P31 wd:Q15632617; # fictional human
rdfs:label ?humanLabel.
FILTER(LANG(?humanLabel) = "en").
FILTER(STRSTARTS(?humanLabel, "Mr. ")).
FILTER NOT EXISTS{ ?human wdt:P18 ?image. } # without images
}
MINUS
[edit | edit source]A similar way of negation provided in SPARQL is MINUS
which evaluates both its arguments, then calculates solutions in the left-hand side that are not compatible with the solutions on the right-hand side.
SELECT ?human ?humanLabel ?image
WHERE
{
?human wdt:P31 wd:Q15632617; # fictional human
rdfs:label ?humanLabel.
FILTER(LANG(?humanLabel) = "en").
FILTER(STRSTARTS(?humanLabel, "Mr. ")).
MINUS{ ?human wdt:P18 ?image. } # without images
}
References
[edit | edit source]
UNION
Suppose the set A: 28 capitals in European Union SELECT ?city ?cityLabel
WHERE {
wd:Q458 wdt:P150 ?country. # European Union contains administrative territorial entity
?country wdt:P36 ?city. # capital
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
|
Suppose the set B: 3 cities with millions of inhabitants in European Union. SELECT ?city ?cityLabel
WHERE {
wd:Q458 wdt:P150 ?country. # European Union contains administrative territorial entity
?city wdt:P17 ?country. # city in a (European) country
?city wdt:P31 wd:Q1637706. # city with millions of inhabitants
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
|
The intersection of A and B, big capitals of the European Union, can simply be achieved by combining all the triples. This results in 1 city.
SELECT ?city ?cityLabel
WHERE {
wd:Q458 wdt:P150 ?country. # European Union contains administrative territorial entity
?country wdt:P36 ?city. # capital
?city wdt:P17 ?country. # city in a (European) country
?city wdt:P31 wd:Q1637706. # city with millions of inhabitants
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
All capitals of the European union, excluding big cities can be achieved by filtering using FILTER NOT EXISTS { }
. This results in the other 27 capitals.
SELECT ?city ?cityLabel
WHERE {
wd:Q458 wdt:P150 ?country. # European Union contains administrative territorial entity
?country wdt:P36 ?city. # capital
?city wdt:P17 ?country. # city in a (European) country
FILTER NOT EXISTS{ ?city wdt:P31 wd:Q1637706. } # NOT a city with millions of inhabitants
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Finally a UNION
of capitals and big cities result in 30 cities, one of which was deduplicated by DISTINCT
.
SELECT DISTINCT ?city ?cityLabel
WHERE {
wd:Q458 wdt:P150 ?country. # European Union contains administrative territorial entity
{ ?country wdt:P36 ?city. } # capital
UNION
{ ?city wdt:P17 ?country. # city in a (European) country
?city wdt:P31 wd:Q1637706. # a city with millions of inhabitants
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Mind that the 2 parts of the union should both be within brackets { ... } UNION { ... }
.
Two simpler examples of UNION
are
{?a wdt:P27 wd:Q55. } UNION { ?a wdt:P27 wd:Q29999. }
country of citizenship (P27) for Netherlands (Q55) or Kingdom of the Netherlands (Q29999).{?child wdt:P22 ?parent. } UNION {?child wdt:P25 ?parent. }
father (P22) or mother (P25).
- The last code can be simplified by using property path
?child wdt:P22|wdt:P25 ?parent.
An overview of all kind of joins:
Venn diagram | Mathematical | SPARQL | |
---|---|---|---|
And | A. B. | ||
A. FILTER NOT EXISTS{ B. } | |||
A. OPTIONAL{ B. } | |||
Or | { A. } UNION { B. } |
SERVICE - Label
You can fetch the label, alias, or description of entities you query, with language fallback, using the specialized service with the URI <http://wikiba.se/ontology#label>. The service is very helpful when you want to retrieve labels, as it reduces the complexity of SPARQL queries that you would otherwise need to achieve the same effect.
The service can be used in one of the two modes: manual and automatic.
Automatic Label SERVICE
[edit | edit source]In automatic mode, you only need to specify the service template, e.g.:
PREFIX wikibase: <http://wikiba.se/ontology#>
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
and WDQS will automatically generate labels as follows:
- If an unbound variable in
SELECT
is named?NAMELabel
, then WDQS produces the label (rdfs:label
) for the entity in variable?NAME
. - If an unbound variable in
SELECT
is named?NAMEAltLabel
, then WDQS produces the alias (skos:altLabel
) for the entity in variable?NAME
. - If an unbound variable in
SELECT
is named?NAMEDescription
, then WDQS produces the description (schema:description
) for the entity in variable?NAME
.
In each case, the variable in ?NAME
should be bound, otherwise the service fails.
Example, showing the list of EU country names and capitals in french. For demonstation also Description and AltLabel are shown
SELECT ?country ?countryLabel ?capitalLabel ?capitalDescription ?capitalAltLabel
WHERE {
wd:Q458 wdt:P150 ?country. # European Union contains administrative territorial entity
OPTIONAL{ ?country wdt:P36 ?capital. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "fr". }
}
In this example WDQS automatically creates the labels ?countryLabel
, ?capitalLabel
, ?capitalDescription
and ?capitalAltLabel
.
Manual Label SERVICE
[edit | edit source]In the manual mode, you explicitly bind the label variables within the service call, but WDQS will still provide language resolution and fallback.
Manual Label service is mandatory for using labels in some SPARQL functions like
GROUP_CONCAT
- Aggregate functions
MIN
,MAX
,SUM
orAVG
Example:
SELECT *
WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "fr,de,en".
wd:Q123 rdfs:label ?q123Label.
wd:Q123 skos:altLabel ?q123Alt.
wd:Q123 schema:description ?q123Desc.
}
}
This will consider labels and descriptions in French, German and English, and if none are available, will use the Q-id as the label.
Manual labels can also be used to list labels in more languages, for instance European countries in English, German and French
SELECT ?country ?country_EN ?country_DE ?country_FR
WHERE {
wd:Q458 wdt:P150 ?country. # European Union contains administrative territorial entity
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?country rdfs:label ?country_EN.
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "de".
?country rdfs:label ?country_DE.
} hint:Prior hint:runLast false.
SERVICE wikibase:label { bd:serviceParam wikibase:language "fr".
?country rdfs:label ?country_FR.
} hint:Prior hint:runLast false.
}
PS: hint:Prior hint:runLast false.
is added to prevent error: "there can be only one "run last" join in any group". [1]
It is also possible to write the above query by using rdfs:label
directly without the wikibase:label
SERVICE
:
SELECT ?country ?country_EN ?country_DE ?country_FR
WHERE {
wd:Q458 wdt:P150 ?country. # European Union contains administrative territorial entity
OPTIONAL {?country rdfs:label ?country_EN FILTER (LANG(?country_EN) = "en")}.
OPTIONAL {?country rdfs:label ?country_DE FILTER (LANG(?country_DE) = "de")}.
OPTIONAL {?country rdfs:label ?country_FR FILTER (LANG(?country_FR) = "fr")}.
}
Languages
[edit | edit source]You can specify a list of languages to be used as fallback in case a label does not exist in a language
You specify your preferred language(s) for the label with one or more of bd:serviceParam wikibase:language "language-code"
triples. Each string can contain one or more language codes, separated by commas. WDQS considers languages in the order in which you specify them. If no label is available in any of the specified languages, the Q-id of the entity (without any prefix) is its label.
SERVICE wikibase:label { bd:serviceParam wikibase:language "fr,de,en" }
The Wikidata Query Service website auto-magically replaces [AUTO_LANGUAGE]
with the language code of current user's interface. For example, if the user's UI is in French, the SPARQL's code bd:serviceParam wikibase:language "[AUTO_LANGUAGE],de,en"
will be converted to bd:serviceParam wikibase:language "fr,de,en"
before being sent to the query service.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],de,en" }
References
[edit | edit source]mw:Wikidata query service/User Manual
SERVICE - around and box
The service allows to search for items with coordinates located within a certain radius of the center or within a certain bounding box.
Most of the examples below use a map view by using #defaultView:Map
at the top of the query. You may switch to table display to see the underlying data.
The property most commonly used for locations is coordinate location (P625).
Search around point
[edit | edit source]Example Airports within 100km from Berlin:
# Airports within 100km from Berlin
#defaultView:Map
SELECT ?place ?placeLabel ?location ?dist
WHERE {
# Berlin coordinates
wd:Q64 wdt:P625 ?berlinLoc .
SERVICE wikibase:around {
?place wdt:P625 ?location .
bd:serviceParam wikibase:center ?berlinLoc .
bd:serviceParam wikibase:radius "100" .
bd:serviceParam wikibase:distance ?dist.
}
FILTER EXISTS {
# Is an airport
?place wdt:P31/wdt:P279* wd:Q1248784 .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ASC(?dist)
The first line of the around
service call must have format ?item
predicate
?location
, where the result of the search will bind ?item
to items within the specified location and ?location
to their coordinates. The parameters supported are:
Predicate | Meaning |
---|---|
wikibase:center | The point around which search is performed. Must be bound for search to work. |
wikibase:radius | Distance from the center. Currently the distance is always in kilometers, other units are not supported yet. |
wikibase:globe | The globe which is being searched. Optional, default it's Earth (wd:Q2). |
wikibase:distance | The variable receiving distance information |
Search within box
[edit | edit source]Example of box search Schools between San Jose, CA and Sacramento, CA:
# Schools between San Jose, CA and Sacramento, CA
#defaultView:Map
SELECT ?place ?placeLabel ?location
WHERE {
wd:Q16553 wdt:P625 ?SJloc.
wd:Q18013 wdt:P625 ?SCloc.
SERVICE wikibase:box {
?place wdt:P625 ?location.
bd:serviceParam wikibase:cornerSouthWest ?SJloc.
bd:serviceParam wikibase:cornerNorthEast ?SCloc.
}
?place wdt:P31/wdt:P279* wd:Q3914.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Coordinates may be specified directly:
# Schools between San Jose, CA and Sacramento, CA
#same as previous
#defaultView:Map
SELECT ?place ?placeLabel ?location
WHERE {
SERVICE wikibase:box {
?place wdt:P625 ?location.
bd:serviceParam wikibase:cornerWest "Point(-121.872777777 37.304166666)"^^geo:wktLiteral.
bd:serviceParam wikibase:cornerEast "Point(-121.486111111 38.575277777)"^^geo:wktLiteral.
}
?place wdt:P31/wdt:P279* wd:Q3914.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
The first line of the box
service call must have format ?item
predicate
?location
, where the result of the search will bind ?item
to items within the specified location and ?location
to their coordinates. The parameters supported are:
Predicate | Meaning |
---|---|
wikibase:cornerSouthWest | The south-west corner of the box. |
wikibase:cornerNorthEast | The north-east corner of the box. |
wikibase:cornerWest | The western corner of the box. |
wikibase:cornerEast | The eastern corner of the box. |
wikibase:globe | The globe which is being searched. Optional, default it's Earth (wd:Q2). |
wikibase:cornerSouthWest
and wikibase:cornerNorthEast
should be used together, as well as wikibase:cornerWest
and wikibase:cornerEast
, and can not be mixed. If wikibase:cornerWest
and wikibase:cornerEast
predicates are used, then the points are assumed to be the coordinates of the diagonal of the box, and the corners are derived accordingly.
Distance function
[edit | edit source]The function geof:distance
returns distance between two points, in kilometers. Example usage:
# Airports within 100km from Berlin
SELECT ?place ?placeLabel ?location ?dist
WHERE {
# Berlin coordinates
wd:Q64 wdt:P625 ?berlinLoc.
SERVICE wikibase:around {
?place wdt:P625 ?location.
bd:serviceParam wikibase:center ?berlinLoc.
bd:serviceParam wikibase:radius "100".
}
# Is an airport
?place wdt:P31/wdt:P279* wd:Q1248784.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
BIND(geof:distance(?berlinLoc, ?location) as ?dist)
}
ORDER BY ?dist
# Places around 0°,0°
SELECT ?place ?placeLabel ?location ?dist
WHERE {
SERVICE wikibase:around {
?place wdt:P625 ?location.
bd:serviceParam wikibase:center "Point(0 0)"^^geo:wktLiteral.
bd:serviceParam wikibase:radius "250".
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
BIND(geof:distance("Point(0 0)"^^geo:wktLiteral, ?location) as ?dist)
}
ORDER BY ?dist
References
[edit | edit source]
SERVICE - mwapi
Mediawiki API Service allows to call out to Mediawiki API from SPARQL, and receive the results from inside the SPARQL query.
The query is initiated by SERVICE with URL wikibase:mwapi. Currently supported Mediawiki endpoints are: *.wikipedia.org, commons.wikimedia.org, www.mediawiki.org, www.wikidata.org, test.wikidata.org
.
Currently the following services are supported:
Service | Documentation | Inputs | Outputs | Description |
---|---|---|---|---|
Generator | see here | generator, prop, pprop | title, item, pageid, lastrevid, timestamp | Call any generator API. Use "generator" parameter to specify, and specific generator parameters to further amend the search (see the example below). |
Categories | see here | titles,cllimit | category, title | Get a list of categories on the page. |
Search | see here | srsearch,srwhat,srlimit | title | Full-text search in wiki. |
EntitySearch | see here | search,language,type,limit | item,label | Wikibase entity search, by title. |
Required parameters are in bold. Please refer to the service documentation (linked in Documentation column) for the meaning of input parameters.
Please see full description on Mediawiki API Service documentation page.
Example finding members of wikipedia category
[edit | edit source]SELECT * WHERE {
wd:Q6501349 wdt:P910 ?category . # Parking lot - Main category
?link schema:about ?category; schema:isPartOf <https://en.wikipedia.org/>; schema:name ?title .
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam wikibase:endpoint "en.wikipedia.org" .
bd:serviceParam mwapi:gcmtitle ?title .
bd:serviceParam mwapi:generator "categorymembers" .
bd:serviceParam mwapi:gcmprop "ids|title|type" .
bd:serviceParam mwapi:gcmlimit "max" .
bd:serviceParam wikibase:limit 50 .
# out
?subcat wikibase:apiOutput mwapi:title .
?ns wikibase:apiOutput "@ns" .
?item wikibase:apiOutputItem mwapi:item .
}
}
Depicts statements with Dutch labels, of files in one Commons category
[edit | edit source]SELECT ?file ?title ?depicts ?depicts_label
WITH
{ SELECT ?file ?title
WHERE
{ SERVICE wikibase:mwapi
{
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
bd:serviceParam mwapi:gcmtitle "Category:Historia Naturalis van Rudolf II" .
bd:serviceParam mwapi:generator "categorymembers" .
bd:serviceParam mwapi:gcmtype "file" .
bd:serviceParam mwapi:gcmlimit "max" .
?title wikibase:apiOutput mwapi:title .
?pageid wikibase:apiOutput "@pageid" .
}
BIND (URI(CONCAT('https://commons.wikimedia.org/entity/M', ?pageid)) AS ?file)
}
} AS %get_files
WHERE
{ INCLUDE %get_files
?file wdt:P180 ?depicts .
service <https://query.wikidata.org/sparql> {
OPTIONAL {?depicts rdfs:label ?depicts_label FILTER (lang(?depicts_label) = 'nl') }
}
}
References
[edit | edit source]
Modifiers
There are 4 optional modifiers to the SELECT ... WHERE { .. }
. See the chapter on SELECT for an introduction.
The five modifiers are GROUP BY ...
, HAVING ...
, ORDER BY ...
, LIMIT ...
and OFFSET ...
.
GROUP BY
[edit | edit source]Let us consider this list of children of Bach, listing also their mother
SELECT ?mother ?motherLabel ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P25 ?mother.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Let us group this list by the mothers.
SELECT ?mother ?motherLabel (COUNT(?child) AS ?children)
(GROUP_CONCAT(DISTINCT ?childLabel; SEPARATOR=", ") AS ?names)
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P25 ?mother.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?mother rdfs:label ?motherLabel.
?child rdfs:label ?childLabel.
}
}
GROUP BY ?mother ?motherLabel
We GROUP BY
both variables ?mother
and ?motherLabel
because if we would leave out the label we would end up with an error "Bad Aggregate".
First we added (COUNT(?child) AS ?children)
to count the number of children. COUNT
is one of the Aggregate functions MIN
, MAX
, SUM
, AVG
, COUNT
or SAMPLE
.
Notice first that it counts the variable ?child
. Notice as well that the syntax is (COUNT(?var1) AS ?var2)
.
As second information we added a combined list of children per mother by using (GROUP_CONCAT(DISTINCT ?var1; SEPARATOR=", ") AS ?var2)
.
When using GROUP_CONCAT
and a label, all labels should be defined explicitly in the SERVICE.
HAVING
[edit | edit source]HAVING
is always used in combination with GROUP BY
SELECT ?mother ?motherLabel (COUNT(?child) AS ?children)
(GROUP_CONCAT(DISTINCT ?childLabel; SEPARATOR=", ") AS ?names)
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P25 ?mother.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?mother rdfs:label ?motherLabel.
?child rdfs:label ?childLabel.
}
}
GROUP BY ?mother ?motherLabel
HAVING (COUNT(?child)>7)
HAVING
will filter out groups that does not meet the condition specified. In this case only one mother is shown, with 13 children.
As COUNT(?child)
is bound to the variable ?children
the HAVING
clause can also be written as HAVING (?children>7)
.
The HAVING
clause can be useful for finding duplicates for instance like HAVING (COUNT(?var)>1)
.
ORDER BY
[edit | edit source]ORDER BY something
sorts the results by something
. something
can be any expression or variables. This expression can also be wrapped in either ASC()
or DESC()
to specify the sorting order (ascending or descending). (If you don’t specify either, the default is ascending sort, so ASC(something)
is equivalent to just something
.)
SELECT ?mother ?motherLabel ?child ?childLabel
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P25 ?mother.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?motherLabel) ?childLabel
LIMIT
[edit | edit source]LIMIT count
cuts off the result list at count
results, where count
is any natural number. For example, LIMIT 10
limits the query to ten results. LIMIT 1
only returns a single result.
This could be useful to get the top 10 results, or only 10 random results to see how the data looks like.
OFFSET
[edit | edit source]OFFSET count
can be used to skip the first number of results. OFFSET 100
LIMIT 10
returns records 101-110.
References
[edit | edit source]
Aggregate functions
Aggregate functions are used in combination with modifier GROUP BY
.
Available aggregate functions are:
COUNT
: the number of elements. You can also writeCOUNT(*)
to simply count all results.SUM
,AVG
: the sum or average of all elements, respectively. If the elements aren't numbers, you'll get weird results.MIN
,MAX
: the minimum or maximum value of all elements, respectively. This works for all value types; numbers are sorted numerically, strings and other types lexically.SAMPLE
: any element. This is occasionally useful if you know there's only one result, or if you don't care which one is returned.GROUP_CONCAT
: concatenates all elements.
Aggregate functions with GROUP BY
[edit | edit source]See modifier GROUP BY
for examples.
# The syntax is like this
SELECT ?group1 ?group2
(COUNT(?var1) AS ?var2)
(GROUP_CONCAT(DISTINCT ?var3; SEPARATOR=", ") AS ?var4)
WHERE
{
...
}
GROUP BY ?group1 ?group2
When using aggregate functions of a label, all labels should be defined explicitly in the SERVICE.
Additionally, you can add a DISTINCT
modifier for any of these functions to eliminate duplicate results. For example, if there are two results but they both have the same value in ?var
, then COUNT(?var)
will return 2
but COUNT(DISTINCT ?var)
will only return 1
. You often have to use DISTINCT
when your query can return the same item multiple times – this can happen if, for example, you use ?item wdt:P31/wdt:P279* ?class
, and there are multiple paths from ?item
to ?class
: you will get a new result for each of those paths, even though all the values in the result are identical. (If you're not grouping, you can also eliminate those duplicate results by starting the query with SELECT DISTINCT
instead of just SELECT
.)
Aggregate functions with HAVING
[edit | edit source]Aggregate functions can also be used with the HAVING
modifier.
See for instance this query to check for duplicate IATA codes for airports:
SELECT ?object (COUNT(?subject) AS ?count)
(MIN(?subject) AS ?subject1) (MAX(?subject) AS ?subject2)
(GROUP_CONCAT(DISTINCT ?subjectLabel; SEPARATOR=", ") AS ?subjectLabels)
WHERE
{
?subject wdt:P238 ?object. # IATA airport code
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?subject rdfs:label ?subjectLabel.
}
}
GROUP BY ?object
HAVING(COUNT(?subject) > 1)
ORDER BY ?object
References
[edit | edit source]
Bad aggregate
The error Bad aggregate might appear when coding mistakes upon using GROUP BY
. See also the chapter modifiers for a description of GROUP BY
.
The error "Bad aggregate" does not specify what exactly is the mistake, hence this explanation to help you find out the cause.
This is a correct example. This list the children of Bach, grouped by their mother:
SELECT ?mother ?motherLabel (COUNT(?child) AS ?children)
(GROUP_CONCAT(DISTINCT ?childLabel; SEPARATOR=", ") AS ?names)
WHERE
{
?child wdt:P22 wd:Q1339.# ?child has father Bach
?child wdt:P25 ?mother.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?mother rdfs:label ?motherLabel.
?child rdfs:label ?childLabel.
}
}
GROUP BY ?mother ?motherLabel
The first condition is that the grouping variables should match.
- In the
SELECT
the grouping is by both variables?mother
and?motherLabel
. - In the
GROUP BY
the same variables?mother
?motherLabel
should be listed.
Try for yourself what happens when you use GROUP BY ?mother
.
At the bottom of the long list of the errors you might find the text "Non-aggregate variable in select expression: motherLabel".
If the GROUP BY
is left out completely the error message is similar "Non-aggregate variable in select expression: mother".
The second condition is that the rest of the variables in the SELECT
should each result into one value.
- You need expressions that will result into one value, for instance
COUNT
or one of the other Aggregate functionsMIN
,MAX
,SUM
orAVG
. - You can also use
GROUP_CONCAT
to list all values into one variable, as all values of?childLabel
into?names
. - All variables should be converted with expressions as described above, even if a variable has one value only.
Try for yourself what happens when you change in the rest of the SELECT
code.
If for instance you would use ?childLabel
you might find at the bottom of the long list of the errors the text "Non-aggregate variable in select expression: childLabel".
The error message is always "Non-aggregate variable in select expression: xxxxx", whether a variable is missing from the GROUP BY
or a variable not aggregated at the SELECT
clause.
Property paths
Property paths
[edit | edit source]Statements in a triplestore have a particular Property in the triples. In SPARQL queries you can also write down property paths in the triples.
Property paths are a shorthand to write down a path of properties between two items. The simplest path is just a single property, which forms an ordinary triple:
?item wdt:P31 ?class.
You can add path elements with a forward slash (/
).
?item wdt:P31/wdt:P279/wdt:P279 ?class.
This is equivalent to either of the following:
?item wdt:P31 ?temp1.
?temp1 wdt:P279 ?temp2.
?temp2 wdt:P279 ?class.
?item wdt:P31 [ wdt:P279 [ wdt:P279 ?class ] ].
Exercise: (re)write the “grandchildren of Bach” query to use this syntax.
An asterisk (*
) after a path element means “zero or more of this element”.
?item wdt:P31/wdt:P279* ?class.
# means:
?item wdt:P31 ?class
# or
?item wdt:P31/wdt:P279 ?class
# or
?item wdt:P31/wdt:P279/wdt:P279 ?class
# or
?item wdt:P31/wdt:P279/wdt:P279/wdt:P279 ?class
# or ...
If there are no other elements in the path, ?a something* ?b
means that ?b
might also just be ?a
directly, with no path elements between them at all.
A plus (+
) is similar to an asterisk, but means “one or more of this element”. The following query finds all descendants of Bach:
SELECT ?descendant ?descendantLabel
WHERE
{
wd:Q1339 wdt:P40+ ?descendant.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
If we used an asterisk instead of a plus here, the query results would include Bach himself.
A question mark (?
) is similar to an asterisk or a plus, but means “zero or one of this element”.
You can separate path elements with a vertical bar (|
) instead of a forward slash; this means “either-or”: the path might use either of those properties. (But not both – an either-or path segment always matches a path of length one.)
You can also group path elements with parentheses (()
), and freely combine all these syntax elements (/|*+?
). This means that another way to find all descendants of Bach is:
SELECT ?descendant ?descendantLabel
WHERE
{
?descendant (wdt:P22|wdt:P25)+ wd:Q1339.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Instead of using the “child” property to go from Bach to his descendants, we use the “father” and “mother” properties to go from the descendants to Bach. The path might include two mothers and one father, or four fathers, or father-mother-mother-father, or any other combination. (Though, of course, Bach can’t be the mother of someone, so the last element will always be father.)
Summary of the codes after a path element:
Code | Meaning |
---|---|
? (Question mark) |
zero or one of this element |
* (Asterisk) |
zero or more of this element |
+ (Plus) |
one or more of this element |
Inverse link
[edit | edit source]Instead of the normal Triple "subject, predicate, object" it is also possible to write it as inverse link "object, predicate, subject". This can be done by adding ^
in front of the predicate.
For normal triples this is not very useful, but for property paths it avoids using dummy variables.
For example this query finds the siblings of Johan Sebastian Bach, by querying siblings with the same father.
SELECT ?sibling ?siblingLabel
WHERE
{
# Bach father/has father sibling
wd:Q1339 wdt:P22/^wdt:P22 ?sibling. # ^ = Inverse link
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
With dummy variable this can be written as
SELECT ?sibling ?siblingLabel
WHERE
{
# Bach father/has father sibling
wd:Q1339 wdt:P22 ?dummy.
?dummy ^wdt:P22 ?sibling. # ^ = Inverse link
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Or without inverse link:
SELECT ?sibling ?siblingLabel
WHERE
{
# Bach father/has father sibling
wd:Q1339 wdt:P22 ?dummy.
?sibling wdt:P22 ?dummy.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Code | Meaning |
---|---|
^ (Circumflex) |
Inverse link |
Instances and classes
[edit | edit source]Most Wikidata properties are “has” relations: has child, has father, has occupation. But sometimes (in fact, frequently), you also need to talk about what something is. But there are in fact two kinds of relations there:
- Gone with the Wind is a film.
- A film is a work of art.
Gone with the Wind is one particular film. It has a particular director (Victor Fleming), a specific duration (238 minutes), a list of cast members (Clark Gable, Vivien Leigh, …), and so on.
Film is a general concept. Films can have directors, durations, and cast members, but the concept “film” as such does not have any particular director, duration, or cast members. And although a film is a work of art, and a work of art usually has a creator, the concept of “film” itself does not have a creator – only particular instances of this concept do.
This difference is why there are two properties for “is” in Wikidata: P31
and P279
. Gone with the Wind is a particular instance of the class “film”; the class “film” is a subclass (more specific class; specialization) of the more general class “work of art”.
So what does this mean for us when we’re writing SPARQL queries? When we want to search for “all works of art”, it’s not enough search for all items that are directly instances of “work of art”:
SELECT ?work ?workLabel
WHERE
{
?work wdt:P31 wd:Q838948. # instance of work of art
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
As I’m writing this, that query only returns 2815 results – obviously, there are more works of art than that! The problem is that this misses items like Gone with the Wind, which is only an instance of “film”, not of “work of art”. “film” is a subclass of “work of art”, but we need to tell SPARQL to take that into account when searching.
One possible solution to this is the []
syntax we talked about: Gone with the Wind is an instance of some subclass of “work of art”. (For exercise, try writing that query!) But that still has problems:
- We’re no longer including items that are directly instances of work of art.
- We’re still missing items that are instances of some subclass of some other subclass of “work of art” – for example, Snow White and the Seven Dwarfs is an animated film, which is a film, which is a work of art. In this case, we need to follow two “subclass of” statements – but it might also be three, four, five, any number really.
The solution: ?item wdt:P31/wdt:P279* ?class
. This means that there’s one “instance of” and then any number of “subclass of” statements between the item and the class.
SELECT ?work ?workLabel
WHERE
{
?work wdt:P31/wdt:P279* wd:Q838948. # instance of any subclass of work of art
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 1000
I don’t recommend running that query for all works of art. WDQS can handle it (just barely), but your browser might crash when trying to display the results because there’s so many of them. For that reason a LIMIT 1000
is inserted.
Now you know how to search for all works of art, or all buildings, or all human settlements: the magic incantation wdt:P31/wdt:P279*
, along with the appropriate class. This uses some more SPARQL features that I haven’t explained yet, but quite honestly, this is almost the only relevant use of those features, so you don’t need to understand how it works in order to use WDQS effectively.
References
[edit | edit source]
Variables
A query variable is marked by the use of either "?" or "$"; the "?" or "$" is not part of the variable name. In a query, $abc and ?abc identify the same variable.
In this book only variables marked with "?" are used.
Allowable characters are [a-z], [A-Z], [0-9], _, · and also diacrictics like áéíóú etcetera.
Example
SELECT $child_of_Bach $child_of_BachLabel
WHERE
{
$child_of_Bach wdt:P22 wd:Q1339.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Expressions and Functions
Expressions
[edit | edit source]BIND
[edit | edit source]The BIND( expression AS ?variable ).
clause can be used to assign the result of an expression to a variable (usually a new variable, but you can also overwrite existing ones).
# Persons died by capital punishment
SELECT ?person ?personLabel ?personDescription ?age
WHERE
{
?person wdt:P31 wd:Q5;
wdt:P569 ?born;
wdt:P570 ?died;
wdt:P1196 wd:Q8454.
BIND(?died - ?born AS ?ageInDays).
BIND(?ageInDays/365.2425 AS ?ageInYears).
BIND(FLOOR(?ageInYears) AS ?age).
# or, as one expression:
#BIND(FLOOR((?died - ?born)/365.2425) AS ?age).
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
BIND
can also be used to simply bind constant values to variables in order to increase readability. For example, a query that finds all female priests:
SELECT ?woman ?womanLabel
WHERE
{
?woman ?instanceOf ?human;
?sexOrGender ?female;
?occupation ?priest.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
# linking to wikidata Q and P codes
BIND(wdt:P31 AS ?instanceOf).
BIND(wd:Q5 AS ?human).
BIND(wdt:P21 AS ?sexOrGender).
BIND(wd:Q6581072 AS ?female).
BIND(wdt:P106 AS ?occupation).
BIND(wd:Q42603 AS ?priest).
}
The above gives the same result as
SELECT ?woman ?womanLabel
WHERE
{
?woman wdt:P31 wd:Q5; # instance of human
wdt:P21 wd:Q6581072; # sex or gender female
wdt:P106 wd:Q42603. # occupation priest
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
IF
[edit | edit source]The IF( condition, thenExpression, elseExpression )
expressions are used to assign different values based on a condition.
For instance to calculate the age, precisely to the day. In the expression BIND( .... - (IF( condition,1,0 ) AS ?age)
the condition determines to subtract 1 or 0 (nothing) from the ?age
based on the month and the day of the birth and the death.
# Query to find all musicians who have already died
# calculate their age (full years) at death
# count how many of them died at each age
#defaultView:LineChart
SELECT ?age (COUNT (DISTINCT ?a) AS ?count)
WHERE {
?a wdt:P31 wd:Q5. #instance of human
?a wdt:P106/wdt:P279 wd:Q639669. #occupation a subclass of musician
?a p:P569/psv:P569 ?birth_date_node.
?a p:P570/psv:P570 ?death_date_node.
?birth_date_node wikibase:timeValue ?birth_date.
?death_date_node wikibase:timeValue ?death_date.
BIND( YEAR(?death_date) - YEAR(?birth_date) -
IF(MONTH(?death_date)<MONTH(?birth_date) ||
(MONTH(?death_date)=MONTH(?birth_date) && DAY(?death_date)<DAY(?birth_date)),1,0) AS ?age )
# calculate the age, precisely to the day (times and timezones ignored)
FILTER(?age > 10 && ?age < 100). #ignore outlyers, several of which are probably errors
}
GROUP BY ?age
ORDER BY ?age
Here an example to divide population in groups. The ?layer
in the #defaultView:Map
shows it in different colors.
#Big cities, grouped into map layers by population
#defaultView:Map
SELECT DISTINCT ?city ?cityLabel (SAMPLE(?location) AS ?location) (MAX(?population) AS ?population) (SAMPLE(?layer) AS ?layer)
WHERE
{
?city wdt:P31/wdt:P279* wd:Q515;
wdt:P625 ?location;
wdt:P1082 ?population.
FILTER(?population >= 500000).
BIND(
IF(?population < 1000000, "<1M",
IF(?population < 2000000, "1M-2M",
IF(?population < 5000000, "2M-5M",
IF(?population < 10000000, "5M-10M",
IF(?population < 20000000, "10M-20M",
">20M")))))
AS ?layer).
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?city ?cityLabel
ORDER BY ?population
Operators
[edit | edit source]The familiar mathematical operators are available: +
, -
, *
, /
to add, subtract, multiply or divide numbers.
The operators <
, >
, =
, <=
, >=
are available to compare values. Comparison is also defined for other types; for example, "abc" < "abd"
is true (lexical comparison), as is "2016-01-01"^^xsd:dateTime > "2015-12-31"^^xsd:dateTime
and wd:Q4653 != wd:Q283111
. The inequality test ≠ is written !=
. The !
can also be used as a prefix to functions that results a boolean like !BOUND
and !REGEX
.
Boolean conditions can be combined with &&
(logical and: a && b
is true if both a
and b
are true) and ||
(logical or: a || b
is true if either (or both) of a
and b
is true).
Examples IF( ?a != ?b, ... , ... )
and IF( ?a = ?b && ?c = ?d, ... , ... )
.
VALUES
[edit | edit source]The VALUES ?var { val1 ... }
clause generates a variable with one or more values. The values can be of any type, like numeric, strings, or even items, like VALUES ?number { 1 2 3 }
, VALUES ?abc { "a" "b" "c" }
or VALUES ?city { wd:Q84 wd:Q90 }.
SELECT ?city ?cityLabel WHERE {
VALUES ?city { wd:Q84 wd:Q90 }. # London, Paris
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Values can have more dimensions too. The dimensions are grouped by ( ) like VALUES ( ?varA ?varB ... ) { ( valA1 valB1 ... ) ... }
.
SELECT ?country ?capital WHERE {
VALUES ( ?country ?capital ) {
( "United Kingdom" "London" )
( "France" "Paris" )
( "Etc" UNDEF ) # Use UNDEF for an undefined value
}
}
Functions
[edit | edit source]General functions
[edit | edit source]DATATYPE
[edit | edit source]Each value in SPARQL has a type, which tells you what kind of value it is and what you can do with it. The most important types are:
- item, like
wd:Q42
for Douglas Adams (Q42). - boolean, with the two possible values
true
andfalse
. Boolean values aren’t stored in statements, but many expressions return a boolean value, e. g.2 < 3
(true
) or"a" = "b"
(false
). - string, a piece of text. String literals are written in double quotes.
- monolingual text, a string with a language tag attached. In a literal, you can add the language tag after the string with an
@
sign, e. g."Douglas Adams"@en
. - numbers, either integers (
1
) or decimals (1.23
). - dates. Date literals can be written by adding
^^xsd:dateTime
(case sensitive –^^xsd:datetime
won’t work!) to an ISO 8601 date string:"2012-10-29"^^xsd:dateTime
.
# Date related properties of Bach
SELECT ?predicate ?pLabel ?object
WHERE
{
wd:Q1339 ?predicate ?object. # Bach
FILTER(DATATYPE(?object) = xsd:dateTime).
BIND( IRI(REPLACE( STR(?predicate),"prop/direct/","entity/" )) AS ?p).
# or ?p wikibase:directClaim ?predicate.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?object
Some datatypes are rdf:langString, xsd:string, xsd:integer, xsd:dateTime, xsd:decimal or "<http://www.opengis.net/ont/geosparql#wktLiteral>".
STR
[edit | edit source]The STR( value )
function converts values to a string. It also resolves prefixes, for instance wd:Q1339 will be converted into http://www.wikidata.org/entity/Q1339. The reversal of this is the IRI( string )
, which will convert a string to an IRI.
SELECT ?item ?itemLabel ?string ?stringLabel ?iri ?iriLabel
WHERE {
VALUES ?item { wd:Q1339 }.
BIND( STR(?item) AS ?string ).
BIND( IRI(?string) AS ?iri ).
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
IRI
[edit | edit source]the IRI( string )
, which will convert a string to an IRI, by applying the prefixes.
In the example below a Predicate like wdt:P569
get converted to a string http://www.wikidata.org/prop/direct/P569. To get a Label for the predicate it first needs to be replaced to get an entity (http://www.wikidata.org/entity/P569), which with IRI
gets converted to wd:P569
. The label of this can then be displayed as "date of birth (P569)".
# Date related properties of Bach
SELECT ?object ?predicate ?string ?entity ?p ?pLabel
WHERE
{
wd:Q1339 ?predicate ?object. # Bach
FILTER(DATATYPE(?object) = xsd:dateTime).
BIND( STR(?predicate) AS ?string ).
BIND( REPLACE( ?string,"prop/direct/","entity/" ) AS ?entity ).
BIND( IRI(?entity) AS ?p ).
# or all on one line:
# BIND( IRI(REPLACE( STR(?predicate),"prop/direct/","entity/" )) AS ?p).
# This can also be written as: ?p wikibase:directClaim ?predicate.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?object
LANG
[edit | edit source]The LANG( string )
function retrieves the language tag of Labels, Descriptions, Aliases and of Monolingual texts.
#Countries in European Union with native name and language
SELECT ?country ?countryLabel ?nativename ?language
WHERE
{
wd:Q458 wdt:P150 ?country. # European Union contains administrative territorial entity
OPTIONAL { ?country wdt:P1705 ?nativename.
BIND( LANG(?nativename) AS ?language). }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?language)
BOUND
[edit | edit source]The function BOUND( variable )
returns true if variable is bound to a value. It returns false otherwise. The BOUND
expression is used often with IF(condition,thenExpression,elseExpression)
expressions. For example, suppose you have a query that shows some humans, and instead of just showing their label, you’d like to display their pseudonym (P742) if they have one, and only use the label if a pseudonym doesn’t exist. For this, you select the pseudonym in an OPTIONAL
clause (it has to be optional – you don’t want to throw out results that don’t have a pseudonym), and then use BIND(IF(BOUND(…
to select either the pseudonym or the label.
SELECT ?writer ?label
WHERE
{
?writer wdt:P31 wd:Q5; # French writer
wdt:P27 wd:Q142;
wdt:P106 wd:Q36180;
wdt:P569 ?dob.
FILTER("1751-01-01"^^xsd:dateTime <= ?dob && ?dob < "1801-01-01"^^xsd:dateTime) # born in the second half of the 18th century
?writer rdfs:label ?writerLabel. # get the English label
FILTER(LANG(?writerLabel) = "en")
OPTIONAL { ?writer wdt:P742 ?pseudonym. } # get the pseudonym, if it exists
BIND(IF(BOUND(?pseudonym),?pseudonym,?writerLabel) AS ?label). # bind the pseudonym, or if it doesn’t exist the English label, as ?label
}
IN and NOT IN
[edit | edit source]The function value IN( list of values )
checks if a value is found in a list of values. For example 2 IN( 1, 2, 3 )
is true. A similar function value NOT IN( list of values )
checks if a value is not found in a list of values.
# Railway stations in London or Paris
SELECT ?cityLabel ?station ?stationLabel ?location
WHERE {
?station wdt:P31 wd:Q55488. # is a railway station
?station wdt:P131* ?city.
?station wdt:P625 ?location
FILTER( ?city IN(wd:Q84, wd:Q90) ). # in London or Paris
SERVICE wikibase:label { bd:serviceParam wikibase:language "en,fr". }
}
isBLANK
[edit | edit source]The isBLANK( variable )
expression checks if a value is an "unknown value".
#Demonstrates "unknown value" handling
SELECT ?human ?humanLabel
WHERE
{
?human wdt:P21 ?gender
FILTER isBLANK(?gender) .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
Wikidata value can also have special value "no value". With normal triples this simply results in the absence of a result.
# persons who were stateless (country of citizenship: no value) for some time (start time and end time qualifiers)
SELECT ?person ?personLabel ?start ?end
WHERE {
?person wdt:P31 wd:Q5;
p:P27 [
a wdno:P27; # no value for country of citizenship
pq:P580 ?start;
pq:P582 ?end
].
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?end - ?start)
The prefix wdno:
selects the "no value" node.
The code a
is a special word.
Functions on strings
[edit | edit source]STRLEN
[edit | edit source]The function STRLEN( string )
returns the length of a string. For instance STRLEN(?name)
where ?name
is "ABC" will return 3.
SUBSTR
[edit | edit source]The function SUBSTR( string, beginposition, stringlength )
returns a substring of a string beginning at the position marked by beginposition, and will have a length as indicated. For instance SUBSTR( ?name,2,3 )
where ?name
= "ABCDEFGH" will return "BCD". A substring without length like SUBSTR( ?name,2 )
will return "BCDEFGH".
UCASE
[edit | edit source]The function UCASE( string )
returns the string in Upper case. For instance a string "Abc" will be returned as "ABC".
LCASE
[edit | edit source]The function LCASE( string )
returns the string in Lower case. For instance a string "Abc" will be returned as "abc".
STRSTARTS
[edit | edit source]The function STRSTARTS( string, comparestring )
checks if the string starts with the compare string. For instance STRSTARTS( "ABCDEFGH", "ABC" )
returns true, indicating that the string starts with "ABC".
STRENDS
[edit | edit source]The function STRENDS( string, comparestring )
checks if the string ends with the compare string. For instance STRENDS( "ABCDEFGH", "FGH" )
returns true, indicating that the string ends with "FGH".
CONTAINS
[edit | edit source]The function CONTAINS( string, comparestring )
checks if the string contains the compare string. For instance CONTAINS( "ABCDEFGH", "DEF" )
returns true, indicating that the string contains "DEF".
STRBEFORE
[edit | edit source]The function STRBEFORE( string, comparestring )
returns the part of the string before the compare string. For instance STRBEFORE( "ABCDEFGH", "DEF" )
returns "ABC". If the compare string is not found it returns "".
STRAFTER
[edit | edit source]The function STRAFTER( string, comparestring )
returns the part of the string after the compare string. For instance STRAFTER( "ABCDEFGH", "DEF" )
returns "GH". If the compare string is not found, it returns "".
ENCODE_FOR_URI
[edit | edit source]The function ENCODE_FOR_URI( string )
converts the special characters in the string, to be able to use it in an web URL. For instance ENCODE_FOR_URI( "ABC DËFGH" )
returns "ABC%20D%C3%8BFGH". The reverse of this function is wikibase:decodeUri( string )
.
CONCAT
[edit | edit source]The function CONCAT( string1, string2 ... )
returns the concatenation of 2 or more strings. For instance CONCAT( "ABCDEFGH", "XYZ" )
returns "ABCDEFGHXYZ".
LANGMATCHES
[edit | edit source]The function LANGMATCHES( languagetag, languagerange )
checks if the language tag matches the language range. For example this code FILTER LANGMATCHES( LANG(?label), "fr" ).
filters labels in the french language. It is similar to FILTER (LANG(?label) = "fr").
with the difference that the LANGMATCHES filter will also output language tags with regions, e.g. "fr-BE" while (FILTER (LANG(?label) = "fr").
will only output the strings with the exact tag "fr". A language range of "*" matches any non-empty language-tag string.
REGEX
[edit | edit source]The function REGEX( string, pattern, flag )
checks if the string matches the pattern. It is mostly used in a FILTER
clause like FILTER REGEX( string, pattern ).
.
The pattern may contain different special characters. Blazegraph instances such as the Wikidata Query Service interpret the pattern as a Java Pattern
,[1] which is a subset of ICU regular expressions. The table below lists the most common special characters:
|
|
|
The flag is optional. Flag "i" means the match is case-insensitive.
# Names of human settlements ending in "-ow" or "-itz" in Germany
#defaultView:Map
SELECT DISTINCT ?item ?itemLabel ?coord
WHERE
{
?item wdt:P31/wdt:P279* wd:Q486972; # instance/subclass of human settlement
wdt:P17 wd:Q183; # Germany
rdfs:label ?itemLabel;
wdt:P625 ?coord;
FILTER (LANG(?itemLabel) = "de").
FILTER REGEX (?itemLabel, "(ow|itz)$").
}
LIMIT 1000
Examples
- title might be a rhyme
FILTER(REGEX(?title, "^\\w*(\\w{3})(\\W+\\w*\\1)+$", "i") && !REGEX(?title, "^(\\w+)(\\W+\\1)+$", "i")).
- title is an alliteration
FILTER(REGEX(STR(?title), "^(\\p{L})\\w+(?:\\W+\\1\\w+){2,}$", "i")).
PS: A single \ is used as an escape symbol in strings, so \\ is used to indicate a single \.
REPLACE
[edit | edit source]The function REPLACE( string, pattern, replacement, flag )
returns the string after replacing all occurrences of pattern
in string
with replacement
. pattern
is interpreted the same way as in REGEX
. The replacement
can contain $n
or ${name}
, which are replaced by the corresponding numbered or named capture group in the pattern.[1][2] An optional flag
affects the regular expression pattern, just as with the flag
argument to REGEX()
. For example, REPLACE( "ABCDEFGH", "DEF", "_def_" )
returns "ABC_def_GH". REPLACE( "ABCDEFGH", "[AEIOU]", "" )
removes all the vowels from the original string.
Functions on numbers
[edit | edit source]ABS
[edit | edit source]The function ABS( number )
returns the absolute value of a number. For instance, ABS( -1 )
returns 1.
ROUND
[edit | edit source]The function ROUND( number )
returns the rounded value of a number. For instance, ROUND( 1.4 )
returns 1, while ROUND( 1.6 )
returns 2.
CEIL
[edit | edit source]The function CEIL( number )
returns the largest number (round up). For instance, both CEIL( 1.4 )
and CEIL( 1.6 )
return 2.
FLOOR
[edit | edit source]The function FLOOR( number )
returns the smallest number (round down). For instance, both FLOOR( 1.4 )
and FLOOR( 1.6 )
return 1.
RAND
[edit | edit source]The function RAND( )
returns a random value between 0 and 1. For instance, RAND( )
returns 0.7156405780739334.
COUNT, MIN, MAX, AVG and SUM
[edit | edit source]The functions COUNT
, MIN
, MAX
, AVG
, and SUM
can only be used as Aggregate functions.
# average age of painters by century
SELECT ?century (AVG(?age) AS ?average_age) (ROUND(AVG(?age)) AS ?rounded_average_age)
WHERE
{
?item wdt:P31 wd:Q5. # is a human
?item wdt:P106 wd:Q1028181. # occupation painter
?item wdt:P569 ?born.
?item wdt:P570 ?died.
FILTER( ?died > ?born ).
BIND( (?died - ?born)/365.2425 AS ?age )
BIND( FLOOR(YEAR(?born)/100)*100 AS ?century )
}
GROUP BY ?century
ORDER BY ?century
For instance, BIND( MAX( ?var1, ?var2 ) AS ?max)
does NOT work. Instead, use the expression BIND( IF( ?var1>?var2, ?var1, ?var2 ) AS ?max)
.
Functions on dates and times
[edit | edit source]NOW
[edit | edit source]The function NOW( )
returns the current date and time.
With constructions like NOW() + "P1D"^^xsd:duration
it is possible to add or subtract days from the current date.
NOW() + "P1M"^^xsd:duration
will add 1 month.
NOW() + "P1Y"^^xsd:duration
will add 1 year.
You can add or subtract any combination of years, months, days and even hours, minutes and seconds using "P1Y1M1DT0H0M0.000S"^^xsd:duration
.
YEAR, MONTH and DAY
[edit | edit source]The function YEAR( datevalue )
returns the year of the date value. The functions MONTH
and DAY
return the month or day.
HOURS, MINUTES and SECONDS
[edit | edit source]The function HOURS( datevalue )
returns the hour of the date value. The functions MINUTES
and SECONDS
returns the minutes or seconds.
Currently Wikidata does not hold date values in hour, minutes or seconds.
TIMEZONE and TZ
[edit | edit source]The function TIMEZONE( datevalue )
returns the time zone of the date value. Currently in Wikidata all dates have a TIMEZONE
of "PT0S" for UTC. Other valid values would be between -PT14H and PT14H indicating the time offset in hours.
The function TZ( datevalue )
returns the time zone as a simple literal of the date value. Currently in Wikidata all dates have a TZ
of "Z" for UTC.
Example with dates
# Query to find all musicians who have already died
# calculate their age (full years) at death
# count how many of them died at each age
#defaultView:LineChart
SELECT ?age (COUNT (DISTINCT ?a) AS ?count)
WHERE {
?a wdt:P31 wd:Q5. #instance of human
?a wdt:P106/wdt:P279 wd:Q639669. #occupation a subclass of musician
?a p:P569/psv:P569 ?birth_date_node.
?a p:P570/psv:P570 ?death_date_node.
?birth_date_node wikibase:timeValue ?birth_date.
?death_date_node wikibase:timeValue ?death_date.
BIND( YEAR(?death_date) - YEAR(?birth_date) -
IF(MONTH(?death_date)<MONTH(?birth_date) ||
(MONTH(?death_date)=MONTH(?birth_date) && DAY(?death_date)<DAY(?birth_date)),1,0) AS ?age )
# calculate the age, precisely to the day (times and timezones ignored)
FILTER(?age > 10 && ?age < 100). #ignore outliers, several of which are probably errors
}
GROUP BY ?age
ORDER BY ?age
Functions on coordinates
[edit | edit source]geof:distance
[edit | edit source]The function geof:distance
returns distance between two points, in kilometers.
Example usage:
# distance between 2 cities
SELECT ?city1 ?city1Label ?location1 ?city2 ?city2Label ?location2 ?dist
WHERE
{
VALUES ?city1 { wd:Q84 }. # London
VALUES ?city2 { wd:Q90 }. # Paris
?city1 wdt:P625 ?location1.
?city2 wdt:P625 ?location2.
BIND(geof:distance(?location1, ?location2) as ?dist)
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
geof:longitude and geof:latitude
[edit | edit source]The functions geof:longitude
and geof:latitude
convert a well-known text literal (<http://www.opengis.net/ont/geosparql#wktLiteral>) of a Point geometry to its longitude and latitude, respectively.
Example usage:
# Museums in Barcelona with coordinates
SELECT ?item ?itemLabel ?coord ?lon ?lat
WHERE
{
?item wdt:P131 wd:Q1492; # in the administrative territory of Barcelona
wdt:P31 wd:Q33506; # is a museum
wdt:P625 ?coord.
BIND(geof:longitude(?coord) AS ?lon)
BIND(geof:latitude(?coord) AS ?lat)
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
wikibase:geoLongitude and wikibase:geoLatitude
[edit | edit source]It is also possible to retrieve the longitude and latitude of a coordinate node using wikibase:geoLongitude
and wikibase:geoLatitude
qualifiers. See here for an explanation.
# Museums in Barcelona with coordinates
SELECT ?item ?itemLabel ?coord ?lon ?lat
WHERE
{
?item wdt:P131 wd:Q1492; # in the administrative territory of Barcelona
wdt:P31 wd:Q33506; # is a museum
p:P625 [
ps:P625 ?coord;
psv:P625 [
wikibase:geoLongitude ?lon;
wikibase:geoLatitude ?lat;
]
].
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
AUTO_COORDINATES
[edit | edit source]The code BIND("[AUTO_COORDINATES]" as ?loc)
returns the current location, if you allow your browser to use it.
# Drinking establishments near me
SELECT DISTINCT ?pub ?pubLabel ?dist
WHERE
{
BIND("[AUTO_COORDINATES]" as ?loc) .
SERVICE wikibase:around {
?pub wdt:P625 ?location .
bd:serviceParam wikibase:center ?loc.
bd:serviceParam wikibase:radius "1" .
}
FILTER EXISTS { ?pub wdt:P31/wdt:P279* wd:Q5307737 }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
BIND(geof:distance(?loc, ?location) as ?dist)
} ORDER BY ?dist
Aggregate functions
[edit | edit source]See Aggregate functions for COUNT, MIN, MAX, SUM, AVG, SAMPLE and GROUP_CONCAT
References
[edit | edit source]- ↑ a b The Wikidata Query Service uses a Wikimedia fork of Blazegraph that depends on OpenJDK 8. Other instances may run different versions of Java with different levels of Unicode support.
- ↑ QLever uses the
\n
syntax instead, which you need to escape as\\n
inside a string literal.
Federated query
Federated query
[edit | edit source]Federated query is the ability to take a query and provide solutions based on information from many different sources.
A building block is the ability to have one query be able to issue a query on another SPARQL endpoint during query execution.
Example
[edit | edit source]Federated query comparing Wikidata and UK Parliament database, to show differences of more than 10km in locations of UK parliamentary constituencies (source Wikidata:Wikidata:Status updates/2019 06 10)
# compare lat/long of Parliament and Wikidata constituency records
#defaultView:Map{"hide":["?line"]}
PREFIX parliament:<https://id.parliament.uk/schema/>
SELECT DISTINCT ?constituency ?parlcoord ?item ?itemLabel ?wdcoord ?dist ?line WHERE {
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
SERVICE <https://api.parliament.uk/sparql>
{ ?constituency parliament:constituencyGroupHasConstituencyArea ?area .
?area parliament:latitude ?lat . ?area parliament:longitude ?long .
bind(SUBSTR(str(?constituency),26) as ?parlid) . }
BIND(concat("Point(",str(?long)," ",str(?lat),")") as ?parlcoord)
# get constituencies from Parliament with coordinates
?item wdt:P6213 ?parlid . ?item wdt:P31 wd:Q27971968 . ?item wdt:P625 ?wdcoord .
# now get them from Wikidata with coordinates
BIND(geof:distance(?parlcoord, ?wdcoord) as ?dist) . filter (?dist >= 10)
# now find out the distance (in kms)
?item p:P625 ?statementnode. ?statementnode psv:P625 ?valuenode.
?valuenode wikibase:geoLatitude ?wikilat . ?valuenode wikibase:geoLongitude ?wikilon.
BIND(CONCAT('LINESTRING (', STR(?wikilon), ' ', STR(?wikilat), ',', STR(?long), ' ', STR(?lat), ')') AS ?str) .
BIND(STRDT(?str, geo:wktLiteral) AS ?line)
}
References
[edit | edit source]- www.wikidata.org/wiki/Wikidata:SPARQL_query_service/Federated_queries
- www.w3.org/2009/sparql/wiki/Feature:BasicFederatedQuery
- www.oclc.org/developer/news/2016/federated-queries-with-sparql.en.html
Subqueries
SPARQL allows one SELECT query to be nested inside another. The inner SELECT query is called a subquery and is evaluated first. The subquery result variable(s) can then be used in the outer SELECT query.
Simplest example:
SELECT ?x ?y WHERE {
VALUES ?x { 1 2 3 4 }
{
SELECT ?y WHERE { VALUES ?y { 5 6 7 8 } }
} # \subQuery
} # \mainQuery
The example below calculates the population of each country in the world, expressing the population as a percentage of the world's total population. In order to calculate the world's total population, it uses a subquery.
SELECT ?countryLabel ?population (round(?population/?worldpopulation*1000)/10 AS ?percentage)
WHERE {
?country wdt:P31 wd:Q3624078; # is a sovereign state
wdt:P1082 ?population.
{
# subquery to determine ?worldpopulation
SELECT (sum(?population) AS ?worldpopulation)
WHERE {
?country wdt:P31 wd:Q3624078; # is a sovereign state
wdt:P1082 ?population.
}
}
SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".}
}
ORDER BY desc(?population)
The syntax of a query with a subquery is shown below. A subquery is basically the same as a simple query and is enclosed within { brackets }.
SELECT ... query result variables ... WHERE { ... query pattern ... { # subquery SELECT ... subquery result variables ... WHERE { ... subquery pattern ... } ... optional subquery modifiers ... } # end of subquery } ... optional query modifiers ...
Subqueries can be used, often with a LIMIT
, to avoid a query timeout by fractioning the task. As an example, this query is timing out :
#100 humans with exactly 6 months between their month of birthday and their month of death.
SELECT DISTINCT ?itemLabel ?item WHERE {
?item wdt:P31 wd:Q5 ;
p:P569/psv:P569 [wikibase:timePrecision ?datePrecision1; wikibase:timeValue ?naissance] ;
p:P570/psv:P570 [wikibase:timePrecision ?datePrecision2; wikibase:timeValue ?mort ].
filter(?datePrecision1>10)
filter(?datePrecision2>10)
bind(month(?mort) - month(?naissance) as ?mois)
bind(day(?mort) - day(?naissance) as ?jour)
filter(abs(?mois) = 6)
filter(?jour = 0)
SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".}
}
ORDER BY ?itemLabel
LIMIT 100
but the same query putting the limit on the selected items in a subquery and the label service outside it didn't timeout :
#100 humans with exactly 6 months between their month of birthday and their month of death.
SELECT DISTINCT ?itemLabel ?item WHERE {
{
SELECT DISTINCT ?item WHERE {
?item wdt:P31 wd:Q5 ;
p:P569/psv:P569 [wikibase:timePrecision ?datePrecision1; wikibase:timeValue ?naissance] ;
p:P570/psv:P570 [wikibase:timePrecision ?datePrecision2; wikibase:timeValue ?mort ].
filter(?datePrecision1>10)
filter(?datePrecision2>10)
bind(month(?mort) - month(?naissance) as ?mois)
bind(day(?mort) - day(?naissance) as ?jour)
filter(abs(?mois) = 6)
filter(?jour = 0)
}
LIMIT 100
}
SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".}
}
ORDER BY ?itemLabel
Templates
Within Wikidata Query Service Query Helper the comment #TEMPLATE can build a simple template where the user can choose one or more variables to change a query without needing to know the SPARQL query language.
See below an example to select presidents and their spouses from any country:
#TEMPLATE={"template":"Presidents of ?country and their spouses","variables":{"?country":{"query":" SELECT ?id WHERE { ?id wdt:P31 wd:Q6256 . }"} } }
SELECT ?p ?pLabel ?ppicture ?w ?wLabel ?wpicture WHERE {
BIND(wd:Q30 AS ?country) # United States of America
?country (p:P6/ps:P6) ?p. # Head of government
?p wdt:P26 ?w. # Spouse
OPTIONAL {
?p wdt:P18 ?ppicture.
?w wdt:P18 ?wpicture.
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
The "template" contains the texts and one ore more variables.
The "variables" list the variables and optionally a "query" to select possible values, in this case ?id is instance of country.
- If no query is needed the syntax is "?var1":{} In that case that might be
#TEMPLATE={"template":"Presidents of ?country and their spouses","variables":{"?country":{} } }
Mind: BIND(wd:Q30 AS ?country) is used as a default for the variable ?country.
References
[edit | edit source]
WIKIDATA Qualifiers, References and Ranks
The data on WIKIDATA contains more info than only the triples. For a full description see Wikidata:Glossary.
Let us look at Douglas Adams (Q42) and where he is educated at (P69).
Qualifiers
[edit | edit source]Let us list the education of Douglas Adams and the qualifiers Start time end End time:
SELECT ?education ?educationLabel ?starttime ?endtime
WHERE
{
wd:Q42 p:P69 ?statement.
?statement ps:P69 ?education.
?statement pq:P580 ?starttime.
?statement pq:P582 ?endtime.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?starttime
The prefix p:
points not to the object, but to a statement node. This node then is the subject of other triples.
The prefix ps:
within the statement node retrieves the object.
The prefix pq:
within the statement node retrieves the qualifier information.
The code can be abbreviated a lot with the [ ] syntax by eliminating the variable ?statement
.
SELECT ?education ?educationLabel ?starttime ?endtime
WHERE
{
wd:Q42 p:P69 [ps:P69 ?education;
pq:P580 ?starttime;
pq:P582 ?endtime;
].
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?starttime
References
[edit | edit source]Let us list the education of Douglas Adams and the stated in (P248) reference:
SELECT ?education ?educationLabel ?ref ?refLabel
WHERE
{
wd:Q42 p:P69 ?statement.
?statement ps:P69 ?education.
?statement prov:wasDerivedFrom ?refnode.
?refnode pr:P248 ?ref.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
The prefix p:
points not to the object, but to a statement node. This node then is the subject of other triples.
The prefix ps:
within the statement node retrieves the object.
The prov:wasDerivedFrom
within the statement node points to a new reference node.
The prefix pr:
within the reference node retrieves the reference information.
The code can be abbreviated a lot with the [ ] syntax by eliminating the variables ?statement
and ?refnode
.
SELECT ?education ?educationLabel ?ref ?refLabel
WHERE
{
wd:Q42 p:P69 [ ps:P69 ?education;
prov:wasDerivedFrom
[ pr:P248 ?ref;
]
].
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
You might have noticed that only one of the 2 educations is listed in the queries above. To list both we need to introduce OPTIONAL{ }
. As this can only be used with full sentences we need to use the full expanded syntax with triples as short sentence:
SELECT ?education ?educationLabel ?ref ?refLabel
WHERE
{
wd:Q42 p:P69 ?statement.
?statement ps:P69 ?education.
OPTIONAL{ ?statement prov:wasDerivedFrom ?refnode.
?refnode pr:P248 ?ref.
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Ranks
[edit | edit source]Let us list the education of Douglas Adams and the ranks of them:
SELECT ?education ?educationLabel ?rank
WHERE
{
wd:Q42 p:P69 ?statement.
?statement ps:P69 ?education.
?statement wikibase:rank ?rank.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
The prefix p:
points not to the object, but to a statement node. This node then is the subject of other triples.
The prefix ps:
within the statement node retrieves the object.
The wikibase:rank
within the statement node retrieves the rank information.
The code can be abbreviated a lot with the [ ] syntax by eliminating the variable ?statement
.
SELECT ?education ?educationLabel ?rank
WHERE
{
wd:Q42 p:P69 [ps:P69 ?education;
wikibase:rank ?rank;
].
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
An example with different ranks is the (historcal) countries in which Berlin lied.
# Berlins countries and ranking
SELECT ?country ?countryLabel ?rank
WHERE
{
wd:Q64 p:P17 [ps:P17 ?country;
wikibase:rank ?rank;
].
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
The 3 possible values for ranks are wikibase:PreferredRank
, wikibase:NormalRank
and wikibase:DeprecatedRank
Compare this with the normal triples, which will select only the value(s) with the highest rank. In this case only the Preferred Rank value Germany (Q183).
# Berlins countries via normal triples
SELECT ?country ?countryLabel
WHERE
{
wd:Q64 wdt:P17 ?country.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Ignore ranks in queries
[edit | edit source]Ranks might cause unexpected results. For example, consider this query, that will give you back all Dutch municipalities that share a border with Alphen aan den Rijn (Q213246):
select ?muni ?muniLabel where {
?muni wdt:P31 wd:Q2039348;
wdt:P47 wd:Q213246.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
This will not show Boskoop (Q894442), because its rank for municipality of the Netherlands (Q2039348) is 'normal' while the other three values for instance of (P31) are 'preferred'. To see Boskoop in the previous query as well, rewrite it like this:
select ?muni ?muniLabel where {
?muni p:P31 [ps:P31 wd:Q2039348];
wdt:P47 wd:Q213246.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Or use a property path to shorten the query like this:
select ?muni ?muniLabel where {
?muni p:P31/ps:P31 wd:Q2039348;
wdt:P47 wd:Q213246.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Summary
[edit | edit source]Example | ||
---|---|---|
Statements | wd:Q42 wdt:P69 wd:Q691283. or wd:Q42 p:P69 ?s. ?s ps:P69 wd:Q691283. or wd:Q42 p:P69 [ ps:P69 wd:Q691283 ]. |
|
Rank | wd:Q42 p:P69 [ wikibase:rank ?rank ]. | |
Qualifier | wd:Q42 p:P69 [ pq:P580 ?qualifier ]. | |
Reference | wd:Q42 p:P69 [ prov:wasDerivedFrom [ pr:P248 ?ref ] ]. |
A full query of Douglas Adams education, with rank, qualifiers and references could look like
# Douglas Adams education, with rank, qualifiers and references
SELECT ?education ?educationLabel ?rank ?starttime ?endtime ?ref ?refLabel
WHERE
{
wd:Q42 p:P69 ?statement.
?statement ps:P69 ?education.
# rank
?statement wikibase:rank ?rank.
# qualifiers
OPTIONAL{ ?statement pq:P580 ?starttime. }
OPTIONAL{ ?statement pq:P582 ?endtime. }
# references
OPTIONAL{ ?statement prov:wasDerivedFrom ?refnode.
?refnode pr:P248 ?ref.
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?starttime
References
[edit | edit source]
WIKIDATA Language links and Badges
Language links
[edit | edit source]#Female scientists with most number of sitelinks (but not English Wikipedia)
SELECT ?item ?itemLabel (COUNT(DISTINCT ?languagelink) AS ?linkcount) (SAMPLE( ?languagelink ) AS ?sample )
WHERE {
?item wdt:P31 wd:Q5. # is a human
?item wdt:P21 wd:Q6581072. # gender female
?item wdt:P106 wd:Q901. # occupation scientist
?languagelink schema:about ?item.
FILTER NOT EXISTS {
?enlanguagelink schema:about ?item.
?enlanguagelink schema:inLanguage "en".
?enlanguagelink schema:isPartOf <https://en.wikipedia.org/>
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en,de,es,ar,fr" }
}
GROUP BY ?item ?itemLabel
ORDER BY DESC(?linkcount)
The code schema:about
links an item to the language link.
The code schema:inLanguage
within the language link retrieves the language.
The code schema:isPartOf
within the language link retrieves the wikimedia project, like <https://en.wikipedia.org/>.
Badges
[edit | edit source]Badges are a kind of marker attached to a language link, which could identify, for example, that the article is a featured article badge (Q17437796) on a specific site. They do not describe the external entity but the page on the specific site.
Here a query that ranks the wikimedia projects badges.
SELECT ?wiki ?badge ?badgeLabel (COUNT(DISTINCT ?languagelink) AS ?count) (SAMPLE(?item) AS ?sample)
WHERE {
?languagelink schema:isPartOf ?wiki.
?languagelink wikibase:badge ?badge.
?languagelink schema:about ?item.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?badge rdfs:label ?badgeLabel.
}
}
GROUP BY ?wiki ?badge ?badgeLabel
ORDER BY DESC(?count)
LIMIT 100
The code schema:isPartOf
(within the language link) retrieves the wikimedia project, like <https://en.wikipedia.org/>.
The code wikibase:badge
(within the language link) retrieves the badge, like featured article badge (Q17437796).
The code schema:about
links an item to the language link.
WIKIDATA Precision, Units and Coordinates
The data on WIKIDATA contains more info than only the triples. For a full description see Wikidata:Glossary.
Values on Wikidata often have additional info like precision, units etc. Wikidata's solution for almost everything is more triples. And it means more prefixes.
Entities
[edit | edit source]For entities there is no additional information.
Strings
[edit | edit source]For strings there is no additional information.
Time
[edit | edit source]# examples of dates, precision, time zones and calendars
SELECT ?time ?timeprecision ?timezone ?timecalendar ?timecalendarLabel
WHERE
{
{ wd:Q5598 p:P569/psv:P569 ?timenode. } # Jul 15, 1606
UNION
{ wd:Q220 p:P571/psv:P571 ?timenode. } # 13 April 753 BCE
UNION
{ wd:Q1 p:P580/psv:P580 ?timenode. } # 13798 million years BCE
?timenode wikibase:timeValue ?time.
?timenode wikibase:timePrecision ?timeprecision.
?timenode wikibase:timeTimezone ?timezone.
?timenode wikibase:timeCalendarModel ?timecalendar.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
The prefix p:
points to a statement node.
The prefix psv:
within a statement node retrieves a time node.
The wikibase:timeValue
within the time node retrieves the time.
The wikibase:timePrecision
within a time node retrieves the precision of the time.
- The codes for precision are 0: billion years, 1: hundred million years, 3: million years, 4: hundred thousand years, 5: ten thousand years, 6: millennium, 7: century, 8: decade, 9: year, 10: month, 11: day, 12: hour, 13: minute, 14: second.
The wikibase:timeTimezone
within a time node retrieves the timezone, as an offset from UTC in minutes.
The wikibase:timeCalendarModel
within a time node retrieves the calendar, a common used value is proleptic Gregorian calendar (Q1985727).
Comment on filtering on dates.
On filtering dates the code ^^xsd:dateTime
should be added, like:
FILTER("2015-01-01"^^xsd:dateTime <= ?dob && ?dob < "2016-01-01"^^xsd:dateTime).
Monolingual texts
[edit | edit source]For monolingual text there is no additional information. The text is represented as a string literal with language tag. It has only simple value.
#Countries in European Union with native name and language
SELECT ?country ?countryLabel ?nativename ?language
{
wd:Q458 wdt:P150 ?country. # European Union contains administrative territorial entity
OPTIONAL { ?country wdt:P1705 ?nativename.
BIND( LANG(?nativename) AS ?language). }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?language)
Coordinates
[edit | edit source]# Museums in Barcelona with coordinates
SELECT ?item ?itemLabel ?coord ?lon ?lat
WHERE
{
?item wdt:P131 wd:Q1492. # in the administrative territory of Barcelona
?item wdt:P31 wd:Q33506. # is a museum
?item p:P625 ?coordinate.
?coordinate ps:P625 ?coord.
?coordinate psv:P625 ?coordinate_node.
?coordinate_node wikibase:geoLongitude ?lon.
?coordinate_node wikibase:geoLatitude ?lat.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
The prefix p:
points to a statement node.
The prefix ps:
within a statement node for a coordinate retrieves the full coordinate, like Point(2.1749 41.3834).
The prefix psv:
within a statement node retrieves a coordinate node.
The wikibase:geoLongitude
within the coordinate node retrieves the longitude value.
The wikibase:geoLatitude
within a coordinate node retrieves the latitude value.
The wikibase:geoGlobe
within a coordinate node retrieves the globe object. For coordinates on earth it will be Earth (Q2).
The wikibase:geoPrecision
within a coordinate node retrieves the precision of the coordinate values, measured in degrees. Multiply by 111000 to convert to meters.
Here an example of mountains not located on Earth.
# Mountains, with coordinates, not located on Earth
SELECT ?item ?name ?coord ?lon ?lat ?globe ?globeLabel
{
?item wdt:P31 wd:Q8502; # is a mountain
p:P625 [
ps:P625 ?coord;
psv:P625 [
wikibase:geoLongitude ?lon;
wikibase:geoLatitude ?lat;
wikibase:geoGlobe ?globe;
] ;
]
FILTER ( ?globe != wd:Q2 ) # globe is not earth
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .
?item rdfs:label ?name.
?globe rdfs:label ?globeLabel.
}
}
ORDER BY ?globeLabel ?name
Quantities
[edit | edit source]# Longest rivers in the USA
SELECT ?item ?itemLabel ?length ?unitLabel ?lowerbound ?upperbound ?precision ?length2 ?conversion ?length_in_m
WHERE
{
?item wdt:P31/wdt:P279* wd:Q4022. # rivers
?item wdt:P17 wd:Q30. # country USA
?item p:P2043 ?stmnode. # length
?stmnode psv:P2043 ?valuenode.
?valuenode wikibase:quantityAmount ?length.
?valuenode wikibase:quantityUnit ?unit.
?valuenode wikibase:quantityLowerBound ?lowerbound.
?valuenode wikibase:quantityUpperBound ?upperbound.
BIND((?upperbound-?lowerbound)/2 AS ?precision).
BIND(IF(?precision=0, ?length, (CONCAT(str(?length), "±", str(?precision)))) AS ?length2).
# conversion to SI unit
?unit p:P2370 ?unitstmnode. # conversion to SI unit
?unitstmnode psv:P2370 ?unitvaluenode.
?unitvaluenode wikibase:quantityAmount ?conversion.
?unitvaluenode wikibase:quantityUnit wd:Q11573. # meter
BIND(?length * ?conversion AS ?length_in_m).
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?length_in_m)
LIMIT 10
The prefix p:
points to a statement node.
The prefix psv:
within a statement node retrieves a value node.
The wikibase:quantityAmount
within the value node retrieves a quantity value.
The wikibase:quantityUnit
within a value node retrieves a unit. Not all quantities have units.
The wikibase:quantityLowerBound
and wikibase:quantityUpperBound
can be used to indicate the precision.
Within the Unit you can retrieve the statement for converting the unit to the SI unit etc. See in the example the length of the Mississippi River is in miles, and can be converted to SI units. The conversion could also be needed if some lengths are in meters and some in kilometers to so.
Some variables are used only for demonstration. Without those the query will be:
# Longest rivers in the USA
SELECT ?item ?itemLabel ?length2 ?unitLabel ?length_in_m
WHERE
{
?item wdt:P31/wdt:P279* wd:Q4022. # rivers
?item wdt:P17 wd:Q30. # country USA
?item p:P2043 ?stmnode. # length
?stmnode psv:P2043 ?valuenode.
?valuenode wikibase:quantityAmount ?length.
?valuenode wikibase:quantityUnit ?unit.
?valuenode wikibase:quantityLowerBound ?lowerbound.
?valuenode wikibase:quantityUpperBound ?upperbound.
BIND((?upperbound-?lowerbound)/2 AS ?precision).
BIND(IF(?precision=0, ?length, (CONCAT(str(?length), "±", str(?precision)))) AS ?length2).
# conversion to SI unit
?unit p:P2370 ?unitstmnode. # conversion to SI unit
?unitstmnode psv:P2370 ?unitvaluenode.
?unitvaluenode wikibase:quantityAmount ?conversion.
?unitvaluenode wikibase:quantityUnit wd:Q11573. # meter
BIND(?length * ?conversion AS ?length_in_m).
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?length_in_m)
LIMIT 10
The code can be abbreviated a lot with the [ ] syntax by eliminating the variables for the nodes and unused variables.
# Longest rivers in the USA
SELECT ?item ?itemLabel ?length2 ?unitLabel ?length_in_m
WHERE
{
?item wdt:P31/wdt:P279* wd:Q4022. # rivers
?item wdt:P17 wd:Q30. # country USA
?item p:P2043/psv:P2043 [ # length
wikibase:quantityAmount ?length;
wikibase:quantityUnit ?unit;
wikibase:quantityLowerBound ?lowerbound;
wikibase:quantityUpperBound ?upperbound;
]
BIND((?upperbound-?lowerbound)/2 AS ?precision).
BIND(IF(?precision=0, ?length, (CONCAT(str(?length), "±", str(?precision)))) AS ?length2).
# conversion to SI unit
?unit p:P2370/psv:P2370 [ # conversion to SI unit
wikibase:quantityAmount ?conversion;
wikibase:quantityUnit wd:Q11573; # meter
]
BIND(?length * ?conversion AS ?length_in_m).
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?length_in_m)
LIMIT 10
Normalised units
[edit | edit source]Instead of converting by SPARQL code as above the units can be converted automatically by using prefix psn:
(normalised) instead of psv:
.
# Longest rivers in the USA, normalised units
SELECT ?item ?itemLabel ?length ?unitLabel ?lowerbound ?upperbound ?precision ?length2
WHERE
{
?item wdt:P31/wdt:P279* wd:Q4022. # rivers
?item wdt:P17 wd:Q30. # country USA
?item p:P2043 ?stmnode. # length
?stmnode psn:P2043 ?valuenode. # normalised value
?valuenode wikibase:quantityAmount ?length.
?valuenode wikibase:quantityUnit ?unit.
?valuenode wikibase:quantityLowerBound ?lowerbound.
?valuenode wikibase:quantityUpperBound ?upperbound.
BIND((?upperbound-?lowerbound)/2 AS ?precision).
BIND(IF(?precision=0, ?length, (CONCAT(str(?length), "±", str(?precision)))) AS ?length2).
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?length)
LIMIT 10
The prefix psv:
within a statement node retrieves a value node.
The prefix psn:
within a statement node retrieves a normalised value node. Note that a query with psn:
will give you only the values with units presents in this list.[1]
Notice that the units changed from km and miles into meters, and that the values are all calculated accordingly. Also the lower- and upper-bound values change accordingly. The precision now is in meters.
Normalized quantity values are value nodes that are parallel to the original data nodes but represented in base units. They are connected to their parent nodes by predicates with prefix having "v" replaced with "n" - i.e. psn:
, prn:
(for references) and pqn:
(for qualifiers).
The code can be abbreviated a lot with the [ ] syntax by eliminating the variables for the nodes and unused variables.
# Longest rivers in the USA, normalised units
SELECT ?item ?itemLabel ?length2 ?unitLabel
WHERE
{
?item wdt:P31/wdt:P279* wd:Q4022. # rivers
?item wdt:P17 wd:Q30. # country USA
?item p:P2043/psn:P2043 [ # length, normalised
wikibase:quantityAmount ?length;
wikibase:quantityUnit ?unit;
wikibase:quantityLowerBound ?lowerbound;
wikibase:quantityUpperBound ?upperbound;
]
BIND((?upperbound-?lowerbound)/2 AS ?precision).
BIND(IF(?precision=0, ?length, (CONCAT(str(?length), "±", str(?precision)))) AS ?length2).
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?length)
LIMIT 10
Summary
[edit | edit source]item | Statement node | Value node |
---|---|---|
p:Pxxx | ||
psv:Pxxx or psn:Pxxx (normalised) |
||
Time values | ||
wikibase:timeValue | ||
wikibase:timePrecision | ||
wikibase:timeTimezone | ||
wikibase:timeCalendarModel | ||
Coordinate values | ||
wikibase:geoLongitude | ||
wikibase:geoLatitude | ||
wikibase:geoGlobe | ||
wikibase:geoPrecision | ||
Quantity values | ||
wikibase:quantityAmount | ||
wikibase:quantityUnit | ||
wikibase:quantityLowerBound | ||
wikibase:quantityUpperBound | ||
wikibase:quantityAmount |
WIKIDATA Lexicographical data
The data on WIKIDATA contains more info than only triples with concepts: Q-items are related to a thing or an idea. Since 2018, Wikidata has also stored a new type of data: words, phrases and sentences, in many languages, described in many languages. This information is stored in new types of entities, called Lexemes (L), Forms (F) and Senses (S).
Comment This chapter is not yet complete. Please help expand this.
Glossary | SPARQL code |
A Lexeme is a lexical element of a language, such as a word, a phrase, or a prefix (see Lexeme on Wikipedia). Lexemes are Entities in the sense of the Wikibase data model. A Lexeme is described using the following information:
|
|
|
|
|
|
Prefixes
[edit | edit source]Prefixes used only for Lexicograpical data are:
PREFIX ontolex: <http://www.w3.org/ns/lemon/ontolex#>
PREFIX dct: <http://purl.org/dc/terms/>
Examples
[edit | edit source]Get swedish gloss of specific lexeme
[edit | edit source]SELECT ?sense ?gloss
WHERE {
VALUES ?l {wd:L35455}. # Swedish noun "vara"
?l ontolex:sense ?sense.
?sense skos:definition ?gloss.
# Get only the swedish gloss, exclude otherwise
FILTER(LANG(?gloss) = "sv")
}
Get senses of a specific lexeme that has P5137 (item for this sense)
[edit | edit source]SELECT ?sense ?gloss
WHERE {
VALUES ?l {wd:L39751}. # Swedish adjective "smaklös"
?l ontolex:sense ?sense.
?sense skos:definition ?gloss.
# Exclude lexemes without a linked QID from at least one sense
?sense wdt:P5137 []. # has P5137 (item for this sense)
}
Lexemes describing a color
[edit | edit source]# By Vesihiisi
SELECT ?l ?lemma ?languageLabel WHERE {
?l a ontolex:LexicalEntry;
dct:language ?language;
wikibase:lemma ?lemma .
?l wdt:P31 wd:Q376431. # color term
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
ORDER BY ?languageLabel
Words per language
[edit | edit source]Here is an overview of the number of words per language
SELECT (?language AS ?label) (COUNT(*) AS ?count)
WHERE {
?l a ontolex:LexicalEntry ; wikibase:lemma ?word .
BIND( LANG(?word) AS ?language )
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?language
ORDER BY DESC(?count)
English and American English
[edit | edit source]This query lists all words that are different in English and American English
SELECT ?l ?english ?american
WHERE {
?l wikibase:lemma ?english . FILTER(LANG(?english)="en-gb")
?l wikibase:lemma ?american . FILTER(LANG(?american)="en")
FILTER(?english!=?american)
}
ORDER BY ?english
Overview of Lexical categories
[edit | edit source]Here is an overview of the most used Lexical categories in English:
SELECT ?categoryLabel (COUNT(*) AS ?count)
WHERE {
?l a ontolex:LexicalEntry ; wikibase:lemma ?word ; wikibase:lexicalCategory ?category; dct:language ?language.
?language wdt:P218 'en'
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?categoryLabel
ORDER BY DESC(?count)
Longest words
[edit | edit source]Here as example a list of the longest words in English
SELECT DISTINCT ?l ?word ?len
WHERE {
{
?l a ontolex:LexicalEntry ; dct:language wd:Q1860 ; wikibase:lemma ?word .
BIND(strlen(?word) as ?len)
} UNION {
?l a ontolex:LexicalEntry ; dct:language wd:Q1860 ; ontolex:lexicalForm/ontolex:representation ?word .
BIND(strlen(?word) as ?len)
}
}
order by DESC(?len)
LIMIT 20
Adjectives
[edit | edit source]This example shows (English) adjectives and their positive, comparative and superlative degrees. By changing VALUES ?language { wd:Q1860 }
this query can be changed into any language.
# adjectives
SELECT DISTINCT ?l ?word (GROUP_CONCAT(DISTINCT ?subfeatLabel; SEPARATOR=", ") AS ?subfeatures)
(GROUP_CONCAT(DISTINCT ?positive; SEPARATOR=", ") AS ?Positive)
(GROUP_CONCAT(DISTINCT ?comparative; SEPARATOR=", ") AS ?Comparative)
(GROUP_CONCAT(DISTINCT ?superlative; SEPARATOR=", ") AS ?Superlative)
WHERE {
VALUES ?language { wd:Q1860 } # English
?l a ontolex:LexicalEntry ; wikibase:lemma ?word; wikibase:lexicalCategory wd:Q34698 . # adjective
?l dct:language ?language.
OPTIONAL {
?l ontolex:lexicalForm ?form1 .
?form1 ontolex:representation ?positive ; wikibase:grammaticalFeature wd:Q3482678 . # positive
OPTIONAL { ?form1 wikibase:grammaticalFeature ?subfeat . FILTER(?subfeat != wd:Q3482678 ) }
}
?l ontolex:lexicalForm ?form2 .
?form2 ontolex:representation ?comparative ; wikibase:grammaticalFeature wd:Q14169499 . # comparative
OPTIONAL { ?form2 wikibase:grammaticalFeature ?subfeat . FILTER(?subfeat != wd:Q14169499 ) }
?l ontolex:lexicalForm ?form3 .
?form3 ontolex:representation ?superlative ; wikibase:grammaticalFeature wd:Q1817208 . # superlative
OPTIONAL { ?form3 wikibase:grammaticalFeature ?subfeat . FILTER(?subfeat != wd:Q1817208 ) }
# use ?word if ?positive is blank
BIND(IF(BOUND(?positive),?positive,?word) AS ?positive).
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
?subfeat rdfs:label ?subfeatLabel.
}
}
GROUP BY ?word ?l
ORDER BY ?word ?l
LIMIT 20000
Verbs
[edit | edit source]This example shows (English) verbs and their conjugations. This query is very complex because conjugations in Wikidata are modeled very complex. By changing VALUES ?language { "en" }
this query can be changed into any language. Currently only a few verbs are conjugated.
# verbs
SELECT ?l ?word (GROUP_CONCAT(DISTINCT ?subfeatLabel; SEPARATOR=", ") AS ?subfeatures)
?single1 ?single2 ?single3 ?plural1 ?plural2 ?plural3
WHERE {
VALUES ?language { "en" }
?l a ontolex:LexicalEntry ; wikibase:lemma ?word; wikibase:lexicalCategory ?category .
FILTER(?category = wd:Q24905 ) # verb
FILTER(LANG(?word) = ?language)
OPTIONAL {
?l ontolex:lexicalForm ?form1 .
{ ?form1 ontolex:representation ?single1 ; wikibase:grammaticalFeature wd:Q51929218 . # first-person singular
} UNION
{ ?form1 ontolex:representation ?single1 ; wikibase:grammaticalFeature wd:Q21714344 . # first person
FILTER NOT EXISTS{ ?form1 wikibase:grammaticalFeature wd:Q146786 . } # without plural
FILTER NOT EXISTS{ ?form1 wikibase:grammaticalFeature wd:Q51929154 . } # without plural person
} UNION
{ ?form1 ontolex:representation ?single1 ; wikibase:grammaticalFeature wd:Q51929131 . # singular person
FILTER NOT EXISTS{ ?form1 wikibase:grammaticalFeature wd:Q21714344 . } # without first person
FILTER NOT EXISTS{ ?form1 wikibase:grammaticalFeature wd:Q51929049 . } # without second person
FILTER NOT EXISTS{ ?form1 wikibase:grammaticalFeature wd:Q51929074 . } # without third person
} UNION
{ ?form1 ontolex:representation ?single1 ; wikibase:grammaticalFeature wd:Q110786 . # singular
FILTER NOT EXISTS{ ?form1 wikibase:grammaticalFeature wd:Q21714344 . } # without first person
FILTER NOT EXISTS{ ?form1 wikibase:grammaticalFeature wd:Q51929049 . } # without second person
FILTER NOT EXISTS{ ?form1 wikibase:grammaticalFeature wd:Q51929074 . } # without third person
}
FILTER(LANG(?single1) = ?language )
OPTIONAL { ?form1 wikibase:grammaticalFeature ?subfeat .
FILTER(?subfeat != wd:Q51929218 && ?subfeat != wd:Q21714344 ) # not first-person singular / first person
FILTER(?subfeat != wd:Q51929131 && ?subfeat != wd:Q110786 ) # not singular person / singular
FILTER(?subfeat != wd:Q51929049 && ?subfeat != wd:Q51929074 ) } # not second person / third person
}
OPTIONAL {
?l ontolex:lexicalForm ?form2 .
{ ?form2 ontolex:representation ?single2 ; wikibase:grammaticalFeature wd:Q51929369 . # second-person singular
} UNION
{ ?form2 ontolex:representation ?single2 ; wikibase:grammaticalFeature wd:Q51929049 . # second person
FILTER NOT EXISTS{ ?form2 wikibase:grammaticalFeature wd:Q146786 . } # without plural
FILTER NOT EXISTS{ ?form2 wikibase:grammaticalFeature wd:Q51929154 . } # without plural person
} UNION
{ ?form2 ontolex:representation ?single2 ; wikibase:grammaticalFeature wd:Q51929131 . # singular person
FILTER NOT EXISTS{ ?form2 wikibase:grammaticalFeature wd:Q21714344 . } # without first person
FILTER NOT EXISTS{ ?form2 wikibase:grammaticalFeature wd:Q51929049 . } # without second person
FILTER NOT EXISTS{ ?form2 wikibase:grammaticalFeature wd:Q51929074 . } # without third person
} UNION
{ ?form2 ontolex:representation ?single2 ; wikibase:grammaticalFeature wd:Q110786 . # singular
FILTER NOT EXISTS{ ?form2 wikibase:grammaticalFeature wd:Q21714344 . } # without first person
FILTER NOT EXISTS{ ?form2 wikibase:grammaticalFeature wd:Q51929049 . } # without second person
FILTER NOT EXISTS{ ?form2 wikibase:grammaticalFeature wd:Q51929074 . } # without third person
}
FILTER(LANG(?single2) = ?language )
OPTIONAL { ?form2 wikibase:grammaticalFeature ?subfeat .
FILTER(?subfeat != wd:Q51929369 && ?subfeat != wd:Q51929049 ) # not second-person singular / second person
FILTER(?subfeat != wd:Q51929131 && ?subfeat != wd:Q110786 ) # not singular person / singular
FILTER(?subfeat != wd:Q21714344 && ?subfeat != wd:Q51929074 ) } # not first person / third person
}
OPTIONAL {
?l ontolex:lexicalForm ?form3 .
{ ?form3 ontolex:representation ?single3 ; wikibase:grammaticalFeature wd:Q51929447 . # third-person singular
} UNION
{ ?form3 ontolex:representation ?single3 ; wikibase:grammaticalFeature wd:Q51929074 . # third person
FILTER NOT EXISTS{ ?form3 wikibase:grammaticalFeature wd:Q146786 . } # without plural
FILTER NOT EXISTS{ ?form3 wikibase:grammaticalFeature wd:Q51929154 . } # without plural person
} UNION
{ ?form3 ontolex:representation ?single3 ; wikibase:grammaticalFeature wd:Q51929131 . # singular person
FILTER NOT EXISTS{ ?form3 wikibase:grammaticalFeature wd:Q21714344 . } # without first person
FILTER NOT EXISTS{ ?form3 wikibase:grammaticalFeature wd:Q51929049 . } # without second person
FILTER NOT EXISTS{ ?form3 wikibase:grammaticalFeature wd:Q51929074 . } # without third person
} UNION
{ ?form3 ontolex:representation ?single3 ; wikibase:grammaticalFeature wd:Q110786 . # singular
FILTER NOT EXISTS{ ?form3 wikibase:grammaticalFeature wd:Q21714344 . } # without first person
FILTER NOT EXISTS{ ?form3 wikibase:grammaticalFeature wd:Q51929049 . } # without second person
FILTER NOT EXISTS{ ?form3 wikibase:grammaticalFeature wd:Q51929074 . } # without third person
}
FILTER(LANG(?single3) = ?language )
OPTIONAL { ?form3 wikibase:grammaticalFeature ?subfeat .
FILTER(?subfeat != wd:Q51929447 && ?subfeat != wd:Q51929074 ) # not third-person singular / third person
FILTER(?subfeat != wd:Q51929131 && ?subfeat != wd:Q110786 ) # not singular person / singular
FILTER(?subfeat != wd:Q21714344 && ?subfeat != wd:Q51929049 ) } # not first person / second person
}
OPTIONAL {
?l ontolex:lexicalForm ?form4 .
{ ?form4 ontolex:representation ?plural1 ; wikibase:grammaticalFeature wd:Q51929290 . # first-person plural
} UNION
{ ?form4 ontolex:representation ?plural1 ; wikibase:grammaticalFeature wd:Q21714344 . # first person
FILTER NOT EXISTS{ ?form4 wikibase:grammaticalFeature wd:Q110786 . } # without singular
FILTER NOT EXISTS{ ?form4 wikibase:grammaticalFeature wd:Q51929131 . } # without singular person
} UNION
{ ?form4 ontolex:representation ?plural1 ; wikibase:grammaticalFeature wd:Q51929154 . # plural person
FILTER NOT EXISTS{ ?form4 wikibase:grammaticalFeature wd:Q21714344 . } # without first person
FILTER NOT EXISTS{ ?form4 wikibase:grammaticalFeature wd:Q51929049 . } # without second person
FILTER NOT EXISTS{ ?form4 wikibase:grammaticalFeature wd:Q51929074 . } # without third person
} UNION
{ ?form4 ontolex:representation ?plural1 ; wikibase:grammaticalFeature wd:Q146786 . # plural
FILTER NOT EXISTS{ ?form4 wikibase:grammaticalFeature wd:Q21714344 . } # without first person
FILTER NOT EXISTS{ ?form4 wikibase:grammaticalFeature wd:Q51929049 . } # without second person
FILTER NOT EXISTS{ ?form4 wikibase:grammaticalFeature wd:Q51929074 . } # without third person
}
FILTER(LANG(?plural1) = ?language )
OPTIONAL { ?form4 wikibase:grammaticalFeature ?subfeat .
FILTER(?subfeat != wd:Q51929290 && ?subfeat != wd:Q21714344 ) # not first-person plural / first person
FILTER(?subfeat != wd:Q51929154 && ?subfeat != wd:Q146786 ) # not plural person / plural
FILTER(?subfeat != wd:Q51929049 && ?subfeat != wd:Q51929074 ) } # not second person / third person
}
OPTIONAL {
?l ontolex:lexicalForm ?form5 .
{ ?form5 ontolex:representation ?plural2 ; wikibase:grammaticalFeature wd:Q51929403 . # second-person plural
} UNION
{ ?form5 ontolex:representation ?plural2 ; wikibase:grammaticalFeature wd:Q51929049 . # second person
FILTER NOT EXISTS{ ?form5 wikibase:grammaticalFeature wd:Q110786 . } # without singular
FILTER NOT EXISTS{ ?form5 wikibase:grammaticalFeature wd:Q51929131 . } # without singular person
} UNION
{ ?form5 ontolex:representation ?plural2 ; wikibase:grammaticalFeature wd:Q51929154 . # plural person
FILTER NOT EXISTS{ ?form5 wikibase:grammaticalFeature wd:Q21714344 . } # without first person
FILTER NOT EXISTS{ ?form5 wikibase:grammaticalFeature wd:Q51929049 . } # without second person
FILTER NOT EXISTS{ ?form5 wikibase:grammaticalFeature wd:Q51929074 . } # without third person
} UNION
{ ?form5 ontolex:representation ?plural2 ; wikibase:grammaticalFeature wd:Q146786 . # plural
FILTER NOT EXISTS{ ?form5 wikibase:grammaticalFeature wd:Q21714344 . } # without first person
FILTER NOT EXISTS{ ?form5 wikibase:grammaticalFeature wd:Q51929049 . } # without second person
FILTER NOT EXISTS{ ?form5 wikibase:grammaticalFeature wd:Q51929074 . } # without third person
}
FILTER(LANG(?plural2) = ?language )
OPTIONAL { ?form5 wikibase:grammaticalFeature ?subfeat .
FILTER(?subfeat != wd:Q51929403 && ?subfeat != wd:Q51929049 ) # not second-person plural / second person
FILTER(?subfeat != wd:Q51929154 && ?subfeat != wd:Q146786 ) # not plural person / plural
FILTER(?subfeat!= wd:Q21714344 && ?subfeat != wd:Q51929074 ) } # not first person / third person
}
OPTIONAL {
?l ontolex:lexicalForm ?form6 .
{ ?form6 ontolex:representation ?plural3 ; wikibase:grammaticalFeature wd:Q51929517 . # third-person plural
} UNION
{ ?form6 ontolex:representation ?plural3 ; wikibase:grammaticalFeature wd:Q51929074 . # third person
FILTER NOT EXISTS{ ?form6 wikibase:grammaticalFeature wd:Q110786 . } # without singular
FILTER NOT EXISTS{ ?form6 wikibase:grammaticalFeature wd:Q51929131 . } # without singular person
} UNION
{ ?form6 ontolex:representation ?plural3 ; wikibase:grammaticalFeature wd:Q51929154 . # plural person
FILTER NOT EXISTS{ ?form6 wikibase:grammaticalFeature wd:Q21714344 . } # without first person
FILTER NOT EXISTS{ ?form6 wikibase:grammaticalFeature wd:Q51929049 . } # without second person
FILTER NOT EXISTS{ ?form6 wikibase:grammaticalFeature wd:Q51929074 . } # without third person
} UNION
{ ?form6 ontolex:representation ?plural3 ; wikibase:grammaticalFeature wd:Q146786 . # plural
FILTER NOT EXISTS{ ?form6 wikibase:grammaticalFeature wd:Q21714344 . } # without first person
FILTER NOT EXISTS{ ?form6 wikibase:grammaticalFeature wd:Q51929049 . } # without second person
FILTER NOT EXISTS{ ?form6 wikibase:grammaticalFeature wd:Q51929074 . } # without third person
}
FILTER(LANG(?plural3) = ?language )
OPTIONAL { ?form6 wikibase:grammaticalFeature ?subfeat .
FILTER(?subfeat != wd:Q51929517 && ?subfeat != wd:Q51929074 ) # not third-person plural / third person
FILTER(?subfeat != wd:Q51929154 && ?subfeat != wd:Q146786 ) # not plural person / plural
FILTER(?subfeat != wd:Q21714344 && ?subfeat != wd:Q51929049 ) } # not first person / second person
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
?subfeat rdfs:label ?subfeatLabel.
}
}
GROUP BY ?l ?word ?single1 ?single2 ?single3 ?plural1 ?plural2 ?plural3
ORDER BY ?word ?single1 ?single2 ?single3 ?plural1 ?plural2 ?plural3
LIMIT 20000
Articles in all languages
[edit | edit source]This example shows articles in many languages.
# articles in all languages
SELECT ?l ?language ?categoryLabel ?word ?subfeatures
(GROUP_CONCAT(DISTINCT ?masculine; SEPARATOR=", ") AS ?Masculine )
(GROUP_CONCAT(DISTINCT ?feminine; SEPARATOR=", ") AS ?Feminine )
(GROUP_CONCAT(DISTINCT ?neuter; SEPARATOR=", ") AS ?Neuter )
WHERE {
SELECT ?l ?language ?categoryLabel ?word (GROUP_CONCAT(DISTINCT ?subfeatLabel; SEPARATOR=", ") AS ?subfeatures)
?masculine ?feminine ?neuter
WHERE {
VALUES ?categories { wd:Q103184 wd:Q2865743 wd:Q3813849 } # article or definite article or indefinite article
?l a ontolex:LexicalEntry ; wikibase:lemma ?word; wikibase:lexicalCategory ?category .
FILTER(?category = ?categories ) # article or definite article or indefinite article
BIND(LANG(?word) as ?language)
OPTIONAL {
?l ontolex:lexicalForm ?form1 .
?form1 ontolex:representation ?masculine ; wikibase:grammaticalFeature wd:Q499327 . # masculine
FILTER(LANG(?masculine) = ?language )
OPTIONAL { ?form1 wikibase:grammaticalFeature ?subfeat . FILTER(?subfeat != wd:Q499327 ) }
}
OPTIONAL {
?l ontolex:lexicalForm ?form2 .
?form2 ontolex:representation ?feminine ; wikibase:grammaticalFeature wd:Q1775415 . # feminine
FILTER(LANG(?feminine) = ?language )
OPTIONAL { ?form2 wikibase:grammaticalFeature ?subfeat . FILTER(?subfeat != wd:Q1775415 ) }
}
OPTIONAL {
?l ontolex:lexicalForm ?form3 .
?form3 ontolex:representation ?neuter ; wikibase:grammaticalFeature wd:Q1775461 . # neuter
FILTER(LANG(?neuter) = ?language )
OPTIONAL { ?form3 wikibase:grammaticalFeature ?subfeat . FILTER(?subfeat != wd:Q1775461 ) }
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
?subfeat rdfs:label ?subfeatLabel.
?category rdfs:label ?categoryLabel.
}
}
GROUP BY ?language ?l ?categoryLabel ?word ?masculine ?feminine ?neuter
ORDER BY ?language ?categoryLabel ?subfeatures ?word ?masculine ?feminine ?neuter
}
GROUP BY ?language ?l ?categoryLabel ?word ?subfeatures
ORDER BY ?language ?l ?categoryLabel ?word ?subfeatures ?masculine ?feminine ?neuter
External tools
[edit | edit source]See Wikidata:Tools/Lexicographical data for a list of external tools for Lexicographical data.
References
[edit | edit source]
Wikimedia Commons Query Service
Wikimedia Commons images use structured data too, see Structured Data on Commons (SDoC) . With Wikimedia Commons Query Service (WCQS) the data can be queried. This is available per 1/2/2022 at https://commons-query.wikimedia.org/ It was previously available in beta phase at https://wcqs-beta.wmflabs.org/.
Find below an example of Depictions of Douglas Adams shown as image grid.
#Show images of Douglas Adams in an image grid
#defaultView:ImageGrid
SELECT ?file ?image WHERE {
?file wdt:P180 wd:Q42 .
?file schema:url ?image.
}
Mind: use |project=sdc
where sdc =Structured Data on Commons, the default is |project=wd
for Wikidata.
See more examples on WCQS itself.
The query below shows all properties of an image:
# all properties of an image
SELECT DISTINCT ?file ?predicate ?pLabel ?o ?oLabel
WHERE {
VALUES ?file { sdc:M107651852 } # an example image
{ ?file ?predicate ?o.
BIND( IRI(REPLACE( STR(?predicate),"prop/direct/","entity/" )) AS ?p).
}
UNION
{ ?file ?predicate1 [ ?predicate ?o ]. # qualifiers
BIND( IRI((REPLACE( REPLACE( STR(?predicate), "(direct/|statement/|value/|value-normalized/|qualifier/|reference/)", ""),"prop/","entity/"))) AS ?p).
}
FILTER( CONTAINS( STR(?predicate), "/prop/direct/") || CONTAINS( STR(?predicate), "/prop/qualifier/") || CONTAINS( STR(?predicate), "/prop/reference/")
|| CONTAINS( STR(?predicate), "schema.org") || CONTAINS( STR(?predicate), "w3.org") )
SERVICE <https://query.wikidata.org/sparql> {
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
?p rdfs:label ?pLabel .
?o rdfs:label ?oLabel .
}
}
}
LIMIT 100
This reveals some hidden statements like: date Modified / width / height / contentSize (bytes). The query below gives an example how to use it.
# show hidden statements
SELECT ?file ?instance_of ?instance_ofLabel ?inception_date ?dateModified ?width ?height ?contentSize
WHERE {
VALUES ?file { sdc:M107651852 } # an example image
OPTIONAL{ ?file wdt:P31 ?instance_of. }
OPTIONAL{ ?file wdt:P571 ?inception_date. }
OPTIONAL{ ?file schema:dateModified ?dateModified. }
OPTIONAL{ ?file schema:width ?width. }
OPTIONAL{ ?file schema:height ?height. }
OPTIONAL{ ?file schema:contentSize ?contentSize. }
SERVICE <https://query.wikidata.org/sparql> {
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
?instance_of rdfs:label ?instance_ofLabel .
}
}
}
This example with combination of SERVICE - mwapi shows Depicts statements with Dutch labels, of files in one Commons category
SELECT ?file ?title ?depicts ?depicts_label
WITH
{ SELECT ?file ?title
WHERE
{ SERVICE wikibase:mwapi
{
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
bd:serviceParam mwapi:gcmtitle "Category:Historia Naturalis van Rudolf II" .
bd:serviceParam mwapi:generator "categorymembers" .
bd:serviceParam mwapi:gcmtype "file" .
bd:serviceParam mwapi:gcmlimit "max" .
?title wikibase:apiOutput mwapi:title .
?pageid wikibase:apiOutput "@pageid" .
}
BIND (URI(CONCAT('https://commons.wikimedia.org/entity/M', ?pageid)) AS ?file)
}
} AS %get_files
WHERE
{ INCLUDE %get_files
?file wdt:P180 ?depicts .
service <https://query.wikidata.org/sparql> {
OPTIONAL {?depicts rdfs:label ?depicts_label FILTER (lang(?depicts_label) = 'nl') }
}
}
References
[edit | edit source]- WCQSbeta2-release-notes on 1/2/2022
- Commons-l announcement of WCQS on 22/7/2020
Views
This page helps you to understand the various possibilities to display results on Wikidata Query. The query example page contains many good examples you can try!
After running a query, you can choose with the "Display" button which result view you want to display. Depending on your query and the data type, some of the views will be available.
#Countries in European Union with Flags and Population
# to be displayed as
# 1) Table
# 2) Image Grid
# 3) Map
# 4) Bar Chart
# 5) Bubble Chart
SELECT ?country ?countryLabel (MAX(?population) AS ?Population) (SAMPLE(?flag_image) AS ?Flag) (MAX(?coordinate) AS ?coor)
WHERE {
wd:Q458 wdt:P150 ?country. # European Union contains administrative territorial entity
OPTIONAL { ?country wdt:P41 ?flag_image. }
OPTIONAL { ?country wdt:P36 ?capital. }
OPTIONAL { ?country wdt:P1082 ?population. }
OPTIONAL { ?capital wdt:P625 ?coordinate. }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
?country rdfs:label ?countryLabel
}
}
GROUP BY ?country ?countryLabel
ORDER BY DESC(?Population)
|
Default View
[edit | edit source]The default view can be changed for each SPARQL query by using a comment
#defaultView:[NAME]
For example:
#defaultView:ImageGrid
Table (default)
[edit | edit source]Shows the result data as table. It is the default view for results and can display every data type.
This will map every variable from the SPARQL select clause to a formatted table column.
-
Table Result
-
Table Result on small screens with flipped columns
Image Grid
[edit | edit source]The #defaultView:ImageGrid
shows the result data as a grid with images.
Every grid item has a big picture and opens the gallery when clicking on it.
Contents an item is the formatted row data.
Variable | Data Type | Mapping | Description |
---|---|---|---|
* | Commons Media | Grid Item | Image columns will create a grid item. |
Rows containing no image will be ignored.
Options:
Option name | Type | Description |
---|---|---|
hide |
One or more variable names (strings, starting with ? ), single value or array |
Don’t show these variables in the result. |
#Even more cats, with pictures
#added before 2016-10
#defaultView:ImageGrid
SELECT ?itemDescription ?itemLabel ?item ?pic
WHERE
{
?item wdt:P31 wd:Q146 .
OPTIONAL {
?item wdt:P18 ?pic
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
|
Map
[edit | edit source]The #defaultView:Map
shows the result data as map with OpenStreetMap data.
Variable | Data Type | Mapping | Description |
---|---|---|---|
* | Coordinate | Map Marker | Columns will create a marker on the map. Content of that marker is the formatted row data. |
?layer | Map Marker Layer | Creates a colored map layer that allows filtering. | |
?rgb | Color | Color Map Marker | Example 000000 for black color. |
Options:
specified using #defaultView:Map{"hide":["?foo", "?bar"],"layer":"?bar"}
using the array if there are multiple variables.
Option name | Type | Description |
---|---|---|
hide |
One or more variable names (strings, starting with ? ), single value or array |
Don’t show these variables in the result. |
layer |
One or more variable names (strings, starting with ? ), single value or array |
Use these variables for the map layer, instead of the default ?layerLabel , ?layer .
|
markercluster |
boolean or object |
|
#Big cities, grouped into map layers by population
#defaultView:Map
SELECT DISTINCT ?city ?cityLabel (SAMPLE(?location) AS ?location) (MAX(?population) AS ?population) (SAMPLE(?layer) AS ?layer)
WHERE
{
?city wdt:P31/wdt:P279* wd:Q515;
wdt:P625 ?location;
wdt:P1082 ?population.
FILTER(?population >= 500000).
BIND(
IF(?population < 1000000, "<1M",
IF(?population < 2000000, "1M-2M",
IF(?population < 5000000, "2M-5M",
IF(?population < 10000000, "5M-10M",
IF(?population < 20000000, "10M-20M",
">20M")))))
AS ?layer).
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?city ?cityLabel
ORDER BY ?population
|
#Airports in Belgium
#added before 2016-10
#List of airports in Belgium
#defaultView:Map
SELECT DISTINCT ?airport ?airportLabel ?coor ?range
WHERE
{
?airport wdt:P31 wd:Q1248784 ;
?range wd:Q31;
wdt:P625 ?coor.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
|
Technically, any well-known text literal can be displayed, not just a point coordinate.
Some Geoshapes are available via property geoshape (P3896). See this example for the countries.
#Geoshapes of counties
#defaultView:Map
SELECT DISTINCT ?country ?countryLabel ?geoshape (?countryLabel AS ?layer )
WHERE
{
?country wdt:P31 wd:Q3624078.
?country wdt:P3896 ?geoshape.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Geometry values can also be constructed in SPARQL using functions like CONCAT()
and STRDT()
– see this map of Czech railway lines for an example of LINESTRING()
geometry.
#defaultView:Map
SELECT ?line ?coord1 ?coord2 ?comm1 ?comm1Label ?comm2 ?comm2Label WHERE {
?comm1 p:P31/ps:P31/wdt:P279* wd:Q55488 ; wdt:P17 wd:Q213 ; wdt:P625 ?coord1 ; wdt:P197 ?comm2 .
MINUS { ?comm1 p:P31 [ ps:P31 wd:Q55488 ; pq:P582 [] ] } .
MINUS { ?comm2 p:P31 [ ps:P31 wd:Q55488 ; pq:P582 [] ] } .
MINUS { ?comm1 p:P31 [ ps:P31 wd:Q928830 ] } .
MINUS { ?comm2 p:P31 [ ps:P31 wd:Q928830 ] } .
?comm2 wdt:P625 ?coord2 .
?comm1 p:P625 [ ps:P625 []; psv:P625 [ wikibase:geoLongitude ?coord1lon; wikibase:geoLatitude ?coord1lat; ] ] .
?comm2 p:P625 [ ps:P625 []; psv:P625 [ wikibase:geoLongitude ?coord2lon; wikibase:geoLatitude ?coord2lat; ] ] .
BIND(CONCAT('LINESTRING (', STR(?coord1lon), ' ', STR(?coord1lat), ',', STR(?coord2lon), ' ', STR(?coord2lat), ')') AS ?str) .
BIND(STRDT(?str, geo:wktLiteral) AS ?line) .
service wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . }
}
Line, Area, Bar and Scatter Chart
[edit | edit source]#defaultView:LineChart
, #defaultView:AreaChart
, #defaultView:BarChart
and #defaultView:ScatterChart
shows one of these chart types.
Variable | Data Type | Mapping | Description |
---|---|---|---|
* | Number Label DateTime |
X-Axis Y-Axis |
First will be X-axis, second Y-axis |
* | Label | Category / Legend Item (optional) | Will only be used after the above are mapped |
* | Label | Storyboard / Animation (optional) | Will only be used after the above are mapped |
LineChart
[edit | edit source]
#Number of paintings per decade
#defaultView:LineChart
SELECT ?decade (COUNT(?decade) AS ?Paintings)
WHERE {
?item wdt:P31 wd:Q3305213 . # instance of painting
?item wdt:P571 ?inception .
BIND( year(?inception) as ?year ).
BIND( ROUND(?year/10)*10 as ?decade ) .
FILTER( ?year > 1400)
} GROUP BY ?decade
ORDER BY ?decade
|
#Population of countries sharing a border with Germany
#defaultView:LineChart
SELECT ?country ?year ?population ?countryLabel WHERE {
{
SELECT ?country ?year (AVG(?population) AS ?population) WHERE {
{
SELECT ?country (str(YEAR(?date)) AS ?year) ?population WHERE {
?country wdt:P47 wd:Q183.
?country p:P1082 ?populationStatement.
?populationStatement ps:P1082 ?population.
?populationStatement pq:P585 ?date.
}
}
}
GROUP BY ?country ?year
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
|
AreaChart
[edit | edit source]
#Cancer deaths per year and cancer type
#defaultView:AreaChart
SELECT ?cid (str(SAMPLE(?year)) AS ?year ) (COUNT(*) AS ?count) ?cause WHERE {
?pid wdt:P31 wd:Q5.
?pid wdt:P509 ?cid.
?cid wdt:P279* wd:Q12078.
OPTIONAL {
?cid rdfs:label ?cause.
FILTER((LANG(?cause)) = "en")
}
?pid wdt:P570 ?_date_of_death.
BIND ( YEAR(?_date_of_death) AS ?year )
FILTER( BOUND(?cause) )
FILTER( BOUND(?year) )
FILTER( ?year > 1960 )
}
GROUP BY ?cid ?cause ?year
|
BarChart
[edit | edit source]
#Battles per year per continent and country last 80 years (animated)
#defaultView:BarChart
SELECT (SAMPLE(?_continentLabel) AS ?contient) (COUNT(?_country) AS ?battles) (SAMPLE(?_countryLabel) AS ?_countryLabel) (SAMPLE(?year) AS ?year) WHERE {
?subj wdt:P31 wd:Q178561.
OPTIONAL { ?subj wdt:P580 ?d1. }
OPTIONAL { ?subj wdt:P585 ?d2. }
OPTIONAL { ?subj wdt:P582 ?d3. }
BIND(IF(!BOUND(?d1), IF(!BOUND(?d2), ?d3, ?d2), ?d1) AS ?date)
BIND(str(YEAR(?date)) AS ?year)
?subj wdt:P276 ?_loc.
?_loc wdt:P17 ?_country.
?_country wdt:P30 ?_continent.
?_country rdfs:label ?_countryLabel.
?_continent rdfs:label ?_continentLabel.
BIND((NOW()) - ?date AS ?distance)
FILTER(BOUND(?year))
FILTER((LANG(?_countryLabel)) = "en")
FILTER((LANG(?_continentLabel)) = "en")
FILTER((0 <= ?distance) && (?distance < 31 * 12 * 80))
}
GROUP BY ?year ?_country
ORDER BY ?_countryLabel
|
ScatterChart
[edit | edit source]
# Scatter Chart river watershed area in km² plotted over length in km
PREFIX wikibaseβ: <http://wikiba.se/ontology-beta#>
#defaultView:ScatterChart
# river watershed area in km² plotted over length in km, using new normalized units support
# (rivers shorter than 500 km excluded to avoid crashing the browser when trying to render too many results)
SELECT ?river (SAMPLE(?length) AS ?length) (SAMPLE(?area) AS ?area) ?riverLabel WHERE {
?river wdt:P31/wdt:P279* wd:Q355304;
p:P2043/psn:P2043/(wikibase:quantityAmount|wikibaseβ:quantityAmount) ?lengthInMetres;
p:P2053/psn:P2053/(wikibase:quantityAmount|wikibaseβ:quantityAmount) ?areaInSquareMetres.
BIND(?lengthInMetres / 1000 AS ?length).
BIND(?areaInSquareMetres / 1000000 AS ?area).
FILTER(?length > 500).
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?river ?riverLabel
|
#Number of films by year and genre
#defaultView:ScatterChart
SELECT ?year (COUNT(?_genre) AS ?count ) (SAMPLE(?_genreLabel) AS ?label ) (?year as ?year_shown) WHERE {
?item wdt:P31 wd:Q11424.
?item wdt:P577 ?_publication_date.
?item wdt:P136 ?_genre.
?_genre rdfs:label ?_genreLabel.
BIND(str(YEAR(?_publication_date)) AS ?year)
FILTER((LANG(?_genreLabel)) = "en")
FILTER (?_publication_date >= "2000-00-00T00:00:00Z"^^xsd:dateTime)
}
GROUP BY ?_genreLabel ?year
HAVING (?count > 30)
|
Bubble Chart
[edit | edit source]The #defaultView:BubbleChart
shows a bubble chart where the size and the color of the bubble can configured.
When an item is provided a bubble is linked to Wikidata.org.
Variable | Data Type | Mapping |
---|---|---|
* | Label | Bubble Label |
* | Number | Bubble Size |
* | Item | Bubble Link (optional) |
?rgb | Color | Bubble Color (optional) |
#Overall causes of death ranking
#added before 2016-10
#defaultView:BubbleChart
SELECT ?cid ?cause (count(*) as ?count)
WHERE
{
?pid wdt:P31 wd:Q5 .
?pid wdt:P509 ?cid .
OPTIONAL {
?cid rdfs:label ?cause
filter (lang(?cause) = "en")
}
}
GROUP BY ?cid ?cause
ORDER BY DESC(?count) ASC(?cause)
|
Tree Map
[edit | edit source]#defaultView:TreeMap
shows a zoomable tree map.
Variable | Data Type | Mapping | Description |
---|---|---|---|
* | Label | Label | 1st label in a row will be 1st level in tree map. 2nd label will be 2nd level. 3rd label will be 3rd level etc.. |
* | Item | Link (optional) |
#Popular television series (in treeview)
#added before 2016-10
#defaultView:TreeMap
SELECT ?show ?showLabel ?season ?seasonLabel ?episode ?episodeLabel
WHERE
{
{ BIND(wd:Q886 as ?show) . # The Simpsons
?season wdt:P361 ?show .
?episode wdt:P361 ?season .
} UNION
{ BIND(wd:Q16538 as ?show) . # South Park
?season wdt:P361 ?show .
?episode wdt:P361 ?season .
} UNION
{ BIND(wd:Q147235 as ?show) . # How I Met Your Mother
?season wdt:P361 ?show .
?episode wdt:P361 ?season .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
|
Tree
[edit | edit source]#defaultView:Tree
shows an expandable tree.
Variable | Data Type | Mapping | Description |
---|---|---|---|
* | Item | Node ID | First item will be a root node The following items in the same row will be nested. |
* | Label | Node Label (optional) | Labels will be appended to current node. |
* | Number | Node Label (optional) | |
* | Commons Media | Item image |
#defaultView:Tree
SELECT ?continent ?continentFlag ?continentLabel ?country ?countryLabel ?countryFlag ?region ?regionLabel ?regionFlag ?city ?cityLabel ( 'Population:' as ?popLabel ) ?pop ?cityImage
WHERE
{
{
SELECT * WHERE {
?continent wdt:P31 wd:Q5107.
?country wdt:P30 ?continent.
?country wdt:P31 wd:Q6256.
?country wdt:P150 ?region.
OPTIONAL {
?continent wdt:P242 ?continentFlag.
?country wdt:P41 ?countryFlag.
?region wdt:P41 ?regionFlag.
}
OPTIONAL {
?region wdt:P36 ?city.
?city wdt:P31 wd:Q515.
?city wdt:P1082 ?pop.
?city wdt:P18 ?cityImage.
}
}
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
|
Timeline
[edit | edit source]#defaultView:Timeline
shows an explorable timeline.
Variable | Data Type | Mapping | Description |
---|---|---|---|
* | DateTime | Item Point in Time Item End Point in Time |
First date in a row sets the item position in the timeline. If a second date is present a start and end date will be set. |
* | Commons Media | Item image |
Options:
Option name | Type | Description |
---|---|---|
hide |
One or more variable names (strings, starting with ? ), single value or array |
Don’t show these variables in the result. |
#defaultView:Timeline
SELECT ?item ?itemLabel ?launchdate (SAMPLE(?image) AS ?image)
WHERE
{
?item wdt:P31 wd:Q26529 . # Space probe
?item wdt:P619 ?launchdate .
OPTIONAL { ?item wdt:P18 ?image }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?item ?itemLabel ?launchdate
|
Dimensions
[edit | edit source]#defaultView:Dimensions
shows dimension and their links.
It also allows filtering them by selecting an axis fraction.
Variable | Data Type | Mapping | Description |
---|---|---|---|
* | Label | Dimension | Every column in the result row is mapped to a dimension |
* | Number | ||
* | DateTime |
#defaultView:Dimensions
SELECT ?elementLabel ?_boiling_point ?_melting_point ?_electronegativity ?_density ?_mass WHERE {
?element wdt:P31 wd:Q11344.
?element wdt:P2102 ?_boiling_point.
?element wdt:P2101 ?_melting_point.
?element wdt:P1108 ?_electronegativity.
?element wdt:P2054 ?_density.
?element wdt:P2067 ?_mass.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 100
|
Graph
[edit | edit source]Shows an interactive graph that allow hierarchically ordering and exploring items.
Variable | Data Type | Mapping | Description |
---|---|---|---|
* | Item | Node ID | First item in a row will have an arrow pointing to the items in the same row |
* | Label | Node Label (optional) | |
* | Commons Media | Node Image (optional) | |
?rgb | Color | Node Color (optional) | |
* | Number | Node Size (optional) |
#US presidents and spouses
#defaultView:Graph
SELECT ?p ?pLabel ?ppicture ?w ?wLabel ?wpicture
WHERE
{
wd:Q30 p:P6/ps:P6 ?p .
?p wdt:P26 ?w .
OPTIONAL{
?p wdt:P18 ?ppicture .
?w wdt:P18 ?wpicture .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
|
#Children of Genghis Khan
#added before 2016-10
#defaultView:Graph
PREFIX gas: <http://www.bigdata.com/rdf/gas#>
SELECT ?item ?itemLabel ?pic ?linkTo ?linkToLabel
WHERE
{
SERVICE gas:service {
gas:program gas:gasClass "com.bigdata.rdf.graph.analytics.SSSP" ;
gas:in wd:Q720 ;
gas:traversalDirection "Forward" ;
gas:out ?item ;
gas:out1 ?depth ;
gas:maxIterations 4 ;
gas:linkType wdt:P40 .
}
OPTIONAL { ?item wdt:P40 ?linkTo }
OPTIONAL { ?item wdt:P18 ?pic }
SERVICE wikibase:label {bd:serviceParam wikibase:language "en,ru,ja,zh,az,ca" }
}
|
Additional Information
[edit | edit source]RDF Data Type Mapping
[edit | edit source]RDF Type | RDF Data Type | Constraint | Data Type |
---|---|---|---|
URI | String starts with http://www.wikidata.org/entity/Q |
Entity | |
URI | String starts with http://commons.wikimedia.org/wiki/Special:FilePath |
Commons Media | |
Literal | http://www.w3.org/2001/XMLSchema#dateTime | DateTime | |
Literal | http://www.opengis.net/ont/geosparql#wktLiteral | Coordinate | |
Literal |
http://www.w3.org/2001/XMLSchema#double |
Number | |
Literal | Has no RDF Data Type | Label | |
Literal | Has property xml:lang |
Label | |
Literal | String matches /^[0-9A-F]{6}$/ |
Color |
Formatting
[edit | edit source]There is special formatting for the following data types:
Data Type | Format |
---|---|
Entity | Link to Wikidata.org Label is shortened to wd:Qxx Explore an item (magnify icon) |
DateTime | MM DD, YYYY Tooltip shows raw ISO timestamp |
Commons Media | Link to Commons.Wikimedia.org Displaying Gallery ( image icon ) |
Mobile / Responsive
[edit | edit source]Result views are made to work on different devices and resolutions. Their appearance may change depending on the space they can use.
Embed Mode
[edit | edit source]Result views can be linked or embedded via iframes. You can get the regarding link or code by clicking Link in the above menu. See the difference: link to a query and link to a result
or the embed code:
<iframe style="width:80vw; height:50vh;" scrolling="yes" frameborder="0" src="http://tinyurl.com/zwf4k2w">
Download results
[edit | edit source]There are several ways to export the results of a query.
JSON, TSV and CSV are three different open-standard formats to download the data.
You can also download some of the graphic views as a SVG file: this feature works with all the views except table, image grid, timeline, graph builder, map, and graph.
Default View
[edit | edit source]The default view can be changed for each SPARQL query by using a comment:
#defaultView:[NAME]
For example:
#defaultView:ImageGrid
Result view options
[edit | edit source]Result views can be configured with various options, which are specified in JSON (Q2063) format, either following the defaultView
comment or following a separate view
comment.
#defaultView:Map{"layer": "?typeLabel"}
#view:ImageGrid{"hide": "?coords"}
External tools
[edit | edit source]See Wikidata:Tools/Visualize data for a list of external tools to visualize data.
References
[edit | edit source]
FAQ
The most popular questions-and-answers can be obtained through external links:
- ↑ You can see, by example, that these 2 queries don't give exactly the same values. Some values are missing in the second one.
Try it!
#Humans with height > 2,25 m SELECT ?taillem ?item WHERE {?item wdt:P31 wd:Q5 ; p:P2048 [psv:P2048 ?t ] . ?t wikibase:quantityAmount ?taille . ?t wikibase:quantityUnit/p:P2370/psv:P2370 [wikibase:quantityAmount ?conversion ; wikibase:quantityUnit wd:Q11573] . BIND(?taille * ?conversion AS ?taillem). filter(?taillem > 2.25) } order by desc (?taillem)
Try it!#Humans with height > 2,25 m SELECT ?taille ?item WHERE {?item wdt:P31 wd:Q5 ; p:P2048/psn:P2048 [wikibase:quantityAmount ?taille ]. filter(?taille > 2.25) } order by desc (?taille)