SPARQL/Triples

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Introduction[edit | edit source]

The statement "The sky has the color blue", consists of a subject ("the sky"), a predicate ("has the color"), and an object ("blue").

SPO or "subject, predicate, object" is known as a (Semantic) triple, or commonly referred to in Wikidata as a statement about data.

SPO is also used as a form of basic syntax layout for querying RDF data structures, or any graph database or triplestore, such as the Wikidata Query Service (WDQS).

See also w:en:Semantic triple

In Wikidata Query Service (WDQS) triples are used to describe the Query pattern in the WHERE clause of the SELECT statement

# ?child  father   Bach
  ?child wdt:P22 wd:Q1339.

In this case the triple ?child wdt:p22 wd:Q1339 specifies that the variable ?child must have the parent/father Bach.

Any of the triple parts Subject, Predicate and Object may be variables. This makes this selection very versatile.

Triples with the same subject[edit | edit source]

Example of SPARQL Triples
Example of SPARQL Triples

Aditional variables can be added by adding additional triples. In the simplest case these triples use the same subject.

SELECT ?child ?childLabel ?genderLabel ?birth_date ?date_of_death
WHERE
{
  ?child wdt:P22 wd:Q1339.# ?child  has father   Bach
  ?child wdt:P21 ?gender.
  ?child wdt:P569 ?birth_date.
  ?child wdt:P570 ?date_of_death.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

The first triple selects all the children of Bach. The additional triples links all these triples with a value for gender, birth date and date of death. The variable ?child links all of them together.

If you look closely at the result you might have noticed that Johann Christoph Friedrich Bach has 2 lines in the list because there are 2 different birth dates, 21 and 23 of June 1732. In his case ?child wdt:P569 ?birth_date. resulted into 2 values. See for further details at removing duplicates and modifiers.

OPTIONAL triples[edit | edit source]

If not all subjects have a value for a certain triple the subject is excluded. To have it included the OPTIONAL keyword comes in handy.

SELECT DISTINCT ?child ?childLabel ?genderLabel ?birth_date ?date_of_death
WHERE
{
  ?child wdt:P22 wd:Q76.# ?child  has father   Obama
  OPTIONAL{ ?child wdt:P21 ?gender. }
  OPTIONAL{ ?child wdt:P569 ?birth_date. }
  OPTIONAL{ ?child wdt:P570 ?date_of_death. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

Both children are shown, even if one of the variables (in this case the date of death) is not filled in.

See the chapter OPTIONAL for a full description.

Complex triples[edit | edit source]

Triples are not limited to one subject. In fact triples can be linked in any thinkable way.

You would for instance be able to list the coordinates of the birth places of the children of Bach

SELECT ?child ?childLabel ?placeofbirthLabel ?coordinates
WHERE
{
  ?child wdt:P22 wd:Q1339.# ?child  has father   Bach
  ?child wdt:P19 ?placeofbirth.
  ?placeofbirth wdt:P625 ?coordinates. 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?placeofbirthLabel

Try it!

You could even see these birthplaces (Köthen, Leipzig and Weimar) on a map by using #defaultView:Map

#defaultView:Map
SELECT ?placeofbirthLabel ?coordinates
       (GROUP_CONCAT(DISTINCT ?childLabel; SEPARATOR=", ") AS ?children)
WHERE
{
  ?child wdt:P22 wd:Q1339.# ?child  has father   Bach
  ?child wdt:P19 ?placeofbirth.
  ?placeofbirth wdt:P625 ?coordinates. 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". 
                         ?child        rdfs:label ?childLabel.
                         ?placeofbirth rdfs:label ?placeofbirthLabel.
                         }
}
GROUP BY ?placeofbirthLabel ?coordinates ?children

Try it!

If you click on a red dot you will get additional data as specified above with the variables ?placeofbirthLabel and ?children. We had to use GROUP BY, GROUP_CONCAT, DISTINCT and all labels should be defined explicitly in the SERVICE. You can toggle between the Map display and standard table display by the Display drop down list, at the right side of the Run button.

See more about views at Map views or all views

Triples by number of variables[edit | edit source]

Triples with one variable[edit | edit source]

An example of a triple with one variable for Subject would be

SELECT ?child ?childLabel
WHERE
{
  ?child wdt:P22 wd:Q1339.         # ?child  has father   Bach
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

This will list all Subjects (as variable ?child) with Predicate father (P22) and Object Johann Sebastian Bach (Q1339).

An example of a triple with one variable for Predicate would be

SELECT ?predicate ?pLabel
WHERE
{
  wd:Q57225 ?predicate wd:Q1339.         # Johann Christoph Friedrich Bach ?predicate Johann Sebastian Bach

  BIND( IRI(REPLACE( STR(?predicate),"prop/direct/","entity/" )) AS ?p). 
  # or ?p wikibase:directClaim ?predicate. 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

This will list all Predicates (as variable ?predicate) with Object Johann Christoph Friedrich Bach (Q57225) and Subject Johann Sebastian Bach (Q1339).
It shows that he is not only his father (P22) but also student of (P1066) him

An example of a triple with one variable for Object would be

SELECT ?workloc ?worklocLabel
WHERE
{
  wd:Q1339 wdt:P937 ?workloc.         # Bach  work location  ?workloc
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

This will list all Objects (as variable ?workloc) with Subject Johann Sebastian Bach (Q1339) and Predicate work location (P937).

Triples with two variables[edit | edit source]

An example of a triple with 2 variables and only a fixed value for Subject would list all raw information available in Wikidata about Bach

SELECT ?predicate ?object
WHERE
{
  wd:Q1339 ?predicate ?object.         # Bach
}

Try it!

See further at next section with 3 variabels for further usage

An example of a triple with 2 variables and only a fixed value for Predicate would list all subjects (probably airports) with an IATA airport code

SELECT ?subject ?subjectLabel ?object
WHERE
{
  ?subject wdt:P238 ?object.         # IATA airport code
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?object

Try it!

An usage could be to check for duplicate IATA codes:

SELECT ?object (COUNT(?subject) AS ?count)
               (MIN(?subject) AS ?subject1) (MAX(?subject) AS ?subject2)
               (GROUP_CONCAT(DISTINCT ?subjectLabel; SEPARATOR=", ") AS ?subjectLabels)
WHERE
{
  ?subject wdt:P238 ?object.         # IATA airport code
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". 
                         ?subject rdfs:label ?subjectLabel.
                         }
}
GROUP BY ?object
HAVING(COUNT(?subject) > 1)
ORDER BY ?object

Try it!

An example of a triple with 2 variables and only a fixed value for Object would list all subjects related to Bach

SELECT ?subject ?subjectLabel ?subjectDescription ?predicate ?pLabel
WHERE
{
  ?subject ?predicate wd:Q1339.  # Bach

  BIND( IRI(REPLACE( STR(?predicate),"prop/direct/","entity/" )) AS ?p). 
  # or ?p wikibase:directClaim ?predicate. 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?subject

Try it!

An other possibility of a triple with fixed value for Object would list all subjects with value "ABC", and will show for instance airport Albacete Airport

SELECT ?subject ?subjectLabel ?subjectDescription ?predicate ?pLabel
WHERE
{
  ?subject ?predicate "ABC". 

  BIND( IRI(REPLACE( STR(?predicate),"prop/direct/","entity/" )) AS ?p). 
  # or ?p wikibase:directClaim ?predicate. 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?subject

Try it!

Triples with three variables[edit | edit source]

When you would use triples with all 3 as variables (one for Subject, one for Predicate and one for Object) you basically will list out the whole database. This can be done for small databases, and can be used as well to get a rough idea of the available data, on all available properties.

All raw information available in Wikidata about the children of Bach:

SELECT ?subject ?predicate ?object 
WHERE
{
  ?subject ?predicate ?object.
  ?subject wdt:P22 wd:Q1339.		# subject has father   Bach
}
ORDER BY ?subject ?predicate ?object
LIMIT 10000

Try it!

The same query but grouped by predicate:

SELECT DISTINCT ?subject ?subjectLabel ?predicate 
       (GROUP_CONCAT(DISTINCT ?object; SEPARATOR=", ") AS ?objects)
WHERE
{
  ?subject ?predicate ?object.
  ?subject wdt:P22 wd:Q1339.		# subject has father   Bach
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?subject ?subjectLabel ?predicate
ORDER BY ?subject ?subjectLabel ?predicate
LIMIT 10000

Try it!

From the query below you can discover triples about the date the Wikidata page was last updated, the total number of statements, the number of sitelinks etc. These are schema:dateModified, wikibase:statements and wikibase:sitelinks respectively.

SELECT ?subject ?subjectLabel ?datemodified ?statements ?sitelinks 
WHERE
{
  ?subject wdt:P22 wd:Q1339.		# subject has father   Bach
  ?subject schema:dateModified ?datemodified.
  ?subject wikibase:statements ?statements.
  ?subject wikibase:sitelinks  ?sitelinks.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!