SPARQL/FILTER

From Wikibooks, open books for an open world
Jump to navigation Jump to search

FILTER(condition) is a clause you can insert into your SPARQL query to, well, filter the results. Inside the parentheses, you can put any expression of boolean type, and only those results where the expression returns true are used.

FILTER on values[edit | edit source]

For example, to get a list of all humans born in 2015, we first get all humans with their date of birth –

SELECT ?person ?personLabel ?dob
WHERE
{
  ?person wdt:P31 wd:Q5;
          wdt:P569 ?dob.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } 
}

– and then filter that to only return the results where the year of the date of birth is 2015. There are two ways to do that: extract the year of the date with the YEAR function, and test that it’s 2015 –

FILTER(YEAR(?dob) = 2015).

– or check that the date is between Jan. 1st (inclusive), 2015 and Jan. 1st, 2016 (exclusive):

FILTER("2015-01-01"^^xsd:dateTime <= ?dob && ?dob < "2016-01-01"^^xsd:dateTime).

I’d say that the first one is more straightforward, but it turns out the second one is much faster, so let’s use that:

SELECT ?person ?personLabel ?dob
WHERE
{
  ?person wdt:P31 wd:Q5;
          wdt:P569 ?dob.
  FILTER("2015-01-01"^^xsd:dateTime <= ?dob && ?dob < "2016-01-01"^^xsd:dateTime).
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } 
}

Try it!

FILTER on values in Labels[edit | edit source]

Another possible use of FILTER is related to labels. The label service is very useful if you just want to display the label of a variable. But if you want to do stuff with the label – for example: check if it starts with “Mr. ” – you’ll find that it doesn’t work:

SELECT ?human ?humanLabel
WHERE
{
  ?human wdt:P31 wd:Q15632617.                # fictional human
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
  FILTER(STRSTARTS(?humanLabel, "Mr. ")).     # This does not work
}

Try it!

This query finds all instances of Q15632617 and tests if their label starts with "Mr. " (STRSTARTS is short for “string starts [with]”; there’s also STRENDS and CONTAINS). The reason why this doesn’t work is that the label service adds its variables very late during query evaluation; at the point where we try to filter on ?humanLabel, the label service hasn’t created that variable yet.

Fortunately, the label service isn’t the only way to get an item’s label. Labels are also stored as regular triples, using the predicate rdfs:label. Of course, this means all labels, not just English ones; if we only want English labels, we’ll have to filter on the language of the label:

FILTER(LANG(?label) = "en").

The LANG function returns the language of a monolingual string, and here we only select those labels that are in English. The full query is:

SELECT ?human ?humanLabel
WHERE
{
  ?human wdt:P31 wd:Q15632617;  # fictional human
         rdfs:label ?humanLabel.
  FILTER(LANG(?humanLabel) = "en").
  FILTER(STRSTARTS(?humanLabel, "Mr. ")).
}

Try it!

We get the label with the ?human rdfs:label ?label triple, restrict it to English labels, and then check if it starts with “Mr. ”.

FILTER NOT EXISTS[edit | edit source]

Let's add images to these fictional humans

SELECT ?human ?humanLabel ?image
WHERE
{
  ?human wdt:P31 wd:Q15632617;  # fictional human
         rdfs:label ?humanLabel.
  FILTER(LANG(?humanLabel) = "en").
  FILTER(STRSTARTS(?humanLabel, "Mr. ")).
  OPTIONAL{ ?human wdt:P18 ?image. }
}

Try it!

The code OPTIONAL{ ?human wdt:P18 ?image. } shows all the fictional humans that start with “Mr. ” and shows an image if available.
To select only fictional humans starting with “Mr. ” with an image the code ?human wdt:P18 ?image. can be used.
To select those fictional humans without an image the code FILTER NOT EXISTS{ ?human wdt:P18 ?image. } should be used

SELECT ?human ?humanLabel ?image
WHERE
{
  ?human wdt:P31 wd:Q15632617;  # fictional human
         rdfs:label ?humanLabel.
  FILTER(LANG(?humanLabel) = "en").
  FILTER(STRSTARTS(?humanLabel, "Mr. ")).
  FILTER NOT EXISTS{ ?human wdt:P18 ?image. }  # without images
}

Try it!

MINUS[edit | edit source]

A similar way of negation provided in SPARQL is MINUS which evaluates both its arguments, then calculates solutions in the left-hand side that are not compatible with the solutions on the right-hand side.

SELECT ?human ?humanLabel ?image
WHERE
{
  ?human wdt:P31 wd:Q15632617;  # fictional human
         rdfs:label ?humanLabel.
  FILTER(LANG(?humanLabel) = "en").
  FILTER(STRSTARTS(?humanLabel, "Mr. ")).
  MINUS{ ?human wdt:P18 ?image. }  # without images
}

Try it!

References[edit | edit source]