I'm trying to retrieve ALL movie titles, with their aliases. I'm using queries like these (with increasing OFFSET) and at first it seems to work:
SELECT ?itemLabel ?itemAltLabel WHERE {
?item wdt:P31 wd:Q11424.
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
LIMIT 1000
OFFSET 0
While it retrieves a lot of valid movie titles, some are missing, although I can find them on the Wikidata site. I can't manage to make some changes to the query (I'm new to SPARQL):
- For debugging, I want to filter by itemLabel, something like
. ?itemLabel = 'fight club'
. I tried different options but none worked. Can you help me build such a query? - I want to exclude movies that have no itemLabel. These currently return their ID as itemLabel, e.g. "Q12345". How do I add something like
. ?itemLabel != ""
? Or should it be. ?itemLabel NOT LIKE 'Q[0-9]+'
somehow? - Sorting: I wonder if the missing titles might be due to not adding any ordering. I'm just running queries with LIMIT 1000 and incrementing the OFFSET with 1000 until there are no results. Could the sorting change between queries? If so, should I just add
ORDER BY ?refName
?
I could be making some stupid syntax mistakes, so please provide full working queries if you can. If there's anything else you think might prevent me from getting ALL the available titles, let me know.
I'm running the queries here: https://query.wikidata.org/