I could not find any help for my problem that I encounter when I am trying to get list of all basketball players from Wikidata. First I get the number of players (it is someting around 130k). Then I am creating query with specific offset and limit 2000. The problem is that I am getting the same 2000 players every time whatever the offset is.
(However, if I am on https://query.wikidata.org/ than the results are always different)
Here is part of my code in python, where query is created.
while(numberOfPlayers > 0):
numberOfPlayers-=2000
offset = 0
queryPlayersBlock = """SELECT ?item ?itemLabel
WHERE
{
?item wdt:P106 wd:Q3665646.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
offset """+str(offset)+"""
limit 2000
"""
players = get_results(endpoint_url,queryPlayersBlock)["results"]["bindings"]
for i in range (0,len(players)):
dataFile.write(str(players[i]["itemLabel"]["value"]+" : "+players[i]["item"]["value"].removeprefix("http://www.wikidata.org/entity/")+"\n"))
offset+=2000
I found on sparql documentation that : "Using LIMIT and OFFSET to select different subsets of the query solutions will not be useful unless the order is made predictable by using ORDER BY." But when I use order by I get error "Query timeout limit reached".