0

I have this Wikidata query that returns all the football stadiums with the names, coordinates, club labels and stuff like this. But I cannot figure out how to also get the country and city names where stadiums are located (and possibly the coordinates of the cities too).

Here is my query:

SELECT ?club ?clubLabel ?venue ?venueLabel ?coordinates
WHERE
{
    ?club wdt:P31 wd:Q476028 .
    ?club wdt:P115 ?venue .
    ?venue wdt:P625 ?coordinates .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

Link to test the query

EDIT 19th november 2020:

I need the timezone of the cities so I tried this query after looking at the documentation but it does not return the value. Just links like "wd:Q6723" :

SELECT DISTINCT ?timezone ?club ?locationLabel ?countryLabel ?clubLabel ?venue ?venueLabel ?coordinates 
WHERE
{
    ?venue (wdt:P421|wd:Q12143) ?timezone .
    ?club wdt:P31 wd:Q476028 .
    ?club wdt:P115 ?venue .
    ?venue wdt:P625 ?coordinates .
    OPTIONAL {?club wdt:P159|(wdt:P115/(wdt:P131|wdt:P276)) ?location . 
              OPTIONAL { ?location wdt:P17 ?country . }
              }     
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
} limit 500
user2360545
  • 131
  • 1
  • 2
  • 12
  • Just as a tip for the future, Wikidata has an excellent helpdesk specifically for queries: https://www.wikidata.org/wiki/Wikidata:Request_a_query I can highly recommend it. – Ainali Nov 17 '20 at 08:10

1 Answers1

1

Split over two now. Original query:

SELECT DISTINCT ?club ?locationLabel ?countryLabel ?clubLabel ?venue 
?venueLabel ?coordinates 
WHERE {
    ?club wdt:P31 wd:Q476028 .
    ?club wdt:P115 ?venue .
    ?venue wdt:P625 ?coordinates .
    OPTIONAL {
        ?club wdt:P159|(wdt:P115/(wdt:P131|wdt:P276)) ?location . 
        OPTIONAL { ?location wdt:P17 ?country . }
    }     
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
} limit 500

(Update #2: Previously, I asked for the club's timezone. But, of course, that's not the sort of data usually recorded for a club. Instead, you have to go via the location/venue/headquarters or similar, and possibly a level up to region/country because some suburb also doesn't have timezone data.

This is the general idea how it should work, but it's running into a timeout, and so am I:

SELECT DISTINCT ?timezone ?timezoneLabel ?offset 
?club ?clubLabel 
WHERE {
       ?club wdt:P31 wd:Q476028 .
       
       # via country. not perfect, because some have multiple timezones, but shoud be faster  
       ?club wdt:P17/wdt:P421 ?timezone . 
       
        # what I really want to do; all sorts of alternatives 
       #?club wdt:P115?/(wdt:P159|wdt:P276)/wdt:P131?/wdt:P421 ?timezone . 

    ?timezone wdt:P2907 ?offset.
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
} limit 500

Short explanation:

This uses three new things. OPTIONAL makes the following statement, well, optional. Clubs where nothing can be found will still be included in the output. The second OPTIONAL ist nested in the first, as it's pointless to ask for the country of a location that we haven't found.

The pipe symbol (|) allows for alternatives. Here, I'm asking for "headquarter location (P159) or check for two different ways to specify the location of the stadium. The slash, used in the latter case, denotes a path (club / venue / "located in district|location").

If there is missing data (there will be missing data), you may want to look at examples and figure out if there are other common patterns that locations are recorded. You could, for example, move the inner OPTIONAL outside for cases where the club has a country statement but no other, more specific, location.

Update: I've included the timezone as requested in the comment. To note:

  • ?timezoneLabel gets the timezone's label (= name), just as ?clubLabel gets the club's. The apppended "...Label" is a "magic" function that translates from IDs to huma-readable labels. It is enabled by including that SERVICE wikibase:label... line.

  • As you might want to use these timezones, I've included the marked line that gets the numeric offset in hours.

  • The offset may vary because UTC doesn't have dalight savings time. There should be multiple lines in the results for such cases, and you would need to read the ''qualifiers'' to see when they apply. Alternatively, maybe substract the offset from some other timezone's offset (i. e. yours) and you might get lucky and they cancel out.

Matthias Winkelmann
  • 15,870
  • 7
  • 64
  • 76
  • That's amazing. I am currently starting to learn this specific langage but I need time before mastering it. My goal is to get the timezone of the cities. So I looked at the documentation and I tried the query edited in my first post but I do not have the value. I just have "wd:Q6723" as a result in the column. May I ask you to correct it ? – user2360545 Nov 19 '20 at 00:57
  • Thank you for your help. Unfortunately, it does not work as expected. In fact, it returns only 133 rows compared to the 9065 rows from my initial query. – user2360545 Nov 19 '20 at 20:30
  • yeah, I've tried splitting it in two. I could swear it worked once, then just started throwing timeouts, sorry. Might have to do it in multiple steps, i. e. get the city first. – Matthias Winkelmann Nov 19 '20 at 20:58