3

I'm entering the following query at http://dbpedia.org/sparql:

PREFIX geo:  <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT ?s ?name ?value ?lat ?lng
WHERE { 
    ?s a <http://dbpedia.org/ontology/PopulatedPlace> .
    ?s <http://dbpedia.org/property/name> ?name .
    ?s <http://dbpedia.org/property/populationTotal> ?value .
    FILTER (?lng > -8.64 AND ?lng < 2.1 AND ?lat < 61.1 AND ?lat > 49.35 )
    ?s geo:lat ?lat .
    ?s geo:long ?lng .
}

(The bounding box is intended to be for the UK, the other option is to add <http://dbpedia.org/ontology/country> <http://dbpedia.org/resource/United_Kingdom> ., but there's a possibility that some places might not have been tagged with UK as the country).

The problem is that it doesn't seem to be pulling back many places (around 290). Swapping population for populationTotal gives 1588 places, and I can't figure out (semantically) which one should be used.

Is this a limitation with the underlying data, or is there something that could be improved in the way I'm formulating the query?

Note: this question is mainly academic now as I got the info from http://download.geonames.org/export/dump/GB.zip, but I'd much prefer to use open data and the semantic web, so posting up this question to see if there was something I was missing, or to find out if there is a shortcoming in how the data is being scraped from Wikipedia and whether I can muck in.

EoghanM
  • 25,161
  • 23
  • 90
  • 123

1 Answers1

4

Your query is only returning locations that have a value for populationTotal. For example, if Town A has "10,000" for populationTotal in the database, and Town B has NULL, only Town A will be returned.

If you want to return all locations in the UK, then you need to specify population as an optional parameter. This query will show you all the locations, as well as the populations for the ones that have that data.

PREFIX geo:  <http://www.w3.org/2003/01/geo/wgs84_pos#>
SELECT ?s ?name ?value ?lat ?lng
WHERE { 
    ?s a <http://dbpedia.org/ontology/PopulatedPlace> .
    ?s <http://dbpedia.org/property/name> ?name .
    OPTIONAL { ?s <http://dbpedia.org/property/populationTotal> ?value . }
    FILTER (?lng > -8.64 AND ?lng < 2.1 AND ?lat < 61.1 AND ?lat > 49.35 )
    ?s geo:lat ?lat .
    ?s geo:long ?lng .
}
KevLoughrey
  • 281
  • 1
  • 9
  • I considered that but thought that 290 and 1588 were very low numbers for places which had those attributes defined. Maybe I need to dive in and improve the code that parses out the population from the infobox? – EoghanM Oct 21 '15 at 14:39
  • @EoghanM It looks like a combination of the code being at fault (for example it doesn't account for [situations where Population is split](https://en.wikipedia.org/wiki/Collinstown) into regions) and Wikipedia just [not having](https://en.wikipedia.org/wiki/Freshford,_County_Kilkenny) the population info. – KevLoughrey Oct 21 '15 at 15:34
  • Thanks Kev! Looks like we're local. Ping me at eoghan at getthere dot ie (which is what I'm working on) :D – EoghanM Oct 23 '15 at 09:57