I'm looking to obtain Wikipedia article titles and locations (lat/long) over a large area, too large for an individual url query such as:
OR
(This second one is better as it returns a bounding box whereas the first one returns a radius around a point, however it can give the result in json (by adding '&format=json') whereas the second cannot).
I wouldn't have a problem if there was not a limit in the search area of the query, or a limit in the number of result it returns. Are there way of getting around this?
So I'm looking for help to find a good way of automating this procedure to make lots of queries of bounding boxes in a grid-like fashion, parse the data, perhaps using python, and store it in my database.
This is some code I've come up with, but I'm stuck:
url = 'http://api.geonames.org/wikipediaBoundingBox?north=%s&south=%s&east=%s&west=%s&username=demo'
data_coords = [
{north : 51.990, 51.990, 51.990},
south : 51.917, 51.917, 51.917},
east : -3.247, -3.117, -2.987},
west : -3.377, -3.247, 3.117}
]
for i in data_coords:
urllib2.urlopen(url % (i['north']), (i['south']), (i['east']), (i['west']))
Help would be appreciated, thanks!