2

I read more thousand coordinates from a file, for that I want to get the related country. I tried to kill the time limit, but it doesn't work yet, it stops after 150-160 coordinates. Can I handle this?

#!/usr/bin/python
# -*- coding: utf-8 -*-
import os, sys


with open('alagridsor.txt') as f:
    lines = f.read().splitlines()    


for sor in range(1, 9271):
    print(sor) 

    koor = lines[sor]

    from geopy.geocoders import Nominatim
    from geopy.exc import GeocoderTimedOut

    geolocator = Nominatim()
    location = geolocator.reverse(koor, timeout=None)
    cim = location.raw['address']['country']

    print(cim)

    f = open('out.txt', 'a')
    f.write(cim.encode('utf8'))
    f.write("\n")

1 Answers1

3

Problems

  1. Using f.read() and omitting size will result in the entire contents of the file to be read and returned. You will encounter problem if the file is twice as large as your machine's memory.
  2. It is very expensive to always open the output file inside the for loop.

Possible Solution

#!/usr/bin/python
# -*- coding: utf-8 -*-
import time
from geopy.geocoders import Nominatim

geolocator = Nominatim(timeout=None)
fobj_out = open('out.txt', 'a')
with open('alagridsor.txt') as fobj_in:
    for koor in fobj_in:
        location = geolocator.reverse(koor.rstrip())
        cim = location.raw['address']['country']
        fobj_out.write(cim.encode('utf8'))
        fobj_out.write("\n")
        time.sleep(0.5)     # delay 5 milli-seconds between each request
fobj_out.close()
Community
  • 1
  • 1
ikolim
  • 15,721
  • 2
  • 19
  • 29
  • Maybe it's faster a bit now, but the mean problem is remained the same; I maxed out the geopy time limit. – Máté Brunner Aug 18 '17 at 06:43
  • 1) Add timeout=None when instantiating class Nominatim e.g. geolocator = Nomaination(timeout=None) – ikolim Aug 18 '17 at 14:42
  • (1) Add timeout=None when instantiating class Nominatim e.g. geolocator = Nomaination(timeout=None) (2) Remove the line 'koor.rstrip() (3) Remove timeout=None in the for loop e.g. location = goelocator.reverse(koor.rstrip()). Please refer to updated code in "Possible Solution". I have just tested these changes with free world cities database, 'worldcitiespop.txt' from MaxMind. After processed 1,286 lines of coordinates, I encountered error, "geopy.exc.GeocoderServiceError: HTTP Error 429: Too Many Requests" . – ikolim Aug 18 '17 at 15:14
  • Yes, I succeed to handle the time limit problem, and then got the same 'too many request' error... Maybe you have a solution for that? – Máté Brunner Aug 20 '17 at 10:26
  • Add a delay between each request e.g. time.sleep(0.1) # delay 100 milli-seconds. Insert the delay after the write statement as shown in my updated solution. You will also have add an import statement, "import time". – ikolim Aug 21 '17 at 03:31
  • Please retry with time.sleep(0.5) # delay of 5 milli-seconds. – ikolim Aug 21 '17 at 14:34