I am using the below code to batch download a text list of json files from the web. The links aren't standardized and can be https or http and can end with '.json' or not.
def save_json(url):
import os
filename = url.replace('/','').replace(':','') .replace('.','|').replace('|json','.json').replace('|JSON','.json').replace('|','').replace('?','').replace('=','').replace('&','')
path = "U:/location/json"
fullpath = os.path.join(path, filename)
import urllib2
response = urllib2.urlopen(url)
webContent = response.read()
f = open(fullpath, 'w')
f.write(webContent)
f.close()
f = open('U:/location/index_dl.txt')
p = f.read()
url_list = p.split('\n') #here's where \n is the line break delimiter that can be changed
for url in url_list:
save_json(url)
Every so often I get the error:
Errno 10054 An existing connection was forcibly closed by the remote host.
Question: Does anyone know of another way to batch download a list of json links from the web, or have a way to handle this error as it happens?
Thanks in advance! SJB