I'm performing queries in a server with Open Source Routing Machine (OSRM) deployed. I send a set of coordinates and obtain a n x n matrix of network distances over a streets network.
In order to improve the speed of the computations, I want to use "ThreadPoolExecutor" to parallelize queries.
So far, I'm setting the connection in two ways, both giving me the same error:
def osrm_query(url_input):
'Send request'
response = requests.get(url_input)
r = response.json()
return r
def osrm_query_2(url_input):
'Send request'
s = requests.Session()
retries = Retry(total=3,
backoff_factor=0.1,
status_forcelist=[ 500, 502, 503, 504 ])
s.mount('https://', HTTPAdapter(max_retries=retries))
response = s.get(url_input)
r = response.json()
return r
I generate the set of URLs (in the _urls list) that I want to send as requests and parallelize this way:
with ThreadPoolExecutor(max_workers=5) as executor:
for each in executor.map(osrm_query_2, _urls):
r.append(each)
So far, all works ok, but then, when parsing more than 40,000 URLs, I get this error back:
OSError: [WinError 10048] Only one usage of each socket address (protocol/networ
k address/port) is normally permitted
As far as I understand, the problem is that I send too many request from my machine, exhausting the number of ports available to send the request (it looks that this has nothing to do with the machine I'm sending the request to).
How can I fix this? There is a way to tell the treadPoolExecutor to re-use the connections?