I am using concurrent futures to download reports from a remote server using an API. To inform me that the report has downloaded correctly, I just have the function print out its ID.
I have an issue where there are rare times that a report download will hang in-definitely. I do not get a Timeout Error or a Connection Reset error, just hanging there for hours until I kill the whole process. This is a known issue with the API with no known workaround.
I did some research and switched to using a Pebble based approach to implement a timeout on the function. My aim is then to record the ID of the report that failed to download and start again.
Unfortunately, I ran into a bit of a brick wall as I do not know how to actually retrieve the ID of the report I failed to download. I am using a similar layout to this answer:
from pebble import ProcessPool
from concurrent.futures import TimeoutError
def sometimes_stalling_download_function(report_id):
...
return report_id
with ProcessPool() as pool:
future = pool.map(sometimes_stalling_download_function, report_id_list, timeout=10)
iterator = future.result()
while True:
try:
result = next(iterator)
except StopIteration:
break
except TimeoutError as error:
print("function took longer than %d seconds" % error.args[1])
#Retrieve report ID here
failed_accounts.append(result)
What I want to do is retrieve the report ID in the event of a timeout but it does not seem to be reachable from that exception. Is it possible to have the function output the ID anyway in the case of a timeout exception or will I have to re-think how I am downloading the reports entirely?