I am using Python's multiprocessing module to do scientific parallel processing. In my code I use several working processes which does the heavy lifting and a writer process which persists the results to disk. The data to be written is send from the worker processes to the writer process via a Queue. The data itself is rather simple and solely consists of a tuple holding a filename and a list with two floats. After several hours of processing the writer process often would get stuck. More precisely the following block of code
while (True):
try:
item = queue.get(timeout=60)
break
except Exception as error:
logging.info("Writer: Timeout occurred {}".format(str(error)))
will never exit the loop and I get continuous 'Timeout' messages.
I also implemented a logging process which outputs, among others, the status of the queue and, even though I get the timeout error message above, a call to qsize() constantly returns a full queue (size=48 in my case).
I have thoroughly checked the documentation on the queue object and can find no possible explanation for why the get() returns timeouts while the queue is full at the same time.
Any ideas?
Edit:
I modified the code to make sure I catch an empty queue exception:
while (True):
try:
item = queue.get(timeout=60)
break
except Empty as error:
logging.info("Writer: Timeout occurred {}".format(str(error)))