I am on windows and want to run my multi-threaded python app that saves data to .csv in an async way. As reported here, here and here, I am getting the following error at some point:
<type 'exceptions.IOError'>
Traceback (most recent call last):
File "results_path", line 422, in function
df_results.to_csv(results_file)
IOError: [Errno 24] Too many open files
This proposes a fix that includes with-statements for every file IO operation:
with open(results_path, 'a') as results_file:
df_results.to_csv(results_file)
However, I am still getting IOError
as described above (In a nutshell, none of the SO questions solved my issue). Therefore, the with
-statement apparently does not properly close the .csv file after the append operation.
First, I now increases the number of open files. This admittedly just delays the crash:
import win32file
max_open_files = 2048 # Windows-specific threshold for max. open file count
win32file._setmaxstdio(max_open_files)
Second, my temporary approach is (A) to check for open .csv-files consequtively, and (B) forcefully restart the whole script if the open file count gets anywhere near the threshold allowed for windows:
from psutil import Process
import os, sys
proc = Process()
open_file_count = 0 # Set up count of open files
for open_file in proc.open_files(): # Iterate open files list
if ".csv" in str(open_file): # Is file of .csv type?
open_file_count += 1 # Count one up
else:
continue
else:
if open_file_count > (max_open_files / 2): # Threshold, see above
os.execl(sys.executable, sys.executable, *sys.argv) # Force restart
else:
pass
This approach is ugly and inefficient in many ways (loop through all open files in every iteration/thread). At the very least, this needs to work without forcefully restarting the whole code.
Q1: How to properly close .csv files using python on windows?
Q2: If closing fails after IO operation, how to forcefully close open all .csv files at once?