I am attempting to open multiple sub-processes--each running the same pre-compiled binary, but operating on files in unique directories--under Python (2.7) using subprocess32.Popen(). Most of the time things work fine, but all too often I get an OSError [Errno 14] Bad Address. Here is the code:
self.gld_stdout_file = open('stdout', 'w+')
self.gld_stderr_file = open('stderr', 'w+')
...
subprocess.Popen(string.join(gld_open_str, " "), shell=True, stderr=self.gld_stderr_file,
stdout=self.gld_stdout_file, bufsize=-1, close_fds=ON_POSIX,
env={'TEMP':temp_path})
This error occurs about 5-10% of the attempts to use Popen(), while other Popen() calls in the same loop work just fine. Looking around, it seems this could come from an error in lower-level socket calls, that I am not directly interfacing. (e.g. Here or here)
Any ideas on why am I getting this error?
And more importantly:
How I might fix it?
For reference, we are using subprocess32, which supposedly offers improved stability with multiple subprocess calls. Also, if relevant, the entire scheme is wrapped up into a larger MPI-based HPC parallel call, such that multiple compute nodes are attempting to do the same thing at the same time. Fearing there might be some conflict or filesystem challenge with multiple attempts to execute the same file, we are already copying the binary to each of these nodes before execution.
Also, I see the same problem using shell=False
as in:
subprocess.Popen(gld_open_list, shell=False, stderr=self.gld_stderr_file,
stdout=self.gld_stdout_file, bufsize=-1, close_fds=ON_POSIX,
env={'TEMP':temp_path})