subprocess.Popen hangs for ~70 seconds using Python3?

Question

In my program I have this utility function for executing commands in shell, here's a simplified version of it:

def run_command(cmd):
    s = time.time()
    print('starting subprocess')
    proc = subprocess.Popen(cmd.split(),
                            stdout=subprocess.PIPE,
                            stderr=subprocess.PIPE,
                            universal_newlines=True)
    print('subprocess started after ({0}) seconds'.format(time.time() - s))

My program uses this function from different threads to execute commands. Occasionally the "Popen" line takes around 70 seconds to complete. I mean out of thousands of invocations in a day on different program runs this happens about 4-5 times. As far as I know Popen is non-blocking. What is weird to me is when it does happen it takes the same ~70 seconds to start. It's important to note that while this happen I have 3-4 other threads that are waiting in a loop:

while some_counter > 0:
    time.sleep(0.5)

They do so for at most 60 seconds. After they give up and finish their flow I see another ~14 seconds until the "Popen" call finishes. Is there a problem running "Popen" from some threads in parallel to having other threads in a "wait loop"?

Update 1: I now I see that this problem started after I switched from Fedora27+Python3.6 to Fedora31+python3.7.

The usual reason for delays of this duration on random process start is botched dns/user info lookup (this happens often in all kinds of remote auth situations, such as active directory or ldap based ones). Do you see any delay when running an `id` command? — oakad, Dec 23 '19 at 14:10
I think that the command that hangs is irrelevant as it is a different one each time. — Bob Sacamano, Dec 24 '19 at 06:56
Not sure what are you talking about... anyway I ran the id command both directly in shell and via Popen, no delay. But it is not a good indication as I said because I get this issue on different runs on different machines and I can't reproduce it... It happens 1 in 1000(or even more). — Bob Sacamano, Dec 24 '19 at 09:04
Have you checked your `syslog` for the time when this happens? Also, `sar` logs, if you have them. Perhaps it is not Python; perhaps it is something on the system, and you only see it on your program because you're looking (in this case, logging). — caxcaxcoatl, Dec 26 '19 at 01:59
It almost seems like you might be using up many process ids and recycling them may force a period of non-use before creating a process with it. When you say thousands per day, what's the peak rate? — karmakaze, Dec 29 '19 at 08:02
Hey James, no. Did you managed to get a consistent reproduction? What is your environment(os, python version)? — Bob Sacamano, Jan 05 '20 at 07:09
Hey Bob, are you using complete paths to your executables, if not what does your PATH variable look like - could be something as simple as an nfs stall as the shell searches for the command in your PATH? — Andrew Atrens, Jan 08 '20 at 13:47
Also of possible interest - https://docs.python.org/3/library/subprocess.html .. if you read down far enough Popen child process can block on output to OS pipe buffer .. hope this helps. — Andrew Atrens, Jan 08 '20 at 17:57
use async await with subprocess like this https://fredrikaverpil.github.io/2017/06/20/async-and-await-with-subprocesses/ — divyang4481, Jan 12 '20 at 09:12

subprocess.Popen hangs for ~70 seconds using Python3?

0 Answers0