Python, subprocess, pipe and select

Question

I have a python program where I continuously read the output of other program launched via subprocess.Popen and connected via subprocess.PIPE

The problem I am facing is that it sometime lost significantly portion of the output from the launched program.

For example, monitor for inotify events via a pipe to inotifywait loses many events.

This is the relevant functions:


    process = subprocess.Popen(["inotifywait", "-q", "-r", "-m", 
      "--format", "%e:::::%w%f", srcroot], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    polling = select.poll()
    polling.register(process.stdout)
    process.stdout.flush()

    while True:
        process.stdout.flush()
        if polling.poll(max_seconds*1000):
            line = process.stdout.readline()
            if len(line) > 0:
                print line[:-1]

Executing the command inotifywait -q -r -m --format %e:::::%w%f /opt/fileserver/ > /tmp/log1 and moving some file around (to generate inotify events) give a >8000 line file. On the other hand, using my ./pscript.py > /tmp/log2 give a file with about 5000 lines.

try getting line from stderr as well, and printing that, check if the lost data is actually there. - `print process.stderr.read()` — Anand S Kumar, Aug 02 '15 at 07:29
Unfortunately the above example was somewhat simplified, as I was already checking for stderr. Thank you anyway. — shodanshok, Aug 02 '15 at 10:41

score 1 · Answer 1 · answered Aug 02 '15 at 10:13

1

You're ignoring stderr completely in your example. Try to create the process like this:

process = subprocess.Popen(["inotifywait", "-q", "-r", "-m", 
  "--format", "%e:::::%w%f", srcroot], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

Furthermore, I'd use inotify directly with one of its Python bindings rather than spawning a process with inotifywait.

answered Aug 02 '15 at 10:13

noxdafox

14,439
4
33
45

Unfortunately the above example was somewhat simplified, as I was already checking for stderr. Moreover, I can not use pyinotify due to its slow performance (I tried it, and it is OK for some thousands of files, but in my case it can not even create a sufficient number of watches)... Thank you anyway. – shodanshok Aug 02 '15 at 10:43
You don't have to set thousands of watches. You set a watch per folder you want to monitor and the application will call your callback. If you need to watch subfolders as well you just need to specify it when you create the watch. – noxdafox Aug 02 '15 at 11:26
At high level, yes. But at low level, pyinotify need to establish a watch for any different file/directory. This process is time consuming and it has significant scalability problems. Pure C implementation are much faster, mainly because all code is compiled. – shodanshok Aug 02 '15 at 12:12

Python, subprocess, pipe and select

1 Answers1