3

Following python doc to replace shell-pipeline, I have a piece of code that looks like this.

p1 = Popen(["tac" , "/var/log/some_process_log.output"], stderr=PIPE, stdout=PIPE)
p2 = Popen(["head", "-n", "1000"], stdin=p1.stdout, stdout=outfile)
p1.stdout.close()  # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]

outfile is where I want to redirect the output of head command. The log file is very large and hence I am doing a 'head' on it

The chaining is like p1 | p2 | p3 | ..... | Pn > outfile

If there's an error in execution of p1 e.g. the user does not have read permissions on the /var/log/some_process_log.output file, error message in p1.stderr is not piped through when I do Pn.communicate()

If I do p1.stderr.readline() at every stage, then it takes long time to process. This is mentioned in pydocs:

Note The data read is buffered in memory, so do not use this method if the data size is large or unlimited.

I am avoiding subprocess.check_output since it does not handle piping and plus it needs the unsafe shell=True

Any help would be appreciated. Thanks

Yogesh lele
  • 392
  • 4
  • 17

1 Answers1

0

You could create a separate pipe with:

import os
errread, errwrite = os.pipe()

And set the write-end as the stderr for all your Popen instances:

p1 = Popen(["tac" , "/var/log/some_process_log.output"], stderr=errwrite, stdout=PIPE)
p2 = Popen(["head", "-n", "1000"], stdin=p1.stdout, stderr=errwrite, stdout=outfile)

Remember to close the write-end when it's done:

os.close(errwrite)

And get your error messages with either:

data_group = os.read(errread, buf_size)

or:

import io
data = io.open(errread, 'rb', buf_size).read()
Fantix King
  • 1,414
  • 1
  • 14
  • 13