10

I am executing curl command through subprocess. This curl command starts video processing at another server, and waits for the response. Once the process is done, the remote server returns json object. I am checking the status of the subprocess using poll() value, which is None - process not completed, 0- process completed successfully and 1- for error.

I am getting the correct response if the processing takes around 30 mins/or less on remote server, but if the processing is taking more time, I am getting just None value , even though I can see that the remote server has finished processing and already returned the json object.

Can anyone tell me, what could be the possible reason for poll() returning only None after certain time. Thank you in advance.

My Popen object is :

object = subprocess.Popen(str(curlCmd), shell=True,
                           stdout=subprocess.PIPE,
                           stderr=subprocess.PIPE)

and I am calling object.poll() after every 2 seconds to check if the process is successfully completed or not.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
Avichal Badaya
  • 3,423
  • 1
  • 21
  • 23
  • my Popen object is :- PObject = subprocess.Popen(str(curlCmd), shell=True,stdout=subprocess.PIPE,stderr=subprocess.PIPE) and I am calling object.poll() after every 2 seconds to check if the process is successfully completed or not. – Avichal Badaya Aug 13 '12 at 15:10
  • your code has `PIPE`; do you read from `object.stdout/stderr`? – jfs Aug 14 '12 at 03:09
  • yes I read it from object. It is working for process which are not too long, giving the perfect output. I just don't understand why .poll() keeps on returning none for long process even after the process has terminated. Does it have something to do with memory buffer? – Avichal Badaya Aug 14 '12 at 11:55
  • could you provide more details? Does it happen for each long (>30 min) job? Could your write [a short script](http://sscce.org/) that does nothing but reproduces the problem? Does running with the latest subprocess version help (either try on newer python version or install [`subprocess32`](http://pypi.python.org/pypi/subprocess32/)). – jfs Aug 14 '12 at 13:59
  • btw, why do you call `curl`? Have you tried [`requests` library](http://docs.python-requests.org/en/latest/index.html)? – jfs Aug 14 '12 at 14:02
  • Yes, its happening for process taking over 30-35 mins. If I am calling curl command without subprocess , its giving right response. I didn't try subprocess32 ( will check it out). Also, using request shouldn't make any difference ?(would it?). – Avichal Badaya Aug 14 '12 at 17:17
  • requests is an http library i.e., there'll be no subprocess/curl in your code. – jfs Aug 14 '12 at 19:00
  • yeah, but I am not sure if it could be used in my scenario without subprocess. As I am calling a function on remote server, checking its response in a loop while also allocating new process to other servers if I get any request in SQS queue. So, I need a track of which process is in which state and take few actions accordingly. Subprocess Poll seems to work perfectly until I realized that its not working for long process. Meanwhile, when I call just the curl command to remote server, i am getting the output. – Avichal Badaya Aug 14 '12 at 19:15
  • `requests` provides events hooks that as a side-effect would be more efficient then polling the subprocess. – jfs Aug 14 '12 at 20:04
  • It seems like a known issue in Popen.poll can you try using the solutions as outlined in this link. http://www.gossamer-threads.com/lists/python/bugs/633489 – pg2286 Aug 13 '12 at 18:06

4 Answers4

6

.poll() is None means that the child is still running.

The curl process may hang as soon as it fills its stdout or stderr OS pipe buffer (around 64K on my Linux box) unless you read from your end of the pipe.

That is while you are waiting for the child to finish, the child waits for you to drain the pipe buffer -- deadlock. From the subprocess docs:

This[.wait()] will deadlock when using stdout=PIPE or stderr=PIPE and the child process generates enough output to a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use Popen.communicate() when using pipes to avoid that.

You could use threads, async.io to consume the pipes concurrently.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
1

In python 2.7 and possibly 3.x can use:

object._internal_poll(_deadstate=127)

instead of regular Popen.poll() as a workaround. This will return 127 instead of None if the process is terminated.

Of course this is an internal module's method and there's no guarantee it will work after Python's library update.

hegemon
  • 6,614
  • 2
  • 32
  • 30
1

To complement hegemon, this code works on 2.7:

process = subprocess.Popen(cmd, cwd=current_dir, shell=True, stdout=subprocess.PIPE, stdin=subprocess.PIPE, stderr=subprocess.PIPE )
process._internal_poll(_deadstate='dead')

while timeout > 0:
  if process.poll() is not 'dead':
Tomas
  • 944
  • 8
  • 10
0

Finally I ended up using PyCurl instead of creating a subprocess and calling curl command through it. It seems to be an raised issue bit subprocess , where .poll method returns none after certain time, reason is still unclear. I would like to inform people who are using subprocess - poll method ( without wait/ communicate) to be aware of it, if you are running long process . Thank you J.F. Sebastian and Pranav for your directions.

Avichal Badaya
  • 3,423
  • 1
  • 21
  • 23