1

I have observed this weird problem in python logging class, I have two scripts, one is called from other one. First script waits for other script to end while other script logs a huge log using logging.info

Here is the code snippet

#!/usr/bin/env python

import subprocess
import time
import sys

chars = ["/","-","\\","|"]
i = 0
command = 'sudo python /home/tejto/test/writeIssue.py'
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
while process.poll() is None:
    print chars[i],
    sys.stdout.flush()
    time.sleep(.3)
    print "\b\b\b",
    sys.stdout.flush()
    i = (i + 1)%4
output = process.communicate()

and the other script is

#!/usr/bin/env python

import os
import logging as log_status

class upgradestatus():
    def __init__(self):
        if (os.path.exists("/tmp/updatestatus.txt")):
                os.remove("/tmp/updatestatus.txt")

        logFormatter = log_status.Formatter("%(asctime)s [%(levelname)-5.5s]  %(message)s")
        logger = log_status.getLogger()
        logger.setLevel(log_status.DEBUG)

        fileHandler = log_status.FileHandler('/tmp/updatestatus.txt', "a")
        fileHandler.setLevel(log_status.DEBUG)
        fileHandler.setFormatter(logFormatter)
        logger.addHandler(fileHandler)

        consoleHandler = log_status.StreamHandler()
        consoleHandler.setLevel(log_status.DEBUG)
        consoleHandler.setFormatter(logFormatter)
        logger.addHandler(consoleHandler)

    def status_change(self, status):
            log_status.info(str(status))

class upgradeThread ():
    def __init__(self, link):
        self.upgradethreadstatus = upgradestatus()
    self.upgradethreadstatus.status_change("Entered upgrade routine")
    procoutput = 'very huge logs, mine were 145091 characters'
    self.upgradethreadstatus.status_change(procoutput)
    self.upgradethreadstatus.status_change("Exiting upgrade routine")

if __name__ == '__main__':
    upgradeclass = upgradeThread(sys.argv[1:]

If I run first script, both scripts hangs,The problem seems to be with the code while process.poll() is None , If I comment this code every thing works fine. (Not able to relate this with my problem!!)

PS I tried to debug python logging classes too in which i found that the process gets stuck with emit function of StreamHandlerclass, in which it stucks at stream.write function call and doesn't come out after writing the huge logs, however my exit logs are not coming.

So what can be the problem in these scripts by which a deadlock situation is arising ?

Edit 1 (code with threading)

script.py

#!/usr/bin/env python 

import subprocess
import time
import sys
import threading

def launch():
    command = ['python', 'script2.py']
    process = subprocess.Popen(command,stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
    output = process.communicate()

t = threading.Thread(target=launch)
t.start()

chars = ["/","-","\\","|"]
i = 0

while t.is_alive:
    print chars[i],
    sys.stdout.flush()
    time.sleep(.3)
    print "\b\b\b",
    sys.stdout.flush()
    i = (i + 1)%4
t.join()

script2.py

#!/usr/bin/env python  
import os
import sys
import logging as log_status
import time

class upgradestatus():
    def __init__(self):
        if (os.path.exists("/tmp/updatestatus.txt")):
                        os.remove("/tmp/updatestatus.txt")

        logFormatter = log_status.Formatter("%(asctime)s [%(levelname)-5.5s]  %(message)s")
        logger = log_status.getLogger()
        logger.setLevel(log_status.DEBUG)

        fileHandler = log_status.FileHandler('/tmp/updatestatus.txt', "a")
        fileHandler.setLevel(log_status.DEBUG)
        fileHandler.setFormatter(logFormatter)
        logger.addHandler(fileHandler)

        consoleHandler = log_status.StreamHandler()
        consoleHandler.setLevel(log_status.DEBUG)
        consoleHandler.setFormatter(logFormatter)
        logger.addHandler(consoleHandler)

    def status_change(self, status):
        log_status.info(str(status))

class upgradeThread ():
    def __init__(self, link):
        self.upgradethreadstatus = upgradestatus()
        self.upgradethreadstatus.status_change("Entered upgrade routine")
        procoutput = "Please put logs of 145091 characters over here otherwise the situtation wouldn't remain same or run any command whose output is larger then 145091 characters"
    self.upgradethreadstatus.status_change(procoutput)
        time.sleep(1)
        self.upgradethreadstatus.status_change("Exiting upgrade routine")

if __name__ == '__main__':
    upgradeclass = upgradeThread(sys.argv[1:])

In this case t.is_alive function is not returning false (No idea, but launch function has already returned so ideally it should return false!!:()

Tejendra
  • 1,874
  • 1
  • 20
  • 32

1 Answers1

5

The stdout buffer is constipated.

The process.communicate() call is not getting executed since it comes after while process.poll() is None:. So writeIssue.py is trying to write too many bytes to stdout and it is all getting buffered in the subprocess.PIPE and will not be pulled out of the PIPE until communicate is called.

The buffer has finite size. When the buffer is full, stream.write will block until the buffer has space. If the buffer is never emptied (as is happening in your code), then the process deadlocks.

The fix is to call communicate() before the buffer is completely filled. You could do that by launching the writeIssue.py in a thread and calling communicate() concurrently while the while-thread-is-alive loop is run in the main thread.


script.py:

import subprocess
import time
import sys
import threading

def launch():
    command = ['python', 'script2.py']
    process = subprocess.Popen(command)
    process.communicate()

t = threading.Thread(target=launch)
t.start()

chars = ["/","-","\\","|"]
i = 0
while t.is_alive():
    print chars[i],
    sys.stdout.flush()
    time.sleep(.3)
    print "\b\b\b",
    sys.stdout.flush()
    i = (i + 1)%4

t.join()

script2.py:

import sys
import logging
import time

class UpgradeStatus():
    def __init__(self):
        logFormatter = logging.Formatter("%(asctime)s [%(levelname)-5.5s]  %(message)s")
        self.logger = logging.getLogger()
        self.logger.setLevel(logging.DEBUG)

        consoleHandler = logging.StreamHandler()
        consoleHandler.setLevel(logging.DEBUG)
        consoleHandler.setFormatter(logFormatter)
        self.logger.addHandler(consoleHandler)

    def status_change(self, status):
        self.logger.info(str(status))

class UpgradeThread():
    def __init__(self, link):
        self.upgradethreadstatus = UpgradeStatus()
        self.upgradethreadstatus.status_change("Entered upgrade routine")
        for i in range(5):
            procoutput = 'very huge logs, mine were 145091 characters'
            self.upgradethreadstatus.status_change(procoutput)
            time.sleep(1)
        self.upgradethreadstatus.status_change("Exiting upgrade routine")

if __name__ == '__main__':
    upgradeclass = UpgradeThread(sys.argv[1:])

Note that if you have two threads writing to stdout concurrently, the output will get garbled. If you want to avoid that, then all output should be handled by a single thread armed with a queue. All other threads or processes that wish to write output should push strings or logging records to the queue for the dedicated output thread to handle.

That output thread can then use a for loop to pull output from the queue:

for message from iter(queue.get, None):
    print(message)
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • gotcha! this was the info i was looking for...Well do you have any idea weather I can clear(empty) the buffer of already running process, so that stream.write can move ahead ? – Tejendra Jan 06 '16 at 04:37
  • When `subprocess.Popen` is called with `stdout=subprocess.PIPE`, the stdout pipe is "cleared" by calling `process.stdout.read()`. The problem with calling `read()` directly (such as in your main thread's loop) is that it blocks. There are at least two ways around this problem: call `read` in a thread -- that's what communicate does, and also the code above, or use `select.select` to poll stdout and stderr to find out if there is something to be read. There is an example using this latter approach [here](http://stackoverflow.com/q/17190221/190597). – unutbu Jan 06 '16 at 13:44
  • @unutbu...well i tried your example but it didn't work out for me,,the difference between mine and your code is that i am not using "for i in range(5):", instead i am taking a very huge string in procoutput and logging it....script2.py is still stuck...however the script is now stuck after logging the last log "Exiting upgrade routine" – Tejendra Jan 10 '16 at 05:50
  • Avoid using `shell=True` if possible since it can be [a security risk](https://docs.python.org/2/library/subprocess.html#frequently-used-arguments). If you use `shell=True` then `command` should be a string. If you use `shell=False`, then `command` should be a list. Also, there is an IndentationError in your scipt2.py. – unutbu Jan 11 '16 at 11:13
  • With `shell=True` and `command = ['python', 'script2.py']`, the command that gets executed (on Unix) is `/bin/sh -c python script2.py`. Try it; you'll see it hangs because `script2.py` is getting passed to `/bin/sh`, not `python`, and so you simply get a python interpreter waiting for input. The python process never ends. If you use `shell=True` but make `command = "python script2.py"`, then the command that gets executed is `/bin/sh -c "python script2.py"`. The quotes around `python script2.py` makes `/bin/sh` execute `python script2.py` thus passing `script2.py` to `python`. – unutbu Jan 11 '16 at 11:19
  • @unutbu...even if i don't use PIPE in stdout and stdin and take shell=False, t.is_alive is still stuck...that seems to a different problem :( – Tejendra Jan 11 '16 at 11:23
  • How about redirecting stdout and stederr to a `NUL` [http://stackoverflow.com/questions/6735917/redirecting-stdout-to-nothing-in-python], **Any Comments ?** – Tejendra Jan 15 '16 at 13:31