0

I was wondering if it was possible to flush messages from the Hive CLI to the stderr as they occurred. Currently I am trying to execute a multi-stage query (just a sample not the actual):

SELECT  COUNT(*) FROM ( 
SELECT user from users
where datetime = 05-10-2013
UNION ALL
SELECT user from users
where datetime = 05-10-2013 
) a

This will launch 3 jobs, however if job 1 fails because it is killed, I don't want to run job 2. Currently my code is like the following, however hive is not writing to the stderr until all the subqueries finish and then it returns the error.

def execute_hive_query(query):
    return_code = None
    cmd = ["hive", "-e", query]
    proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    while return_code is None:
        out = proc.stdout.read()
        error = proc.stderr.read()
        handle_hive_exception(out,error)
        time.sleep(10)
        return_code = proc.poll()

def handle_hive_exception(stdout,stderr):
      if stderr != '':
      raise Exception(stderr)

Thanks!

Brad Ruderman
  • 2,053
  • 2
  • 15
  • 22
  • `.stdout.read()` will block until the subprocess closes its stdout that usually happens when the subprocess exits. You need a non-blocking read that could be implemented using [threads, or select.select, or fctnl, etc](http://stackoverflow.com/q/375427/4279). In addition, you might encounter [the block buffering issue](http://stackoverflow.com/a/12471855/4279). – jfs Jun 01 '13 at 06:53

1 Answers1

0

I suspect that the stages of the query are being executed in parallel. If they are being executed serially, then the failure of one will will cause the entire job to fail.

Try setting hive.exec.parallel=false in your query.

Owen
  • 1,726
  • 10
  • 15
  • Nope that did not work, same issue. Even if it did work it wouldn't be a long term fix because I want the queries to run in parallel. I just want the errors flushed to the standard error. I opened an issue on Hive's Jira here: https://issues.apache.org/jira/browse/HIVE-4631 – Brad Ruderman May 30 '13 at 03:21