6

I got a oozie workflow, running on a CDH4 cluster of 4 machines (one master-for-everything, three "dumb" workers). The hive metastore runs on the master using mysql (driver is present), the oozie server also runs on the master using mysql, too. Using the web interface I can import and query hive as expected, but when I do the same queries within an oozie workflow it fails. Even the addition of the "IF EXISTS" leads to the error below. I tried to add the connection information as properties to the hive job without any success.

Can anybody give me a hint? Did I miss anything? Any further information needed?

This is the output of the job's log:

  Script [drop.sql] content:
  ------------------------
  DROP TABLE IF EXISTS performance_log;

  ------------------------

  Hive command arguments :
  -f
  drop.sql

  =================================================================

  >>> Invoking Hive command line now >>>

  Intercepting System.exit(10001)

  <<< Invocation of Main class completed <<<

  Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [10001]

  Oozie Launcher failed, finishing Hadoop job gracefully

And this is the error message:

  FAILED: SemanticException [Error 10001]: Table not found performance_log
  Intercepting System.exit(10001)
  Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [10001]
Mario Mueller
  • 1,450
  • 2
  • 13
  • 16
  • This Hue blog post demonstrates how to [run an Hive action](http://gethue.tumblr.com/post/60937985689/hadoop-tutorials-ii-2-execute-hive-queries-and) in an Oozie workflow. – Romain Sep 19 '13 at 04:40

2 Answers2

12

The problem is other nodes don't know where your MYSQL is , so you are getting error table not found.

You need to do 2 things

  1. Copy hive-site.xml in the oozie workflow directory
  2. In your Hive action tell oozie that use my hive-site.xml

Something like below

action name="hive-node"> <hive xmlns="uri:oozie:hive-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <job-xml>hive-site.xml</job-xml>

This should work.

Thanks

Suvarna Pattayil
  • 5,136
  • 5
  • 32
  • 59
user2230605
  • 2,390
  • 6
  • 27
  • 45
  • 1
    If you are in hue, remember to do that at the workflow properties so it will be applied to all the hive nodes/steps in the workflow. – B.Mr.W. Jul 10 '14 at 14:21
  • This is the ideal solution based on what Oozie is currenlty. Honestly, we need a way for oozie to pick it from one centralized location rather than copying it to oozie job directory in HDFS ourselves. After all, for every change in the hive-site.xml we have to make the replacement in every Oozie job directory. Error/inconsistency prone in my opinion. – Suvarna Pattayil Mar 31 '16 at 19:39
0

I have experienced same problem, apart from above mention solution about specifying hive-site.xml properly. I would recommend following things as well.

  1. check if you have mysql connector jar (in case you are using mysql as metastore) is available in classpath.
  2. in case of oozie hive actions just check if you are not adding hive jars multiple times. e.g. it is already present in oozie share lib and you have also copied in workflow/lib
Abhishek Gayakwad
  • 572
  • 2
  • 7
  • 20