I am running Oozie 4.0.1 on Elastic Mapreduce using the 3.0.4 AMI (Hadoop 2.2.0). I've built Oozie from source, and everything installs and seems to work correctly, up to the point of scheduling a Hive job. That is, I can connect to the Web Console, submit and kill jobs using the 'oozie' command, etc. BUT... I find that tasks (I've tried "Hive" and "Shell" so far) go into PREP state (according to the Oozie Web-Console) but never actually start.
I've tried both coordinator (cron) jobs and basic workflow jobs, and gotten the same behavior in either case. It gets to the hive task node, or the shell task node, and then hangs.
For the basic workflow task, here's what the job.properties looks like:
nameNode=hdfs://ip-redacted.ec2.internal:9000
jobTracker=ip-redacted.ec2.internal:9026
queueName=default
examplesRoot=examples
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/shell
and the workflow.xml looks like:
<workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
<start to="shell-node"/>
<action name="shell-node">
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>echo</exec>
<argument>my_output=Hello Oozie</argument>
<capture-output/>
</shell>
<ok to="check-output"/>
<error to="fail"/>
</action>
<decision name="check-output">
<switch>
<case to="end">
${wf:actionData('shell-node')['my_output'] eq 'Hello Oozie'}
</case>
<default to="fail-output"/>
</switch>
</decision>
<kill name="fail">
<message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<kill name="fail-output">
<message>Incorrect output, expected [Hello Oozie] but was [${wf:actionData('shell-node')['my_output']}]</message>
</kill>
<end name="end"/>
</workflow-app>
I don't see any messages in the oozie.log file that look particularly incriminating.
Any thoughts or advice are much appreciated.