Trying to run a Oozie
coordinator with a java action
workflow that consists of running a Camus
mapper job. The coordinator seems to run, and start the workflow every 20 minutes, but the workflow would just run indefinitely, even though the job when run independently would easily complete in a few minutes. I think the error either has to do with how I run the job, or how the arguments are passed, but I'm not sure how to debug this. Here is the code:
/coord/job.properties
oozie.coord.application.path=hdfs://10.0.2.15:8020/user/hue/app/coord/coordinator.xml
name=camus
frequency=20
start=2015-07-30T11:40Z
end=2016-07-30T11:40Z
timezone=GMT+0530
workflow=hdfs://10.0.2.15:8020/user/hue/app/workflow/workflow.xml
nameNode=hdfs://10.0.2.15:8020
jobTracker=10.0.2.15:8021
queueName=default
properties=${nameNode}/user/hue/app/workflows/lib/config.properties
coord/coordinator.xml
<coordinator-app name="${name}" frequency="${frequency}" start="${start}" end="${end}" timezone="${timezone}" xmlns="uri:oozie:coordinator:0.1">
<action>
<workflow>
<app-path>${workflow}</app-path>
</workflow>
</action>
</coordinator-app>
/workflow/workflow.xml
<workflow-app xmlns='uri:oozie:workflow:0.4' name='camus-wf'>
<start to='camus_job' />
<action name='camus_job'>
<java>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<main-class>com.linkedin.camus.etl.kafka.CamusJob</main-class>
<arg>-P</arg>
<arg>${properties}</arg>
</java>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Camus Job Failed</message>
</kill>
<end name='end' />
</workflow-app>
The SHADED jar and config.properties are located in /workflow/lib/
I'm running HDP 2.2
Coordinator Logs:
2015-08-03 06:43:43,820 INFO CoordSubmitXCommand:543 - SERVER[sandbox.hortonworks.com] USER[root] GROUP[-] TOKEN[] APP[camus] JOB[0000000-150803063131195-oozie-oozi-C] ACTION[-] ENDED Coordinator Submit jobId=0000000-150803063131195-oozie-oozi-C
2015-08-03 06:43:43,935 INFO CoordMaterializeTransitionXCommand:543 - SERVER[sandbox.hortonworks.com] USER[root] GROUP[-] TOKEN[] APP[camus] JOB[0000000-150803063131195-oozie-oozi-C] ACTION[-] materialize actions for tz=Coordinated Universal Time,
start=Thu Jul 30 11:40:00 UTC 2015, end=Thu Jul 30 15:40:00 UTC 2015,
timeUnit 12,
frequency :20:MINUTE,
lastActionNumber 0
2015-08-03 06:43:43,971 INFO CoordMaterializeTransitionXCommand:543 - SERVER[sandbox.hortonworks.com] USER[root] GROUP[-] TOKEN[] APP[camus] JOB[0000000-150803063131195-oozie-oozi-C] ACTION[-] [0000000-150803063131195-oozie-oozi-C]: Update status from PREP to RUNNING
2015-08-03 06:43:44,113 INFO CoordActionInputCheckXCommand:543 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-150803063131195-oozie-oozi-C] ACTION[0000000-150803063131195-oozie-oozi-C@1] [0000000-150803063131195-oozie-oozi-C@1]::CoordActionInputCheck:: Missing deps:
2015-08-03 06:43:44,209 INFO CoordActionNotificationXCommand:543 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-150803063131195-oozie-oozi-C] ACTION[0000000-150803063131195-oozie-oozi-C@1] STARTED Coordinator Notification actionId=0000000-150803063131195-oozie-oozi-C@1 : WAITING
...
2015-08-03 06:43:44,267 INFO CoordActionNotificationXCommand:543 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-150803063131195-oozie-oozi-C] ACTION[0000000-150803063131195-oozie-oozi-C@12] No Notification URL is defined. Therefore nothing to notify for job 0000000-150803063131195-oozie-oozi-C action ID 0000000-150803063131195-oozie-oozi-C@12
2015-08-03 06:43:44,268 INFO CoordActionNotificationXCommand:543 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-150803063131195-oozie-oozi-C] ACTION[0000000-150803063131195-oozie-oozi-C@12] ENDED Coordinator Notification actionId=0000000-150803063131195-oozie-oozi-C@12
2015-08-03 06:43:44,433 WARN ParameterVerifier:546 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-150803063131195-oozie-oozi-C] ACTION[0000000-150803063131195-oozie-oozi-C@1] The application does not define formal parameters in its XML definition
...
Workflow Logs:
2015-08-03 06:43:44,672 INFO ActionStartXCommand:543 - SERVER[sandbox.hortonworks.com] USER[root] GROUP[-] TOKEN[] APP[camus-wf] JOB[0000001-150803063131195-oozie-oozi-W] ACTION[0000001-150803063131195-oozie-oozi-W@:start:] Start action [0000001-150803063131195-oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2015-08-03 06:43:44,673 INFO ActionStartXCommand:543 - SERVER[sandbox.hortonworks.com] USER[root] GROUP[-] TOKEN[] APP[camus-wf] JOB[0000001-150803063131195-oozie-oozi-W] ACTION[0000001-150803063131195-oozie-oozi-W@:start:] [***0000001-150803063131195-oozie-oozi-W@:start:***]Action status=DONE
2015-08-03 06:43:44,673 INFO ActionStartXCommand:543 - SERVER[sandbox.hortonworks.com] USER[root] GROUP[-] TOKEN[] APP[camus-wf] JOB[0000001-150803063131195-oozie-oozi-W] ACTION[0000001-150803063131195-oozie-oozi-W@:start:] [***0000001-150803063131195-oozie-oozi-W@:start:***]Action updated in DB!
2015-08-03 06:43:45,104 INFO ActionStartXCommand:543 - SERVER[sandbox.hortonworks.com] USER[root] GROUP[-] TOKEN[] APP[camus-wf] JOB[0000001-150803063131195-oozie-oozi-W] ACTION[0000001-150803063131195-oozie-oozi-W@camus_job] Start action [0000001-150803063131195-oozie-oozi-W@camus_job] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]