4

Oozie has a config property called oozie.launcher.action.main.class where you can pass in the name of a "main class" for a map-reduce action (or a shell action), like so:

  <configuration>
    <property>
      <name>oozie.launcher.action.main.class</name>
      <value>com.company.MyCascadingClass</value>
    </property>
  </configuration>

But I need to pass arguments to my main class and can't see a way to do it. Any ideas?

I'm asking because I'm trying to launch a Cascading class/flow from within Oozie and all options I've tried so far have failed. If anyone has gotten Cascading to work from Oozie, let me know and I'll post another question asking that in particular.

quux00
  • 13,679
  • 10
  • 57
  • 69
  • 1
    Did you try java action syntax: http://oozie.apache.org/docs/3.3.2/WorkflowFunctionalSpec.html#a3.2.7_Java_Action ? – Dmitry Oct 09 '13 at 11:40
  • 1
    I didn't think it would work with the java action because the Cascading job needs the Hadoop jars in the classpath, but fortunately, oozie puts them in the cp for free! So it works. If you put your comment as an answer I'll accept it. – quux00 Oct 09 '13 at 13:34

1 Answers1

4

As of Oozie 3 (haven't tried Oozie 4 yet), the answer to my main question is: you can't. There is no facility (strangely) for specifying any arguments to your main class defined with the oozie.launcher.action.main.class property.

@Dmitry's suggestion in the comments to just use the Oozie java action works for a Cascading job (or any Hadoop dependent job) because Oozie puts all the Hadoop jars in the classpath when it launches the job.

I've documented a working example of launching a Cascading job from Oozie at my blog here: http://thornydev.blogspot.com/2013/10/launching-cascading-job-from-apache.html

Here is the workflow.xml file that worked for me:

<workflow-app xmlns='uri:oozie:workflow:0.2' name='cascading-wf'>
  <start to='stage1' />
  <action name='stage1'>
    <java>
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>

      <configuration>
        <property>
          <name>mapred.job.queue.name</name>
          <value>${queueName}</value>
        </property>
      </configuration>

      <main-class>com.mycompany.MyCascade</main-class>
      <java-opts></java-opts>
      <arg>/user/myuser/dir1/dir2</arg>
      <arg>my-arg-2</arg>
      <arg>my-arg-3</arg>
      <file>lib/${EXEC}#${EXEC}</file> 
      <capture-output />
    </java>
    <ok to="end" />
    <error to="fail" />
  </action>

  <kill name="fail">
    <message>FAIL: Oh, the huge manatee!</message>
  </kill>

  <end name="end"/>
</workflow-app>

In the job.properties file that accompanies the workflow.xml, the EXEC property is defined as:

EXEC=mybig-shaded-0.0.1-SNAPSHOT.jar

and the job is put into the lib directory below where these two definition files are.

quux00
  • 13,679
  • 10
  • 57
  • 69