It seems like that apache oozie is not currently support Spark jobs, am I right? any way to integrate spark jobs into oozie?
Asked
Active
Viewed 829 times
0
-
1possible duplicate of [launching a spark program using oozie workflow](http://stackoverflow.com/questions/29233487/launching-a-spark-program-using-oozie-workflow) – Joe Kennedy Jul 27 '15 at 19:13
2 Answers
1
You can always execute spark as a Java action . Or you can also use spark action in oozie, Refer to this link which has details about spark action -- https://github.com/apache/oozie/blob/master/client/src/main/resources/spark-action-0.1.xsd
<java>
<main-class>org.apache.spark.deploy.SparkSubmit</main-class>
<arg>--class</arg>
<arg>${spark_main_class}</arg>
<arg>--deploy-mode</arg>
<arg>cluster</arg>
<arg>--master</arg>
<arg>yarn</arg>
<arg>--queue</arg>
<arg>${queue_name}</arg> -> depends on your oozie config
<arg>--num-executors</arg>
<arg>${spark_num_executors}</arg>
<arg>--executor-cores</arg>
<arg>${spark_executor_cores}</arg>
<arg>${spark_app_file}</arg>
<arg>${input}</arg> -> some arg
<arg>${output}</arg>-> some other arg
<file>${spark_app_file}</file>
<file>${name_node}/user/spark/share/lib/spark-assembly.jar</file>
</java>

Karthik
- 1,801
- 1
- 13
- 21
1
Oozie support for Spark is coming, see the Jira, this is currently only in trunk.
Otherwise the options are running it as Java or a Shell action.

dpeacock
- 2,697
- 13
- 16