0

I am trying to run a set of steps in an oozie workflow. One of the steps involves running a java program that reads the arguments from job.properties.template file. How do I schedule this on a Azure HDInsight cluster (I already have a cluster running).

Also, is there any way to get on to head node of the HDInsight cluster like the way we ssh into master node of an EMR cluster. I read about RDP (Remote Desktop Protocol) somewhere. It will be useful if someone could give few more pointers related to this.

Riddhi Rathod
  • 115
  • 1
  • 10

2 Answers2

0

These articles give you some basic ideas on using Oozie and Oozie coordinator in HDInsight:

http://azure.microsoft.com/en-us/documentation/articles/hdinsight-use-oozie/ http://azure.microsoft.com/en-us/documentation/articles/hdinsight-use-oozie-coordinator-time/

This article covers Java MapReduce program development and deployment:

http://azure.microsoft.com/en-us/documentation/articles/hdinsight-develop-deploy-java-mapreduce/

Jonathan Gao
  • 599
  • 3
  • 9
0

For executing java program in HDinsight remote desktop please try this.

  1. add your jar in lib folder and add your properties,xml files and then move it to your blob storage.

Example :

WorkfLow.xml

<workflow-app name="WorkflowJavaMainAction" xmlns="uri:oozie:workflow:0.2">

<start to="javaMainAction"/>

<action name="javaMainAction">

<java>

<job-tracker>jobtrackerhost:9010</job-tracker>

<name-node>wasb://xxx@yyy.blob.core.windows.net</name-node>


<configuration>

<property>

<name>mapred.job.queue.name</name>

<value>default</value>

</property>

</configuration>

<main-class>packagename.classname</main-class>

</java>

<ok to="end"/>

<error to="killJobAction"/>

</action>

<kill name="killJobAction">

<message>"Killed job due to error: ${wf:errorMessage(wf:lastErrorNode())}"</message>

</kill>

<end name="end" />

</workflow-app>

Coordiantor.xml :

<coordinator-app end="${endTime}" frequency="${frequency}" name="sample_update" start="${startTime}" timezone="${timezone}" xmlns="uri:oozie:coordinator:0.2">

<controls>

        <timeout>5</timeout>

        <concurrency>1</concurrency>

</controls>

<action>

<workflow>

<app-path>wasb://xxx@yyy.blob.core.windows.net/user/hdp/ooziejava/workflow.xml</app-path>

</workflow>

</action>

</coordinator-app>

Job.properites

oozie.use.system.libpath=true

oozie.coord.application.path=wasb://xxx@yyy.blob.core.windows.net/user/hdp/
ooziejava/coordinator.xml

startTime=2014-11-16T07:30Z

endTime=2014-11-23T04:50Z

frequency=15

timezone=GMT+0530
Suresh Ram
  • 1,034
  • 3
  • 16
  • 40