0

We have made jar of our spark(scala) code, uploaded that to AWS EMR through S3 we intend to run this spark code through the use of Apache Livy. After copying the jar to the cluster we run the following commands to make jar accessible to Livy:

hadoop fs -put /myjar.jar /

Our proof of concept EMR cluster had 1 m5.xLarge as master and no other nodes. We have just enabled Spark and Livy for the EMR.

Now we submit our spark jobs using this post method to http://our-machine-ssh:8998/batches: Body: { "name" : "requestName", "className" : "ourclassname", "file" : "jarName", "args" : ["stringArgs"] }

The issue is that I want to make a queue of these post requests to that Livy runs the post requests one by one rather than in parallel Also, if we request instanteneously at the port Livy gives error for SparkUI not found. I am these facing two issues and any solution to these would be really helpful.

Saad Zia
  • 21
  • 1
  • 6
  • I have realized that I need to add "queue" key to my post request but I am unable to find "queue" value. – Saad Zia Dec 18 '19 at 07:30
  • can you try the below U r probably doing livy submit via lambda; in which case can u first query yarn (over REST endpoint; for jobs in last `n` minutes) and then do livy submit; if only `n` > `m` ; U can adjust n, m based on ur cluster configuration; (example: 1, 2 to start with); --this gives u the queue u wanted but in a different way – chendu Dec 23 '19 at 06:02

0 Answers0