0

Currently have a multi-master node setup in AWS. Livy is installed on all the 3 nodes. Is there any endpoint which can tell , which one is the currently active node, out of the three master nodes.Trying to run spark jobs via LIVY.

naval jain
  • 353
  • 3
  • 4
  • 14

1 Answers1

0
  1. u can do aws emr describe-cluster --cluster-id j-1K48XXXXXXHCB. U can do the same with java api / python apis also.
  2. livy can be configured to use zookeeper. meaning u can spin 3 zoo nodes and configure livy over it: https://github.com/apache/incubator-livy/blob/master/conf/livy.conf.template --meaning u can submit job to any livy (i havent tried this though)
  3. Did u consider AWS Glue? it also has workflows to connect jobs. and Glue is backward compatible with py-spark, scala-spark

unless u wish to use EMR 24X7 ; and use it at more than 80% of cluster CPU, Cluster memory AWS glue is a better win. Remember that though AWS says EMR is managed solution you still have to do sizing, check resiliency etc

chendu
  • 684
  • 9
  • 21