Currently have a multi-master node setup in AWS. Livy is installed on all the 3 nodes. Is there any endpoint which can tell , which one is the currently active node, out of the three master nodes.Trying to run spark jobs via LIVY.
Asked
Active
Viewed 1,058 times
1 Answers
0
- u can do
aws emr describe-cluster --cluster-id j-1K48XXXXXXHCB
. U can do the same with java api / python apis also. - livy can be configured to use zookeeper. meaning u can spin 3 zoo nodes and configure livy over it: https://github.com/apache/incubator-livy/blob/master/conf/livy.conf.template --meaning u can submit job to any livy (i havent tried this though)
- Did u consider AWS Glue? it also has workflows to connect jobs. and Glue is backward compatible with py-spark, scala-spark
unless u wish to use EMR 24X7 ; and use it at more than 80% of cluster CPU, Cluster memory AWS glue is a better win. Remember that though AWS says EMR is managed solution you still have to do sizing, check resiliency etc

chendu
- 684
- 9
- 21