0

We are using Airflow to schedule our jobs on EMR and currently we want to use apache Livy to submit Spark jobs via Airflow I need more guidance on below : Which Airflow-Livy operator we should use for python 3+ pyspark and scala jobs. I have seen below : https://github.com/rssanders3/airflow-spark-operator-plugin and https://github.com/panovvv/airflow-livy-operators

Wants to know more about stable AirflowLivy operator anyone using in production probably in AWS stack.

Also Step by step installation guide for integration.

user10437665
  • 95
  • 2
  • 9

1 Answers1

3

I would recommend using LivyOperator from https://github.com/apache/airflow/blob/master/airflow/providers/apache/livy/operators/livy.py

Currently, it is only available in Master but you could copy-paste the code and use that as a Custom Operator till we backport all the new operators for Airflow 1.10.* series

kaxil
  • 17,706
  • 2
  • 59
  • 78
  • Thanks @kaxil !! I was also considering - https://github.com/panovvv/airflow-livy-operators Let me know your views on this – user10437665 Apr 15 '20 at 07:16