1

Is it possible to use one of airflow's operators to run an Impala query in the same way it can do Hive queries? I imagine a bash operator will work but would like do do it using the airflow api.

Auren Ferguson
  • 479
  • 6
  • 17
  • Have you check this one out https://airflow.readthedocs.io/en/latest/_api/airflow/hooks/hive_hooks/index.html#airflow.hooks.hive_hooks.HiveServer2Hook if you are using impala you may need to set it to false in the extra of your connection in the UI – Chengzhi Aug 21 '19 at 17:52
  • 1
    Do you have an example? It doesn't mention Impala. – Auren Ferguson Aug 22 '19 at 08:11

2 Answers2

1

To connect impala from airflow, we need to mention authentication type and run_set_variable_statements in extra column in connection.

To connect with no authentication mechanism, below are the connection details.
Connection Type : Hive Server 2 Thrift
Port : 21050
Schema: Your Database
Extra: {"authMechanism": "NOSASL", "run_set_variable_statements":false}

Gokulakrishnan
  • 631
  • 6
  • 6
0

I made it as a jdbc connection. This is the way I set up the configuration:

  • Connection Id: the_id_you_want
  • Connection Type: JDBC Connection
  • Connection URL: jdbc:impala://the_url_worker.root.corp:21050
  • Login: my_username
  • Password: my_password
  • Driver Path: url where the .jar file locates
  • Driver Class: com.cloudera.impala.jdbc.Driver

Edit: this is for Airflow v2.6.0