0

I have set up Delta Lake in Cloudera. It works fine with Spark and Hive.

I have searched enough on the internet to integrate Delta Lake with Impala.

I did not find much information.

Can someone please answer if you have done the same?

Update:

Do not need Impala to delete from/update the Delta tables. Impala will be used to only query/select data from Delta (built on top of Parquet) tables.

Hope this can be done with good performance using Delta Hive connector?

Basically, Impala will be used for ad-hoc querying / dashboarding / BI, and if users need to update/delete, then it will be done on new tables created by the users (Kudu can be used here) and not on the original tables where select is done.

Hope this clarifies. Please suggest. Let me know if more Info. is required.

vijayinani
  • 2,548
  • 2
  • 26
  • 48

2 Answers2

1

There is no direct integration. It would be delta hive connectors for integration, with impala sitting on top of hive.

Not common as impala cannot delete from hive, only from kudu.

Impala does not use tez or mr for Hive underneath.

See https://impala.apache.org/docs/build3x/html/topics/impala_refresh.html

thebluephantom
  • 16,458
  • 8
  • 40
  • 83
  • Thanks. Do not need Impala to delete/update. Impala will be used to only query/select data from Delta (built on top of Parquet) tables. Hope this can be done with good performance using Delta Hive connector? Basically, Impala will be used for ad-hoc querying / dashboarding / BI, and if users need to update/delete, then it will be done on new tables created by the users (Kudu can be used here) and not on the original tables where select is done. Hope this clarifies. Please suggest. Let me know if more Info. is required. – vijayinani Oct 11 '22 at 01:49
  • Answer still stands. – thebluephantom Oct 11 '22 at 05:03
  • If Impala is used on top of Hive to query Delta Lake, will it use Hive engine i.e. Tez/MR, or will it use Impala engine? – vijayinani Oct 12 '22 at 02:10
  • No it has its own engineon hive data it knows – thebluephantom Oct 12 '22 at 10:06
  • Not clear with the last comment. Will it use Impala engine or Hive related engine? – vijayinani Oct 14 '22 at 02:16
  • None as it has own engine. – thebluephantom Oct 14 '22 at 13:46
  • Your comment has further confused me. Impala has its own engine meaning Impala on top of Hive when queries Delta Lake will use Impala engine, right? That is what I asked, will it use Impala engine or any Hive engine i.e. Tez/MR. – vijayinani Oct 15 '22 at 08:21
  • No. Read that link. – thebluephantom Oct 15 '22 at 08:43
0

Impala don't have custom handler to understand/translate the SymLinkManifest files or Hive SymLinkTextInputFormat as of yet.

Betta
  • 416
  • 5
  • 17