3

I am trying to migrate a Datawarehouse to Delta lake. One thing that I am struggling to figure out is how to connect to Delta Lake (silver and gold) tables outside a spark session. I want to able to connect to these tables using BI tools like Tableau. I am not using databricks and I was wondering if storing these tables in the hive metastore could help. If not this then could someone help me with an alternative approach or if this is feasible or not.

  • You always need some kind of "compute" to do the delta lake work in the middle, and databricks would be the most convenient and reliable compute at this point. What current issue do you have with your data warehouse that is causing you to migrate? You might end up with more problems with the delta lake approach. Are you trying to migrate a star schema? what database platform are you using? – Nick.Mc Apr 04 '21 at 04:27
  • To `migrate` - Is it please use swoop? – Koushik Roy Apr 04 '21 at 05:40
  • Connect - odbc or jdbc? – Koushik Roy Apr 04 '21 at 05:45
  • Yes you can use ODBC or JDBDC. But you still have to connect to a databricks cluster. – Nick.Mc Apr 04 '21 at 08:52
  • Will integration to hive work? As mentioned here - https://docs.delta.io/latest/hive-integration.html . I have set up a spark cluster with Hadoop to simulate an hdfs on docker. – Cynthia Vincent Apr 04 '21 at 11:30

1 Answers1

0

You can have a Hive metastore and a Thrift server with Spark open source and delta.io open source then connect Tableau desktop for instance.