1

I have created DeltaLake Tables on DataBricks Cluster. And I am able to access these tables from external system/application. Though I need to keep the cluster up and running all the time to be able to access the table data. Question:

  1. Is it possible to access the DeltaLake Tables when Cluster is down?

  2. If Yes, Then how can I setup

I tried to lookup on docs. Found that 'Premium access to DetaBrick' has some Table Access Controls. disabled by otherwise. It says:

Enabling Table Access Control will allow users to control who can select, create, and modify databases, tables, views, and functions that they create.

I also found this doc I don't think this is the option for my requirement.. Please suggest

AmitG
  • 519
  • 6
  • 19
  • Could you clarify what you mean by accessing the Delta Lake tables without Databricks cluster running? The data is sitting in cloud object storage so you could conceivably access the tables through any of the existing connector mechanisms. The Table Access Controls are part of the enterprise security package so if you want authentication to those tables, you would certainly need a Databricks cluster to ensure authorized access to the data. – Denny Lee May 18 '21 at 04:55

1 Answers1

0

The solution I found is to store all Delta Lake Tables on Storage Gen2. This will have access to external resources irrespective of DataBrick Clusters. While reading a file or writing into table we will have our Cluster up and running, rest of time it can be shut down.

From Docs: In databricks we can create delta tables of two types: managed and unmanaged. Managed are those for which data is stored in DBFS (Databricks FileSystem). While Unmanaged are those where an external ADLS Gen-2 location can be specified.

dataframe.write.mode("overwrite").option("path","abfss://[ContainerName]@[StorageAccount].dfs.core.windows.net").saveAsTable("table")
AmitG
  • 519
  • 6
  • 19