[Cross-posting from databrick's community : link]
I have been working on a POC exploring delta live table with GCS location.
I have some doubts :
how to access the gcs bucket. We have to establish connection using databricks service account. In a normal cluster creation , we go to cluster page and under Advanced Options
we provide databricks service account email. For delta live table as the cluster creation is not under our control , how to add this email to cluster to make the gcs bucket path accessible .
I also tried to edit the delta live table cluster from UI by adding the service account sa under google service account block. Save Cluster failed with
**Error : Dlt prefixed spark images cannot be used outside of Delta live tables service**
Error Log that I'm encountering when I provide a gs bucket path as a storage location for delta live table:
DataPlaneException: Failed to start the DLT service on cluster <cluster_id>. Please check the stack trace below or driver logs for more details.
com.databricks.pipelines.execution.service.EventLogInitializationException: Failed to initialize event log
java.io.IOException: Error accessing gs://<path>
shaded.databricks.com.google.api.client.googleapis.json.GoogleJsonResponseException: 403 Forbidden
GET https://storage.googleapis.com/storage/v1/b/<path>?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata
{
"code" : 403,
"errors" : [ {
"domain" : "global",
"message" : "Caller does not have storage.objects.get access to the Google Cloud Storage object. Permission 'storage.objects.get' denied on resource (or it may not exist).",
"reason" : "forbidden"
} ],
"message" : "Caller does not have storage.objects.get access to the Google Cloud Storage object. Permission 'storage.objects.get' denied on resource (or it may not exist)."
}