What s3 bucket does DBFS use? How can I get the S3 location of a DBFS path

Question

I am trying to migrate my Hive metadata to Glue. While migrating the delta table, when I am providing the same dbfs path, I am getting an error - "Cannot create table: The associated location is not empty.

When I am trying to create the same delta table on the S3 location it is working properly.

Is there a way to find the S3 location for the DBFS path the database is pointed on?

score -2 · Answer 1 · answered Oct 30 '19 at 05:08

First configure Databricks Runtime to use AWS Glue Data Catalog as its metastore and then migrate the delta table.

Every Databricks deployment has a central Hive metastore accessible by all clusters to persist table metadata. Instead of using the Databricks Hive metastore, you have the option to use an existing external Hive metastore instance or the AWS Glue Catalog.

Databricks File System (DBFS) is a distributed file system mounted into a Databricks workspace and available on Databricks clusters. DBFS is an abstraction on top of scalable object storage and offers the following benefits:

Allows you to mount storage objects so that you can seamlessly access data without requiring credentials.
Allows you to interact with object storage using directory and file semantics instead of storage URLs.
Persists files to object storage, so you won’t lose data after you terminate a cluster.

Is there a way to find the S3 location for the DBFS path the database is pointed on?

You can access AWS S3 bucket by mounting buckets using DBFS or directly using APIs.

Reference: "Databricks - Amazon S3"

Hope this helps.

This did not answer the question "is there a way to find the S3 location for the DBFS path the database is pointed on?" — Daniel Lee Alessandrini, Feb 04 '22 at 15:30

What s3 bucket does DBFS use? How can I get the S3 location of a DBFS path

1 Answers1

Linked