1

I know my DBFS path is backed by S3. Is there any utility/function to get the exact S3 path from a DBFS path? For example,

%python
required_util('dbfs:/user/hive/warehouse/default.db/students')
>> s3://data-lake-bucket-xyz/.......

I was going through a few other discussions, for example - What s3 bucket does DBFS use? How can I get the S3 location of a DBFS path but didn't get any useful answer

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
soumya-kole
  • 1,111
  • 7
  • 18

1 Answers1

0

This information isn't really available inside the execution context of the cluster. The closest that I can think about is to use Account REST API, but you need to be an account admin for that:

  • You can get information about specific workspace using GET Workspace API
  • From result of this GET request, you can get storage configuration ID (the storage_configuration_id field), and then use Get Storage Configuration API to retrieve bucket information.
Alex Ott
  • 80,552
  • 8
  • 87
  • 132