3

I have mounted 'mybucket' using mount commands and i could able to list all the objects using the below command-

%fs
ls /mnt/mybucket/

however, i have folders inside the folders in 'mybucket' and i want to run the below command but it is not working.

%fs
ls /mnt/mybucket/*/*/

Any help is much appreciated. Thanks

Alex Ott
  • 80,552
  • 8
  • 87
  • 132

3 Answers3

0

The dbutils.fs.ls and it's magic variant %fs ls don't support wildcards, so you need to iterate over the files yourself, with something like this:

def list_files(path, max_level = 1, cur_level=0):
  d = dbutils.fs.ls(path)
  for i in d:
    if i.name.endswith("/") and i.size == 0 and cur_level < (max_level - 1):
      yield from list_files(i.path, max_level, cur_level+1)
    else:
      yield i.path

files = list_files("/mnt/mybucket", 1)
Alex Ott
  • 80,552
  • 8
  • 87
  • 132
0

If you attempt to create a mount point within an existing mount point, for example:

Mount one storage account to /mnt/storage1

Mount a second storage account to /mnt/storage1/storage2

This will fail because nested mounts are not supported in Databricks. recommended one is creating separate mount entries for each storage object.

For example:

Mount one storage account to /mnt/storage1

Mount a second storage account to /mnt/storage2

0

Unmount and mount again.

dbutils.fs.unmount("/mnt/mount_name")

dbutils.fs.mount("s3a://%s" % aws_bucket_name, "/mnt/%s" % mount_name)
Victor Kironde
  • 193
  • 1
  • 10