0

I have a Google bucket with the following tree (Note the folder named "/"!):

"bucket-1"
   |
   |--- "data.csv"
   |
   |--- "/"
         |
         |--- "runs"
                 |
                 |--- "run-1"
                 |       |
                 |       |--- "data.csv"
                 |
                 |--- "run-2"
                         |
                         |--- "data.csv"

I want to access the objects (.csv files) using the Python library libcloud in the sub-folder "/".

I can access data.csv which is outside of the "/" folder:

>>> client.get_object(container_name='bucket-1', object_name='/data.csv')
<Object: name=/data.csv, size=181580, hash=8252d90d95b7b1cb7b4e699b90fbcce3, provider=Google Cloud Storage ...>

Using gsutil with two slashes I can see objects in "/":

>>> gsutil ls "gs://bucket-1//runs/run-1"
gs://bucket-1//runs/run-1/data.csv

However with libcloud if I do client.get_object(container_name='bucket-1', object_name='//runs/run-1/data.csv') or client.get_object(container_name='bucket-1', object_name='/runs/run-1/data.csv') I get the error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/andrey/miniconda3/envs/mostly-cloud/lib/python3.6/site-packages/libcloud/storage/drivers/s3.py", line 342, in get_object
    object_name=object_name)
libcloud.storage.types.ObjectDoesNotExistError: <ObjectDoesNotExistError in <libcloud.storage.drivers.google_storage.GoogleStorageDriver object at 0x7f40560cd4e0>, value=None, object = //runs/run-1/data.csv>

On the other hand,

client.list_container_objects(client.get_container("bucket-1"))
[<Object: name=/runs/run-1/data.csv, size=357683, hash=..., provider=Google Cloud Storage ...>, <Object: name=/runs/run-2/data.csv, size=357683, hash=..., provider=Google Cloud Storage ...>] 

So, how to get an object located in the "/" directory?

siamsot
  • 1,501
  • 1
  • 14
  • 20
  • 2
    You have created an illegal/incorrect file structure. You will need to delete the files and recreate the object names correctly. Remember there are no directories, they are just simulated with separator characters in the object names. – John Hanley Aug 15 '19 at 07:37
  • @JohnHanley Does it mean that the single only separator "/" cannot be used? Why I can access with gsutils? What will be a correct object name? – spintronic Aug 15 '19 at 07:47
  • In Cloud Storage, object names cannot begin with `/`. The fact that one tool is allowing this does not matter. Most tools and libraries will break, so the solution is to fix it instead of trying to find a bandaid. Most libraries will combine multiple `/` charactes into one. – John Hanley Aug 15 '19 at 08:00
  • @JohnHanley Ok, I see, thank you! – spintronic Aug 15 '19 at 08:09

1 Answers1

1

I reproduced your scenario in order to test this behavior. I was able to create this hierarchy through:

gsutil cp your-file gs://your-bucket//abc

This is a weird behavior and it shouldn't be allowed.

If you try to create a folder with this name from the GCP console, you will see the message:

Forward slashes (/) are not allowed in folder names.

For that reason I created a Public Issue Tracker where you can get feedback regarding this issue.

About the naming of your folders, you can take a look at Google's documentation about how subdirectories are working.

To sum up, you shouldn't be allowed to create a folder with this name. The best course of action now is to avoid names like this and prefer a string name which you will be able to handle.

Noohone
  • 694
  • 5
  • 12