0

Although Google Cloud Storage is a flat object store that doesn't need directory entries, adding psuedo directory placeholders (empty entries with names ending in /) makes gcsfuse a lot faster. You can leave out the gcsfuse --implicit-dirs option and browse your GCS directories with very workable performance, which is not the case without the placeholders.

Q. Is there a way to issue a command to gsutil like gsutil cp -r your_directory gs://your-bucket/ that will create the directory placeholders while uploading files?

The alternative is to call the GCS API, but gsutil has a lot of useful features including parallel uploads and retry handling.

Example

Make the local tree:

$ mkdir -p your_directory/subdir
$ echo hi > your_directory/hi.txt
$ echo there > your_directory/subdir/there.txt

$ ls -lR your_directory
total 8
-rw-r--r--  1 jerry  staff   3 Jan 21 17:24 hi.txt
drwxr-xr-x  3 jerry  staff  96 Jan 21 17:24 subdir/

your_directory/subdir:
total 8
-rw-r--r--  1 jerry  staff  6 Jan 21 17:24 there.txt

gsutil copy it to GCS:

$ gsutil cp -r your_directory gs://your-bucket/
Copying file://your_directory/hi.txt [Content-Type=text/plain]...
Copying file://your_directory/subdir/there.txt [Content-Type=text/plain]...
/ [2 files][    9.0 B/    9.0 B]
Operation completed over 2 objects/9.0 B.

$ gsutil ls -lr gs://your-bucket/your_directory
gs://your-bucket/your_directory/:
         3  2020-01-22T01:25:38Z  gs://your-bucket/your_directory/hi.txt

gs://your-bucket/your_directory/subdir/:
         6  2020-01-22T01:25:38Z  gs://your-bucket/your_directory/subdir/there.txt
TOTAL: 2 objects, 9 bytes (9 B)

Notice that gsutil only created 2 objects (blobs) -- the text files. It did not create directory placeholder blobs your_directory/ or your_directory/subdir/.

In a gcsfuse your-bucket your-bucket mount:

$ find your_directory
find: your_directory: No such file or directory

In a gcsfuse --implicit-dirs your-bucket your-bucket mount:

$ find your_directory
your_directory
your_directory/hi.txt
your_directory/subdir
your_directory/subdir/there.txt

slowly.

Back to a gcsfuse your-bucket your-bucket mount, we can make the text files show up by creating the directory placeholders:

$ mkdir your_directory
$ ls your_directory
hi.txt

$ mkdir your_directory/subdir
$ ls your_directory
hi.txt  subdir/

$ ls your_directory/subdir/
there.txt
Andrew Gaul
  • 2,296
  • 1
  • 12
  • 19
Jerry101
  • 12,157
  • 5
  • 44
  • 63

1 Answers1

-1

If I understood correctly and you want to upload files while creating what appear to be empty folders (which in the background are just empty files with a "/" at the end of their path), gsutil cp -r your_directory gs://your-bucket/ does the trick.

For reference here is how subdirectories work in GCS and gsutil cp command

Jose V
  • 1,356
  • 1
  • 4
  • 12
  • Thanks, but alas, unless there's a configuration setting or another option, `gsutil cp -r your_directory gs://your-bucket/` does not create the empty directory placeholder objects `your_directory/`, `your_directory/subdir/`, etc., and thus `gcsfuse` (without `--implicit-dirs`) won't see any of those files. – Jerry101 Jan 21 '20 at 07:14
  • Could you please clarify what you mean by "empty directory placeholder objects"? – Jose V Jan 21 '20 at 15:26
  • By "empty directory placeholder objects", I meant 0-length GCS blobs with names ending in `/`. Without them, `gcsfuse` won't see `your-bucket/your_directory` or its "contents" (unless mounted with the `--implicit-dirs` option). See the example I added to the question, and see https://github.com/GoogleCloudPlatform/gcsfuse/blob/master/docs/semantics.md#implicit-directories – Jerry101 Jan 22 '20 at 02:01