1

Our application data storage is backed by Google Cloud Storage (and S3 and Azure Blob Storage). We need to give access to this storage to random outside tools (upload from local disk using CLI tools, unload from analytical database like Redshift, Snowflake and others). The specific use case is that users need to upload multiple big files (you can think about it much like m3u8 playlists for streaming videos - it's m3u8 playlist and thousands of small video files). The tools and users MAY not be affiliated with Google in any way (may not have Google account). We also absolutely need to data transfer to be directly to the storage, outside of our servers.

In S3 we use federation tokens to give access to a part of the S3 bucket.

So model scenario on AWS S3:

  • customer requests some data upload via our API
  • we give customers S3 credentials, that are scoped to s3://customer/project/uploadId, allowing upload of new files
  • client uses any tool to upload the data
    • client uploads s3://customer/project/uploadId/file.manifest, s3://customer/project/uploadId/file.00001, s3://customer/project/uploadId/file.00002, ...
  • other data (be it other uploadId or project) in the bucket is safe because the given credentials are scoped

In ABS we use STS token for the same purpose.

GCS does not seem to have anything similar, except for Signed URLs. Signed URLs have a problem though that they refer to a single file. That would either require us to know in advance how many files will be uploaded (we don't know) or the client would need to request each file's signed URL separately (strain on our API and also it's slow).

ACL seemed to be a solution, but it's only tied to Google-related identities. And those can't be created on demand and fast. Service users are also and option, but their creation is slow and generally they are discouraged for this use case IIUC.

Is there a way to create a short lived credentials that are limited to a subset of the CGS bucket?

Ideal scenario would be that the service account we use in the app would be able to generate a short lived token that would only have access to a subset of the bucket. But nothing such seems to exist.

Tomáš Fejfar
  • 11,129
  • 8
  • 54
  • 82

1 Answers1

2

Unfortunately, no. For retrieving objects, signed URLs need to be for exact objects. You'd need to generate one per object.

Using the * wildcard will specify the subdirectory you are targeting and will identify all objects under it. For example, if you are trying to access objects in Folder1 in your bucket, you would use gs://Bucket/Folder1/* but the following command gsutil signurl -d 120s key.json gs://bucketname/folderName/** will create a SignedURL for each of the files inside your bucket but not a single URL for the entire folder/subdirectory

Reason : Since subdirectories are just an illusion of folders in a bucket and are actually object names that contain a ‘/’, every file in a subdirectory gets its own signed URL. There is no way to create a single signed URL for a specific subdirectory and allow its files to be temporarily available.

There is an ongoing feature request for this https://issuetracker.google.com/112042863. Please raise your concern here and look for further updates.

For now, one way to accomplish this would be to write a small App Engine app that they attempt to download from instead of directly from GCS which would check authentication according to whatever mechanism you're using and then, if they pass, generate a signed URL for that resource and redirect the user. Reference : https://stackoverflow.com/a/40428142/15803365

Priyashree Bhadra
  • 3,182
  • 6
  • 23
  • 1
    Thanks for the search. I posted to the issue tracker. I still hope someone has some workaround that they use to do the same thing without signed URLs... – Tomáš Fejfar Jun 21 '22 at 09:36