0

I have daily file available in GCS bucket in the following pattern and I need to get those blobs for further process.

gs://bucket_name/202112032210/test/a.csv
gs://bucket_name/202112042310/test/b.csv
gs://bucket_name/202112052240/test/a1.csv

So folder name would be : yyyymmdd and then time. Time could be anything and I need to go by each day folder. I am trying to check the file availability for each day using the following pattern but it didn't work :

timestr = str(today.strftime('%Y')) +  str(today.strftime('%m')) + str(today.strftime('%d'))
FOLDER_NAME=timestr +'*'+'/test/'
storage_client = storage.Client()
    bucket = storage_client.get_bucket()
    for blob in storage_client.list_blobs(prefix=FOLDER_NAME):
            num = blob.name.count('/')
            file_name = blob.name.split('/')[num].strip('.csv')
            blob = bucket.blob(blob.name)
            print(blob)

But I am not getting the the blobs. Please suggest any workaround for the same to get the folder for each day.

James Z
  • 12,209
  • 10
  • 24
  • 44
  • What does not work? – guillaume blaquiere Dec 29 '21 at 13:58
  • I am not getting the blob using /20211203*/temp/ .. so I need approach to look into 202112032210/test/ – Prithwiraj Samanta Dec 29 '21 at 16:50
  • 1
    The prefix does not support wildcards AFAIK. Use a prefix of **yyyymm**. That will return all objects that start with **yyyymm**. Remember, Cloud Storage does not have folders/directories. The namespace is flat. – John Hanley Dec 29 '21 at 17:45
  • Please, consider read [this related SO question](https://stackoverflow.com/questions/51379101/how-to-get-list-blobs-to-behave-like-gsutil): it doesn't answer yours, but I think it may be of help. – jccampanero Dec 30 '21 at 23:02

0 Answers0