0

I'm new to GCP and GCP Storage. I want to upload files from File System on PC to GCS bucket. I found following code and altering it.

But I have files that sit in folder like this: \F1\Data\Export\PlayOnUsers\2021\12\ That is 2021 year and 12 month - December

So after \F1\Data\Export\PlayOnUsers\ it's changing.

I need to put in similar format to GCS. I need to create sub-buckets 2021\ and 12\

How is this done? I also don't see the part where you put CREDENTIALS FOR GCS

I have this code so far:

    from google.cloud import storage
    
    
    def upload_blob(bucket_name, source_file_name, destination_blob_name):
        """Uploads a file to the bucket."""
        # The ID of your GCS bucket
        bucket_name = "MyBucket-scv"
    
        # The path to your file to upload
        source_file_name = "F1/Data/Export"
    
        # The ID of your GCS object
        destination_blob_name = "storage-object-name"
    
        storage_client = storage.Client()
        bucket = storage_client.bucket(bucket_name)
        blob = bucket.blob(destination_blob_name)
    
        blob.upload_from_filename(source_file_name)
    
        print(
            "File {} uploaded to {}.".format(
                source_file_name, destination_blob_name
            )
        )
    
    upload_blob(.., .., ..)

# how do I pass parameters automated when calling the function?
Anna
  • 1
  • 1
  • 4

1 Answers1

0

See Finding Credentials Automatically. Using these "Application Default Credentials" is a good practice. All you need do is have a Service Account with suitable role|permissions and, if you're running off-GCP (i.e. not on Compute Engine etc.), then you'll need to create a Service Account key and reference it GOOGLE_APPLICATION_CREDENTIALS before you run your code.

Google Cloud Storage (GCS) does not really have a concept of folders nor "sub-buckets". In fact, everything in a GCS bucket is called an Object but Object names may include / (this is the *nix equivalent of Windows' /) and commonly (!) used to denote folder paths.

So, you only really have to worry about recursively iterating over your Windows folders (I'll leave that to you) and then for every file that your code finds, it will need to create an Object in your GCS bucket that comprises:

  1. The bucket
  2. The folder structures with / instead of \
  3. The filename

i.e.

  • \F1|Data\Export\PlayOnUsers\2021\12\x becomes gs://your-bucket/F1/Data/Exporter/PlayOnUsers/2021/12/x
  • \F1|Data\Export\PlayOnUsers\2022\01\x becomes gs://your-bucket/F1/Data/Exporter/PlayOnUsers/2022/01/x
DazWilkin
  • 32,823
  • 5
  • 47
  • 88
  • Thank you! I got it about Objects. After a few attempts I have a correct code. `destination_blob_name = "Consumption/" + os.path.split(source_file_name)[-1]` . Makes "Object". Depending on file name I will be putting them to different folders. Think that now I don't need "/2021/12/" anymore. If I already have "Consumption/" in my Bucket will it just put file in already existing Object "Consumption"? – Anna Dec 23 '21 at 17:33
  • Sorry, to be clear. I just found out that all files sit in one Folder and will need to be put in different folders due to the file names – Anna Dec 23 '21 at 17:37
  • Then you want to parse the filenames and inject `/` to mimic the folder structure, i.e. `2021-12-31.doc` becomes `gs://bucket/.../2021/12/31/doc` or similar. – DazWilkin Dec 23 '21 at 17:46