1

I want to migrate files from Digital Ocean Storage into Google Cloud Storage programatically without rclone.

I know the exact location file that resides in the Digital Ocean Storage(DOS), and I have the signed url for the Google Cloud Storage(GCS).

How can I modify the following code so I can copy the DOS file directly into GCS without intermediate download to my computer ?

def upload_to_gcs_bucket(blob_name, path_to_file, bucket_name):
    """ Upload data to a bucket"""

    # Explicitly use service account credentials by specifying the private key
    # file.
    storage_client = storage.Client.from_service_account_json(
        'creds.json')

    #print(buckets = list(storage_client.list_buckets())

    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(blob_name)
    blob.upload_from_filename(path_to_file)

    #returns a public url
    return blob.public_url
Dharmaraj
  • 47,845
  • 8
  • 52
  • 84
london_utku
  • 1,070
  • 2
  • 16
  • 36
  • 1
    The short answer is you can't. There is no interface/API between Google Cloud Storage and other cloud storage services. There are services that can do this for you (see @dazwilkin answer) but in summary, you must download and then upload using a service (yours or another). To improve performance, use a VM in the cloud. – John Hanley Mar 27 '22 at 18:11

1 Answers1

5

Google's Storage Transfer Servivce should be an answer for this type of problem (particularly because DigitalOcean Spaces like most is S3-compatible. But (!) I think (I'm unfamiliar with it and unsure) it can't be used for this configuration.

There is no way to transfer files from a source to a destination without some form of intermediate transfer but what you can do is use memory rather than using file storage as the intermediary. Memory is generally more constrained than file storage and if you wish to run multiple transfers concurrently, each will consume some amount of storage.

It's curious that you're using Signed URLs. Generally Signed URLs are provided by a 3rd-party to limit access to 3rd-party buckets. If you own the destination bucket, then it will be easier to use Google Cloud Storage buckets directly from one of Google's client libraries, such as Python Client Library.

The Python examples include uploading from file and from memory. It will likely be best to stream the files into Cloud Storage if you'd prefer to not create intermediate files. Here's a Python example

DazWilkin
  • 32,823
  • 5
  • 47
  • 88