0

I am using python sdk to download large (1Gb) object from cloud storage bucket inside cloud run object. cloud run has 8 GB memory and 4 cpus. I tried it with various chunk size and various worker count.
Below is my code:

  bucket = storage_client.bucket('mybucket')
  chunk_list = [33554432, 52428800, 78905344, 104857600]
  work_list = [4, 8, 16, 22, 32, 48]
  for chunk in chunk_list:
    for worker in work_list:
      blob = bucket.blob('my1GbBlob')
      print('download started: ', 'worker:', worker, 'chunk_size: ', chunk)
      start_time = datetime.now(timezone.utc)
      transfer_manager.download_chunks_concurrently(blob, '/tmp/myTmpFile_' + str(worker) + '_' + str(chunk), chunk_size=chunk, max_workers=worker)
      deltaTime = (datetime.now(timezone.utc) - start_time)
      executionTimeMilliSec = (round(deltaTime.total_seconds()*1000))
      print('download completed: ', worker, chunk, executionTimeMilliSec)
      os.remove('/tmp/myTmpFile_' + str(worker) + '_' + str(chunk))

I tried this code with with 2 cpu and 8 GB memory in cloud run where it was using 100% CPU so i switched to cloud run with 4 cpu and 8 Gb where cpu utilization is normal 55 % Speed is never goes above 13 sec to download 1 GB file.

How can i achieve more download speed ? Note: 1. this is cloud function gen2 environment. Note: 2. the same script gives speed of 4 seconds in default cloud shell machine
Note: 3. Machine is in us-central1 and bucket is in us multi-region. My cloud shell is located in asia-southeast1

Djai
  • 188
  • 10
  • Thats what exactly "download_chunks_concurrently" method of cloud sdk does. And hence i am expecting more speed. – Djai Jul 23 '23 at 18:38

0 Answers0