2

I am using the latest Azure Storage SDK (azure-storage-blob-12.7.1). It works fine for smaller files but throwing exceptions for larger files > 30MB.

azure.core.exceptions.ServiceResponseError: ('Connection aborted.', timeout('The write operation timed out'))

from azure.storage.blob import BlobServiceClient, PublicAccess, BlobProperties,ContainerClient

    def upload(file):
        settings = read_settings()
        connection_string = settings['connection_string']
        container_client = ContainerClient.from_connection_string(connection_string,'backup')
        blob_client = container_client.get_blob_client(file)
        with open(file,"rb") as data:
            blob_client.upload_blob(data)
            print(f'{file} uploaded to blob storage')
    
    upload('crashes.csv')
DevMonk
  • 427
  • 1
  • 9
  • 23
  • 1
    hi, could this be the issue https://github.com/Azure/azure-storage-python/issues/228 or perhaps https://github.com/Azure/azure-sdk-for-python/issues/12166 – IronMan Feb 27 '21 at 00:03
  • I think those are old and closed. – DevMonk Feb 27 '21 at 00:30
  • https://github.com/Azure/azure-sdk-for-python/issues/12166 seems an accurate description of your issue, please comment on it to help prioritize the work (I work at MS on the SDK team) – Laurent Mazuel Feb 27 '21 at 21:54
  • 1
    @LaurentMazuel I have commented on GitHub as you mentioned. – DevMonk Feb 28 '21 at 05:53

1 Answers1

3

Seems everything works for me by your code when I tried to upload a ~180MB .txt file. But if uploading small files work for you, I think uploading your big file in small parts could be a workaround. Try the code below:

from azure.storage.blob import BlobClient

storage_connection_string=''
container_name = ''
dest_file_name = ''

local_file_path = ''

blob_client = BlobClient.from_connection_string(storage_connection_string,container_name,dest_file_name)

#upload 4 MB for each request
chunk_size=4*1024*1024  

if(blob_client.exists):
    blob_client.delete_blob()
    blob_client.create_append_blob()

with open(local_file_path, "rb") as stream:
    
    while True:
            read_data = stream.read(chunk_size)
            
            if not read_data:
                print('uploaded')
                break 
            blob_client.append_block(read_data)

Result:

enter image description here

Stanley Gong
  • 11,522
  • 1
  • 8
  • 16
  • Thanks for your inputs ! But as per the SDK spec the API should handle the chunk internally. So trying to figure out why the API not working as expected! I think as mentioned above SDK team already has got a similar issue active. – DevMonk Mar 01 '21 at 04:07
  • @DevMonk, I see, let's see what is the root reason. – Stanley Gong Mar 01 '21 at 04:11
  • @DevMonk,bu the way, does the solution: set max_single_put_size to a smaller value works for you? – Stanley Gong Mar 01 '21 at 04:23
  • Isn't that param 'max_single_put_size' part of legacy SDK ? – DevMonk Mar 01 '21 at 04:58
  • @DevMon, actually, it also be part of V12 SDK too, go to : https://learn.microsoft.com/en-us/python/api/azure-storage-blob/azure.storage.blob.blobclient?view=azure-python and search max_block_size or max_single_put_size, you can find them. – Stanley Gong Mar 01 '21 at 06:20
  • And for upload_blob function, I think you can also extend the timeout config:https://learn.microsoft.com/en-us/python/api/azure-storage-blob/azure.storage.blob.blobclient?view=azure-python#upload-blob-data--blob-type--blobtype-blockblob---blockblob----length-none--metadata-none----kwargs- – Stanley Gong Mar 01 '21 at 06:35
  • 1
    those params didn't resolve the issue. I am going with your approach of chunking. (But the underlying issue is still there, which Azure SDK team needs to resolve) – DevMonk Mar 01 '21 at 18:24
  • I am still getting the timeout issue as well. the append seems to work for me. – Paul Jan 17 '22 at 02:29
  • I tried this solution but its taking 2 minutes to upload 50MB, is this normal? It's taking something about 8 sconds everytime it goes to line blob_client.append_block(read_data) – Mikhael Abdallah Jan 06 '23 at 14:02
  • 1
    I decided to use max_block_size and max_single_put_size, and this seems to lessen the timeouts when LTE network is slow. – Paul Feb 03 '23 at 02:23