I'm uploading concurrently 1500 blobs (1Mo max per blob) to a container in Azure Storage Account (StorageV2 (general purpose v2))
So far i'm uploading them via python package azure-blob_storage with the pseudo-code below.
async def upload_blobs_async(blobs_args:list):
tasks = [asyncio.create_task(upload_blob_async(blob_arg)) for arg in blobs_args]
# concurrent call return_when all completed. Safe.
finished, pending = await asyncio.wait(
tasks, return_when=asyncio.ALL_COMPLETED
)
return None
....
async def upload_blob_async(args: dict):
# Instantiate a new BlobServiceClient using a connection string
blob_service_client = asyncbsc.from_connection_string(CONNECTION_STRING_STORAGE)
async with blob_service_client:
# Instantiate a new ContainerClient
container_client = blob_service_client.get_container_client(args["blob_name"])
# Upload a blob to the container
await container_client.upload_blob(...)
With no restriction on the number of // queries, sending 1500 docs has a huge impact on my E2E response time
What would you recommand in order to lower the E2E ? Using a semaphore in order to send maybe requests 100 by 100 ? Also i need to keep the general purpose storage account (i/o premium account) because i use the tags (not available on the premium...).