3

In Azure SDK v11, we had the option to specify the ParallelOperationThreadCount through the BlobRequestOptions. In Azure SDK v12, I see that the BlobClientOptions does not have this, and the BlockBlobClient (previously CloudBlockBlob in Azure SDK v11), there is only mention of parallelism in the download methods.

We have three files: one 200MB, one 150MB, and one 20MB. For each file, we want the file to be split into blocks and have those uploaded in parallel. Is this automatically done by the BlockBlobClient? If possible, we would like to do these operations for the 3 files in parallel as well.

AGuyCalledGerald
  • 7,882
  • 17
  • 73
  • 120

2 Answers2

2

You also can take use of StorageTransferOptions in v12.

The sample code below:

   BlobServiceClient blobServiceClient = new BlobServiceClient(conn_str);
   BlobContainerClient containerClient= blobServiceClient.GetBlobContainerClient("xxx"); 
   BlobClient blobClient = containerClient.GetBlobClient("xxx");

   //set it here.
   StorageTransferOptions transferOptions = new StorageTransferOptions();
   //transferOptions.MaximumConcurrency or other settings.       

   blobClient.Upload("xxx", transferOptions:transferOptions);

By the way, for uploading large files, you can also use Microsoft Azure Storage Data Movement Library for better performance.

Ivan Glasenberg
  • 29,865
  • 2
  • 44
  • 60
  • Thanks for the advice. I have a couple of questions. Do you know if using ```StorageTransferOptions``` is better than just using the ```BlockBlobClient```? And as for the Azure Storage Data Movement Library, do you know if it truly has better performance than Azure SDK v12? I wasn't sure as it is older and Azure SDK v12 is the newer. –  May 18 '20 at 05:21
  • @pegalusAlt, Storage Data Movement Library should be the fastest one to download/upload blobs, since it's optimized for uploading/downloading. – Ivan Glasenberg May 18 '20 at 05:45
  • I just did 5 tests with Azure SDK v12 using ```BlockBlobClient``` and then 5 tests using the Storage Data Movement Library. In each series, I uploaded files sequentially. After each test, I deleted the files. I also followed the Data Storage Movement Libary recommendation of setting ```ServicePointManager.DefaultConnectionLimit = Environment.ProcessorCount * 8;``` and ```ServicePointManager.Expect100Continue = false;``` for both series of tests. I got an average of 37.463 seconds with Azure SDK v12 and 41.863 seconds using Storage Data Movement Library. –  May 18 '20 at 06:39
  • @pegalusAlt, what's the size of the files you're using it for testing? – Ivan Glasenberg May 18 '20 at 07:36
  • I tested it on two 200MB files. Is it the case that these are not large enough to benefit from the Storage Data Movement Library? –  May 18 '20 at 08:19
  • @pegalusAlt, can you update your question with the code for testing? Data movement library may take some times for environment prepare, and yes, if the data is more bigger, then it's faster than other ways. – Ivan Glasenberg May 18 '20 at 08:25
  • 1
    This time I tested two 500MB files. 95.282 seconds for Azure SDK v12, and the Storage Date Movement Library (SDML) varied a lot. In one series of 5 tests, it got very close with 95.377 seconds, but in another series of 5 tests, it got an average of 157.399 seconds. What I noticed is that when the SDML takes a long time, it is stuck on uploading one chunk while the rest of its siblings finished. This behavior remains even when decreasing ```TransferManager.Configurations.ParallelOperations```. I will post a new question for this later today (05/18/2020) with the code, as I have to sleep now. –  May 18 '20 at 08:58
  • @pegalusAlt, ok and please post your code here. I did the same test(file size about 300MB), SDML is much more faster than v12. – Ivan Glasenberg May 18 '20 at 09:49
  • I have posted a new question because the comparison of performance is a separate issue from this topic. https://stackoverflow.com/questions/61879697/azure-sdk-v12-performance-vs-storage-data-movement-library-sdml-performance –  May 18 '20 at 22:05
0

Using Fiddler, I verified that BlockBlobClient does indeed upload the files in chunks without needing to do any extra work. For doing each of the major files in parallel, I simply had a task for each one, added it to a list tasks and used await Task.WhenAll(tasks).