8

I am developing an azure application which needs at some point to upload(download) a large amount of small blobs to a single container (more than 1k blobs, less than 1 Mb each). In order to speed up this process I'd like to use multiple threads for uploading(downloading) blobs.

This is routine I use for uploading single blob:

CloudStorageAccount storageAccount = CloudStorageAccount.Parse(ConnectionString);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer blobContainer = 
    blobClient.GetContainerReference(ContainerName);
blobContainer.CreateIfNotExist();

CloudBlob blob = blobContainer.GetBlobReference(Id);
blob.UploadByteArray(Data);

For each type used in the code above MSDN says following:

Any public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.

Does it mean that I need to execute following code in every thread? Or maybe I can execute it only once and share single instance of CloudBlobContainer among different threads?

CloudStorageAccount storageAccount = CloudStorageAccount.Parse(ConnectionString);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer blobContainer = 
    blobClient.GetContainerReference(ContainerName);

I would be really happy to use single instance of CloudBlobContainer in different threads otherwise it seriously slows down the whole uploading(downloading) process.

monofilm
  • 103
  • 1
  • 4

1 Answers1

9

You should be fine sharing a single blob container reference as long as you are not trying to perform an update on the container itself (even then, I think it would still be fine in most scenarios like List). In fact, you don't really even need the container reference if you are sure it exists:

client.GetContainerReference("foo").GetBlobReference("bar");
client.GetBlobReference("foo/bar");  //same

As you can see, the only reason to get a container reference is if you want to perform an operation on the container itself (list, delete, etc.). If you keep the blob references in separate threads, you will be fine.

dunnry
  • 6,858
  • 1
  • 20
  • 20
  • Thanks for your reply. I am wondering is it also possible to share CloudBlobClient and CloudStorageAccount references among threads? – monofilm Aug 03 '11 at 08:00
  • In general yes, that should be fine as the CloudStorageAccount is nothing more than data really and I don't believe it is mutable (cannot change credentials once set). The client is similar I believe. However, you can get weird stuff happening if you mutate the settings (like DefaultDelimiter, Retry, etc.) on different threads. It would be best to share the CloudStorageAccount and spin up a new client in each thread to be safe. That is lightweight anyway to do. – dunnry Aug 03 '11 at 15:32
  • 1
    Can you please check this http://stackoverflow.com/questions/24229288/parallel-blob-upload-throwing-404-bad-request-intermittently – Imran Qadir Baksh - Baloch Jun 15 '14 at 12:59
  • This answer is still seems to hold with the latest packages: WindowsAzure.Storage 9.3 – PenFold Jul 22 '18 at 17:37