Upload a zip file in small chunks to azure cloud blob storage

Question

I want to upload zip file in small chunks (less than 5 MB) to blob containers in Microsoft Azure Storage. I already configured 4 MB chunk limits in BlobRequestOptions but when I run my code and check the memory usage in Azure Cloud, its not uploading in chunks. I am using C# .NET Core. Because I want to zip files that are already located in Azure Cloud, so first I am downloading the individual files to stream, adding stream to zip archive and then uploading the zip back to the cloud. The following is my code:

if (CloudStorageAccount.TryParse(_Appsettings.GetSection("StorConf").GetSection("StorageConnection").Value, out CloudStorageAccount storageAccount)) {
 CloudBlobClient BlobClient = storageAccount.CreateCloudBlobClient();

 TimeSpan backOffPeriod = TimeSpan.FromSeconds(2);
 int retryCount = 1;
 BlobRequestOptions bro = new BlobRequestOptions() {


  SingleBlobUploadThresholdInBytes = 4096 * 1024, // 4MB
   ParallelOperationThreadCount = 1,
   RetryPolicy = new ExponentialRetry(backOffPeriod, retryCount),
   // new
   ServerTimeout = TimeSpan.MaxValue,
   MaximumExecutionTime = TimeSpan.FromHours(3),
   //EncryptionPolicy = policy
 };



 // set blob request option for created blob client
 BlobClient.DefaultRequestOptions = bro;

 // using specified container which comes via transaction id
 CloudBlobContainer container = BlobClient.GetContainerReference(transaction id);

 using(var zipArchiveMemoryStream = new MemoryStream()) {
  using(var zipArchive = new ZipArchive(zipArchiveMemoryStream, ZipArchiveMode.Create, true)) // new
  {

   foreach(FilesListModel FileName in filesList) {

    if (await container.ExistsAsync()) {
     CloudBlob file = container.GetBlobReference(FileName.FileName);

     if (await file.ExistsAsync()) {
      // zip: get stream and add zip entry
      var entry = zipArchive.CreateEntry(FileName.FileName, CompressionLevel.Fastest);

      // approach 1
      using(var entryStream = entry.Open()) {
       await file.DownloadToStreamAsync(entryStream, null, bro, null);

       await entryStream.FlushAsync();

       entryStream.Close();
      }
     } else {
      downlReady = "false";
     }
    } else {
     // case: Container does not exist

     //return BadRequest("Container does not exist");

    }
   }
  }

  if (downlReady == "true") {
   string zipFileName = "sample.zip";
   CloudBlockBlob zipBlockBlob = container.GetBlockBlobReference(zipFileName);

   zipArchiveMemoryStream.Position = 0;
   //zipArchiveMemoryStream.Seek(0, SeekOrigin.Begin);

   // new
   zipBlockBlob.Properties.ContentType = "application/x-zip-compressed";

   await zipArchiveMemoryStream.FlushAsync();

   await zipBlockBlob.UploadFromStreamAsync(zipArchiveMemoryStream, zipArchiveMemoryStream.Length, null, bro, null);
  }

  zipArchiveMemoryStream.Close();

 }
}

The following is a snapshot of the memory usage (see private_Memory) in azure cloud kudu process explorer:

memory usage

Any suggestions would be really helpful. Thank you.

UPDATE 1:

To make it more clear. I have files which are already located in Azure blob storage. Now I want to read the files from the container, create a ZIP which contains all of my files. The major challenge here is that my code is obviously loading all files into memory to create the zip. If and how it is possible to read files from a container and write the ZIP file back into the same container in parallel/pieces, so that my Azure web app does NOT need to load the whole files into memory? Ideally I read the files in pieces and also start writing the zip already so that my Azure web app consumes less memory.

Could you tell me what is ``` upload file in chunk```? Is that you want to split a block blob into smaller chunk of blocks, uploading them one-by-one or in-parallel and lastly combine them all into single block blob? — Jim Xu, Feb 05 '20 at 03:23
@JimXu First I downloaded the files (that I want to zip together) from the cloud to the stream "zipArchiveMemoryStream". Now I want to upload the stream to a block blob in small parts ( I don't want to upload the whole stream to the block blob). Finally the whole zip file will be in a single block blob. — liquidwall, Feb 05 '20 at 10:13
Could you please tell me what is "I don't want to upload the whole stream to the block blob"? You want to upload a part of stream to blob or split the stream to small part then upload the small part one by one? — Jim Xu, Feb 05 '20 at 11:33
I want to upload the whole stream by splitting the stream to small parts, then uploading the small part one by one — liquidwall, Feb 05 '20 at 12:05
Please see UPDATE 1 which hopefully makes makes my problem more clear. @JimXu and others — liquidwall, Feb 05 '20 at 14:46
According to your need, you can try to use [PutBlock](https://msdn.microsoft.com/en-us/library/azure/dd135726.aspx) and [PutBlockList](https://msdn.microsoft.com/en-us/library/azure/dd135726.aspx). You can zip a file and then upload the part to azure with PutBlock. At last, you can commit all blocks as a single blob. For more details, please refer to https://stackoverflow.com/questions/32558511/upload-zip-chunks-and-join-them-on-azure-platform and http://syncbite.com/code/posts/80/upload-files-larger-than-4-mb-in-chunks-using-azure-blob-storage — Jim Xu, Feb 06 '20 at 01:02

score -1 · Answer 1 · answered Feb 06 '20 at 13:40

I have found the solution by referring to this stackoverflow article:

How can I dynamically add files to a zip archive stored in Azure blob storage?

The way to do is to simultaneously write to the zip memory stream while reading / downloading the input files.

Below is my code snippet:

                using (var zipArchiveMemoryStream = await zipBlockBlob.OpenWriteAsync(null, bro, null))
                using (var zipArchive = new ZipArchive(zipArchiveMemoryStream, ZipArchiveMode.Create))
                {

                    foreach (FilesListModel FileName in filesList)
                    {
                        if (await container.ExistsAsync())
                        {
                            CloudBlob file = container.GetBlobReference(FileName.FileName);

                            if (await file.ExistsAsync())
                            {
                                // zip: get stream and add zip entry
                                var entry = zipArchive.CreateEntry(FileName.FileName, CompressionLevel.Fastest);

                                // approach 1
                                using (var entryStream = entry.Open())
                                {
                                    await file.DownloadToStreamAsync(entryStream, null, bro, null);
                                    entryStream.Close();
                                }
                            }
                        }
                    }
                    zipArchiveMemoryStream.Close();

}

Upload a zip file in small chunks to azure cloud blob storage

1 Answers1

Linked