1

I have some code running behind an API that loops through a list of files on Azure Blob Storage, Zips them up and saves the final Zip to the same storage account. I then provide a link to the Zip file for my users to access.

This solution works fine provided the files are small. However there are many files in the 2-5 GB range and as soon as these are tested I get an out of memory exception error:

'Array dimensions exceeded supported range.'

I've seen systems like OneDrive and GoogleDrive create these files very quickly and I aspire to creating that experience for my users. But I am also fine with notifying the user when the archive is ready to download even if it is a few minutes later as I will have their email.

Here is a version of the code simplified and running in a console app:

using Microsoft.WindowsAzure.Storage;
using System.IO.Compression;


var account = CloudStorageAccount.Parse("ConnectionString");
var blobClient = account.CreateCloudBlobClient();
var container = blobClient.GetContainerReference("ContainerName");

var blob = container.GetBlockBlobReference("ZipArchive.zip");
using (var stream = await blob.OpenWriteAsync())
using (var zip = new ZipArchive(stream, ZipArchiveMode.Create))
{
    var files = new string[] {
        "files/psds/VeryLargePsd_1.psd",
        "files/psds/VeryLargePsd_2.psd",
        "files/psds/VeryLargePsd_3.psd",
        "files/zips/VeryLargeZip_1.zip",
        "files/zips/VeryLargeZip_2.zip"
    };
   
    foreach (var file in files)
    {
        var sourceBlob = container.GetBlockBlobReference(file);
        var index = file.LastIndexOf('/') + 1;
        var fileName = file.Substring(index, file.Length - index);
        var entry = zip.CreateEntry(fileName, CompressionLevel.Optimal);

        await sourceBlob.FetchAttributesAsync();
        byte[] imageBytes = new byte[sourceBlob.Properties.Length];
        await sourceBlob.DownloadToByteArrayAsync(imageBytes, 0);

        using (var zipStream = entry.Open())
            zipStream.Write(imageBytes, 0, imageBytes.Length);
    }
}
INNVTV
  • 3,155
  • 7
  • 37
  • 71

1 Answers1

0

As you mentioned it working for small files and large files it throwing the error.

workarounds

1) Upload the large files with small chunks then zip them .

For more details refer this SO Thread: Upload a zip file in small chunks to azure cloud blob storage

2) This tutorial shows you deploy an application that uploads large amount of random data to an Azure storage account: Upload large amounts of random data in parallel to Azure storage

3) uploading large files, you can use Microsoft Azure Storage Data Movement Library for better performance. The Microsoft Azure Storage Data Movement Library designed for high-performance uploading, downloading and copying Azure Storage Blob and File

ShrutiJoshi-MT
  • 1,622
  • 1
  • 4
  • 9
  • Thanks for the info, however my scenario calls for a vast library of very large files. The users submit a request for selected files from the catalog to download them. So there is a wide array of combinations of zipped files that could be requested at any time. The zip occurs upon request and not part of the upload process. I'm thinking I may need to have a VM get these requests from an event, copy each file to it's local hard drive, zip them up on the VM then push the zip back up to Azure. The link can then be sent to the requester when ready. If this works I will update. – INNVTV Dec 17 '21 at 13:53