0

I'm not sure where and what am I doing wrong, but the zip that I'm creating using DotNetZip library, is creating a zip file whose contents are blank. Or the size of file in zip is showing as 0Kb and unable to open it.

Code:

public static async Task DotNetZipFileAsync(MemoryStream stream, string bucket, List<List<string>> pdfFileSet, IAmazonS3 s3Client)
        {
           
            using Ionic.Zip.ZipFile zip = new ZipFile();
            foreach (var pdfFile in pdfFileSet)
            {
                foreach (var file in pdfFile)
                {
                    GetObjectRequest request = new GetObjectRequest
                    {
                        BucketName = bucket,
                        Key = file
                    };
                    
                    using GetObjectResponse response = await s3Client.GetObjectAsync(request);
                    using Stream responseStream = response.ResponseStream;
                    ZipEntry zipEntry = zip.AddEntry(file.Split('/')[^1], responseStream);
                    await responseStream.CopyToAsync(stream);
                }
            }
            zip.Save(stream);
            stream.Seek(0,SeekOrigin.Begin);
            await stream.CopyToAsync(new FileStream(@"C:\LocalRepo\Temp.zip", FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.ReadWrite));
        }
    }
}
Aniruddha
  • 837
  • 3
  • 13
  • 37
  • What is the use of this parameter `stream`? Copying the reponse to this stream will probably consume the whole downloaded content. – Klaus Gütter Mar 24 '21 at 16:14
  • That is to write contents to stream and then create a zip file from stream. – Aniruddha Mar 24 '21 at 16:15
  • No, have a look at the [documentation](https://documentation.help/DotNetZip/7bc513f6-c21c-6a02-3964-f2a571308a33.htm): the second parameter to `AddEntry` shall be a stream readable at the moment `zip.Save()` is called. As you are disposing the responseStream before, it will not work. – Klaus Gütter Mar 24 '21 at 16:17
  • Didn't catch that, you mean `responseStream` is getting disposed, where? – Aniruddha Mar 24 '21 at 16:23
  • At the end of its scope (the `}`) – Klaus Gütter Mar 24 '21 at 16:25
  • @AlexeyRumyantsev Yes, this is also one of the problems. But because of `using Stream responseStream` it will be `Dispose()`d even if the zipEntry still has a reference. – Klaus Gütter Mar 24 '21 at 16:29
  • @KlausGütter I did change the code by moving `using Stream responseStream` to top of the function, but still it did not have any effect, same result. – Aniruddha Mar 24 '21 at 16:33
  • @KlausGütter You right I didn't mention that `using` statement. It looks like `ZipEntry` is not disposable and author has to track all his streams somewhere and explicitly dispose after all processing. – Alexey Rumyantsev Mar 24 '21 at 16:33
  • @Aniruddha remove that line `await responseStream.CopyToAsync(stream);`, remove memory stream it is redundant, introduce list of streams in start of method and add all your `responseStreams` in that list, in the end iterate and dispose all streams. `zip.Save()` you can do directly to `FileStream`. – Alexey Rumyantsev Mar 24 '21 at 16:35

2 Answers2

2

Your code has at least two problems:

  1. The read stream is completely consumed by the await responseStream.CopyToAsync(stream). You could rewind the responseStream to cope with this, but saving the data into the memory stream is completely useless.

  2. The response stream is disposed before zip.Save is called.

What you could do: keep the streams open until Save is called and dispose them afterwards. As Alexey Rumyantsev discovered (see comments), also the GetObjectResponse objects need to be kept until the ZIP file is saved.

using Ionic.Zip.ZipFile zip = new ZipFile();
var disposables = List<IDisposable>();
try
{
    foreach (var pdfFile in pdfFileSet)
    {
        foreach (var file in pdfFile)
        {
            GetObjectRequest request = new GetObjectRequest
            {
                BucketName = bucket,
                Key = file
            };
            
            var response = await s3Client.GetObjectAsync(request);
            disposables.Add(response);
            var responseStream = response.ResponseStream;
            disposables.Add(responseStream);
            ZipEntry zipEntry = zip.AddEntry(file.Split('/')[^1], responseStream);
        }
    }
    using var fileStream = new FileStream(@"C:\LocalRepo\Temp.zip", FileMode.Create, FileAccess.Write);        
    zip.Save(fileStream);
}
finally
{
    foreach (var disposable in disposables)
    {
        disposable.Dispose();
    }
}

The documentation has some hints ony how this could be made smarter.

Klaus Gütter
  • 11,151
  • 6
  • 31
  • 36
  • This code can have another problem it is with very large files, all streams are opened in same time keeping their connections alive, but then they are processed one by one, if reading one stream will take much time other can timeout. But this is outside of scope of this question and results in much more complex solution. – Alexey Rumyantsev Mar 24 '21 at 16:47
  • @Klaus Thanks for your assistance, however this isn't still working. Now the created zip file is giving an error of Invalid file – Aniruddha Mar 24 '21 at 16:51
  • I suppose you have to dispose `FileStream` by yourself, wrap it in `using`. – Alexey Rumyantsev Mar 24 '21 at 16:54
  • @AlexeyRumyantsev Thanks, but that still didn't work. – Aniruddha Mar 24 '21 at 17:01
  • 1
    Maybe your file is not truncated from previous attempts try using `FileMode.Create`. – Alexey Rumyantsev Mar 24 '21 at 17:09
  • @AlexeyRumyantsev Tried that too with no luck. Perhaps this requires thorough study of the library. – Aniruddha Mar 24 '21 at 17:14
  • Updated my answer according to the comments from @AlexeyRumyantsev – Klaus Gütter Mar 24 '21 at 17:33
  • @KlausGütter I appreciate your work and assistance, however it's still not working. A zip file get created with files in it of 0Kb. Unable to understand it, what's missing. – Aniruddha Mar 24 '21 at 18:59
  • Are you sure the responseStreams contain any data? – Klaus Gütter Mar 24 '21 at 19:25
  • Yes, it does contain data. – Aniruddha Mar 25 '21 at 04:07
  • @KlausGütter I think I've got it, problem in this line `using GetObjectResponse response = await ...` response object is disposed to early, it must live until working with it's underlying stream is finished, can you rewrite your answer to support storing tuples of response/responsestream for later disposal? – Alexey Rumyantsev Mar 25 '21 at 04:51
  • List of disposables is even better than tuples) Now this solution looks complete. – Alexey Rumyantsev Mar 25 '21 at 07:16
  • @AlexeyRumyantsev and Klaus, I know you both have done your level best to help me but nothing worked. The only thing that worked is in my answer – Aniruddha Mar 25 '21 at 15:13
  • 1
    @Aniruddha Your solution is memory inefficient and if you work with large files overallocation can become really huge,if you are ok with it then just use your solution. But if you need optimal solution then I see two outcomes, 1 - this library has bug working with streams and works only with byte arrays, in this case your solution is only possible, 2 - your are facing problem with responses timed out before you begin processing their stream, I've mentioned it in first comment, in this case you can go with enother overload of `AddEntry` where you can pass actions for opening and closing stream. – Alexey Rumyantsev Mar 25 '21 at 16:00
0
public static async Task DotNetZipFileAsync(string bucket, List<List<string>> pdfFileSet, IAmazonS3 s3Client)
{
    int read;
    using Ionic.Zip.ZipFile zip = new ZipFile();
    byte[] buffer = new byte[16 * 1024];
    
    foreach (var pdfFile in pdfFileSet)
    {
        foreach (var file in pdfFile)
        {
            GetObjectRequest request = new GetObjectRequest
            {
                BucketName = bucket,
                Key = file
            };

            using GetObjectResponse response = await s3Client.GetObjectAsync(request);
            using Stream responseStream = response.ResponseStream;
            using (MemoryStream ms = new MemoryStream())
            {
                while ((read = responseStream.Read(buffer, 0, buffer.Length)) > 0)
                {
                    ms.Write(buffer, 0, read);
                }
                zip.AddEntry(file.Split('/')[^1], ms.ToArray());
            }
        }
    }
    using var fileStream = new FileStream(@"C:\LocalRepo\Temp.zip", FileMode.Create, FileAccess.Write);
    zip.Save(fileStream);
}
Aniruddha
  • 837
  • 3
  • 13
  • 37