0

Is there a way we can decompress 3 GB gzip file which contains 100GB JSON data to local drive using c#. I have found the following code to decompress but it has a limit of 8GB

public static void Decompress(FileInfo fileToDecompress)
   {
       using (FileStream originalFileStream = fileToDecompress.OpenRead())
       {
           string currentFileName = fileToDecompress.FullName;
           string newFileName = currentFileName.Remove(currentFileName.Length - fileToDecompress.Extension.Length);
           using (FileStream decompressedFileStream = File.Create(newFileName))
           {
               using (GZipStream decompressionStream = new GZipStream(originalFileStream, CompressionMode.Decompress))
               {
                   decompressionStream.CopyTo(decompressedFileStream);
                   Console.WriteLine("Decompressed: {0}", fileToDecompress.Name);
               }
           }
       }
   }

Also I found a way to de serialize directly to object from https://www.codeart.dk/blog/2020/5/reading-very-large-gzipped-json-files-in-c/ But I didn't find a way to decompress and save to local drive

  • I think the issue might be that the decompression is done to memory first and only then copied to the decompressedFileStream. You'll need a way to have the decompressed data written directly to a file. – phuzi Aug 26 '22 at 15:13
  • 1
    may be of interest - https://stackoverflow.com/questions/70933327/net-6-failing-at-decompress-large-gzip-text – phuzi Aug 26 '22 at 15:17
  • 2
    Would the [CopyTo overload](https://learn.microsoft.com/en-us/dotnet/api/system.io.stream.copyto?view=net-6.0#system-io-stream-copyto(system-io-stream-system-int32)) that accepts a buffersize help? – Hans Kesting Aug 26 '22 at 15:29
  • CopyTo without giving the buffersize uses a 80K default buffer. Don't see changing that should do anything about a 8GB barrier. How did you find out there is an 8GB Barrier? Via an Exception? Show that. – Ralf Aug 26 '22 at 15:47

0 Answers0