0

So I need to decompress a BZip2-file, use the files data, and then remove the decompressed file. The issue is that my method doesn't work when the BZip2-file is too large.

Been using:

using ICSharpCode.SharpZipLib.BZip2;

This is what I've been trying to do:

    private List<JObject> listWithObjects;

    private void DecompressBzip2File(string bzipPath)
    {
        string tempPath = Path.GetRandomFileName();
        FileStream fs = new FileStream(bzipPath, FileMode.Open);

        using (FileStream decompressedStream = File.Create(tempPath))
        {
            BZip2.Decompress(fs, decompressedStream, true);
        }

        LoadJson(tempPath);
        File.Delete(tempPath);
    }

    private void LoadJson(string tempPath)
    {
        List<JObject> jsonList = new List<JObject>();

        using (StreamReader file = new StreamReader(tempPath))
        {
            string line;

            while ((line = file.ReadLine()) != null)
            {
                JObject jObject = JObject.Parse(line);
                jsonList.Add(jObject);
            }

            file.Close();
        }

        listWithObjects = jsonList;
    }

It's working when I've got a .bz2 ~14mb, but not when I've tried a .bz2 ~900mb my program just stops (and I get no error-message(my RAM goes crazy)). I read something about buffer size, but couldn't figure out how to use it.

Does anyone have any tip on how I could decompress a large bzip2-file? Could you like chunk the file to smaller pieces?

Jesper
  • 2,044
  • 3
  • 21
  • 50
  • 2
    I'd expect the problem to be in `LoadJson(tempPath)`. You can deduce this by letting it operate without the Bzip stuff on an already extraced file. See also [mcve]. – CodeCaster Dec 14 '17 at 12:20
  • You might hit the 2GB per object limit. You might want to flip the switch to x64 and set the the gcAllowVeryLargeObjects = true –  Dec 14 '17 at 12:42
  • It seems like I am able to decompress the file with no issues, so yeah, the problem is probably with the LoadJson as @CodeCaster pointed out. Does StreamReader then have a limit of what it could read? – Jesper Dec 14 '17 at 13:35
  • I restructured the code to use the jObject produced by the readline-method directly, instead of putting the objects in a list. I guess @UrbanEsc was right about hitting the object limit – Jesper Dec 14 '17 at 18:05
  • I am doing the exact same thing in my code so I am familiar with the issue! Try it out and don’t forget to uncheck „prefer 32bit“ in project properties –  Dec 14 '17 at 19:26

0 Answers0