1

I have a massive large dataset that contains almost 700 columns and I,m using GZipStream for compression and decompression. Compression works fine and size of the dataset after compression is almost 40mb, but I get "system out of memory" exception during decompression. I,m using below code for compression and decompression:

Compression:

public static Byte[] CompressDataSet(DataSet dataset)
{
    Byte[] data;
    MemoryStream mem = new MemoryStream();
    GZipStream zip = new GZipStream(mem, CompressionMode.Compress);
    dataset.WriteXml(zip, XmlWriteMode.WriteSchema);
    zip.Close();
    data = mem.ToArray();
    mem.Close();
    return data;

}

Decompression:

public static DataSet DecompressDataSet(Byte[] data)
{
    MemoryStream mem = new MemoryStream(data);
    GZipStream zip = new GZipStream(mem, CompressionMode.Decompress);
    DataSet dataset = new DataSet();
    dataset.ReadXml(zip, XmlReadMode.ReadSchema);
    zip.Close();
    mem.Close();
    return dataset;

}

Please recommend any other compression library if GZipStream is not optimal/suitable for massive large datasets. Thanks in advance

saadsaf
  • 1,421
  • 1
  • 17
  • 28

1 Answers1

1

Your issue is stemming from the way you're compressing the data in the first place, have a look at the code below and let me know if you have any questions.

public static Byte[] CompressDataSet(DataSet dataSet)
{
    using (MemoryStream inputStream = new MemoryStream())
    using (MemoryStream resultStream = new MemoryStream())
    using (GZipStream gzipStream = new GZipStream(resultStream, CompressionMode.Compress))
    {
        dataSet.WriteXml(inputStream, XmlWriteMode.WriteSchema);
        inputStream.Seek(0, SeekOrigin.Begin);
        inputStream.CopyTo(gzipStream);

        gzipStream.Close();

        return resultStream.ToArray();
    }
}

public static DataSet DecompressDataSet(Byte[] data)
{
    using (MemoryStream compressedStream = new MemoryStream(data))
    using (GZipStream gzipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
    using (DataSet dataset = new DataSet())
    {
        dataset.ReadXml(gzipStream, XmlReadMode.ReadSchema);
        return dataset;
    }
}
Aydin
  • 15,016
  • 4
  • 32
  • 42
  • the compression approcah you prescribed above uses CPU and memory highly and keep on running..we have to stop it forcefully as it keeps on running using huge memory and CPU – saadsaf Apr 14 '17 at 05:41
  • I'm using the same approach your question is based off, I tested with data in excess of 100mb and it worked fine. Your implementation doesn't actually compress anything which is why you are having issues in the first place – Aydin Apr 14 '17 at 05:47