GZIPInputStream memory leakage

Question

I am getting a java.lang.OutOfMemoryError: Java heap space when using GZIPInputStream. The Java process runs good for some time but after a while full ups the memory. I guess there is some references that are not taken care by the GC but can really find where the problem in the code could be. I already increase the memory of the process to 3 GB but for sure after a while will full up that memory too. Is really progressive and no matter the memory size. Does any one have an idea how I could improve my code to prevent memory leakage?

public byte[] uncompress(byte[] msg) {
    byte[] buffer = new byte[4 * 1024];
    int length;

    try (GZIPInputStream gzis = new GZIPInputStream(new ByteArrayInputStream(msg));
         BufferedInputStream bis = new BufferedInputStream(gzis);
         ByteArrayOutputStream baos = new ByteArrayOutputStream()
    ) {
        while ((length = bis.read(buffer)) >= 0) {
            baos.write(buffer, 0, length);
        }

        final byte[] result = baos.toByteArray();
        return result;
    } catch (Exception e) {
    }
}

Welcome to Stack Overflow. Please take the [tour] to learn how Stack Overflow works and read [ask] on how to improve the quality of your question. Then [edit] your question to include your source code as a working [mcve], which can be compiled and tested by others. — Progman, Nov 10 '21 at 17:32

score 1 · Answer 1 · edited Nov 10 '21 at 18:00

1

I've seen a few things floating (e.g. example) around talking about Java OOM caused by zlib off-heap memory allocation which may mean you are out of luck with conventional code.

In case it is just your code, you should monitor the live heap (try JMC), take a heap dump (or two) and check the content (Eclipse MAT lets you diff two heaps).

edited Nov 10 '21 at 18:00

Progman

16,827
6
33
48

answered Nov 10 '21 at 17:29

drekbour

2,895
18
28

DuncG · Answer 2 · 2021-11-10T19:41:01.770

Is reasonable to assume the inflated stream is bigger than msg so your can help your program use fewer re-allocations if you use new ByteArrayOutputStream(msg.length). Also, get rid of the double buffering by removing BufferedInputStream and your own buffer, just call transferTo which allocates a single internal buffer.

Thus your program will do fewer memory re-allocations if you reduce it to this:

public byte[] uncompress(byte[] msg) throws IOException {
    try (GZIPInputStream gzis = new GZIPInputStream(new ByteArrayInputStream(msg));
         ByteArrayOutputStream baos = new ByteArrayOutputStream(msg.length)
    ) {
        gzis.transferTo(baos);

        return baos.toByteArray();
    }
}

However you may still get OOM - as you have may have 3 big byte[] in memory. You may be able use a bigger guess size for the anticipated final length inside the ByteArrayOutputStream to ensure it does not re-allocate when inflating - example: new ByteArrayOutputStream(msg.length * 4 / 3). If OOM happens at toByteArray() it is also possible to read the internal byte[] with a sub-class.

Better still, change the structure of your application to avoid full byte[] copies of stream, and change the compression to support caller supplied OutputStream.

public void uncompress(byte[] msg, OutputStream out) throws IOException

or

public void  uncompress(InputStream msg, OutputStream out) throws IOException

GZIPInputStream memory leakage

2 Answers2