3

I'vew been using iText 4.2.1 and java 1.6 to generate PDF-files. My task is to add two fields having some random content using a template pdf. It works fine even with an 1GB large PDF. But now the environment demands java 7 and I run into this Out of memory-problem. I'v upgraded the iText to 5.5.3, but still the same issue. The code is trivial:

public final class PdfHelper
{
    public static void randomizePDFStream(InputStream in, OutputStream out)
    {
        try
        {
            PdfReader ReadInputPDF;
            ReadInputPDF = new PdfReader(in);
-> crash            PdfStamper stamper = new PdfStamper(ReadInputPDF, out);
            HashMap<String, String> hMap = ReadInputPDF.getInfo();
            hMap.put("Title", "RANDOM PDF TITLE: " + System.nanoTime() + ", " + System.currentTimeMillis());
            hMap.put("Subject", "RANDOM PDF SUBJECT: " + System.currentTimeMillis() + ", " + System.nanoTime());
            stamper.setMoreInfo(hMap);
            stamper.close();
        }
        catch (Exception e)
        {
            e.printStackTrace();
        }
    }
}

This gives the following stack dump when using a 1GB large pdf file :

Caught: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
        at java_util_concurrent_Future$get.call(Unknown Source)
        at Main.awaitCompletion(Main.groovy:222)
        at Main$awaitCompletion.callCurrent(Unknown Source)
        at Main.run(Main.groovy:113)
Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
        at com.itextpdf.text.io.StreamUtil.inputStreamToArray(StreamUtil.java:74)
        at com.itextpdf.text.io.RandomAccessSourceFactory.createSource(RandomAccessSourceFactory.java:146)
        at com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:351)
        at com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:371)
        at PdfHelper.randomizePDFStream(PdfHelper.java:65)

This is called from a groovy script with this basic code:

mPDFFiles[i] = new java.io.File(getTempDirectory(), String.format("temp_file_%s_%s.pdf", System.nanoTime(), i));
mPDFFiles[i].createNewFile();

input = new BufferedInputStream(new FileInputStream(mTemplateFiles[i]));
output = new BufferedOutputStream(new FileOutputStream(mPDFFiles[i]));

long start=System.currentTimeMillis();
PdfHelper.randomizePDFStream(input, output);
output.flush();
println "Conversion time: " + (System.currentTimeMillis()-start) + " ms."

Anyone having an idea how to get this to work?

Bo Öberg
  • 31
  • 1
  • 1
  • 3
  • have you tried increasing the JVM heap size? http://viralpatel.net/blogs/jvm-java-increase-heap-size-setting-heap-size-jvm-heap/ – Leo Oct 08 '14 at 11:47
  • Well, of course I've tried increasing Xmx and MaxHeapSize quite a bit, but still the same problem. I'm now running with -Xms=8192m -Xmx16g -XX:MaxPermSize=8192m. And as I mentioned, using jdk1.6 everything is running fine. – Bo Öberg Oct 09 '14 at 04:59
  • `PdfStamper stamper = new PdfStamper(ReadInputPDF, out, true);` in appending mode. But I doubt. Seems a specific data error. Try another PDF, write the PDF to a temp file, and use that new temp PDF. – Joop Eggen Oct 09 '14 at 08:56
  • Which exact JRE are you using? I feel like nearly remembering something about some memory handling bug in some JRE update. – mkl Oct 09 '14 at 12:03
  • @mkl: I've tried both 1.7.0_40 and 1.7.0_51 with the same result. – Bo Öberg Oct 11 '14 at 03:21
  • Remember that the JVM uses 32 bit int's to index arrays. The PdfStamper / PdfReader / etc. are probably going to use a large byte array somewhere in the process (looks like StreamUtil.java:74 in the stack trace). If that array needs to be bigger than what the JVM will allow -- boom! I've read other pages that suggest that the max size of any array is somewhere between 1 and 2GB. (One would think it would be just under 2GB, given the 32 bit index, but in practice it seems like it can be considerably smaller.) – Charles Roth Apr 19 '16 at 20:09

3 Answers3

1

You can use command-line parameters to increase the amount of memory available to Java. Here is an example of the command-line parameters that I use - you should change the numbers as appropriate for your needs and system memory capacity:

Xms256m -Xmx1024m -XX:+DisableExplicitGC -Dcom.sun.management.jmxremote
-XX:PermSize=256m -XX:MaxPermSize=512m
Bruce
  • 8,202
  • 6
  • 37
  • 49
Nirav Prajapati
  • 2,987
  • 27
  • 32
0

Some options of what you could do:

  1. Tell the JVM (which executes your Groovy code and the PdfStamper inside) to allow use of more memory (-Xmx etc. Consult your JVM documentation).
  2. Find an implementation that does not require to load the complete PDF into memory (at least not at once).

(I'm wondering why the implementation of iText and its PdfStamper is not efficient enough to accomplish your task without using a huge amount of memory...)

Alexis Pigeon
  • 7,423
  • 11
  • 39
  • 44
Johannes
  • 150
  • 3
  • 13
0

The error says "Requested array size exceeds VM limit" - the maximum size of an array is around 2GB (Integer.MAX_VALUE). The question is what VM are you running? 32 GB or 64 GB? You may try the following options (64 Bit VM):

-XX:+UseCompressedOops

Lonzak
  • 9,334
  • 5
  • 57
  • 88