Java : PDF page preview error, expected xref

Question

I am trying to create a preview for PDF files which are created by Ballasamic Mockups. Around 50% of the time, I am not getting a preview and getting xref missing error. What am I doing wrong?

Error log :

com.sun.pdfview.PDFParseException: Expected 'xref' at start of table
at com.sun.pdfview.PDFFile.readTrailer(PDFFile.java:974)
at com.sun.pdfview.PDFFile.parseFile(PDFFile.java:1175)
at com.sun.pdfview.PDFFile.<init>(PDFFile.java:126)
at com.sun.pdfview.PDFFile.<init>(PDFFile.java:102)

Code :

private byte[] onlyCreatePdfPreview(String path, int attachId) {
    try {
        File file = new File(path);
        RandomAccessFile raf = new RandomAccessFile(file, "r");
        FileChannel channel = raf.getChannel();
        ByteBuffer buf = channel.map(FileChannel.MapMode.READ_ONLY, 0, channel.size());

        PDFFile pdffile = new com.sun.pdfview.PDFFile(buf);
        PDFPage page = pdffile.getPage(0);
        Rectangle rect = new Rectangle(0, 0,
            (int) page.getBBox().getWidth(),
            (int) page.getBBox().getHeight());
        java.awt.Image img = page.getImage(
            rect.width, rect.height, //width & height
            rect, // clip rect
            null, // null for the ImageObserver
            true, // fill background with white
            true  // block until drawing is done
        );

        BufferedImage buffered = toBufferedImage(img);
        buffered = Scalr.resize(buffered, Scalr.Method.ULTRA_QUALITY, 400, 250);
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        ImageIO.write(buffered, "png", baos);
        baos.flush();
        return baos.toByteArray();

    } catch (Exception e) {
        e.printStackTrace();
    }
}

What am I doing wrong? Thank you.

Final Working code

      try {
            String pdfPath = zipLocation + String.valueOf(new BigInteger(130, random).toString(32));
            PdfReader reader = new PdfReader(path);
            PdfStamper pdfStamper = new PdfStamper(reader,new FileOutputStream(pdfPath));
            pdfStamper.getWriter().setPdfVersion(PdfWriter.PDF_VERSION_1_4);
            pdfStamper.close();
            reader.close();
     RandomAccessFile raf = new RandomAccessFile(pdfPath, "r");
            FileChannel channel = raf.getChannel();
            ByteBuffer buf = channel.map(FileChannel.MapMode.READ_ONLY, 0, channel.size());
             PDFFile pdffile = new com.sun.pdfview.PDFFile(buf);
                PDFPage page = pdffile.getPage(0);
                Rectangle rect = new Rectangle(0, 0,
                        (int) page.getBBox().getWidth(),
                        (int) page.getBBox().getHeight());
                java.awt.Image img = page.getImage(
                        rect.width, rect.height, //width & height
                        rect, // clip rect
                        null, // null for the ImageObserver
                        true, // fill background with white
                        true  // block until drawing is done
                );

                BufferedImage buffered = toBufferedImage(img);
                buffered = Scalr.resize(buffered, Scalr.Method.ULTRA_QUALITY, 400, 250);
            ByteArrayOutputStream  baos = new ByteArrayOutputStream();
                ImageIO.write(buffered, "png", baos);
                baos.flush();
         return baos.toByteArray();
}//catch block

Apparently `com.sun.pdfview.PDFFile` expects the cross references with **xref**. But this expectation only makes sense for PDFs following a PDF reference up to revision 3 (version 1.4) published November 2001; PDFs following a later reference or even the ISO 32000 standard (part 1 or 2) have the choice of using a cross reference stream (starting with an object number) instead of a cross reference table (starting with **xref**). Thus, you should switch to using software following newer specifications than some more than 15 years old version. — mkl, Jun 19 '17 at 19:26
@mkl this is the latest version of the software running on windows 10. Any idea how I can set this element manually? Thanks. — We are Borg, Jun 20 '17 at 02:39
You have to pre-process the PDF by loading it and saving with PDF version 1.4 compatibility. You can do that manually (e.g. using Adobe Acrobat) or automatized (e.g. using iText). — mkl, Jun 20 '17 at 07:07
My comments above, by the way, assume that the *PDF files which are created by Ballasamic Mockups* are valid to start with. But I assume you have checked that. After all, if they were not valid, an exception from `com.sun.pdfview.PDFFile` would be the most correct reaction... — mkl, Jun 20 '17 at 07:11
@mkl : I will check out how to do that with Itext, but the problem is it happens randomly. — We are Borg, Jun 20 '17 at 07:48
@mkl : I changed the PDF version, I get no error, but the preview doesn't look good, just a white thumbnail. I have added the updated code at bottom of main post, can you please check it out. Thanks. — We are Borg, Jun 20 '17 at 08:23
You do not change the version of the existing PDF. You replace the existing PDF with a new PDF which only contains a single space character. To manipulate an existing PDF, please use a `PdfReader` instance to read the existing PDF and a `PdfStamper` to manipulate and store the manipulated PDF. — mkl, Jun 20 '17 at 09:30
@mkl : Thank you for the comment. I did it now with PdfReader as you suggested, but it's actually making the pdf file a 0 byte file. I have added the code at bottom of main post. There is no method in pdfStamper available to directly set version, so I had to get its writer to do it. Can you please have a look. Thank you. — We are Borg, Jun 20 '17 at 10:52
You have to close the `PdfStamper`, and you have to do this *before* closing the `PdfReader`. — mkl, Jun 20 '17 at 13:12
@mkl : pdfStamper.close() gives a java.io.EOFException.. Sorry, seems like I am bugging u with this problem, but no other option I have. Thanks a lot for your patience. I only added pdfStamper.close() before the reader.close(). — We are Borg, Jun 21 '17 at 07:36
@mkl : I also tried upgrading the itext version, from 5.4.* to latest, but any of them is simply causing a JDK core dump to initiate. — We are Borg, Jun 21 '17 at 09:31
Looking at the details you appear to use the same file path for the `PdfReader` and the `FileOutputStream` you created for the `PdfStamper` to write to. This causes issues because your original file is truncated before the `PdfStamper` had a chance to copy it all to its output. Please use different paths. — mkl, Jun 21 '17 at 09:54
@mkl : This worked finally. Thanks a lot. Can you please post an answer for me to accept? Thank you. — We are Borg, Jun 21 '17 at 10:47
Do you still have access to the code of this solution? There are two things that are marked as error in my code, `toBufferedImage(img)` and `buffered = Scalr.resize(buffered, Scalr.Method.ULTRA_QUALITY, 400, 250);` Thanks in advanced @WeareBorg — Rodolfo Velasco, Oct 03 '20 at 00:19

score 5 · Accepted Answer · answered Jun 21 '17 at 16:33

(This answer collects information from comments to the question which eventually led to a solution.)

Apparently com.sun.pdfview.PDFFile expects the cross references to start with xref. But this expectation only makes sense for PDFs following a PDF Reference up to revision 3 (version 1.4) published November 2001; PDFs following a later Reference or even the ISO 32000 standard (part 1 or 2) have the choice of using a cross reference stream (starting with an object number) instead of a cross reference table (starting with xref).

Thus, one should switch to using software following newer specifications than some more than 15 years old version or one has to convert one's PDFs to follow the old specifications, at least on the surface.

One can convert manually (e.g. using Adobe Acrobat) or automatized (e.g. using iText). (These examples software products really are only examples, other products can also be used for this task.)

If using a current iText 5 version, the conversion looks like this:

PdfReader reader = new PdfReader(SOURCE);
PdfStamper stamper = new PdfStamper(reader, DEST);
stamper.getWriter().setPdfVersion(PdfWriter.PDF_VERSION_1_4);
stamper.close();
reader.close();

And one has to take care that if SOURCE is a file name or a random access file, DEST must not be a file output stream to the same file. Otherwise the original file is truncated before the PdfStamper had a chance to copy it all to its output.

Java : PDF page preview error, expected xref

1 Answers1