2

I've taken a sample JPEG 2000 from the fnord examples page.

However, when I try to add that image to the PDF:

PDDocument document = new PDDocument();
PDImageXObject pdImage = pdImage = PDImageXObject.createFromFileByContent(
   "samples/relax.jp2", document);
PDPage page = new PDPage(new PDRectangle(pageWidth, pageHeight));
PDPageContentStream contentStream = new PDPageContentStream(document, page);
contentStream.drawImage(pdImage, matrix);
contentStream.close();

I get the exception:

Caused by: java.lang.IllegalArgumentException: Image type UNKNOWN not supported: relax.jp2 at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.createFromFileByContent(PDImageXObject.java:313)

The PDFBox dependencies that I have in Maven:

    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>pdfbox</artifactId>
        <version>2.0.12</version>
    </dependency>
    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>fontbox</artifactId>
        <version>2.0.12</version>
    </dependency>
    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>jempbox</artifactId>
        <version>1.8.16</version>
    </dependency>       
    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>jbig2-imageio</artifactId>
        <version>3.0.2</version>
    </dependency>
    <dependency>
        <groupId>com.github.jai-imageio</groupId>
        <artifactId>jai-imageio-core</artifactId>
        <version>1.4.0</version>
    </dependency>
    <dependency>
        <groupId>com.github.jai-imageio</groupId>
        <artifactId>jai-imageio-jpeg2000</artifactId>
        <version>1.3.0</version>
    </dependency>

Am I doing something wrong here? Or there is some problem with PDFBox and/or the samples that I'm using?

Other Apache library, Tika, detects this sample file MIME type as image/jp2:

TikaConfig tika = new TikaConfig();
Metadata metadata = new Metadata();
MediaType mimetype = tika.getDetector().detect(
     TikaInputStream.get(new FileInputStream("samples/relax.jp2"), metadata);
informatik01
  • 16,038
  • 10
  • 74
  • 104
9ilsdx 9rvj 0lo
  • 7,955
  • 10
  • 38
  • 77
  • What you could do is to read the file into a BufferedImage. Currently PDFBox does not support direct usage of jp2 files. One could write code for it, but jp2 is not a common format. So the question is, do you have a real application that needs to use these files directly (i.e. without conversion to BufferedImage) or did you just try a few images from everywhere for fun? – Tilman Hausherr Nov 26 '18 at 16:06

1 Answers1

2

From PDFBox's API documentation:

createFromFileByContent()
The following file types are supported: jpg, jpeg, tif, tiff, gif, bmp and png.

Looking into the source code, what gets called inside createFromFileByContent() is their own check for known file types, independent from the underlying libraries, the detection code looks like this: FileTypeDetector.java.

This check does not recognize JPEG 2000.

Actually createFromFileByExtension() might be a better bet:

if ("gif".equals(ext) || "bmp".equals(ext) || "png".equals(ext)) {
    BufferedImage bim = ImageIO.read(file);
    return LosslessFactory.createFromImage(doc, bim);
}

As long as you pretend you have a GIF, BMP or PNG and your ImageIO supports JPEG 2000, this might somewhat work (not tested).

informatik01
  • 16,038
  • 10
  • 74
  • 104
  • Thank you. It really looks like when I'm able to recognize format with Tika, I should rename the input file according to my own rules (like image/jp2 and image/x-jbig2 are .jpg) and then use createFromFileByExtension. It looks like a bug to pdfbox. – 9ilsdx 9rvj 0lo Nov 26 '18 at 14:21