0

I am using PdfBox to read the Xobjects in a pdf, the xobjects are of type Form, I noticed the lower left y and upper right y are of a wrong values, the illustrator/ pdf viewers are showing correct rendering

Here is my code to find the y coordinates

    PDDocument document = PDDocument.load(new File("D:/temp/temp.pdf"));
    PDResources pdResources = document.getPage(0).getResources();
    Iterable<COSName> cosNames = pdResources.getXObjectNames();
    for (COSName cosname : cosNames) {
    PDXObject xobject = pdResources.getXObject(cosname);
    COSStream stream = xobject.getCOSObject();
    PDFormXObject pdxObjectForm = new PDFormXObject(stream);
    System.out.println(pdxObjectForm.getBBox().getLowerLeftY());
    System.out.println(pdxObjectForm.getBBox().getUpperRightY());
}
    document.close();
    // TODO: handle exception
 }  

The actual displayed results are: lower left y : -2494.4902 upper right y: -283.47314

However, the right value for lower left y from illustrator is: 2211

Now I understand that the top left is the 0,0, this is not the issue, the issue is that the value -2494 is out of the trimbox.

You can check the pdf link here: https://www.justbeamit.com/zxime

Tilman Hausherr
  • 17,731
  • 7
  • 58
  • 97
Abadir
  • 65
  • 1
  • 9
  • 1
    That's not the way it works... the bbox does not tell where the xobject form is to be rendered: `These boundaries shall be used to clip the form XObject and to determine its size for caching.` The display position depends on the ctm: `Each time the form XObject is painted by the Do operator, this matrix shall be concatenated with the current transformation matrix to define the mapping from form space to device space.` – Tilman Hausherr Jul 06 '17 at 07:58
  • @TilmanHausherr : can you please expand on your answer please, can you please show the data using pdf box debugger tool, it will be really helpful, where should I look? http://imgur.com/a/R64Cv – Abadir Jul 06 '17 at 08:29
  • Or is there an easy way to read the display options using pdfbox? – Abadir Jul 06 '17 at 08:39
  • Please correct the link to your PDF. The answer is more complex than just looking at the correct entry. The same form xobject can appear at several places. Btw, are you sure you really need form xobjects and not the position of some acroform fields? – Tilman Hausherr Jul 06 '17 at 09:06
  • @TilmanHausherr Hi, I managed to get the display position, your answer was the key, really appreciated :D, yes I do need the xobjects positions for now. thanks again, you are the man – Abadir Jul 06 '17 at 11:28
  • @TilmanHausherr Hi, can I contact you privately? I want to know if I am doing the right steps, I will update the question with the right answer, I cannot post the data here because the pdf is private.. – Abadir Jul 13 '17 at 08:49
  • tilman at snafu dot de – Tilman Hausherr Jul 14 '17 at 09:13

1 Answers1

1

The bbox, by itself, does not tell where the xobject form is to be rendered: These boundaries shall be used to clip the form XObject and to determine its size for caching. The display position depends on the ctm (= current transformation matrix): Each time the form XObject is painted by the Do operator, this matrix shall be concatenated with the current transformation matrix to define the mapping from form space to device space.

Take the PrintImageLocations.java example from the source code download or from the repository.

You'll find this segment:

else if(xobject instanceof PDFormXObject)
{
    PDFormXObject form = (PDFormXObject)xobject;
    showForm(form);
}

change it to this:

else if(xobject instanceof PDFormXObject)
{
    PDFormXObject form = (PDFormXObject)xobject;

    PDRectangle bbox = form.getBBox();
    Matrix ctm = getGraphicsState().getCurrentTransformationMatrix().clone();
    ctm.concatenate(form.getMatrix());
    System.out.println("Found form [" + objectName.getName() + "]");
    System.out.println("bbox: " + bbox);
    Rectangle2D transformedBBox = bbox.transform(ctm).getBounds2D();
    System.out.println("bbox transformed: " + transformedBBox);

    showForm(form);
}

Note that the transformed bbox is the bounds of the xobject form, but it is also used as a clipping rectangle, and that is is intersected with the current clipping area so in some cases you may not always see everything.

To verify the coordinates of "bbox transformed", open the file with the PDFDebugger command line application. Move the cursor around until the numbers match.

(We had some discussion off-site. I was also asked about other shapes; these are vector graphics. This answer shows how to get them)

Tilman Hausherr
  • 17,731
  • 7
  • 58
  • 97
  • Thanks a lot Tilman for your help, without you it would not be possible for me to solve it. Thanks again. – Abadir Jul 20 '17 at 19:56