1

I'm trying to fill an XFA form using the example file using PdfBox 2 or 3 from: https://issues.apache.org/jira/secure/attachment/12964530/XFAFormFiller.java

I'm not so used to PDFBox so not sure how to correct it.

  1. I get compile errors. Basically how can I make it compile with latest PDFBox? Are my corrections below correct?
Line 107 I tried to fix with
Set<COSDictionary> objectsToWrite = new HashSet<>();

Line 121 I tried to fix with
COSWriter writer = new COSWriter(fos, new RandomAccessReadBuffer(baos.toByteArray()), objectsToWrite);

Line 128 I tried to fix with:
objectsToWrite.add(dataSetsStream);
  1. What format should the XML input file be? Can someone give an example? I get a corrupt PDF out with my corrections and feeding it the XML generated from dataSetsStream. The file I feed:
<xfa:datasets xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/"
><xfa:data
><mycompany
>
.....
><mycompany
></dd:dataDescription
></xfa:datasets
>

Any help is appreciated.

Miyagi
  • 154
  • 2
  • 17
  • My code fixes actually worked. The reason I could not get it to work was that I read in the XML-file (which was in ISO-8859-1 encoded) as UTF-8 encoding and then the viewer got confused and complained about bad xml tag. So my fix (for my case) was to change Line 92 to following instead: InputStreamReader reader = new InputStreamReader(xmlData, StandardCharsets.ISO_8859_1); – Miyagi Nov 18 '21 at 14:11
  • But if someone PDFBox expert could verify my fixes are correct so I can help PDFBox document it. – Miyagi Nov 18 '21 at 14:12
  • 1
    The `XFAFormFiller` class you found was an attachment to some issue in the Apache Jira, so it is not an example file and not part of the PDFBox code base; this is also illustrated by the package `com.airvoyant.pdf` of the class. Strictly speaking neither the original `"UTF-8"` nor your `StandardCharsets.ISO_8859_1` is correct for generic use. The correct fix would be to not use a `InputStreamReader` at all: what the code reads from the `InputStreamReader`, it writes to an `Outputstream`, so the decoding by the `InputStreamReader` is incorrect in any way. – mkl Aug 02 '22 at 09:47

0 Answers0