1

PDFBox setValue() is not setting data for each PDTextField. It is saving few fields. It is not working for fields which have similar appearance in getFullyQualifiedName().

Note: field.getFullyQualifiedName() { customdutiesa, customdutiesb, customdutiesc } it is working for customdutiesa, but not working for customdutiesb and customdutiesc etc...

@Test
public void testb3Generator() throws IOException {
    File f = new File(inputFile);

    outputFile = String.format("%s_b3-3.pdf", "123");

    try (PDDocument document = PDDocument.load(f)) {

        PDDocumentCatalog catalog = document.getDocumentCatalog();
        PDAcroForm acroForm = catalog.getAcroForm();
        int i = 0;
        for (PDField field : acroForm.getFields()) {
            i=i+1;
            if (field instanceof PDTextField) {
                PDTextField textField = (PDTextField) field;
                textField.setValue(Integer.toString(i));
            }
        }

        document.getDocumentCatalog().getAcroForm().flatten();

        document.save(new File(outputFile));
        document.close();
    }
    catch (Exception e) {

        e.printStackTrace();
    }
}

Input pdf link : https://s3-us-west-2.amazonaws.com/kx-filing-docs/b3-3.pdf Ouput pdf link : https://kx-filing-docs.s3-us-west-2.amazonaws.com/123_b3-3.pdf

Tilman Hausherr
  • 17,731
  • 7
  • 58
  • 97
Abubakar
  • 31
  • 1
  • 6

1 Answers1

3

The problem is that under certain conditions PDFBox does not construct appearances for fields it sets the value of, and, therefore, during flattening completely forgets the field content:

// in case all tests fail the field will be formatted by acrobat
// when it is opened. See FreedomExpressions.pdf for an example of this.  
if (actions == null || actions.getF() == null ||
    widget.getCOSObject().getDictionaryObject(COSName.AP) != null)
{
    ... generate appearance ...
}

(org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(String))

I.e. if there is a JavaScript action for value formatting associated with the field and no appearance stream is yet present, PDFBox assumes it does not need to create an appearance (and probably would do it wrong anyways as it does not use that formatting action).

In case of a use case later flattening the form, that assumption of PDFBox obviously is wrong.

To force PDFBox to generate appearances for those fields, too, simply remove the actions before setting field values:

if (field instanceof PDTextField) {
    PDTextField textField = (PDTextField) field;
    textField.setActions(null);
    textField.setValue(Integer.toString(i));
}

(from FillAndFlatten test testLikeAbubakarRemoveAction)

mkl
  • 90,588
  • 15
  • 125
  • 265
  • 1
    @Abubakar Great! Nonetheless you should consider Tilman's advice from the meanwhile deleted comment to your question: `acroForm.getFields()` does not return a straight list of all fields, only of the top level ones. Often, like in your example document, there only are top level fields, but sometimes there are actual field hierarchies. In that case you should use `acroForm.getFieldTree()` or `acroForm.getFieldIterator()` instead to visit all fields. – mkl Apr 01 '20 at 15:08
  • In this current case i am getting all fields that are being updated. Really this is a bit more helpfull. – Abubakar Apr 02 '20 at 19:49