0

I am using iText7.NET. A third party has provided PDF's with fields, the fields are present and Adobe Acrobat seems to have no issues opening and displaying the PDF, but in iText the fields collection is empty.

I've seen the answer at ItextSharp - Acrofields are empty and the related knowledge-base articles on iText's site, but the fix does not work in my case, as form.getAsArray(PdfName.FIELDS) returns null, so it cannot be added to.

Also I've checked for Xfa and that does not seem to present

XfaForm xfa = form.GetXfaForm();
xfa.IsXfaPresent()  // returns false

Is it possible to add PdfName.FIELDS to the document and then populate?

Thank You

ju209192
  • 33
  • 5
  • Please share the pdf for analysis. – mkl Aug 21 '21 at 06:22
  • Unfortunately its full of private data, when redacted by deleting info, saving in Acrobat, then it works – ju209192 Aug 21 '21 at 19:19
  • Indeed, that code should not only skip links, it should skip everything except widgets. As an aside, though, you should post solutions as answers instead of edits to your question. That way you can eventually mark that answer as accepted, working answer. – mkl Aug 22 '21 at 06:07
  • Thanks @mkl, I updated to post as an answer and made the filtering more specific to subtype "Link" – ju209192 Aug 22 '21 at 19:28

1 Answers1

0

So I think I have figured out what causes the issue and have a short term fix for my particular case. In this document some fields are sub type "Link", not "Widget" and the fix code I was using (based on link above which most likely came from here https://kb.itextsupport.com/home/it7kb/faq/why-are-the-acrofields-in-my-document-empty) will fail. My fix is is to skip sub type link, although perhaps a better solution exists that would not skip Links, which I don't need.

If I don't skip Links, when the saved PDF is loaded again it fails on

            PdfAcroForm form = PdfAcroForm.GetAcroForm(pdfDoc, true);

In the lower level code in itext.forms, IterateFields() is called and within that it passes formField.GetParent() as a parameter to PdfFormField.MakeFormField, GetParent() returns null for the Link fields so there is an exception.

Below is the RUPS hierarchy to the first subtype Link field that causes a problem

enter image description here

So the solution at the moment to fix my particular issue is to skip sub type links. The code is as follows

            PdfReader reader = new PdfReader(pdf);
            MemoryStream dest = new MemoryStream();
            PdfWriter writer = new PdfWriter(dest);
            PdfDocument pdfDoc = new PdfDocument(reader, writer);
            PdfCatalog root = pdfDoc.GetCatalog();
            PdfDictionary form = root.GetPdfObject().GetAsDictionary(PdfName.AcroForm);
            PdfArray fields = form.GetAsArray(PdfName.Fields);
            if (fields == null)
            {
                form.Put(PdfName.Fields, new PdfArray());
                fields = form.GetAsArray(PdfName.Fields);
            }
            for (int i = 1; i <= pdfDoc.GetNumberOfPages(); i++)
            {
                PdfPage page = pdfDoc.GetPage(i);
                var annots = page.GetAnnotations();
                for (int j = 0; j < annots.Count(); j++)
                {
                    PdfObject o = annots[j].GetPdfObject();
                    PdfDictionary m = o as PdfDictionary;
                    string subType = m?.GetAsName(PdfName.Subtype)?.GetValue() ?? "";
                    if (subType != "Link")
                    {
                        fields.Add(o);
                        fields.SetModified();
                    }
                }
            }
            pdfDoc.Close();

ju209192
  • 33
  • 5