0

I have an XFA form, which I want to fill automatically with either Python or C# or both.

It would be easy if the form data was in datasets like in usual XFA pdfs, but it is not. Here is an example of some data in the datasets:

<?xml version="1.0" encoding="UTF-8"?>
<topmostSubform
><Effacer
/><Rangée3 xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/" xfa:dataNode="dataGroup"
/><table2
><Rangée1
><colG
><positioner xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/" xfa:dataNode="dataGroup"
/></colG
><colD xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/" xfa:dataNode="dataGroup"
/></Rangée1
></table2
><positioner1 xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/" xfa:dataNode="dataGroup"
/><table2
><Rangée1
><colG
><positioner xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/" xfa:dataNode="dataGroup"
/></colG 

It points to an other datagroup which is in position 15 in the file XDP. Using python, I could easily access the XDP, here is its configuration:

[xdp:xdp 115 0 R config 2 0 R template 3 0 R datasets 116 0 R localeSet 5 0 R xmpmeta 6 0 R xfdf 7 0 R form 117 0 R </xdp:xdp> 118 0 R ]

I found all the form data in the "form" stream, here is a part of it:

<form xmlns="http://www.xfa.org/schema/xfa-form/2.8/" checksum="8cn7XfZQ5SK/27lJmAiqCnPmd4M=">
  <subform name="topmostSubform">
    <field name="pMultilineModified">
      <value>
        <text>N</text>
      </value>
      <assist>
        <toolTip/>
      </assist>
    </field>
    <instanceManager name="_page1"/>
    <subform name="page1">
      <instanceManager name="_sf0"/>
      <subform name="sf0">
        <instanceManager name="_container1"/>
        <subform name="container1">
          <instanceManager name="_positioner"/>
          <subform name="positioner">
            <instanceManager name="_sf_numeroEvenement"/>
            <subform name="sf_numeroEvenement">
              <instanceManager name="_container"/>
              <subform name="container">
                <instanceManager name="_Figure"/>
                <subform name="Figure">
                  <field name="NumeroEvenement">
                    <assist>
                      <toolTip/>
                    </assist>
                    <value override="1">
                      <text>12234445522</text>
                    </value>
                  </field>
                </subform>
              </subform>
            </subform>
            <field name="txt0_UnitePlaignante">
              <assist>
                <toolTip/>
              </assist>
              <value override="1">
                <text>200</text>
              </value>
            </field>
            <field name="Effacer">
              <assist>
                <toolTip/>
              </assist>
            </field>
          </subform>
        </subform>

So the real question is:

How can I with iText7 get this "form" stream in position 15 of the XDP PdfObject, modify it to put it back in the XDP and put the XDP back in the pdf? To replicate what I'm trying to do, one could try getting any other element from the XDP than datasets, modifying it or not, putting it back in the pdf after this. I wasn't able to.

I tried; Pypdf2 PDFNet (python and C#), iText7..

I am desperate, I've been trying for weeks now with no solution. I obviously can't use iText FillXfaForm method, since it modifies the datasets and I want to modify the form.

Gab ПК
  • 43
  • 4
  • `datagroup which is in position 15 in the file XDP` - where do you get this position and what do you mean by this position? Can you attach sample PDF? – Alexey Subach Sep 11 '21 at 14:34
  • Which tools are you going to consume the result with? I can tell you for sure that if you modify something in the `
    ` DOM then your result will be as if you erased it completely if you open up the file in Acrobat. This is because `
    ` DOM has the checksum in it which is calculated based on the data in that Form DOM. And the algorithm to calculate the checksum is proprietary and not published.
    – Alexey Subach Sep 11 '21 at 14:36
  • In general depending on what you want to achieve in the end it might be or might not be possible with iText, and the options will differ. What you asked specifically (changing the `
    ` XML is very easy, but it's very likely that it will not point you to the desired result)
    – Alexey Subach Sep 11 '21 at 14:37
  • 1- I got this position from the PdfObject, like an array ; 2- I don't want to use data from the pdf, but put data in it.. is there a way to modify the pdf so there is no checksum? How do I control if the pdf has its data in the or in the
    DOM?.. ; 3- I understand that if I modify the
    DOM, my pdf is gone, so I need an other alternative.
    – Gab ПК Oct 31 '21 at 16:37

0 Answers0