0

I have a pdf. The xfa:datasets node looks like this:

<xfa:datasets xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/"><xfa:data xfa:dataNode="dataGroup" /></xfa:datasets>

Thus I can extract no information about the underlying schema, so I can't generate xml to fill this.

the pdf uses javascript to add rows to the tables, in the following manner:

this.resolveNode('document_table.Table3._user_input').addInstance(1);

using iTextSharp in the Xfa.TemplateSom, I can find references to the form elements. In the DomDocument I can search and find the various elements in the xml, but this is where I'm stuck. There seems to be no relationship to the XFA data, I tried the XFA Worker demo and the demo confirmed the pdf has no XFA data.

I'm unsure what type of form this is if it's not XFA and not AcroFields.

Bruno, I bought your book and while I haven't read it cover to cover, it doesn't cover this scenario so far as I can tell. Any insight anyone has on how this type of form can be filled, I would appreciate it.

Right now, I'm wondering if I have to append javascript which mimicks the actions and sets the data. Since there are many forms, each with a different layout, parsing through all of them manually to get the hierarchy and replicate the javascript on my own is onerous and frankly I'm not sure this approach will be successful.

Erikest
  • 4,997
  • 2
  • 25
  • 37

1 Answers1

4

The form is an XFA form, and the fact that the DataDescription is missing is a pity, but it's not abnormal. The description of the data is optional, not mandatory.

If it's your intention to fill the form with iText, then you should forget about JavaScript to add rows. You should not confuse manual data entry —where a user enters the data row by row, adding rows manually— with automated data entry —where you provide data in the form of an XML file. If you have a table with rows, then those rows will be populated with as many rows as there are data rows in your source XML.

The real question is: How do you compose that XML if there is no data description?

  • There's the ideal way: you ask the person who created the form for the schema of the data. He or she provided a data source when creating the form, so he or she knows how the data should be organized in the XML. Unfortunately, you don't always have access to the people who created the form.
  • There's the hard way: you examine the XFA XML, and you look at the data bindings in the template. I would need the XFA spec if I was given that task; I wouldn't know where to start if someone asked me off the cuff.
  • There's the pragmatic way: this is the workaround I always use. I fill out the form manually in Adobe Acrobat. I fill out every possible field, and then I save the filled out form. Then I extract the data XML. I use that data XML to create an XSD (there are tools for that). This gives me an approximation of the data description that is expected by the XFA form. Obviously, this XSD won't be perfect because it might not catch all the rules and restrictions that were added to the form, but in most of the cases I encountered, this was sufficient (and when it wasn't sufficient, some minor tweaks to the XSD did the trick).

Once you know the expected structure of the XML, you shouldn't have to worry about JavaScript. The JavaScript that is there to deal with user actions is irrelevant in the context of automatic form filling. The only JavaScript that matters in the context of filling an XFA form with iText, is the JavaScript that does automatic formatting of data (e.g. format a date, a currency,...).

rhens
  • 4,791
  • 3
  • 22
  • 38
Bruno Lowagie
  • 75,994
  • 9
  • 109
  • 165
  • Yes, I definitely didn't want to have to go down a javascript append route, but was unsure what the options were given the lack of description of the data. The ideal way may be possible for me, I'll have to push pretty hard on a few levers. However, is it possible that they did not specify an XML data source but merely built the form field by field? I ask this because the TemplateSom and the javasript indicate a field hierarchy that would be odd to arrive at by first deliberately creating an xml schema or source document. – Erikest May 10 '17 at 19:54
  • The pragmatic way, I have used before, not realizing there was even a DataDescriptions element possible, and for simple forms at small numbers it works good. In this case, I have a hundred or so of varying complexity to start and more could pile up at any moment along with biannual changes, so I was hoping for a solution that doesn't involve a lot of manual form filling. Related to that, is there a way the form designer could tell LiveCycle Designer to include the DataDescription element and infer it from the form in its current state? – Erikest May 10 '17 at 20:01
  • It is possible that an XFA form was filled *field by field*, but to me (and probably also to you) that feels like a very cumbersome way to create the form. As for forcing LiveCycle Designer to include a DataDescription (or to create one after the fact), I don't know LiveCycle Designer well enough to answer that question. Maybe @rhens knows? – Bruno Lowagie May 11 '17 at 07:21
  • When the source of the work is non-technical people in government, you can almost be assured that they did it the long and hard way with little concern for the downstream effects. At least that has been my repeated experience. – Erikest May 11 '17 at 21:52
  • Your pragmatic way is really working, thanks. In fact to be generated datasets xml PDF should be changed and saved. After the save I removed entered data and datasets still presented. – nikolai.serdiuk Dec 16 '20 at 12:17