Special characters in PDF form fields and global and fieldbased DR

Question

I have a question regarding a weird form field behaviour.

Two pdf documents, both have textfield(s) using Helvetica as a font
Both are filled with values using the same iText logic (cp. below)

The field value (/V) is correct for both PDFs however the field appearance is not. One Pdf is working fine the other scrambles special character like the euro symbol € or German characters like üöäß. I tried to define a substitute font (as described in the book) however never got € and ß to work.

The only difference I could find is that a /DR dictionary is defined on field level for the non-working PDF (in adition to the global one). But if I remove it, the € sign still doesn't work. Please note, that I am not talking about asian or some exotic unicode characters here - all are part of the standard helvetica font (as the other PDF proves)

Question(s):

Any ideas how to get the non working PDF to correctly display the characters?
Or does the PDF violates the pdf spec somehow? (It was created using Acrobat which makes that unlikely but not impossible).
If you suggest to replace the form field font - how can I differentiate between working and non working PDF files since I don't want to do that for perfectly valid and working files

Update: The code is not the problem (I am certain of that since its the same code for both) however for the sake of completeness here it is:

AcroFields acroFields = stamper.getAcroFields();
try {
    boolean successful = acroFields.setField("Mitarbeiter", "öäüß€@");
    if (!successful) {
        //throw some exception
    }
}
catch (DocumentException de) {
    //some exceptionhandling
}

And also: can you post links to the original forms, not to the forms that are filled out. I see that the value is stored correctly in both forms, but the appearance is wrong in one of the PDFs. This means that iText didn't have access to the font at the time the form was filled out. — Bruno Lowagie, Jan 13 '15 at 13:29
exactly - both values are correctly stored into the /V - you can see that if you click inside the form. However the appearance is broken for the one. — Lonzak, Jan 13 '15 at 13:49
From what I see at first sight, it's an encoding problem. When I fill out a form that expects Helvetica with Winansi encoding, all is well. When I fill out your form that expects Helvetica with PdfDocEncoding the wrong characters are selected (because the wrong encoding is used). — Bruno Lowagie, Jan 13 '15 at 15:08
That is exactly what I debugged. There is a different constructor for the font called one with winansi and the other not. I just saw that a times new roman winansi font is defined on the page level. Does this trigger it? — Lonzak, Jan 13 '15 at 15:22
Aha, now I see the remark you added to your update. I'll have to make this an internal ticket at iText for this. I have a 2-day board meeting coming up and I don't have the time to dig into this problem. — Bruno Lowagie, Jan 13 '15 at 15:28
Please take a look at [this form](http://itextpdf.org/documents/form_helv.pdf). I have added an `/Encoding` entry to the font dictionary. This `/Encoding` entry refers to the PDFDocEncoding dictionary that is already present in the `/DR`. Now the text is rendered correctly. I checked the PDF ISO standard and `/Encoding` is not listed as a possible entry in the resource dictionary (`/DR`). iText expects it to be in the font dictionary but it isn't there. Your problem is caused by the fact that iText doesn't find the encoding. I'm not sure if your PDF is correct according to the standard. — Bruno Lowagie, Jan 13 '15 at 16:04

score 1 · Accepted Answer · answered Jan 13 '15 at 16:43

1

I didn't find any clues in the PDF reference about this, but the font that is used for the field doesn't define an encoding. However: an encoding is defined at the level of the resource dictionary (/DR). If you use that encoding, then the appearance of the field is created correctly. Note that the ISO specification doesn't say anything about the existence of an /Encoding entry at the level of the resource dictionary.

I've made a small update to iText. You can check the changes in revision 6693. This way, iText will now check if the /DR dictionary has encoding values in case no encoding is defined at the level of the font. With this fix, your form is filled out correctly.

answered Jan 13 '15 at 16:43

Bruno Lowagie

75,994
9
109
165

Highly appreaciate that - thank you very much! On a side note: That PDF was generated by Adobe Acrobat so it is their "bug". Maybe Leonard can shed some light on that... – Lonzak Jan 13 '15 at 18:54
I don't know if it's a bug. Leonard usually says: if something isn't excluded in the spec, it isn't forbidden. However: I prefer clarity ;-) – Bruno Lowagie Jan 13 '15 at 21:15

Special characters in PDF form fields and global and fieldbased DR

1 Answers1

Linked