Can't update/displayed PDF Form with python pdfrw lib

Question

I have some questions about pdf form filling. First let me give you some context : i am trying to make a 100% python pdf form filling service, and for that i am using the pdfrw lib.

Here is my code, it takes as arguments a pdf path and data_dict (json turn into a dict) :

import pdfrw

_ANNOT_KEY = "/Annots"
_ANNOT_FIELD_KEY = "/T"
_ANNOT_VAL_KEY = "/V"
_ANNOT_RECT_KEY = "/Rect"
_SUBTYPE_KEY = "/Subtype"
_WIDGET_SUBTYPE_KEY = "/Widget"

def fill_pdf_with_values(input_pdf_path, data_dict):

    template_pdf = pdfrw.PdfReader(input_pdf)
    template_pdf.Root.AcroForm.update(
        pdfrw.PdfDict(NeedAppearances=pdfrw.PdfObject("true"))
    )
    annotations = template_pdf.pages[0][_ANNOT_KEY]
    
    for page in template_pdf.pages:
        for annotation in annotations:
            if annotation[_SUBTYPE_KEY] != _WIDGET_SUBTYPE_KEY:
                continue
            if not annotation[_ANNOT_FIELD_KEY]:
                continue
            key = annotation[_ANNOT_FIELD_KEY][1:-1]
            if key not in data_dict.keys():
                continue
            if isinstance(data_dict[key], bool):
                if data_dict[key]:
                    # If the value is True then the checkbox will be checked
                    # "On" is not necessary, by that i mean you can put whatever you want,
                    # but without this line we cant get the checkbox to works..
                    # annotation.update(pdfrw.PdfDict(AS=pdfrw.PdfName("On")))
                    annotation.update(
                        pdfrw.PdfDict(AP=data_dict[key], AS=pdfrw.PdfName("On"))
                    )
                else:
                    # If the value is False then we dont want the checkbox to be checked
                    # annotation.update(pdfrw.PdfDict(AS=pdfrw.PdfName("Off")))
                    annotation.update(
                        pdfrw.PdfDict(AP=data_dict[key], AS=pdfrw.PdfName("Off"))
                    )
                continue
            annotation.update(pdfrw.PdfDict(AP=data_dict[key], V=data_dict[key]))
    
    output_pdf = pdfrw.PdfWriter()
    output_pdf.write("test.pdf", template_pdf)

But i struggle to make it works. Here is my 2 problems :

Depending of the pdf viewer, the data in the text field are not displayed, same for my checkbox. I dont have enough knowledge about PDF to tell the difference between each viewer, what am i supposed to have for it to be displayed in any cases?
I also have a big problem with one particular field => i can edit it when i open the "cleaned" pdf, but when i pass it through my code, nothing is written and the text is no editable ... Also when i print the corresponding annotation, for the the "bugged one", here is what i get (before filling):

annotation = {'/AP': {'/N': (216, 0)}, '/DA': '(/Helv 0 Tf 0 g)', '/DR': {'/Encoding': {'/PDFDocEncoding': {'/Differences': ['24', '/breve', '/caron', '/circumflex', '/dotaccent', '/hungarumlaut', '/ogonek', '/ring', '/tilde', '39', '/quotesingle', '96', '/grave', '128', '/bullet', '/dagger', '/daggerdbl', '/ellipsis', '/emdash', '/endash', '/florin', '/fraction', '/guilsinglleft', '/guilsinglright', '/minus', '/perthousand', '/quotedblbase', '/quotedblleft', '/quotedblright', '/quoteleft', '/quoteright', '/quotesinglbase', '/trademark', '/fi', '/fl', '/Lslash', '/OE', '/Scaron', '/Ydieresis', '/Zcaron', '/dotlessi', '/lslash', '/oe', '/scaron', '/zcaron', '160', '/Euro', '164', '/currency', '166', '/brokenbar', '168', '/dieresis', '/copyright', '/ordfeminine', '172', '/logicalnot', '/.notdef', '/registered', '/macron', '/degree', '/plusminus', '/twosuperior', '/threesuperior', '/acute', '/mu', '183', '/periodcentered', '/cedilla', '/onesuperior', '/ordmasculine', '188', '/onequarter', '/onehalf', '/threequarters', '192', '/Agrave', '/Aacute', '/Acircumflex', '/Atilde', '/Adieresis', '/Aring', '/AE', '/Ccedilla', '/Egrave', '/Eacute', '/Ecircumflex', '/Edieresis', '/Igrave', '/Iacute', '/Icircumflex', '/Idieresis', '/Eth', '/Ntilde', '/Ograve', '/Oacute', '/Ocircumflex', '/Otilde', '/Odieresis', '/multiply', '/Oslash', '/Ugrave', '/Uacute', '/Ucircumflex', '/Udieresis', '/Yacute', '/Thorn', '/germandbls', '/agrave', '/aacute', '/acircumflex', '/atilde', '/adieresis', '/aring', '/ae', '/ccedilla', '/egrave', '/eacute', '/ecircumflex', '/edieresis', '/igrave', '/iacute', '/icircumflex', '/idieresis', '/eth', '/ntilde', '/ograve', '/oacute', '/ocircumflex', '/otilde', '/odieresis', '/divide', '/oslash', '/ugrave', '/uacute', '/ucircumflex', '/udieresis', '/yacute', '/thorn', '/ydieresis'], '/Type': '/Encoding'}}, '/Font': {'/Helv': {'/BaseFont': '/Helvetica', '/Name': '/Helv', '/Subtype': '/Type1', '/Type': '/Font'}}}, '/F': '4', '/FT': '/Tx', '/P': (12, 0), '/Rect': ['453.96', '455.04', '749.16', '463.2'], '/Subtype': '/Widget', '/T': '(Nomdusage)', '/TU': '(Nomdusage)', '/Type': '/Annot'}

when for all the other one that are supposed to be used the same way i get :

annotation = {'/DA': '(/Helv 12 Tf 0 g)', '/F': '4', '/FT': '/Tx', '/MK': {}, '/P': (12, 0), '/Rect': ['129.105', '454.669', '395.032', '463.725'], '/Subtype': '/Widget', '/T': '(Nomdenaissance)', '/TU': '(Nomdenaissance)', '/Type': '/Annot'}

With this, i cant tell if i am doing something wrong... my opinion is that the "clean" pdf has a bad annotation implementation for it to works, i tried a lot of differents things, but it turns out that i cant find the solution on internet.

If needed i can provide the pdf and a data_set.

Thanks for reading and your time! Hope you can help me with this :)

Can't update/displayed PDF Form with python pdfrw lib

0 Answers0