0

My project is to add a digital signature to an existing PDF if Windows, in C++, with PAdES support. Studying base aspects of the PDF format and the output of jSignPdf, I 've managed to be able to sign successfully any existing PDF file and Adobe Acrobat finds the PAdES B-T level signature successfully.

My problem is now to preserve existing PDF information. As far as I have understood it, adding extra information to an existing PDF requires:

  • Creating a news ection %%EOF, startxref, trailer, xref table.
  • Creating some new objects to describe the signature.

jSignPDF (and my tool) creates 9 new objects, 8 of them are self-describing (i.e. they reference parts to the same section:

12 0 obj
<</F 132/Type/Annot/Subtype/Widget/Rect[0 0 0 0]/FT/Sig/DR<<>>/T(Signature1)/V 10 0 R/P 9 0 R/AP<</N 11 0 R>>>>
endobj
10 0 obj
<</Contents<....> 
/Type/Sig/SubFilter/ETSI.CAdES.detached/M(D:20181004191406+00'00')/ByteRange [0 829 60831 1193]/Filter/Adobe.PPKLite>>
endobj
13 0 obj
<</BaseFont/Helvetica/Type/Font/Subtype/Type1/Encoding/WinAnsiEncoding/Name/Helv>>
endobj
14 0 obj
<</BaseFont/ZapfDingbats/Type/Font/Subtype/Type1/Name/ZaDb>>
endobj
11 0 obj
<</Type/XObject/Resources<</ProcSet [/PDF /Text /ImageB /ImageC /ImageI]>>/Subtype/Form/BBox[0 0 0 0]/Matrix [1 0 0 1 0 0]/Length 8/FormType 1/Filter/FlateDecode>>stream
...
endstream
endobj
9 0 obj
<</Parent 8 0 R/Contents 5 0 R/Type/Page/Resources<</Font<</Helv 13 0 R>>>>/MediaBox[0 0 200 200]/Annots[12 0 R]>>
endobj
8 0 obj
<</Type/Pages/MediaBox[0 0 200 200]/Count 1/Kids[9 0 R]>>
endobj
7 0 obj
<</Type/Catalog/AcroForm<</Fields[12 0 R]/DR<</Font<</Helv 13 0 R/ZaDb 14 0 R>>>>/DA(/Helv 0 Tf 0 g )/SigFlags 3>>/Pages 8 0 R>>
endobj
15 0 obj
<</Producer(AdES Tools)/ModDate(D:20181002132630+03'00')>>
endobj
xref
7 9
.... 
trailer
<</Root 7 0 R/Prev 492/Info 15 0 R/Size 12>>
startxref
61760
%%EOF

The problem is the object 9 in this.

9 0 obj
<</Parent 8 0 R/Contents 5 0 R/Type/Page/Resources<</Font
<</Helv 13 0 R>>>>/MediaBox[0 0 200 200]/Annots[12 0 R]>>

This object refers to the original PDF section. My problem is, how to create a correct reference to it.

The original document is this:

%PDF-1.7

1 0 obj  % entry point
<<
  /Type /Catalog
  /Pages 2 0 R
>>
endobj

2 0 obj
<<
  /Type /Pages
  /MediaBox [ 0 0 200 200 ]
  /Count 1
  /Kids [ 3 0 R ]
>>
endobj

3 0 obj
<<
  /Type /Page
  /Parent 2 0 R
  /Resources <<
    /Font <<
      /F1 4 0 R 
    >>
  >>
  /Contents 5 0 R
>>
endobj

4 0 obj
<<
  /Type /Font
  /Subtype /Type1
  /BaseFont /Times-Roman
>>
endobj

5 0 obj  % page content
<<
  /Length 44
>>
stream
BT
70 50 TD
/F1 12 Tf
(Hello, world!) Tj
ET
endstream
endobj

xref
0 6
0000000000 65535 f 
0000000010 00000 n 
0000000079 00000 n 
0000000173 00000 n 
0000000301 00000 n 
0000000380 00000 n 
trailer
<<
  /Size 6
  /Root 1 0 R
>>
startxref
492
%%EOF

Now I have created a simple PDF parser in C++, and as far as I understand I need to:

  • Find the root object (from trailer)
  • Find the pages descriptor from the root object (in this case, /Pages 2 0 R)
  • Find the first (?) kid object from the pages descriptor (in this case, /Kids [ 3 0 R ]
  • From the referenced object, get the contents (in this case, 5)
  • Build the obj 9 object in my PDF section.

This is merely experimental though. What I need is some reference to PDF document which would explain the proper procedure to refer to the previous PDF section of the file, or some working C++ PDF-add-a-new-document updating.

If I have, I will look through the PDF specification and/or iText source, but I hope I can avoid that.

Thanks a lot.

πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
Michael Chourdakis
  • 10,345
  • 3
  • 42
  • 78
  • The new/second xref table is just "7 9 ...." I assume that is not actually what you appended to the original PDF, but just because you had not calculated those values. If not, then that is an issue. The reference from Obj 9 to Obj 5 looks fine to me. Can you elaborate on what the issue exactly is? – Ryan Oct 05 '18 at 06:56
  • *"If I have, I will look through the PDF specification and/or iText source, but I hope I can avoid that."* - A developer manipulating files of some given format directly who hopes to *avoid looking through the specs* of that format... this sounds weird to me. You really should look through the specification add your question shows multiple misunderstandings of the format. – mkl Oct 05 '18 at 07:12

0 Answers0