0

Edit to add PDF link

I have an outlook add in which edits a PDF attached to an email using iText 7.

The code (c#) appears to function correctly and when opening the PDF in Acrobat or Kofax power PDF, the formatting is correct. When the same PDF is opened in Nuance PDF professional 5, only the text entered into the form fields remain visible. If the same PDF is opened in acrobat or Kofax, and a copy saved of the same file, it then opens correctly in Nuance.

I have tried flattening the pdf which results in no visible text in Nuance (essentially a blank PDF). I also identified that the issue only occurs for PDFs which have had a written signature added to them from an iPad.

Finally, I have noticed that file saved through iText is 50-100 kb smaller than the same pdf saved through Acrobat. I set compression to no_compression but this did not impact the results.

Any help would be greatly appreciated.

  • Please share the PDF in question. – mkl Apr 28 '23 at 13:27
  • Thanks mkl, I can’t share the PDF, however I will alter the pdf to remove anything I can not share, test it still does the same and add it. – user3555090 Apr 28 '23 at 21:14
  • @mkl here is a link to the PDF that is causing me problems. https://www.dropbox.com/s/4i2pyb7yiewuow6/Electronic%20Daily%20Timesheet%20Template%20V2.24.pdf?dl=0 – user3555090 May 02 '23 at 22:22
  • I still have been unable to resolve this issue. I have recreated the PDF and switched to iTextSharp however the issue persists. Would anyone have any insight into why this is occurring? Maybe @mkl – user3555090 May 10 '23 at 06:55
  • I've had a look at the file contents. At first glance, they appear to be correct. Can you share the example PDF both before and after editing and before and after signing? Maybe comparing the internals highlights a difference. – mkl May 10 '23 at 08:12
  • Thanks so much @mkl Here is the same PDF once saved using "save as" from Adobe Acrobat Reader"https://www.dropbox.com/s/yvn3vu8vukbihtn/Electronic%20Daily%20Timesheet%20Template%20V2.24%20%28Same%20file%20once%20saved%20from%20Adobe%20Acrobat%20reader%29.pdf?dl=0 – user3555090 May 10 '23 at 08:50
  • Here are the screenshots of what I see when opening each file in Nuance PDF converter. https://www.dropbox.com/s/7x69ote3ivguobm/Screenshot%202023-05-10%20184405.png?dl=0 https://www.dropbox.com/s/pjax2ma3yo2gmpa/Screenshot%202023-05-10%20184645.png?dl=0 – user3555090 May 10 '23 at 08:52

1 Answers1

0

In comments the OP provided both the state of the file their code handled which Nuance PDF Professional displays incorrectly - "Electronic Daily Timesheet Template V2.24.pdf" - and the state of that file saved as by Adobe Acrobat which Nuance PDF Professional displays correctly - "Electronic Daily Timesheet Template V2.24 (Same file once saved from Adobe Acrobat reader).pdf" . The main differences are as follows.

  • The objects of "Electronic Daily Timesheet Template V2.24.pdf" and "Electronic Daily Timesheet Template V2.24 (Same file once saved from Adobe Acrobat reader).pdf" differ hardly at all, essentially only the page content has been split up into a larger number of partial streams.

    I doubt Nuance PDF Professional has issues handling the larger partial content streams of the former document, this would much more often cause issues. Nonetheless it cannot be completely ruled out as the cause of the issue.

  • Looking into the way the objects are stored in the file, though, there is a very relevant difference: The former file is stored as sequence of revisions while the latter has been flattened into a single revision. And this is relevant because there are structural errors in the first revision of the former document (without fill-ins or signature scribbles), the cross reference table is invalidly built. Adobe Acrobat, when flattening those revisions into a single one, has created the latter file with a valid cross reference table.

    This error in the former file is also known to cause issues in Adobe Acrobat in work flows with multiple digital signatures. It might also cause arbitrary issues in other PDF processing software like Nuance PDF Professional.

    (The exact error in the cross reference table is a segmentation which is forbidden in the first revision in a PDF, compare this, this, and numerous similar stackoverflow answers.)

mkl
  • 90,588
  • 15
  • 125
  • 265
  • Thanks @mkl. instead of rebuilding the PDF again, I edited the "Electronic Daily Timesheet Template V2.24 (Same file once saved from Adobe Acrobat reader).pdf" resetting the form fields. It appears this has worked however I will need to test it a few more times. Thank you so much for your assistance!! – user3555090 May 11 '23 at 01:44
  • It works 100% as expected! you are incredible thanks @mkl – user3555090 May 11 '23 at 23:53
  • That's great! (You may want to mark this answer as *accepted answer* by clicking the tick at its upper left.) – mkl May 12 '23 at 04:56