0

I am currently trying to compare two PDFs with minor differences in Java, and storing the result in an output PDF. Generally whichever solution available out there merges the two PDFs into one, and highlights the differences then and there, which is not the business requirement. For this purpose, I am looking for some open source solution or any form of suggestion to this, where the differences of the two PDFs are highlighted, and stored in an output PDF with the two PDFs retained and the differences are highlighted in them separately.

I have tried compare functionality of PDFBox, but the issue I found there with respect to my requirement is that, it is merging the two PDFs into one, and the differences are highlighted in that single PDF, which appears kind of messy.

Aspose's solution seemed impressive although it again returns in a single PDF file but is more structured - but my organisation doesn't allow access to their API due to security reasons.

  • There's some code for this in TestPDFToImage.java in the source code download. This renders two PDFs and compares the rendering output and produces visual diff files. But this isn't very elegant to look at. – Tilman Hausherr Nov 08 '22 at 09:47
  • Thanks for that - are you aware of any link to its repo or anything to which I can refer to? – Aneek Biswas Nov 08 '22 at 10:35
  • It's here: https://svn.apache.org/viewvc/pdfbox/branches/2.0/pdfbox/src/test/java/org/apache/pdfbox/rendering/TestPDFToImage.java?view=markup The source code in full is here (you don't really need the repo but you can get it at http://svn.apache.org/repos/asf/pdfbox/branches/2.0/ ) https://pdfbox.apache.org/download.html – Tilman Hausherr Nov 08 '22 at 11:31
  • Note that it isn't part of the build tests, you have to explicitly start it. This is because renderings slightly differ in different OS / different jdk versions. So it's only used by individual developers on their own machine. This is "developer code" i.e. not ready to use as an application, but you can modify that code to make something for yourself (if you think the output is useful). – Tilman Hausherr Nov 08 '22 at 11:35
  • Sure, thanks! I'll give it a try and let you know. – Aneek Biswas Nov 08 '22 at 14:11

1 Answers1

0

We are using pdfcompare for this. It is based on pdfbox but can do exactly this...

Lonzak
  • 9,334
  • 5
  • 57
  • 88