5

I've been working on settting passwords on PDFs to prevent copy/paste and allow printing, add watermarks and set an owner password to prevent further changes.

Everything works well as expected, no issue there.

Then I downloaded this free for 15 days pdf removal tool, a-pdf. In a blink it removes all protection, no matter the complexity of the password (tried with 50 char length will all kind of chars).

I see there are other methods in itextPDF to encrypt a document. I used the following:

File f = new File("C:/TEMP/zip/waterMarked.pdf");

String hardPassword = "D 5BaIZQ@ CqAk+NQCW)7Dkgb@i&02ifu!2TMX*d 0TGK(j(Kq";
byte[] hardPasswordByte = hardPassword.getBytes(); 

PdfReader reader = new PdfReader("C:/TEMP/zip/Original_document-9.pdf");

FileOutputStream out = new FileOutputStream(f);

PdfStamper stamp = new PdfStamper(reader, out);

//first argument is the user password. If set to something it asks for password when opening file, not wanted.
stamp.setEncryption(null, hardPasswordByte, PdfWriter.ALLOW_PRINTING, true);

//do stuff on the stamper, save file.

Does anyone knows a better way to protect PDF documents from Java code ?

IceGras
  • 705
  • 4
  • 11
  • 19
  • 7
    The problem you're facing is inherent to the type or restriction you're doing: For displaying a PDF, the computer needs access to pretty much the same data as it needs for copying text from your document. You can't reliably allow printing and prevent copy & paste from your document. Everything you try is just raising the clue barrier. – Joachim Sauer Jul 12 '11 at 13:27
  • 3
    Please don't create PDFs with copying prevented. It's extremely annoying to have to break out the OCR just to copy text from your document, which I'm going to do whether you try to prevent me or not. – endolith Sep 20 '11 at 03:26

1 Answers1

12

PDF files support 2 passwords: user password and owner password. A user can view the PDF file if he knows any of these passwords. If the file has a user password, when the file is opened with a PDF viewer, the viewer asks the user to enter a password and either the user or owner passwords will work. If the file has only an owner password, the document is displayed automatically and the password is required when trying to change the file's access rights. This is the flow of operations suggested by PDF specification, but in reality it works like this: if the file is protected with a user password, brute force approach is required for password cracking, the longer the password is the longer it takes to crack. Problem is your real users need the password to open the file. If the file is protected only with an owner password, there is a default decryption key (remember, any viewer can display the PDF file without requesting a password) and the application that processes the PDF file decides whether to respect or not the document access rights. Once the file has been decrypted, it is saved without encryption and the output file has no longer a password. Since your documents have only the owner password, the tool removes it without problems using the default decryption key.

There are a few solutions (more or less related to iText) depending on your audience: simple PDF encryption (with the problems above) if your audience is widespread, for example you publish papers on a website; 3rd party DRM solution, more complex and requires various plugins installed on your users' computers; certificate encryption (no sure if iText supports it), again complex, requires each user to have a digital certificate and documents access is defined for each user. Last 2 options work in a controlled enterprise environment.

iPDFdev
  • 5,229
  • 2
  • 17
  • 18
  • 1
    Thanks for your detailed answer ! IText supports a certificate encryption, but as you said all this is quite complicated to manage, and the target audience is too widespread to enforce security that way. I was trying to make better what already exists, but I think I will settle for the status quo. As Joachim explained too, the client still needs to be able to read with no restrictons, hence protection is merely good will coming from the reading app. Thanks to both of you for your answers ! – IceGras Jul 12 '11 at 14:35
  • 4
    Well any solution to copy the text will fail miserably as soon as you allow the user to see the document. Even if there was some fairy dust to avoid them being able to copy the data, screenshots and OCR software will work perfectly fine – Voo Jul 12 '11 at 19:17