0

I am part of the R&OS pdf class for php developer team and noticed some strange behavior in Adobe Reader XI (11.0.5)

When a pdf file includes the fileIdentifier (/ID entry in trailer part) Adobe Reader failed to search for the text content.

Once I remove the /ID entry search is fully functioning.

Foxit Reader and Chrome reader search worked in both cases

Does anybody know why Adobe Reader (AAR) is behaving like this?

In addition I added both pdf files on pastebin.com. So you can download and test from there. Simple store it with the extension ".pdf"

http://pastebin.com/an5NaZcv - search failed

http://pastebin.com/ZyFZNQ36 - search ok

BINARY FILE WHICH IS NOT WORKING: bug is fixed

I reported this as bug in my application here: https://sourceforge.net/p/pdf-php/bugs/71/

Thank you in advanced

Ole K
  • 754
  • 1
  • 9
  • 32
  • 2
    I can search both files for e.g. the word "gray" successfully using Adobe Reader 11.0.5; on the other hand providing the files via pastebin, a service intended for textual data, might have destroyed some important aspect. I would advice using some file server for binary files. – mkl Nov 25 '13 at 13:07
  • The question is so specific, that I think that only Adobe can answer it, I agree with @mkl, pasting binary in Pastebin might mess everything – Noam Rathaus Nov 25 '13 at 13:16
  • Do you have `Fast Find` enabled in *Preferences*? – user2846289 Nov 25 '13 at 14:24
  • NOTE: these pdf files do not contain any binary stuff. So it should safe to copy & paste. Anyway I am going to upload the binaries... thankk @mkl which OS are you using? Maybe its because I am using windows 8.1?! – Ole K Nov 25 '13 at 18:55
  • I tested on Windows 7 in office. @VadimR's idea should also be checked. – mkl Nov 25 '13 at 19:15
  • *these pdf files do not contain any binary stuff.* - a common misconception. Pdf files are binary files even if they look like plain text. – mkl Nov 25 '13 at 19:18
  • 2
    Well, it actually once bit me... Disable Fast Find, purge search cache, restart Acrobat. Useful option for people, nasty trap for developers. – user2846289 Nov 25 '13 at 19:21
  • I'm now at a Win8.1 Computer and I could search all the files both with Adobe Reader 11.0.4 and 11.0.5, and I could find 'gray' each time. Have you probably re-used an ID used in another document on your Computer before? Maybe a Prior Version of the same PDF? BTW, the file made available for binary download did not make Adobe Reader want to repair while it wanted to repair the textual ones. – mkl Nov 25 '13 at 20:03

1 Answers1

0

I am pretty much sure I solved it by clearing the fast search cache in Adobe Reader.

My explanation why this problem occurs:

Even without encryption, the fast find feature from Adobe Reader is using the /ID entry (if it is set) to somehow identify all text content of the document.

Once I have cleared the fast find cache from "Preferences -> Search" I was again able to search

So in future I will use something similar to md5(#timestamp#) to make sure every document has its own unique fileIdentifer stored in /ID entry of the document trailer.

Thank you for your hints

Regards, Ole

Ole K
  • 754
  • 1
  • 9
  • 32
  • *So in future I will use something similar to md5(#timestamp#) to make sure every document has its own unique fileIdentifer stored in /ID entry of the document trailer.* - I would propose following a routine as presented in the PDF specification. And an ID is an ID is an ID. – mkl Nov 25 '13 at 20:07
  • I fully agree, but I also think noone knew that this effects the fast find feature in AAR. Anyway problem is solved. – Ole K Nov 25 '13 at 22:32
  • It was new to me, too, but @VadimR seems to have had experienced something like that before. ;-) – mkl Nov 26 '13 at 05:31