Adobe Reader XI search fail when pdf fileIdentifier (/ID) is set

Question

I am part of the R&OS pdf class for php developer team and noticed some strange behavior in Adobe Reader XI (11.0.5)

When a pdf file includes the fileIdentifier (/ID entry in trailer part) Adobe Reader failed to search for the text content.

Once I remove the /ID entry search is fully functioning.

Foxit Reader and Chrome reader search worked in both cases

Does anybody know why Adobe Reader (AAR) is behaving like this?

In addition I added both pdf files on pastebin.com. So you can download and test from there. Simple store it with the extension ".pdf"

http://pastebin.com/an5NaZcv - search failed

http://pastebin.com/ZyFZNQ36 - search ok

BINARY FILE WHICH IS NOT WORKING: bug is fixed

I reported this as bug in my application here: https://sourceforge.net/p/pdf-php/bugs/71/

Thank you in advanced

I can search both files for e.g. the word "gray" successfully using Adobe Reader 11.0.5; on the other hand providing the files via pastebin, a service intended for textual data, might have destroyed some important aspect. I would advice using some file server for binary files. — mkl, Nov 25 '13 at 13:07
The question is so specific, that I think that only Adobe can answer it, I agree with @mkl, pasting binary in Pastebin might mess everything — Noam Rathaus, Nov 25 '13 at 13:16
NOTE: these pdf files do not contain any binary stuff. So it should safe to copy & paste. Anyway I am going to upload the binaries... thankk @mkl which OS are you using? Maybe its because I am using windows 8.1?! — Ole K, Nov 25 '13 at 18:55
I tested on Windows 7 in office. @VadimR's idea should also be checked. — mkl, Nov 25 '13 at 19:15
*these pdf files do not contain any binary stuff.* - a common misconception. Pdf files are binary files even if they look like plain text. — mkl, Nov 25 '13 at 19:18
Well, it actually once bit me... Disable Fast Find, purge search cache, restart Acrobat. Useful option for people, nasty trap for developers. — user2846289, Nov 25 '13 at 19:21
I'm now at a Win8.1 Computer and I could search all the files both with Adobe Reader 11.0.4 and 11.0.5, and I could find 'gray' each time. Have you probably re-used an ID used in another document on your Computer before? Maybe a Prior Version of the same PDF? BTW, the file made available for binary download did not make Adobe Reader want to repair while it wanted to repair the textual ones. — mkl, Nov 25 '13 at 20:03

score 0 · Accepted Answer · answered Nov 25 '13 at 19:34

0

I am pretty much sure I solved it by clearing the fast search cache in Adobe Reader.

My explanation why this problem occurs:

Even without encryption, the fast find feature from Adobe Reader is using the /ID entry (if it is set) to somehow identify all text content of the document.

Once I have cleared the fast find cache from "Preferences -> Search" I was again able to search

So in future I will use something similar to md5(#timestamp#) to make sure every document has its own unique fileIdentifer stored in /ID entry of the document trailer.

Thank you for your hints

Regards, Ole

answered Nov 25 '13 at 19:34

Ole K

754
1
9
32

*So in future I will use something similar to md5(#timestamp#) to make sure every document has its own unique fileIdentifer stored in /ID entry of the document trailer.* - I would propose following a routine as presented in the PDF specification. And an ID is an ID is an ID. – mkl Nov 25 '13 at 20:07
I fully agree, but I also think noone knew that this effects the fast find feature in AAR. Anyway problem is solved. – Ole K Nov 25 '13 at 22:32
It was new to me, too, but @VadimR seems to have had experienced something like that before. ;-) – mkl Nov 26 '13 at 05:31

Adobe Reader XI search fail when pdf fileIdentifier (/ID) is set

1 Answers1