Please have a look on the below issue.
1 - Applying the MD5 on a .txt file containing "Hello" (without quotes, length = 5). It gives some hash value (say h1).
2 - Now file content are changed to "Hello " ( without quotes, length = 6). It gives some hash value (say h2).
3 - Now file is changed to "Hello" (exactly as step. 1). Now the hash is h1. Which makes sense.
Now the problem comes if procedure is applied to a .pdf file. Here rather than changing the file content I am chaging the colour of the text and again reverting back to the original file. In this way i am getting three different hash values.
So, is it because of the way pdf reader encode the text and meta-data, hash is different or the analogy itself is wrong?
Info:- Using a freeware in windows to calculate the hash.