0

I have been playing with the files' sizes a bit as I use CheckSum to prevent from creating duplicates of the same file. CheckSum works absolutely fine, exactly as I would expect it to work. The problem I face is a fact that the same files have different sizes. Let me explain it, e.g. if I have a docx file and one of the words it contains is my first name "Szymon" and the size of this file is 436,854 bytes. Then, I will remove "Szymon" from the document and wrote it again, in exactly the same way, so "Szymon". In the very end I can see a slight difference of 10-20 bytes between the initial size of the document (436,854 bytes) and the second one (436,875 bytes). My question is, what is the reason for it to happen, cause both docx files contain exactly the same content?

Thanks in advance

Szymon
  • 91
  • 8
  • 1
    A word document contains a lot more data than the text you typed. Actually it's just a zip file with some files in it for your text and so on. Maybe there is some undo info in it that might be the reason the filesize has changed. – SBF Dec 07 '17 at 13:58
  • Thanks, the problem was a fact that it is the way the document file is constructed and whenever it was sligthly edited the structure of it was changing what had an influence on the memory size of the file, no matter if the text was the same. – Szymon Mar 20 '18 at 11:55

0 Answers0