My users tend to save tons of duplicate files what consumes more and more space and generate HW and archiving cost.
Im thinking to create some scheduled job, to:
- find duplicate files (check file MD5 sum, not only filename / size)
- leave only 1 original file
- replace other redundant copies by link (shortcut) to file (point above)
Any idea how to archive that?
Script / tool / tips ?
EDIT 28.10.2021
Ive found in the meantime findDupe: https://www.sentex.ca/~mwandel/finddupe/
It allows to create hardlinks to original files. Ive tried this - it shows correctly what is duplicated, seems creating hardlinks - but... I cant see difference in HDD usage stats after all
Why that? Can it be Windows calculates free space incorrectly ?