I am doing some image processing code where in I download some images(as BufferedImage) from URLs and pass it on to a image processor.
I want to avoid passing of the same image more than once to the image processor(as the image processing operation is of high cost). The URL end points of the images(if they are same images) may vary and hence I can prevent this by the URL. So I was planning to do a checksum or hash to identify if the code is encountering the same image again.
For md5 I tried Fast MD5, and it generated a 20K+ character length hex checksum value for the image(some sample). Obviously storing this 20K+ character hash would be an issue when it comes to database storage. Hence I tried the CRC32(from java.util.zip.CRC32). And it did generate quite smaller length check sum than the hash.
I do understand checksum and hash are for different purposes. For the purpose explained above can I just use the CRC32? Would it solve the purpose or I have to try something more than these two?
Thanks, Abi