I am sending thousands of images from one system to another over FTP. Initially, I'll dump all the images, but later on, I want to send only those images which are changed.
I haven't found any concrete solution to figure out changed images based on the updated timestamp in windows. Therefore, I decided the following approach:
1.) Generate checksums for all the files and store them somewhere. Maybe database or filesystem.
2.) Every time I send files to another system, compare the checksums and send only the files which have different checksums.
In order to test the above, I tried to generate a checksum (SHA and MD5) for two different images, and the checksum was the same.
Following is the sample code:
package com.test;
import java.io.FileInputStream;
import java.io.IOException;
import org.apache.commons.codec.digest.DigestUtils;
public class TestHash {
public static void main(String[] args) throws IOException {
String checksumSHA256 = DigestUtils.sha256Hex(new FileInputStream("monkey_11.jpg"));
System.out.println("checksumSHA256 : " + checksumSHA256);
String checksumMD5 = DigestUtils.md5Hex(new FileInputStream("monkey_11.jpg"));
System.out.println("checksumMD5 : " + checksumMD5);
String checksumSHA256_1 = DigestUtils.sha256Hex(new FileInputStream("monkey.jpg"));
System.out.println("checksumSHA256 : " + checksumSHA256_1);
String checksumMD5_1 = DigestUtils.md5Hex(new FileInputStream("monkey.jpg"));
System.out.println("checksumMD5 : " + checksumMD5_1);
}
}
I'm wondering why the checksums are the same? Is there another way to identify updated images?