0

I am sending thousands of images from one system to another over FTP. Initially, I'll dump all the images, but later on, I want to send only those images which are changed.

I haven't found any concrete solution to figure out changed images based on the updated timestamp in windows. Therefore, I decided the following approach:

1.) Generate checksums for all the files and store them somewhere. Maybe database or filesystem.

2.) Every time I send files to another system, compare the checksums and send only the files which have different checksums.

In order to test the above, I tried to generate a checksum (SHA and MD5) for two different images, and the checksum was the same.

Following is the sample code:

package com.test;

import java.io.FileInputStream;
import java.io.IOException;

import org.apache.commons.codec.digest.DigestUtils;

public class TestHash {
     public static void main(String[] args) throws IOException {

            String checksumSHA256 = DigestUtils.sha256Hex(new FileInputStream("monkey_11.jpg"));
            System.out.println("checksumSHA256 : " + checksumSHA256);

            String checksumMD5 = DigestUtils.md5Hex(new FileInputStream("monkey_11.jpg"));
            System.out.println("checksumMD5 : " + checksumMD5);


            String checksumSHA256_1 = DigestUtils.sha256Hex(new FileInputStream("monkey.jpg"));
            System.out.println("checksumSHA256 : " + checksumSHA256_1);

            String checksumMD5_1 = DigestUtils.md5Hex(new FileInputStream("monkey.jpg"));
            System.out.println("checksumMD5 : " + checksumMD5_1);

        }
}

I'm wondering why the checksums are the same? Is there another way to identify updated images?

miserable
  • 697
  • 1
  • 12
  • 31
  • 1
    If the checksums are the same, the files are the same. There's no chance you've found two different images with the same SHA-256 and MD5 hashes. If you think you have, upload the images so we can check them ourselves. – John Kugelman Oct 16 '18 at 02:48
  • 1
    By the way, it sounds like you're reinventing [Rsync](https://en.wikipedia.org/wiki/Rsync) and its [alternatives](https://serverfault.com/questions/24622/how-to-use-rsync-over-ftp). – John Kugelman Oct 16 '18 at 02:48
  • Yes, The files are same. I thought it considers file names also while generating checksums. :) – miserable Oct 16 '18 at 03:38
  • Regarding Rsync: We will be using MFTP as that's what our company uses for other transfer also. – miserable Oct 16 '18 at 03:40

0 Answers0