0

I have a customer's broken application that I am trying to fix, but the original code has been lost, and I am trying to re-engineer it. It involves downloading 140 blocks of data, 768 bytes per block, to a device. In the data there is a "checksum" consisting of 16 bytes of data, that presumably covers all 140 blocks or possibly some small subset (no way to know).

If I change so little as 1 bit in the data, the entire 16 bytes of "checksum" changes. What I'm looking for are ideas about how this "checksum" might be calculated.

Example:

In block 24 at offset 0x116 I change two bytes from 0xe001 to 0xe101, and the "checksum" data changes from:

53 20 5a 20 3e f5 38 72 eb d7 f4 3c d9 0a 3f c5

to this:

7f fe ad 1f cc c3 1e 3c 22 0a bf 6a 6d 03 ad 97

I can experiment if I have some clue how they might be calculating this "checksum".

Looking for any idea to get me started.

bolov
  • 72,283
  • 15
  • 145
  • 224
Sinbad
  • 65
  • 5
  • 2
    Most hashing algorithms do exactly this, and there's an infinite number of them :( Without a series of inputs->outputs, I doubt we can be of assistance. You could go down a list of standard checksum algorithms, and hope to stumble across the matching configuration, but you'd probably have better luck reverse engineering the binary. – Mooing Duck Sep 29 '18 at 00:33
  • 2
    128-bit checksum -- md5? That used to be pretty common though is considered weak now. – Chris Dodd Sep 29 '18 at 00:34
  • First check the commonly used hashes (MD5, SHA, there's a lot of them). If none matches, then you'll need to find the hashing code in the exe, and identify/decipher its algorithm. – geza Sep 29 '18 at 00:36
  • https://en.wikipedia.org/wiki/List_of_hash_functions – Chris Dodd Sep 29 '18 at 00:37
  • 2
    If it is a good checksum, you will never figure out what it is from examining samples unless it is one of the standard algorithms. And even if it is a standard algorithm, if it has a salt or nonce, you will not figure out what it is. A better approach would be to examine the machine instructions (preferably disassembled to assembly, of course). – Eric Postpischil Sep 29 '18 at 00:43
  • if there are 140 different hashes then probably they're calculated independently for each block – phuclv Sep 29 '18 at 02:44

1 Answers1

0

Partial answer:

Presumably covers all 140 blocks or possibly some small subset (no way to know)

If I change so little as 1 bit in the data, the entire 16 bytes of "checksum" changes.

Iterate through each bit of the data, modifying each in turn, and see if the checksum changes. Then you can figure out which subset (if any) is being hashed.

Once you know that, then you will be able to come up with known input / output pairs. You can plug a known input into a tool that will generate outputs using well known hash algorithms (for example, this one: http://www.toolsvoid.com/multi-hash-generator - just the first hit I got from Google, not necessarily the most comprehensive).

EDIT: As per comments on the question, this will not work if there is a salt. But at least it will quickly rule out the simplest cases.

Community
  • 1
  • 1
JBentley
  • 6,099
  • 5
  • 37
  • 72