5

I have a set of data protected by 16bit checksums that I need to correct. The checksum locations are known, the exact areas they are calculated on and the exact algorithm used to calculate them are not. 16bit, LSB first. I suspect it's some sort of 16bit CRC, but I have not been able to find the code that's actually calculating the checksums.

Example:

00    4E00FFFF26EC14091E00A01830393630  
10    30313131313030393030363030313030  
20    30303131313030393030363030313030  
30    30303131313030393030363030313030  
40    3030FFFF225E363436304D313037**0CE0**  
50    64000000000000008080808080800000  
60    00000000**BE6E**FC01E001EB0013010500  

Checksums are stored at 4E and 64. I don't know if they are calcuated starting from the offset in the first word at the beginning of each data section or starting after that, or on the whole range. I have tried a number of common CRC algorithms and polynomials with no luck. There are no references or specifications available for this application.

Here is another data section with different CRCs for comparison's sake.

00    4E00FFFF26C014091600A01030393132  
10    30313131313030393030313230313030  
20    30303131313030393030313230313030  
30    30303131313030393030313230313030  
40    3030FFFF225E343231324F313044**8348**  
50    64000000000000008080808080800000  
60    00000000**72F8**E001EB00130105000E01  

My question is, can anyone identify the algorithm? Is there any way to calculate the CRC polynomial and other factors from the data and the CRC?

Thanks!

Edit:

A search of my disassembly for the common CRC16 polynomial 0xA001 revealed this function:

34F86 ; =============== S U B R O U T I N E =======================================
34F86
34F86
34F86 Possible_Checksum:                    ; CODE XREF: MEM_EXT_4:00034FEEP
34F86                                         ; MEM_EXT_4:0003503AP ...
34F86                 mov     [-r0], r9       ; Move Word
34F88                 mov     r4, r12         ; Move Word
34F8A                 mov     r5, r13         ; Move Word
34F8C                 shr     r4, #14         ; Shift Right
34F8E                 shl     r5, #2          ; Shift Left
34F90                 or      r5, r4          ; Logical OR
34F92                 mov     r4, r12         ; Move Word
34F94                 mov     DPP0, r5        ; Move Word
34F98                 and     r4, #3FFFh      ; Logical AND
34F9C                 movb    rl3, [r4]       ; Move Byte
34F9E                 mov     DPP0, #4        ; Move Word
34FA2                 movbz   r9, rl3         ; Move Byte Zero Extend
34FA4                 mov     r15, #0         ; Move Word
34FA6
34FA6 loc_34FA6:                              ; CODE XREF: MEM_EXT_4:00034FC8j
34FA6                 mov     r4, [r14]       ; Move Word
34FA8                 xor     r4, r9          ; Logical Exclusive OR
34FAA                 and     r4, #1          ; Logical AND
34FAC                 jmpr    cc_Z, loc_34FBA ; Relative Conditional Jump
34FAE                 mov     r4, [r14]       ; Move Word
34FB0                 shr     r4, #1          ; Shift Right
34FB2                 xor     r4, #0A001h     ; Logical Exclusive OR
34FB6                 mov     [r14], r4       ; Move Word
34FB8                 jmpr    cc_UC, loc_34FC0 ; Relative Conditional Jump
34FBA ; ---------------------------------------------------------------------------
34FBA
34FBA loc_34FBA:                              ; CODE XREF: MEM_EXT_4:00034FACj
34FBA                 mov     r4, [r14]       ; Move Word
34FBC                 shr     r4, #1          ; Shift Right
34FBE                 mov     [r14], r4       ; Move Word
34FC0
34FC0 loc_34FC0:                       
Boann
  • 48,794
  • 16
  • 117
  • 146
mattbarn
  • 53
  • 1
  • 5
  • What is the context of this? Homework? Reverse engineering, which devices? – starblue Dec 30 '08 at 19:38
  • Reverse engineering. It's a microcontroller with a Siemens SAB80C166W. An automotive engine controller, if it matters. – mattbarn Dec 30 '08 at 19:41
  • Are you able to provide the executable that performed the checksum? – codelogic Dec 30 '08 at 20:00
  • What's the application, then? – starblue Dec 30 '08 at 20:09
  • @codelogic Well, the executable is a 256kB binary file. I could provide it if it would help. I have begun disassembling it, but have not been able to find out how/where the checksums are calculated. – mattbarn Dec 30 '08 at 20:15
  • @starblue The application is an engine controller (aka ECU or DME), do I need to be more specific?. The firmware calculates checksums on the calibration data stored in the flash memory to detect errors (and people like me trying to modify it). – mattbarn Dec 30 '08 at 20:19
  • Well, it doesn't seem to be an 8-bit sum, 16-bit sum, 16-bit XOR or CRC-CCITT. That's all I'm willing to try! ;) – Judge Maygarden Dec 30 '08 at 20:32
  • If the purpose were just to detect errors I'd suppose the standard CRC-16 (polynomial 0xa001). – starblue Dec 30 '08 at 20:37
  • How do you know that those locations are the checksums? Just curious. – e.James Dec 30 '08 at 20:50
  • Here is the binary: http://mthreemfive.com/Data/mattprogfull.bin It is compiled for a Siemens SAB80C166W and the data section in question starts at 0x10000. I am about to edit the original post with another piece of data relating to the polynomial 0xA001. Thank you all! – mattbarn Dec 30 '08 at 20:54
  • @eJames: The pattern fits the entire data section, without exception. It also helps to know that everything else I'm looking at is NOT a checksum (thanks to disassembly and prior experience). – mattbarn Dec 30 '08 at 20:56

2 Answers2

3

The code you posted from loc_34FA6 down is basically the following:

unsigned short
crc16_update(unsigned short crc, unsigned char nextByte)
{
    crc ^= nextByte;

    for (int i = 0; i < 8; ++i) {
        if (crc & 1)
            crc = (crc >> 1) ^ 0xA001;
        else
            crc = (crc >> 1);
    }

    return crc;
}

This is a CRC-16 with a 0xA001 polynomial. Once you find out the range of data for which the CRC-16 applies, you initialize CRC to 0xFFFF and the call this function for each byte in the sequence. Store the return value and pass it back in the next time through. The value returned at the end is your final CRC.

I'm not sure what the prologue is doing...

Judge Maygarden
  • 26,961
  • 9
  • 82
  • 99
  • Thank you very much! I am trying combinations of ranges now. I'll refocus on the disassembly also, since you've told me that piece is what I thought it was. I believe the first section of the code that I posted is moving the DPP0 (data page pointer) to the beginning of the checksummed range. – mattbarn Dec 30 '08 at 22:18
  • Maybe I'm missing something, but it looks like this section only processes one byte. The return (rets) occurs after iterating over 8 bits in one byte. – Judge Maygarden Dec 30 '08 at 22:33
  • It definitely looks like you're right. I'll need to move up the xref chain from here to see how this function gets called. – mattbarn Dec 30 '08 at 22:56
  • The prologue sets up the data page pointer (DPP0) in order to access that initial byte. r13/r12 are used to pass a 30-bit pointer, which probably means that this particular byte is located in another device (an external EEPROM would make sense) within the controller – e.James Dec 30 '08 at 23:14
  • Thank you, James. This particular CPU does not have any internal ROM so it's loading all of this data and code off of the flash chip. – mattbarn Dec 30 '08 at 23:26
1

More generally, part of the concept of CRC is that the when you compute the CRC of some data file, and then append the CRC on the end, you get a file who's CRC is some value that depends on the length of the file, but not it's contents. (For some CRC algorithms, it doesn't even depend on the file length.)

So, if you suspect the app you're trying to reverse-engineer is using say CRC16, and you have a program that computes CRC16, and you have multiple samples of the same length, just compute the CRC16 of those data files (which include the checksum). If it comes back with the same checksum data every time (for files of the same length), then they must contain a CRC checksum using the same width and polynomial.

For example, I once had to reverse engineer some files where the developer thought he was being clever by changing the CRC32 algorithm by changing two constants. I didn't have to find the object code that verified the checksum, disassemble it and then figure it out the hard way. This simple test nailed it.

Die in Sente
  • 9,546
  • 3
  • 35
  • 41