CRC calculation: Polynomial division with bytewise message XOR-ing?

Question

Trying to understand what is meant in this bit of pseudocode from "Code fragment 3" in the following Wikipedia page:

function crc(byte array string[1..len], int len) {
    remainderPolynomial := 0
    // A popular variant complements remainderPolynomial here; see § Preset to −1 below
    for i from 1 to len {
        remainderPolynomial := remainderPolynomial xor polynomialForm(string[i]) * xn−8
        for j from 1 to 8 {    // Assuming 8 bits per byte
            if coefficient of xn−1 of remainderPolynomial = 1 {
                remainderPolynomial := (remainderPolynomial * x) xor generatorPolynomial
            } else {
                remainderPolynomial := (remainderPolynomial * x)
            }
        }
    }
    // A popular variant complements remainderPolynomial here; see § Post-invert below
    return remainderPolynomial
}

The article states:

In software, it is convenient to note that while one may delay the xor of each bit until the very last moment, it is also possible to do it earlier. It is usually convenient to perform the xor a byte at a time, even in a bit-at-a-time implementation like this:

OK, that sounds great. I need to implement CRC-16-IBM on a microcontroller and would prefer not to use the lookup table method (despite performance disadvantages). It would be interesting to explore this possibility of byte-wise xor calculation, rather than bit-by-bite. Yet, I don't quite understand what they mean.

While the code will be written in assembler, I am using Python to explore ideas and understand all the details. Here's CRC-16 code I wrote that, as far as I can tell, returns correct values:

# Flip the bits of an integer
# It will only flip the lower <bits> bits
def flip(x:int, bits:int=16):
    result = 0
    for i in range(bits):  # Reflect reg
        result <<= 1
        temp = x & (0x0001 << i)
        if temp:
            result |= 0x0001
    return result

def test_flip():
    n = 0b1100_1000

    bits = 8
    print(f"\nin: {n:0{bits}b}")
    print(f"out:{flip(n, bits):0{bits}b}")

    bits = 16
    print(f"\nin: {n:0{bits}b}")
    print(f"out:{flip(n,bits):0{bits}b}")

# test_flip()


def crc16(data:bytearray,
            poly:int=0x8005,
            init:int=0x0000,
            ref_in:bool=True,
            ref_out:bool=True,
            xor_out:int=-0x0000,
            debug:bool=False,
            disable_poly_xor:bool=False):
    reg = init                        # Initial CRC value
    for byte in data:                 # For each byte..
        if not ref_in:                # Reflect the input if necessary
            byte = flip(byte, 8)

        for i in range(8):            # Process all 8 bits one at a time
            reg <<= 1                 # Shift register left by one bit

            reg |= byte & 0x01      # Shift data each bit into the register on at a time
                                    # LSB first, unless input is NOT reflected

            if not disable_poly_xor:
                if reg & 0x010000:    # If a 1 shifts out of the 16 bit register,
                    reg ^= poly       # xor with polynomial

            byte >>= 1                # Prepare next bit to shift into register

            if debug:
                reg_str = f"{reg:032b}"
                print(f"{reg_str[-17:-16]} {reg_str[-16:-12]} {reg_str[-12:-8]} {reg_str[-8:-4]} {reg_str[-4:]} < {byte:08b}")

        if debug: print()
        reg &= 0xFFFF                 # This isn't a 16 bit hardware register, get rid of the excess

    if ref_out:
        return flip(reg, 16) ^ xor_out
    else:
        return reg ^ xor_out


# Checking implementations
check_data = bytearray(b"123456789\x00\x00")
check_result_should_be = 0xBB3D

print(f"\ndata: {[hex(n) for n in check_data]}")
print(f"expected CRC: {check_result_should_be: 04x}")

print()

print(f"crc16: {crc16(check_data, ref_in=False, ref_out=False): 04x}")
print(f"crc16: {crc16(check_data, ref_in=False, ref_out=True): 04x}")
print(f"crc16: {crc16(check_data, ref_in=True,  ref_out=False): 04x}")
print(f"crc16: {crc16(check_data, ref_in=True,  ref_out=True): 04x}")

I followed this reference document in order to produce the implementation you see above:

https://zlib.net/crc_v3.txt

I have looked through many CRC implementations in a range of languages (including many here on SO). I have not seen a single one that appends zeros to the data, as the paper seems to indicate is necessary. My test data above is bytearray(b"123456789"), yet I padded it with 16 bits of zeros. Without that I do not get the right CRC output values.

Per that paper, the CRC-16-IBM version I am implementing should output 0xBB3D, which is exactly what I get when the ref_in and ref_out are set to True.

Which brings me to the Wikipedia bytewise xor pseudocode. Here's how I interpret the pseudocode:

def crc16_1(data:bytearray,
            poly:int=0x8005,
            init:int=0x0000,
            ref_in:bool=True,
            ref_out:bool=True,
            xor_out:int=-0x0000,
            debug:bool=False,
            disable_poly_xor:bool=False):
    reg = init                        # Initial CRC value
    for byte in data:                 # For each byte..
        if not ref_in:                # Reflect the input if necessary
            byte = flip(byte, 8)

        reg ^= byte

        for i in range(8):            # Process all 8 bits one at a time
            if not disable_poly_xor:
                if reg & 0x8000:    # If a 1 is going to shift out of the 16 bit register,
                    reg <<= 1
                    reg ^= poly     # xor with polynomial
                else:
                    reg <<= 1       # Shift register left by one bit

            if debug:
                reg_str = f"{reg:032b}"
                print(f"{reg_str[-17:-16]} {reg_str[-16:-12]} {reg_str[-12:-8]} {reg_str[-8:-4]} {reg_str[-4:]} < {byte:08b}")

        if debug: print()
        reg &= 0xFFFF                 # This isn't a 16 bit hardware register, get rid of the excess

    if ref_out:
        return flip(reg, 16) ^ xor_out
    else:
        return reg ^ xor_out

Adding this to my test:

# Checking implementations
check_data = bytearray(b"123456789\x00\x00")
check_result_should_be = 0xBB3D

print(f"\ndata: {[hex(n) for n in check_data]}")
print(f"expected CRC: {check_result_should_be: 04x}")

print()

print(f"crc16: {crc16(check_data, ref_in=False, ref_out=False): 04x}")
print(f"crc16: {crc16(check_data, ref_in=False, ref_out=True): 04x}")
print(f"crc16: {crc16(check_data, ref_in=True,  ref_out=False): 04x}")
print(f"crc16: {crc16(check_data, ref_in=True,  ref_out=True): 04x}")

print()

print(f"crc16_1: {crc16_1(check_data, ref_in=False, ref_out=False): 04x}")
print(f"crc16_1: {crc16_1(check_data, ref_in=False, ref_out=True): 04x}")
print(f"crc16_1: {crc16_1(check_data, ref_in=True,  ref_out=False): 04x}")
print(f"crc16_1: {crc16_1(check_data, ref_in=True,  ref_out=True): 04x}")

I get:

data: ['0x31', '0x32', '0x33', '0x34', '0x35', '0x36', '0x37', '0x38', '0x39', '0x0', '0x0']
expected CRC:  bb3d

crc16:  fee8
crc16:  177f
crc16:  bcdd
crc16:  bb3d

crc16_1:  5e8b
crc16_1:  d17a
crc16_1:  6a07
crc16_1:  e056

Which isn't correct.

I hope someone can help me understand how to get from their pseudocode to real code that can past the simple test for this or any other CRC configuration. The definition for the CRC algorithm I am trying to implement is (also from the linked paper):

Name   : "CRC-16"
Width  : 16
Poly   : 8005
Init   : 0000
RefIn  : True
RefOut : True
XorOut : 0000
Check  : BB3D

The check value at the bottom is what I am using to ensure my code is working correctly.

I also took a look at the MODBUS CRC implementation, per page 40 of this document:

https://www.modbus.org/docs/Modbus_over_serial_line_V1_02.pdf

Here's my implementation:

def crc16_2(data:bytearray,
            poly:int=0x8005,
            init:int=0x0000,
            ref_in:bool=True,
            ref_out:bool=True,
            xor_out:int=-0x0000,
            debug:bool=False,
            disable_poly_xor:bool=False):
    reg = init                        # Initial CRC value
    for byte in data:                 # For each byte..
        if not ref_in:                # Reflect the input if necessary
            byte = flip(byte, 8)

        reg ^= byte

        for i in range(8):            # Process all 8 bits one at a time
            if not disable_poly_xor:
                if reg & 0x0001:    # If a 1 is going to shift out of the 16 bit register,
                    reg >>= 1
                    reg ^= poly     # xor with polynomial
                else:
                    reg >>= 1       # Shift register left by one bit

            if debug:
                reg_str = f"{reg:032b}"
                print(f"{reg_str[-17:-16]} {reg_str[-16:-12]} {reg_str[-12:-8]} {reg_str[-8:-4]} {reg_str[-4:]} < {byte:08b}")

        if debug: print()
        reg &= 0xFFFF                 # This isn't a 16 bit hardware register, get rid of the excess

    if ref_out:
        return flip(reg, 16) ^ xor_out
    else:
        return reg ^ xor_out

Note that they flip the polynomial and shift right, rather than left. They do xor one byte at a time, which is interesting.

Once again, no configuration produces the expected CRC value:

data: ['0x31', '0x32', '0x33', '0x34', '0x35', '0x36', '0x37', '0x38', '0x39', '0x0', '0x0']
expected CRC:  bb3d

crc16:  fee8 < correct
crc16:  177f < correct
crc16:  bcdd < correct
crc16:  bb3d < correct

crc16_1:  5e8b
crc16_1:  d17a
crc16_1:  6a07
crc16_1:  e056

crc16_2:  295
crc16_2:  a940
crc16_2:  44ad
crc16_2:  b522

EDIT:

This is a really nice resource contributed by Mark Adler:

https://github.com/madler/crcany

Note: I found this tool useful: http://www.lokker.net/Java/crc/CRCcalculation2.htm — martin's, Jun 10 '22 at 23:13

rcgldr · Accepted Answer · 2022-06-12T05:04:43.290

0

I only have Python 2.7. This seems to work.

    def flip(x=0, bits=16):
        result = 0
        for i in range(bits):  # Reflect reg
            result <<= 1
            temp = x & (0x0001 << i)
            if temp:
                result |= 0x0001
        return result
    
    def crc16(bytearray,
                poly=0x8005,
                init=0x0000,
                ref_in=True,
                ref_out=True,
                xor_out=0x0000):
        reg = init
        for byte in data:
            if ref_in:
                byte = flip(byte, 8)
            reg ^= byte << 8
            for i in range(8):
                if reg & 0x08000:
                    reg = (reg << 1) ^ poly
                else:
                    reg = (reg << 1)
                reg &= 0xffff
        if ref_out:
            return flip(reg, 16) ^ xor_out
        else:
            return reg ^ xor_out
    
    data = [0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39]
    
    print('0x%04x' % crc16(data, 0x8005, 0x0000, True, True, 0x0000))

Since the goal here is a reflected CRC16, this is a simpler right shifting CRC function that eliminates the need to do to bit flips. This means the right most bit of data and the CRC register are the logical most significant bits. The polynomial is also reflected from 0x8005 to 0xa001.

def crc16r(bytearray,
            poly=0xa001,
            init=0x0000,
            xor_out=0x0000):
    reg = init
    for byte in data:
        reg ^= byte
        for i in range(8):
            if reg & 0x0001:
                reg = (reg >> 1) ^ poly
            else:
                reg = (reg >> 1)
    return reg ^ xor_out

data = [0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39]

print('0x%04x' % crc16r(data, 0xa001, 0x0000, 0x0000))

edited Jun 12 '22 at 05:04

answered Jun 11 '22 at 00:00

rcgldr

27,407
3
36
61

I just tried this. Based on the paper I linked to and the online tool, the validated return value for that data sequence is `0xbb3d`. Your code and all of my attempts to produce this using byte-wise xor do not produce this result. – martin's Jun 11 '22 at 00:43
I just realized what happened. I tested again, on three different versions of Python and, you are right. The result is correct...so long as we omit the 16 bits of 0 padding at the end of the message. That's an interesting aspect of this. All of my tests have appended two 0x00 bytes to the end of the test array. This is what the papers and explanations I have read all say you should do. And this is how I tested your code. How did you come-up with shifting the byte to the left 8 bits before the xor to reg at the start? I don't think that's what the description of the byte-wise xor says. – martin's Jun 11 '22 at 07:03
If the sequence was to cycle the CRC register one step, then XOR the next input bit with the least significant bit of the CRC register, then zero padding would be needed. However, a typical CRC calculation XOR's the next bit of the input data with the most significant bit of the CRC register, then cycle the CRC register one step. This eliminates the need to pad with zeroes. The optimization here is to XOR the next byte of input data with the most significant byte of the CRC register, then cycle the CRC register 8 steps (or use a lookup table to cycle the CRC register 8 steps). – rcgldr Jun 11 '22 at 16:28
Continuing, the initial value for the CRC register is to be XOR'ed with the leading bits of the input data, for CRC16, the first 16 bits of input data, since the code is expected to use a typical CRC implementation, (XOR input bit to most significant bit of CRC register, then cycle one step). – rcgldr Jun 11 '22 at 21:40
Thanks for the explanation. The part I missed in the Wikipedia article (now that I look at it again at the start of a 20 hour day, rather than the end of it) is that they do actually tell you to multiply by 256. – martin's Jun 11 '22 at 22:07
@martin's - I updated my answer by adding a right shifting implementation which eliminates the need to bit flip variables. – rcgldr Jun 12 '22 at 05:06

Mark Adler · Answer 2 · 2022-06-11T05:47:41.363

0

It's all very well explained in the paper you linked. I'm not sure you read the whole thing, since it explains why and how you don't need to pad with zeros in the implementation. It also explains why and how exclusive-oring the entire byte into the CRC works, instead of a bit at a time.

Anyway, your problem is that you're not reflecting the polynomial for a reflected CRC. 0x8005 becomes 0a001.

def crc16(data):
    crc = 0
    for byte in data:
        crc ^= byte
        for _ in range(8):
            crc = (crc >> 1) ^ 0xa001 if crc & 1 else crc >> 1
    return crc

print(hex(crc16(b'123456789')))

prints:

0xbb3d

edited Jun 11 '22 at 05:47

answered Jun 11 '22 at 05:40

Mark Adler

101,978
13
118
158

I've been working 20 hour days for several weeks now. Time to take a short break. Yes, I did read the entire thing. I guess coffee can't compensate for not actually being awake. Thanks. – martin's Jun 11 '22 at 22:09
Is that not the CRC you are looking for? If so, you can see it can be calculated without all that flipping and multiplying by 256. – Mark Adler Jun 12 '22 at 00:52
The CRC function I wrote has all that flipping code because it is modeled after situations encountered in hardware-based applications. For example, UARTs transmit the LSB first. If you were to build a circuit to compute CRC of such a bit stream you would receive flipped bytes. There are a number of excellent presentations on this. The part I missed was also providing for flipping the polynomial and rotating in register in the other direction. In my survey I discovered a lot of CRC code that, in my opinion, simply does not work. – martin's Jun 12 '22 at 06:07
The paper I linked to is an excellent resource. At the end it tends to focus on table-driven CRC computation. I happen to be implementing this in assembler on an embedded system, hence the need to experiment with various permutations in Python to determine what might be best. – martin's Jun 12 '22 at 06:09
Here's another interesting CRC resource: https://reveng.sourceforge.io/crc-catalogue/ – martin's Jun 12 '22 at 06:11
If you look at the parameters that define the various CRC variants you will see `refin` and `refout`, among other parameters. You'll also find the check value. In order to construct such a CRC generator you have to reflect (flip) the byte, register or both, depending on the definition. I am not an expert, by any means. This deep-ish dive into CRC generation has been very interesting. I'd like to understand how to choose a polynomial. That's far more complex than might appear on first inspection. – martin's Jun 12 '22 at 06:17
1

Professor Koopman of CMU has an interesting page where he computed optimal polynomials based on various Hamming distances. https://users.ece.cmu.edu/~koopman/crc/ – martin's Jun 12 '22 at 06:17
Lastly, professor Koopman's CRC and Checksum webinar...more than I'll ever know: https://betterembsw.blogspot.com/2013/11/crc-webinar.html – martin's Jun 12 '22 at 06:22
You may also find this useful: https://github.com/madler/crcany – Mark Adler Jun 12 '22 at 06:43
Oh, this is brilliant. Thanks for posting it. I will edit the question to add this reference at the bottom, just to make sure others in this journey can take advantage of this resource. Looking-up CRC implementations has been a bit of a frustrating experience in that there's a lot of code out there that looks good but is incorrect. The combination of the CRC catalogue, the Ross Williams paper and this github repository is probably the best all-around resource for anyone wanting to learn about CRC. – martin's Jun 12 '22 at 18:32

CRC calculation: Polynomial division with bytewise message XOR-ing?

2 Answers2