Calculating an 8-bit CRC with the C preprocessor?

Question

I'm writing code for a tiny 8-bit microcontroller with only a few bytes of RAM. It has a simple job which is to transmit 7 16-bit words, then the CRC of those words. The values of the words are chosen at compile time. The CRC specifically is "remainder of division of word 0 to word 6 as unsigned number divided by the polynomial x^8+x²+x+1 (initial value 0xFF)."

Is it possible to calculate the CRC of those bytes at compile time using the C preprocessor?

#define CALC_CRC(a,b,c,d,e,f,g)    /* what goes here? */

#define W0    0x6301
#define W1    0x12AF
#define W2    0x7753
#define W3    0x0007
#define W4    0x0007
#define W5    0x5621
#define W6    0x5422
#define CRC   CALC_CRC(W0, W1, W2, W3, W4, W5, W6)

http://codegolf.stackexchange.com/questions/3268/compute-the-crc32-table-at-compile-time — Xophmeister, Feb 21 '12 at 14:58
If speed is much more important to you than non-volatile memory (flash), then you can have all the results pre-calculated and stored in a constant lookup table. The CRC polynomial you describe is known as "CRC-8-CCITT". I don't know the optimal algorithm for that one, I'd suggest searching the web. — Lundin, Feb 22 '12 at 07:47

score 3 · Answer 1 · answered Feb 24 '12 at 17:08

It is possible to design a macro which will perform a CRC calculation at compile time. Something like

 // Choosing names to be short and hopefully unique.
 #define cZX((n),b,v) (((n) & (1 << b)) ? v : 0)
 #define cZY((n),b, w,x,y,z) (cZX((n),b,w)^CzX((n),b+1,x)^CzX((n),b+2,y)^cZX((n),b+3,z))
 #define CRC(n) (cZY((n),0,cZ0,cZ1,cZ2,cZ3)^cZY((n),4,cZ4,cZ5,cZ6,cZ7))

should probably work, and will be very efficient if (n) can be evaluated as a compile-time constant; it will simply evaluate to a constant itself. On the other hand, if n is an expression, that expression will end up getting recomputed eight times. Even if n is a simple variable, the resulting code will likely be significantly larger than the fastest non-table-based way of writing it, and may be slower than the most compact way of writing it.

BTW, one thing I'd really like to see in the C and C++ standard would be a means of specifying overloads which would be used for functions declared inline only if particular parameters could be evaluated as compile-time constants. The semantics would be such that there would be no 'guarantee' that any such overload would be used in every case where a compiler might be able to determine a value, but there would be a guarantee that (1) no such overload would be used in any case where a "compile-time-const" parameter would have to be evaluated at runtime, and (2) any parameter which is considered a constant in one such overload will be considered a constant in any functions invoked from it. There are a lot of cases where a function could written to evaluate to a compile-time constant if its parameter is constant, but where run-time evaluation would be absolutely horrible. For example:

#define bit_reverse_byte(n) ( (((n) & 128)>>7)|(((n) & 64)>>5)|(((n) & 32)>>3)|(((n) & 16)>>1)|
  (((n) & 8)<<1)|(((n) & 4)<<3)|(((n) & 2)<<5)|(((n) & 1)<<7) )
#define bit_reverse_word(n) (bit_reverse_byte((n) >> 8) | (bit_reverse_byte(n) << 8))

A simple rendering of a non-looped single-byte bit-reverse function in C on the PIC would be about 17-19 single-cycle instructions; a word bit-reverse would be 34, or about 10 plus a byte-reverse function (which would execute twice). Optimal assembly code would be about 15 single-cycle instructions for byte reverse or 17 for word-reverse. Computing bit_reverse_byte(b) for some byte variable b would take many dozens of instructions totalling many dozens of cycles. Computing bit_reverse_word(w) for some 16-bit wordw` would probably take hundreds of instructions taking hundreds or thousands of cycles to execute. It would be really nice if one could mark a function to be expanded inline using something like the above formulation in the scenario where it would expand to a total of four instructions (basically just loading the result) but use a function call in scenarios where inline expansion would be heinous.

+1 clever. However, I think it would be easier to understand code (perhaps written in normal C or Python) that runs on my desktop computer and prints out a "pre-calculated table" in a ".h" source file that is later #included in the C code that will run on the microcontroller. Something like ["using Python to generate C"](http://stackoverflow.com/questions/4000678/using-python-to-generate-a-c-string-literal-of-json) or ["pycrc"](http://programmers.stackexchange.com/questions/96211/what-is-a-faster-alternative-to-a-crc/96374#96374) — David Cary, May 01 '13 at 19:18

score 1 · Answer 2 · edited Mar 19 '19 at 00:37

1

The simplest checksum algorithm is the so-called longitudinal parity check, which breaks the data into "words" with a fixed number n of bits, and then computes the exclusive or of all those words. The result is appended to the message as an extra word.

To check the integrity of a message, the receiver computes the exclusive or of all its words, including the checksum; if the result is not a word with n zeros, the receiver knows that a transmission error occurred.

(souce: wiki)

In your example:

#define CALC_LRC(a,b,c,d,e,f) ((a)^(b)^(c)^(d)^(e)^(f))

edited Mar 19 '19 at 00:37

bryc

12,710
6
41
61

answered Feb 21 '12 at 15:03

vulkanino

9,074
7
44
71

This isn't a cyclic redundancy check, this is just a party check without a polynomial. It has a probability of 50% at failing to detect single bit errors. – Lundin Feb 21 '12 at 15:08
I agree, but it is the simplest one, as I said. With this checksum, any transmission error that flips a single bit of the message, or an odd number of bits, will be detected as an incorrect checksum. However, an error that affects two bits will not be detected if those bits lie at the same position in two distinct words. If the affected bits are independently chosen at random, the probability of a two-bit error being undetected is 1/n. – vulkanino Feb 21 '12 at 15:09
I should have mentioned that it's not just any checksum, I need a specific one. – Rocketmagnet Feb 21 '12 at 15:24

Brian McFarland · Answer 3 · 2012-02-24T15:27:25.827

Disclaimer: this is not really a direct answer, but rather a series of questions and suggestions that are too long for a comment.

First Question: Do you have control over both ends of the protocol, e.g. can you choose the checksum algorithm by means of either yourself or a coworker controlling the code on the other end?

If YES to question #1:

You need to evaluate why you need the checksum, what checksum is appropriate, and the consequences of receiving a corrupt message with a valid checksum (which factors into both the what & why).

What is your transmission medium, protocol, bitrate, etc? Are you expecting/observing bit errors? So for example, with SPI or I2C from one chip to another on the same board, if you have bit errors, it's probably the HW engineers fault or you need to slow the clock rate, or both. A checksum can't hurt, but shouldn't really be necessary. On the other hand, with an infrared signal in a noisy environment, and you'll have a much higher probability of error.

Consequences of a bad message is always the most important question here. So if you're writing the controller for digital room thermometer and sending a message to update the display 10x a second, one bad value ever 1000 messages has very little if any real harm. No checksum or a weak checksum should be fine.

If these 6 bytes fire a missile, set the position of a robotic scalpel, or cause the transfer of money, you better be damn sure you have the right checksum, and may even want to look into a cryptographic hash (which may require more RAM than you have).

For in-between stuff, with noticeable detriment to performance/satisfaction with the product, but no real harm, its your call. For example, a TV that occasionally changes the volume instead of the channel could annoy the hell out of customers--more so than simply dropping the command if a good CRC detects an error, but if you're in the business of making cheap/knock-off TVs that might be OK if it gets product to market faster.

So what checksum do you need?

If either or both ends have HW support for a checksum built into the peripheral (fairly common in SPI for example), that might be a wise choice. Then it becomes more or less free to calculate.

An LRC, as suggested by vulkanino's answer, is the simplest algorithm.

Wikipedia has some decent info on how/why to choose a polynomial if you really need a CRC: http://en.wikipedia.org/wiki/Cyclic_redundancy_check

If NO to question #1:

What CRC algorithm/polynomial does the other end require? That's what you're stuck with, but telling us might get you a better/more complete answer.

Thoughts on implementation:

Most of the algorithms are pretty light-weight in terms of RAM/registers, requiring only a couple extra bytes. In general, a function will result in better, cleaner, more readable, debugger-friendly code.

You should think of the macro solution as an optimization trick, and like all optimization tricks, jumping to them to early can be a waste of development time and a cause of more problems than it's worth.

Using a macro also has some strange implications you may not have considered yet:
You are aware that the preprocessor can only do the calculation if all the bytes in a message are fixed at compile time, right? If you have a variable in there, the compiler has to generate code. Without a function, that code will be inlined every time it's used (yes, that could mean lots of ROM usage). If all the bytes are variable, that code might be worse than just writing the function in C. Or with a good compiler, it might be better. Tough to say for certain. On the other hand, if a different number of bytes are variable depending on the message being sent, you might end up with several versions of the code, each optimized for that particular usage.

NO to question 1. I have added the polynomial to my question. — Rocketmagnet, Feb 21 '12 at 20:14
I am aware that all bytes have to be known at compile time. That's why my example code had them all defined. — Rocketmagnet, Feb 21 '12 at 20:15
@Rocketmagnet: I would first see if you can come up with a working algorithm with a look-table as a shortcut for the bitwise operations. Calculate the lookup table with a PC program and store it in a preprocessor variable (i.e. macro). Then unroll 'outer' loop that does the lookup on each word into something like roughly like this: `#define CALC_CRC(a,b, c) LUT[ c ^ LUT[ b ^ LUT[ a ^ FF ] ] ]` — Brian McFarland, Feb 21 '12 at 21:09
"SPI or I2C... /--/ ... if you have bit errors, it's probably the HW engineers fault". There is just so many ways they can get copper wrong. So on the contrary, if you have bit errors on SPI or I2C it is most likely the SW engineers fault! Is my experience, misconfiguration problems or poorly defined/standardized communication settings makes up around 99% of all SPI/I2C errors. — Lundin, Feb 22 '12 at 07:33
@Lundin: I meant if you have observed SPI/I2C bit error on the wire due to a corrupted signal, then it's *probably* time to redesign either the schematic, layout, enclosure, or shielding. Real transmission error from noise is going to be rare. So on well made HW, a simple XOR or no checksum is probably fine on messages this small. Sure a CRC-8 always superior, but sometimes MD5 is too. In any field of engineering, it's important to recognize the minimal but adequate solution. As I mentioned its also essential to assess the impact of a bad message getting through. — Brian McFarland, Feb 22 '12 at 15:40
Incidentally, many embedded-system C compilers don't use stacks for auto-variables or parameter passing. Instead, they statically allocate variables such that routines whose variables are live simultaneously will store their auto-variables and parameters in different addresses, while those whose variables are not live simultaneously may be given overlapping addresses. — supercat, Feb 23 '12 at 23:55
@supercat, Good point. I guess that especially makes sense on an arch w/ a HW stack for the program counter. Revised appropriately. — Brian McFarland, Feb 24 '12 at 15:37

Calculating an 8-bit CRC with the C preprocessor?

3 Answers3