CRC calculation reduction

Question

I have one math and programming related question about CRC calculations, to avoid recompute full CRC for a block when you must change only a small portion of it.

My problem is the following: I have a 1K block of 4 byte structures, each one representing a data field. The full 1K block has a CRC16 block at the end, computed over the full 1K. When I have to change only a 4 byte structure, I should recompute the CRC of the full block but I'm searching for a more efficient solution to this problem. Something where:

I take the full 1K block current CRC16
I compute something on the old 4 byte block
I "subtract" something obtained at step 2 from the full 1K CRC16
I compute something on the new 4 byte block
I "add" something obtained at step 4 to the result obtained at step 3

To summarize, I am thinking about something like this:

CRC(new-full) = [CRC(old-full) - CRC(block-old) + CRC(block-new)]

But I'm missing the math behind and what to do to obtain this result, considering also a "general formula".

Thanks in advance.

I do not think that it is doable. The crc calculations depends on previous results. It is not simple adding and subtracting — 0___________, Jan 26 '19 at 12:01

Mark Adler · Accepted Answer · 2019-02-05T04:33:22.133

Take your initial 1024-byte block A and your new 1024-byte block B. Exclusive-or them to get block C. Since you only changed four bytes, C will be bunch of zeros, four bytes which are the exclusive-or of the previous and new four bytes, and a bunch more zeros.

Now compute the CRC-16 of block C, but without any pre or post-processing. We will call that CRC-16'. (I would need to see the specific CRC-16 you're using to see what that processing is, if anything.) Exclusive-or the CRC-16 of block A with the CRC-16' of block C, and you now have the CRC-16 of block B.

At first glance, this may not seem like much of a gain compared to just computing the CRC of block B. However there are tricks to rapidly computing the CRC of a bunch of zeros. First off, the zeros preceding the four bytes that were changed give a CRC-16' of zero, regardless of how many zeros there are. So you just start computing the CRC-16' with the exclusive-or of the previous and new four bytes.

Now you continue to compute the CRC-16' on the remaining n zeros after the changed bytes. Normally it takes O(n) time to compute a CRC on n bytes. However if you know that they are all zeros (or all some constant value), then it can be computed in O(log n) time. You can see an example of how this is done in zlib's crc32_combine() routine, and apply that to your CRC.

Given your CRC-16/DNP parameters, the zeros() routine below will apply the requested number of zero bytes to the CRC in O(log n) time.

// Return a(x) multiplied by b(x) modulo p(x), where p(x) is the CRC
// polynomial, reflected. For speed, this requires that a not be zero.
uint16_t multmodp(uint16_t a, uint16_t b) {
    uint16_t m = (uint16_t)1 << 15;
    uint16_t p = 0;
    for (;;) {
        if (a & m) {
            p ^= b;
            if ((a & (m - 1)) == 0)
                break;
        }
        m >>= 1;
        b = b & 1 ? (b >> 1) ^ 0xa6bc : b >> 1;
    }
    return p;
}

// Table of x^2^n modulo p(x).
uint16_t const x2n_table[] = {
    0x4000, 0x2000, 0x0800, 0x0080, 0xa6bc, 0x55a7, 0xfc4f, 0x1f78,
    0xa31f, 0x78c1, 0xbe76, 0xac8f, 0xb26b, 0x3370, 0xb090
};

// Return x^(n*2^k) modulo p(x).
uint16_t x2nmodp(size_t n, unsigned k) {
    k %= 15;
    uint16_t p = (uint16_t)1 << 15;
    for (;;) {
        if (n & 1)
            p = multmodp(x2n_table[k], p);
        n >>= 1;
        if (n == 0)
            break;
        if (++k == 15)
            k = 0;
    }
    return p;
}

// Apply n zero bytes to crc.
uint16_t zeros(uint16_t crc, size_t n) {
    return multmodp(x2nmodp(n, 3), crc);
}

I have just a bunch of questions about your code. How am I supposed to integrate this with existing HAL_CRC call? I have also noticed that the CRC16DNP of every 0 sequence (any length) is always 0xFFFF. — PeppeA82, Feb 06 '19 at 07:49
I don't know what "HAL_CRC" is. CRC-16' is simply your CRC without the exclusive-or with `0xffff` at the end. (Or adding another exclusive-or with `0xffff` after you get a result to cancel the other one.) Your observation is noted in my answer, which is that the CRC-16' of a sequence of zeros is zero. — Mark Adler, Feb 06 '19 at 14:41

score 2 · Answer 2 · answered Jan 26 '19 at 13:35

2

CRC actually makes this an easy thing to do.

When looking into this, I'm sure you've started to read that CRCs are calculated with polynomials over GF(2), and probably skipped over that part to the immediately useful information. Well, it sounds like it's probably time for you to go back over that stuff and reread it a few times so you can really understand it.

But anyway...

Because of the way CRCs are calculated, they have a property that, given two blocks A and B, CRC(A xor B) = CRC(A) xor CRC(B)

So the first simplification you can make is that you just need to calculate the CRC of the changed bits. You could actually precalculate the CRCs of each bit in the block, so that when you change a bit you can just xor it's CRC into the block's CRC.

CRCs also have the property that CRC(A * B) = CRC(A * CRC(B)), where that * is polynomial multiplication over GF(2). If you stuff the block with zeros at the end, then don't do that for CRC(B).

This lets you get away with a smaller precalculated table. "Polynomial multiplication over GF(2)" is binary convolution, so multiplying by 1000 is the same as shifting by 3 bits. With this rule, you can precalculate the CRC of the offset of each field. Then just multiply (convolve) the changed bits by the offset CRC (calculated without zero stuffing), calculate the CRC of those 8 byes, and xor them into the block CRC.

answered Jan 26 '19 at 13:35

Matt Timmermans

53,709
3
46
87

Thanks for your answer but I still have some doubts about implementation. I have just to XOR new data with old data, compute a crc over the full length of the block and then XOR old CRC with new "small CRC"? – PeppeA82 Jan 26 '19 at 15:11
Your initial claim on A and B is not quite correct for CRC's with pre and post-processing, which most have. – Mark Adler Jan 26 '19 at 18:25
@MarkAdler (with a short bow) Yes, that's true. If your only available CRC primitive includes those steps (and you don't want to write your own), then you'd need to add a prefix that resets the initial state in order to compute a delta, and undo any post-processing at the end. I'm not sure I know how to explain that in an accessible way. I'll think about it. Maybe the OP could indicate what he's working with. – Matt Timmermans Jan 26 '19 at 19:08
@MarkAdler I'm working on a proprietary file format in which each file section has a CRC16. I'm computing CRC16 using STM32 dedicated hardware. – PeppeA82 Jan 26 '19 at 20:04
@PeppeA82 With what settings? – Mark Adler Jan 27 '19 at 00:26
@MarkAdler CRC settings are Reflected Input (Inverted Input), Reflected Output (Inverted Output), XOR Output with 0XFFFF, Generating Polynomial = 0x3D65 CRC-16-DNP – PeppeA82 Feb 04 '19 at 14:02

score 2 · Answer 3 · answered Jan 26 '19 at 13:37

2

The CRC is the remainder of the long integer formed by the input stream and the short integer corresponding to the polynomial, say p.

If you change some bits in the middle, this amounts to a perturbation of the dividend by n 2^k where n has the length of the perturbed section and k is the number of bits that follow.

Hence, you need to compute the perturbation of the remainder, (n 2^k) mod p. You can address this using

(n 2^k) mod p = (n mod p) (2^k mod p)

The first factor is just the CRC16 of n. The other factor can be obtained efficiently in Log k operations by the power algorithm based on squarings.

answered Jan 26 '19 at 13:37

It should be noted that n = (original data) xor (new data). Also for clarification: (n 2^k) mod p = ((n mod p) (2^k mod p)) mod p, and that result is xor'ed with the original CRC. If k is a fixed value, then (2^k mod p) is a constant and can be pre-computed. This assumes that the CRC is not post-complemented. If the CRC is post-complemented, it needs to be complemented, updated, and complemented back. – rcgldr Jan 26 '19 at 15:23
I forgot to mention that a carryless multiply is needed for (n mod p) (2^k mod p). If k is constant, this could be done using four 256 byte tables. – rcgldr Jan 26 '19 at 21:36

score 0 · Answer 4 · answered Jan 26 '19 at 12:03

0

CRC depends of the calculated CRC of the data before. So the only optimization is, to logical split the data into N segment and store the computed CRC-state for each segment.

Then, of e.g. modifying segment 6 (of 0..9), get the CRC-state of segment 5, and continue calculating CRC beginning with segment 6 and ending with 9.

Anyway, CRC calculations are very fast. So think, if it is worth it.

answered Jan 26 '19 at 12:03

Wiimm

2,971
1
15
25

It is a comment not the answer (BTW is a copy of my one) – 0___________ Jan 26 '19 at 12:06
No, that is not the only optimization. – Mark Adler Jan 26 '19 at 18:28

CRC calculation reduction

4 Answers4