4

I'm working on receiving binary data from sensors for the first time. The data is base64-encoded, I should decode the data and validate it and then save it to the database. One step of the validation process is to check for the CRC-16 validity.

Each payload I receive comes with a CRC code, I have the function that is supposed to calculate the CRC-16 code itself, all I want to know is if it's enough to check that by passing the decoded data to the CRC-16 calculation function and then comparing the result to zero? If it's non-zero, then the data has been corrupted.

If everything went fine, I need unpack the binary data and loop over the result to get the sensors' data such as the battery and air_temperature according to specific offset info (according to the manufacturer's documentation). Then save the data as we normally do to the DB.

The problem is: I get non-zero values when I apply the crc16Calc function to a valid dataset.

Can that be because the CRC is added to the beginning of the data string, not to the end? I mean the structure of the payload is <CRC code><Original code>, not the opposite!

My code is:

public static $crc16_tbl = [
    0x0000, 0xC0C1, 0xC181, 0x0140, 0xC301, 0x03C0, 0x0280, 0xC241,
    0xC601, 0x06C0, 0x0780, 0xC741, 0x0500, 0xC5C1, 0xC481, 0x0440,
    0xCC01, 0x0CC0, 0x0D80, 0xCD41, 0x0F00, 0xCFC1, 0xCE81, 0x0E40,
    0x0A00, 0xCAC1, 0xCB81, 0x0B40, 0xC901, 0x09C0, 0x0880, 0xC841,
    0xD801, 0x18C0, 0x1980, 0xD941, 0x1B00, 0xDBC1, 0xDA81, 0x1A40,
    0x1E00, 0xDEC1, 0xDF81, 0x1F40, 0xDD01, 0x1DC0, 0x1C80, 0xDC41,
    0x1400, 0xD4C1, 0xD581, 0x1540, 0xD701, 0x17C0, 0x1680, 0xD641,
    0xD201, 0x12C0, 0x1380, 0xD341, 0x1100, 0xD1C1, 0xD081, 0x1040,
    0xF001, 0x30C0, 0x3180, 0xF141, 0x3300, 0xF3C1, 0xF281, 0x3240,
    0x3600, 0xF6C1, 0xF781, 0x3740, 0xF501, 0x35C0, 0x3480, 0xF441,
    0x3C00, 0xFCC1, 0xFD81, 0x3D40, 0xFF01, 0x3FC0, 0x3E80, 0xFE41,
    0xFA01, 0x3AC0, 0x3B80, 0xFB41, 0x3900, 0xF9C1, 0xF881, 0x3840,
    0x2800, 0xE8C1, 0xE981, 0x2940, 0xEB01, 0x2BC0, 0x2A80, 0xEA41,
    0xEE01, 0x2EC0, 0x2F80, 0xEF41, 0x2D00, 0xEDC1, 0xEC81, 0x2C40,
    0xE401, 0x24C0, 0x2580, 0xE541, 0x2700, 0xE7C1, 0xE681, 0x2640,
    0x2200, 0xE2C1, 0xE381, 0x2340, 0xE101, 0x21C0, 0x2080, 0xE041,
    0xA001, 0x60C0, 0x6180, 0xA141, 0x6300, 0xA3C1, 0xA281, 0x6240,
    0x6600, 0xA6C1, 0xA781, 0x6740, 0xA501, 0x65C0, 0x6480, 0xA441,
    0x6C00, 0xACC1, 0xAD81, 0x6D40, 0xAF01, 0x6FC0, 0x6E80, 0xAE41,
    0xAA01, 0x6AC0, 0x6B80, 0xAB41, 0x6900, 0xA9C1, 0xA881, 0x6840,
    0x7800, 0xB8C1, 0xB981, 0x7940, 0xBB01, 0x7BC0, 0x7A80, 0xBA41,
    0xBE01, 0x7EC0, 0x7F80, 0xBF41, 0x7D00, 0xBDC1, 0xBC81, 0x7C40,
    0xB401, 0x74C0, 0x7580, 0xB541, 0x7700, 0xB7C1, 0xB681, 0x7640,
    0x7200, 0xB2C1, 0xB381, 0x7340, 0xB101, 0x71C0, 0x7080, 0xB041,
    0x5000, 0x90C1, 0x9181, 0x5140, 0x9301, 0x53C0, 0x5280, 0x9241,
    0x9601, 0x56C0, 0x5780, 0x9741, 0x5500, 0x95C1, 0x9481, 0x5440,
    0x9C01, 0x5CC0, 0x5D80, 0x9D41, 0x5F00, 0x9FC1, 0x9E81, 0x5E40,
    0x5A00, 0x9AC1, 0x9B81, 0x5B40, 0x9901, 0x59C0, 0x5880, 0x9841,
    0x8801, 0x48C0, 0x4980, 0x8941, 0x4B00, 0x8BC1, 0x8A81, 0x4A40,
    0x4E00, 0x8EC1, 0x8F81, 0x4F40, 0x8D01, 0x4DC0, 0x4C80, 0x8C41,
    0x4400, 0x84C1, 0x8581, 0x4540, 0x8701, 0x47C0, 0x4680, 0x8641,
    0x8201, 0x42C0, 0x4380, 0x8341, 0x4100, 0x81C1, 0x8081, 0x4040
];


// $crc is an integer between 0 and 0xFFFF
// $dataByte is an integer between 0 and 0xFF
// The result is an integer between 0 and 0xFFFF
function addCRC($crc, $dataByte)
{
    $index = ($crc & 0xFF) ^ $dataByte;
    $crc16int = self::$crc16_tbl[$index];
    return ($crc >> 8) ^ $crc16int;
}

// $buffer is a string containing the binary data
// The result is an integer between 0 and 0xFFFF
function crc16Calc($buffer)
{
    $crc16 = 0;
    $length = strlen($buffer);
    for ($i = 0; $i < $length; $i++) {
        // Use ord() to go from a length-1 string to an integer between 0 and 0xFF
        $dataByte = ord($buffer[$i]);
        $crc16 = $this->addCRC($crc16, $dataByte);
    }
    return $crc16;
}

public function store(Request $request)
{
    // 1. Decode the data from base64 string, and check for CRC validaity. 
    $content = file($request->file('data'));
    Storage::disk('local')->put('examples.bin', '');
    $file_handler = fopen('C:\laragon\www\medium-clone\storage\app\examples.bin', 'w+');
    foreach ($content as $line) {
        $decoded_data = base64_decode($line);
        // check for CRC validaty
        print ($this->crc16Calc($decoded_data)). '<br />'; // this gives a different non-zero number eachtime
        if($this->crc16Calc($decoded_data) != 0)
            return "Invalid Data";
        //else
        fwrite($file_handler, $decoded_data);
    }
    fclose($file_handler);
}

Edit This is the data below encoded with base64, contains 20 payloads & the image below explains the structure of the payload, all multi-byte binary fields are ordered little endian.

otykgAFuAGUAAEwBQAMfCqMI6g3zA+UDBQR8AXEBiQEyAiQCPQKh/nb+SwBKAAA=
WVOWgAFuAGUAAEwBQAMOCgAA6g1nAVsBcAEuAi0CMgJLAUgBTgFK/kX+IgAiAAA=
g5v5gAFvAGcAAPAAQAMRCs0IxiWrA54DsgMzAycDQQObAI0ApwCFAnYCFAATAA8=
z/5qgAFvAGcAABkBQAPuCSMJLh+uAqgCtALoA+gD6APY/9j/2P+uAqgCAAAAAA8=
XoVTgAFvAGcAAPgAQAMDCr8JZiq0Aa0BvAGhAkIC3gL+ANAARAGG/7n+GgAWAAA=
SI5CgAFvAGcAAPgAQAPvCQAAWirJAMEA0AD8ALgATwHvAcEBFQKu+U/4NAAvAAU=
RxA9gAFvAGcAAA8BQAMRCrgJUCVbAkwCcgLNAoQCCQPjALIAIAGBAOD/GQAUAAA=
T+s1gAFvAGcAAPgAQAP0CQAATioEAfsADQHgAL4AIgEMAucBIgJe+bL4OAA0AAU=
H+EqgAFvAGcAAPgAQAP8CQAAQip0AXIBdgH0AswB6AOjAND/jgG0/1P9EgAAAAU=
CLUbgAFvAGcAAPgAQAMDCgAAJirIAa8B5AHoA+gD6APT/9L/0//IAa8BAAAAAA8=
3nAQgAFvAGcAAPgAQAMFCq4IHCqtAKUAswAyACoAQgBgAlQCZwKx8gfyQQBAAAA=
fDsKgAFvAGgAADEBQAMvCtYJOfgmAxsDNgM+AzADVwOKAHIAmAAZAgQCEQAPAA8=
YD4pgAFvAGgAADEBQAP2CQAAOfiCAXABlgHbA84D6APf/9D/7f9wAV4BAAAAAA8=
hCW9gAFvAGcAAOkAQAMgCjoAbh6xALEAswC9A7IDxQP7//L/BgB1AGUAAgACAA8=
HRv7gAFvAGcAAL4BQAP5CQAASBPCBbgFzAXoA+gD6APw/+//8P/CBbgFAAAAAA8=
lZPRgAFvAGcAANcAQAMqCnoJTiAoAhwCOALvAuICCAPGALEA0QCTAG4AFQAUAAA=
9AfcgAFvAGcAAE4BQAMdCgAAAMBUCEcIYwi1Aa8BuwHJAr4C1QJQA0oDjgCMAAA=
KHT7gAFuAGUAADwBQAMrCv0ItA9EADQAVADoA+gD6APK/8r/yv9DADQAAAAAAA8=
fcjsgAFvAGcAAK0BQAMdCqMJtg1OA0EDWwOHA3QDpANCACUAVwC6AqcCCgAHAA8=
LHArgAFvAGcAAJwBQAMLCsQJpBXhANAAAgHoA+gD6APO/83/zv/hANAAAAAAAA8=

This is the payload structure,  all multi-byte binary fields are ordered little endian

I also tried to move the first two bytes of the CRC to the end of the string, then calculate the result, it gave 0xB9AE which is non-zero, The function is performing the calculation correctly because I compared the result with an online CRC-16 calculator.

  $new_string = mb_strcut($decoded_data,2,46).mb_strcut($decoded_data,0,2);
        print $new_string;
        print 'crc1: '.$this->crc16Calc($new_string).'  ';
Furqan S. Mahmoud
  • 1,417
  • 2
  • 13
  • 26
  • 1
    There are protocols that do what you describe, but to tell you how to use this particular one we would have to know a bit more about the protocol you are using. – Robert Harvey Jul 25 '20 at 20:10
  • 2
    My assumption would be that you have to split the data into the main payload and the checksum, then pass the payload (_without_ the checksum!) into `crc16Calc` and then check if the returned value equals the checksum you got in the data (not zero). So if you got `XXAABB` and know that the checksum is at the start, you'd verify that `crc16Calc('AABB') == 'XX'` - symbolically speaking. But to know in detail how it is supposed to work, we'd need some documentation about the protocol that is used here! – CherryDT Jul 25 '20 at 20:17
  • @CherryDT I updated the question, please take a look. – Furqan S. Mahmoud Jul 26 '20 at 12:53
  • 1
    You can't use `mb_strcut` for this. While it uses byte offsets, it treats the data as multi-byte characters, not as raw bytes. Use `substr`. Your example data behaves exactly as expected, giving a CRC of zero if the two bytes at the start are moved to the end. – Mark Adler Jul 26 '20 at 15:33
  • 1
    Also while `substr($decoded_data,2,46)` will work, you really mean `substr($decoded_data,2,45)`, since your messages are 47 bytes in length. – Mark Adler Jul 26 '20 at 15:47
  • @MarkAdler Thanks, I will update that and will let you know. – Furqan S. Mahmoud Jul 26 '20 at 16:01

1 Answers1

2

Just save the first two bytes, calculate the CRC on the rest, and then compare that to what you saved. That's far and away the most straightforward and most verifiable approach. Not to mention it is a smidge faster than what you want to do (even if you could), since you avoid calculating the CRC on two bytes that you don't need to.

You will get a zero on the whole thing only if the sender appended the CRC bytes to the end, and they appended them in little-endian order.

If you simply cannot resist the siren's call of that lovely algebraic property of the CRC, then move the two bytes to the end, in the proper byte order, and compute the CRC on the message plus CRC. The result should be zero.

Update for example data added to question:

The first two bytes of each 47-byte message is the expected CRC-16/ARC, stored in little-endian order. The approaches given above work as stated, using the 45 bytes after the CRC as the data. The PHP CRC code is correct.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158
  • Aren't most CRC's appended in big endian order? – rcgldr Jul 26 '20 at 06:38
  • No. In zip and gzip its little endian, png it’s big endian. Depends on the whims of the format designer. – Mark Adler Jul 26 '20 at 07:01
  • @MarkAdler I moved the first two bytes to the end, and computed the CRC on the whole msg, I got a non-zero number! $new_string = mb_strcut($decoded_data,2,46).mb_strcut($decoded_data,0,2); print $new_string; print 'crc1: '.$this->crc16Calc($new_string).' '; – Furqan S. Mahmoud Jul 26 '20 at 10:52
  • And the crc16Calc function is giving a correct values, I compared the result of that function with online crc-16 calculator http://www.sunshine2k.de/coding/javascript/crc/crc_js.html – Furqan S. Mahmoud Jul 26 '20 at 10:53
  • @MarkAdler I appreciate your help a lot, thanks for every single line you wrote to help me solve this. – Furqan S. Mahmoud Jul 26 '20 at 16:06
  • @MarkAdler - I deleted my prior comments. Brain fade on my part, as I was thinking CRC output with bits reversed, not bytes reversed (little endian instead of big endian). – rcgldr Jul 26 '20 at 19:29