2

I am writing a Wireshark protocol dissector in lua. The protocol it parses contains a crc16 checksum. The dissector should check whether the crc is correct.

I have found a crc16 implementation written in C already with the lua wrapper code here. I have successfully compiled it and run it (e.g. crc16.compute("test")). The problem is it expects a string as input. From wireshark, I get a buffer that seems to be of lua type userdata. So when I do

crc16.compute(buffer(5, 19))

Lua complains bad argument #1 to compute (string expected, got userdata).

compute() in the crc16 implementation looks like this:

static int compute(lua_State *L)
{
    const char *data;
    size_t len = 0;
    unsigned short r, crc = 0;

    data = luaL_checklstring(L, 1, &len);

    for ( ; len > 0; len--)
    {
        r = (unsigned short)(crc >> 8);
        crc <<= 8;
        crc ^= crc_table[r ^ *data];
        data ++;
    }

    lua_pushinteger(L, crc);
        return 1;
}

It seems luaL_checklstring fails. So I guess I would either need to convert the input into a lua string, which I am not sure it works, as not all bytes of my input are necessarily printable characters. Or I would need to adjust the above code so it accepts input of type userdata. I found lua_touserdata(), but this seems to return something like a pointer. So I would need a second argument for the length, right?

I don't necessarily need to use this implementation. Any crc16 implementation for lua that accepts userdata would perfectly solve the problem.

lhf
  • 70,581
  • 9
  • 108
  • 149
Christopher K.
  • 1,015
  • 12
  • 18
  • Perhaps wireshark provides a method to convert the buffer to a Lua string? Perhaps `__tostring`? If so, you can use `lua_tostring` instead of `luaL_checklstring`. – lhf Apr 04 '17 at 16:28
  • If I do `uint8_t * data = (uint8_t *) lua_tostring(L,1); lua_pushinteger(L, data[0]);` then wireshark crashes. If I do `uint8_t * data = (uint8_t *) lua_tolstring(L,1, &len); lua_pushinteger(L, len);` then I get `0` length. – Christopher K. Apr 05 '17 at 09:24
  • Also, my data might contain multiple bytes that are zero, and lua strings are terminated at the first zero, right? Wouldn't tostring terminate at the first zero? – Christopher K. Apr 05 '17 at 09:40
  • Strings in Lua can have embedded zero bytes. Use `lua_tolstring` to get the buffer and the length. – lhf Apr 05 '17 at 11:29
  • Thank you for your tips. I now got a working solution. With `tostring(buffer(5, 19):bytes())` you can get a string representation of the byte array. It is an ASCII representation of the hex encoded data, i.e. 2 ascii characters per byte. So in C you need to decode the ascii again before computing the crc. I will post the complete solution later. – Christopher K. Apr 05 '17 at 12:45
  • Or just send `buffer(5, 19):bytes()` to your C function. – lhf Apr 05 '17 at 12:49
  • As a quick test, could you try: crc16.compute(buffer(5, 19):bytes()) – Christopher Maynard Apr 07 '17 at 16:46
  • I have crc checking routines implemented in Lua, and while they work, they're not particularly fast. It would be much better to have Wireshark support the necessary bindings in order to improve performance. I would encourage you to submit the code to Wireshark so everyone could benefit from it ... once the code is proven to work that is. – Christopher Maynard Apr 07 '17 at 16:53

2 Answers2

0

The buffer that you get from wireshark can be used as a ByteArray like this:

byte_array = Buffer(5,19):bytes();

ByteArray has a _toString function that converts the bytes into a string representation of the bytes represented as hex. So you can call the crc function like this:

crc16.compute(tostring(byte_array))

'Representation of the bytes represented as hex' means an input byte with the bits 11111111 will turn into the ASCII string FF. The ASCII string FF is 01000110 01000110 in bits or 46 46in hex. This means what you get in C, is not the original bytearray. You need to decode the ascii representation back into the original bytes before computing the crc, otherwise we will obviously get a different crc. First, this function converts a single character c containing one ascii hex character back into the value it represents:

static char ascii2char(char c) {
    c = tolower(c);
    if(c >= '0' && c <= '9')
        return c - '0';
    else if(c >= 'a' && c <= 'f')
        return c - 'a' + 10;
}

Now in the compute function we loop through the string representation, always combining two characters into one byte.

int compute(lua_State *L) {
    size_t len;
    const char * str = lua_tolstring(L, 1, &len);
    uint8_t * data = (uint8_t *) malloc(len/2);

    for(int n=0; n<len/2; n++) {
        data[n] = ascii2char(str[2*n]) << 4;
        data[n] |= ascii2char(str[2*n+1]);
    }

    crc16_t crc = crc16_init();
    crc = crc16_update(crc, data, len/2);
    crc = crc16_finalize(crc);

    lua_pushinteger(L, crc);
    free(data);
    return 1;
}

In this example, I used the crc functions crc16_init, crc16_update and crc16_finalize generated using pycrc, not the crc implementation linked in the question. The problem is that you need to use the same polynom etc. as when generating the crc. Pycrc allows you the generate crc functions as needed. My packets also contain a crc32. Pycrc can also generate code for crc32, so it works all the same way for crc32.

Christopher K.
  • 1,015
  • 12
  • 18
0

Christopher K outlines what is mostly the correct answer, but the conversion of hex values back into bytes seemed a little like hardwork, but this got me looking as I was searching for something like this.

The trick missed was that as well as calling the function with a buffer:bytes() you can also call

buffer:raw()

This provides exactly what is needed: a simple TSTRING that can be parsed directly without the need to do ascii conversions that would, I imagine, add significantly to the load in the C code.

David
  • 39
  • 3