2

I'm thinking of using Ragel to generate a lexer for NMEA GPS data in an embedded system. I would have an arbitrary-sized buffer into which I'd read blocks of data from a UART, and for each read I'd pass that data into the lexer.

I'd like to be able to extract particular fields, but the problem is that I have no guarantee that an entire field is present in a block of data. Any field might be split across two reads, so setting pointers to the start and end of the field might leave the start pointer at the end of the previous (now overwritten) buffer, and the end pointer before it.

One solution that springs to mind is to use a '$' action on each field to push the characters one-by-one into another bit of memory (probably a struct field). Is that the best approach?

Isvara
  • 3,403
  • 1
  • 28
  • 42
  • think you want to use an extensible string wrapper so that you don't overflow things (for example what happens when you try and store ',' when you're already at the limit of `wptr` allocation). should be as simple as realloc when length reaches allocation capacity – amdixon Nov 27 '13 at 04:47
  • @amdixon I'm avoiding the use of malloc wherever possible (so far I don't use it at all). None of the fields will overflow, though -- they're all big enough. (I don't store ',' anywhere there, btw.) – Isvara Nov 27 '13 at 05:13

2 Answers2

3

For what it's worth, I ended up with this:

%%{
    machine nmea;

    action store { *wptr = fc; }
    action append { *wptr++ = fc; }
    action term { *wptr++ = 0; }

    integer = digit+;
    float = digit+ '.' digit+;

    rmc = '$GPRMC,'
        float ','
        [AV] >{ wptr = &loc.valid; } $store ','
        float? >{ wptr = loc.lat; } $append %term ','
        [NS]? >{ wptr = &loc.ns; } $store ','
        float? >{ wptr = loc.lng; } $append %term ','
        [EW]? >{ wptr = &loc.ew; } $store
        print*
        '\n' >{ printf("%c, %s, %c, %s, %c\n", loc.valid, loc.lat, loc.ns, loc.lng, loc.ew); }
    ;

    main := any* rmc;
}%%
Isvara
  • 3,403
  • 1
  • 28
  • 42
0

You might want to add an overflow protection to your code in order to avoid an undefined behaviour on a malicious or otherwise wrong input:

char buf[1024], *wptr = buf, *wmax = buf + sizeof(buf) - 2;

action append { if (wptr < wmax) *wptr++ = fc; }
ArtemGr
  • 11,684
  • 3
  • 52
  • 85
  • You're right in general, but in this case it's a tightly controlled system with no chance for malicious or unexpected input. – Isvara Apr 28 '14 at 14:52
  • Thanks for the example, BTW. (AFAIK, it would've been better to post it as an *answer* so that the people would know the question was answered). – ArtemGr Apr 28 '14 at 15:27
  • Okay, I moved it to an answer. – Isvara Apr 29 '14 at 02:49