6

I was originally parsing a file line by line using fgets().

Now I changed things so that I already have my entire file in a buffer. I still like to read that buffer line by line for parsing purposes. Is there something designed for this, or do I need to make a loop that inspects for 0x0A chars at this point?

roschach
  • 8,390
  • 14
  • 74
  • 124
Steven Lu
  • 41,389
  • 58
  • 210
  • 364
  • 2
    if you're using C++, have a look at [istringstream](http://www.cplusplus.com/reference/iostream/istringstream/) which allows you to 'read from a string in memory'. – Andre Holzner Sep 17 '11 at 15:49
  • 2
    You've tagged this `c` and `c++` ... which is it? And how is your buffer defined? – Brian Roach Sep 17 '11 at 15:49
  • Decide whether you want to write C code or C++ code. Then come back and ask how to do that. I'm not gonna waste my time on writing a C++ solution for you when you might dismiss it as "not C enough". – sbi Sep 17 '11 at 15:51
  • I am using C++ but I prefer to use the C functions like `fopen` when it is straightforward to implement using them. – Steven Lu Sep 17 '11 at 18:21
  • If you writing a lexer why not write it in `flex`, it's blazingly fast and gets it right every time. It will give you both C and C++ outputs and you can write callbacks once you receive a token to your heart's content for either. – Ahmed Masud Mar 21 '19 at 05:25

5 Answers5

7

memchr (with a little bit of your own wrapper code, ending with memcpy) is the exact equivalent - like fgets it takes a maximum length it will process (should be the min of the remaining input buffer size and the size of your output buffer) and scans until it hits the desired character (which will be '\n') or runs out of input/output space.

Note that for data already in a buffer in memory, though, you might want to skip the step of copying to a separate output buffer, unless you need to null-terminate the output without modifying the input. Many beginner C programmers often make the mistake of thinking they need null termination, when it would really suffice to just improve some of your interfaces to take a (pointer, length) pair, allowing you to pass/process substrings without copying them. For instance you can pass them to printf using: printf("%.*s", (int)length, start);

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • Thanks for pointing out `memchr`. I have never used it but I have used `strrchr` before and this is quite convenient. Thanks – Steven Lu Sep 17 '11 at 18:23
2

You could use the sscanf function for this. If you actually need a whole line, something like this should do the trick:

sscanf(your_buffer, "%50[^\n]", line);

(This will read lines at most 50 chars long. As always, careful with the length and 0 terminators. And check the return value of sscanf in case something went wrong.)

You can use pointer arithmetics to move your buffer along (just add "returned" line length + 1).

Mat
  • 202,337
  • 40
  • 393
  • 406
0

For C, this should work, I think:

// str (in/out): the unconsumed portion of the input string
static char *sgets(char *buf, int n, const char **str)
{
    const char *s = *str;
    const char *lf = strchr(s, '\n');
    int len = (lf == NULL) ? strlen(s) : (lf - s) + 1;

    if (len == 0)
        return NULL;
    if (len > n - 1)
        len = n - 1;

    memcpy(buf, s, len);
    buf[len] = 0;
    *str += len;
    return buf;
}
John Lindgren
  • 777
  • 5
  • 14
0

There is sscanf which may or may not work for you.

Joshua
  • 40,822
  • 8
  • 72
  • 132
0

If you're looking for C functions, strtok() and strsep() will both split a string on a specified character.

  • I know for sure that `strtok` will modify the underlying string, which may not really be suited to the task, what of `strstep` ? – Matthieu M. Sep 17 '11 at 16:27
  • 1
    Yes, both functions modify the string in place. You'll need to cook something up with `strchr` / `memcpy` if you don't want that. –  Sep 17 '11 at 16:42