2

I'm trying to parse xxxxxx(xxxxx) format string using sscanf as following:

sscanf(command, "%s(%s)", part1, part2)

but it seems like sscanf does not support this format and as a result, part1 actually contains the whole string.

anyone has experience with this please share...

Thank you

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
user1011346
  • 33
  • 1
  • 7

3 Answers3

7

Converting your code into a program:

#include <stdio.h>

int main(void)
{
    char part1[32];
    char part2[32];
    char command[32] = "xxxxx(yyyy)";
    int n;

    if ((n = sscanf(command, "%s(%s)", part1, part2)) != 2)
        printf("Problem! n = %d\n", n);
    else
        printf("Part1 = <<%s>>; Part2 = <<%s>>\n", part1, part2);
    return 0;
}

When run, it produces 'Problem! n = 1'.

This is because the first %s conversion specifier skips leading white space and then scans for 'non white-space' characters up to the next white space character (or, in this case, end of string).

You would need to use (negated) character classes or scansets to get the result you want:

#include <stdio.h>

int main(void)
{
    char part1[32];
    char part2[32];
    char command[32] = "xxxxx(yyyy)";
    int n;

    if ((n = sscanf(command, "%31[^(](%31[^)])", part1, part2)) != 2)
        printf("Problem! n = %d\n", n);
    else
        printf("Part1 = <<%s>>; Part2 = <<%s>>\n", part1, part2);
    return 0;
}

This produces:

Part1 = <<xxxxx>>; Part2 = <<yyyy>>

Note the 31's in the format; they prevent overflows.


I'm wondering how does %31 works. Does it work as %s and prevent overflow or does it just prevent overflow?

With the given data, these two lines are equivalent and both safe enough:

    if ((n = sscanf(command, "%31[^(](%31[^)])", part1, part2)) != 2)
    if ((n = sscanf(command, "%[^(](%[^)])", part1, part2)) != 2)

The %[...] notation is a conversion specification; so is %31[...].

The C standard says:

Each conversion specification is introduced by the character %. After the %, the following appear in sequence:

  • An optional assignment-suppressing character *.
  • An optional decimal integer greater than zero that specifies the maximum field width (in characters).
  • An optional length modifier that specifies the size of the receiving object.
  • A conversion specifier character that specifies the type of conversion to be applied.

The 31 is an example of the (optional) maximum field width. The [...] part is a scanset, which could perhaps be regarded as a special case of the s conversion specifier. The %s conversion specifier is approximately equivalent to %[^ \t\n].

The 31 is one less than the length of the string; the null at the end is not counted in that length. Since part1 and part2 are each an array of 32 char, the %31[^(] or %31[^)] conversion specifiers prevent buffer overflows. If the first string of characters was more than 31 characters before the (, you'd get a return value of 1 because of a mismatch on the literal open parenthesis. Similarly, the second string would be limited to 31 characters, but you'd not easily be able to tell whether the ) was in the correct place or not.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Thank you very much. But I'm wondering how does %31 works. Does it works as %s and prevent overflow or does it just prevent overflow, sscanf read string by default. – user1011346 Sep 13 '12 at 04:25
2

If you know exactly how long are the parts of your "command", then the simplest option is:

sscanf(command, "%6s(%5s)", part1, part2);

This assumes that 'part1' is always 6 characters long and 'part2' is always 5 characters long (as in your code sample).

sirgeorge
  • 6,331
  • 1
  • 28
  • 33
1

Try this instead:

#include <stdio.h>

int main(void)
{
  char str1[20];
  char str2[20];
  sscanf("Hello(World!)", "%[^(](%[^)])", str1, str2);
  printf("str1=\"%s\", str2=\"%s\"\n", str1, str2);
  return 0;
}

Output (ideone):

str1="Hello", str2="World!"
Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180