7

How to use a scanf width specifier of 0?
1) unrestricted width (as seen with cywin gcc version 4.5.3)
2) UB
3) something else?

My application (not shown) dynamically forms the width specifier as part of a larger format string for scanf(). Rarely it would create a "%0s" in the middle of the format string. In this context, the destination string for that %0s has just 1 byte of room for scanf() to store a \0 which with behavior #1 above causes problems.

Note: The following test cases use constant formats.

#include <memory.h>
#include <stdio.h>

void scanf_test(const char *Src, const char *Format) {
  char Dest[10];
  int NumFields;
  memset(Dest, '\0', sizeof(Dest)-1);
  NumFields = sscanf(Src, Format, Dest);
  printf("scanf:%d Src:'%s' Format:'%s' Dest:'%s'\n", NumFields, Src, Format, Dest);
}

int main(int argc, char *argv[]) {
  scanf_test("1234" , "%s");
  scanf_test("1234" , "%2s");
  scanf_test("1234" , "%1s");
  scanf_test("1234" , "%0s");
  return 0;
}

Output:

scanf:1 Src:'1234' Format:'%s' Dest:'1234'  
scanf:1 Src:'1234' Format:'%2s' Dest:'12'  
scanf:1 Src:'1234' Format:'%1s' Dest:'1'  
scanf:1 Src:'1234' Format:'%0s' Dest:'1234' 

My question is about the last line. It seems that a 0 width results in no width limitation rather than a width of 0. If this is correct behavior or UB, I'll have to approach the zero width situation another way or are there other scanf() formats to consider?

MOHAMED
  • 41,599
  • 58
  • 163
  • 268
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • it looks like `%0s` is evaluated as `%s`. 0 is ignored. But I could not confirm that. I do not know if there is standard mentionning this behaviour – MOHAMED May 29 '13 at 16:12
  • There is a `*` format spec to suppress assignment. Would that help? –  May 29 '13 at 16:39
  • @mf_ UB stands for undefined behavior. – chux - Reinstate Monica May 29 '13 at 17:38
  • @Arkadiy `*` format spec to suppress assignment is an novel idea. It would prevent the 0 width destination string from receiving data, but `scanf()` would unfortunately still consume the input string leaving nothing for the next format specifier whose address would not be in the right place. – chux - Reinstate Monica May 29 '13 at 18:12

2 Answers2

10

The maximum field width specifier must be non-zero. C99, 7.19.6.2:

The format shall be a multibyte character sequence, beginning and ending in its initial shift state. The format is composed of zero or more directives: one or more white-space characters, an ordinary multibyte character (neither % nor a white-space character), or a conversion specification. Each conversion specification is introduced by the character %. After the %, the following appear in sequence:
— An optional assignment-suppressing character *.
An optional nonzero decimal integer that specifies the maximum field width (in characters).
— An optional length modifier that specifies the size of the receiving object.
— A conversion specifier character that specifies the type of conversion to be applied.

So, if you use 0, the behavior is undefined.

jxh
  • 69,070
  • 8
  • 110
  • 193
4

This came from 7.21.6.2 of n1570.pdf (C11 standard draft):

After the %, the following appear in sequence:

— An optional assignment-suppressing character *.

— An optional decimal integer greater than zero that specifies the maximum field width (in characters).

...

It's undefined behaviour, because the C standard states that your maximum field width must be greater than zero.

An input item is defined as the longest sequence of input characters which does not exceed any specified field width and ...

What is it you wish to achieve by reading a field of width 0 and assigning it as a string (empty string) into Dest? Which actual problem are you trying to solve? It seems more clear to just assign like *Dest = '\0';.

autistic
  • 1
  • 3
  • 35
  • 80
  • *undefined behaviour The format for `scanf()` in this application is dynamically generated depending on the space available for the destination string, along with other fields. The program does not know ahead of time that the width needs to be 0. Thus using `*Dest = '\0';` does not work in the general case, but can be used in handling the exceptional width = 0 case. – chux - Reinstate Monica May 29 '13 at 17:07
  • @chux Well, my program sorts a deck of cards... but what purpose does it serve? Why is it needed? Which actual problem are you trying to solve? I'm not asking you to describe your solution; I'm asking you to describe the problem. – autistic May 29 '13 at 17:29
  • Application used text files to control the operation of production programs testing a wide variety of products and the products functionality. In parsing configuration files, their can exist other configurations that dictate the maximum text length expected for certain fields. It is these other configuration files that were allow to specify a width of 0 and that 0 was passed onto the general parser. When 0 width was passed, only 0+1 bytes were allocated to receive the data. As described, too many bytes were saved. In the end, configuring for a width of 0 is now verboten. – chux - Reinstate Monica May 29 '13 at 17:54
  • user315052 C99 reference and your C11 draft reference (**available on line**) are turning out to be the best part of these answers. I can look up these niche C issues directly. RTFM - Read the _fantastic_ manual! – chux - Reinstate Monica May 29 '13 at 19:49