1

Following code will due to infinite loop. I know that using %d format instead of %zu for getting a size_t from input on scanf function is wrong. But why condition is TRUE ?? How ?

  size_t c;
  scanf("%d", &c);
 
  for (size_t i = 0; i < c; i++)
    printf("%d\n", i);

If I change i type to int, problem will solved. But why ?

Other examples :

  size_t c;
  size_t b = 10;

  scanf("%d", &c);
  printf("%s\n", b < c ? "TRUE" : "FALSE");
  printf("c: %p\n", c);
  printf("b: %p\n", b);

Out

$ ./a.out
100
TRUE
c: 0x7f6400000064
b: 0xa

How can I understand this problem and mechanism ? please help me to searching ...

Compiler details:

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/12.1.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc/src/gcc/configure --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-bootstrap --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-werror --with-build-config=bootstrap-lto --enable-link-serialization=1
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.1.0 (GCC) 

OS:

archlinux
Linux developer 5.18.6-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 22 Jun 2022 18:10:56 +0000 x86_64 GNU/Linux
alirezaarzehgar
  • 171
  • 1
  • 8
  • 5
    `%d` is for reading `int` and only `int`. Using incorrect type causes undefined behaviour – M.M Sep 25 '22 at 11:37
  • 1
    To explain what's seems to actually be happening, is that `size_t` is typically a 64-bit unsigned type (on 64-bit systems at least), while `int` is usually 32 bits. That means `scanf("%d", &c)` will only write to 32 bits of the variable `c`, leaving the other 32 bits uninitialized and with an indeterminate value. – Some programmer dude Sep 25 '22 at 11:41
  • @M.M Ok. I know this code causes undefined behavior. But why ? How this code can effect on comparing two variables ? After printing `c`, we can see it's value is correct. put why ? – alirezaarzehgar Sep 25 '22 at 11:42
  • 1
    Also note that with e.g. `printf("c: %p\n", c)` you again use mismatching format specifier and argument type. If you want to print a `size_t` as hexadecimal, use `%zx`. – Some programmer dude Sep 25 '22 at 11:42
  • 1
    You say that "[a]fter printing c, we can see it's value is correct." In fact it's *not* correct, the output says that `c` is `0x7f6400000064`, while it should be `0x64`. Those leading digits are part of the value. And they are explained by the 32/64 bit mismatch I explained above. – Some programmer dude Sep 25 '22 at 11:44
  • What I should to search for finding causality for this action ? – alirezaarzehgar Sep 25 '22 at 11:48
  • 4
    @alirezaarzehgar read the language standard. It says that `%d` is for `int` only. The cause of the problem is that your program does not comply with the rules of the language, so you cannot expect any particular behaviour – M.M Sep 25 '22 at 11:55
  • 2
    Enable compiler warnings. It should've told you about this. – HolyBlackCat Sep 25 '22 at 12:23

2 Answers2

1

Your code has undefined behavior because %d expects a pointer to int, not a pointer to size_t, which has a different representation in memory:

  • on most 64-bit systems, size_t has 64 bits and int only 32 bits, so size_t c; scanf("%d", &c); at best only modifies half of the stored value of c, namely the low order word on little endian systems. Since c is uninitialized, the high order word can have any value, most of which will make it greater than b.

The above explanation is tentative, the behavior is undefined so something else could happen. Use %zu to read and convert values of type size_t and enable all warnings (gcc -Wall -Wextra or clang -Weverything) to let the compiler detect such mistakes:

#include <stdio.h>

int main() {
    size_t c;
    size_t b = 10;

    if (scanf("%zu", &c) == 1) {
        printf("%s\n", b < c ? "TRUE" : "FALSE");
        printf("c: 0x%zx\n", c);
        printf("b: 0x%zx\n", b);
    }
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
1

The behavior on using %d to read a size_t item is undefined - neither the compiler nor the runtime environment are required to handle the situation in any particular way. The result can quite literally be anything - your code could crash outright, you could get garbled input or output, or your code could work as expected, and each of those outcomes would be considered equally "correct" as far as the language is concerned.

Most likely the problem is that size_t is larger than an int, so reading an input with %d only affects the lower sizeof (int) bytes, but the upper bytes are unaffected. Since you don't initialize c, it contains some indeterminate, most likely non-zero value. If c is 8 bytes wide and had an initial value of 0xFFFFFFFFFFFFFFFF, then after a scanf with %d and an input of 100, it would (likely) have a value of 0xFFFFFFFF00000064.

scanf doesn't know that c is a size_t - it only knows that you told it to read an int value into the first sizeof (int) bytes of the first argument.

Use the right conversion specifiers for both input and output, always. Use %zu for reading and writing size_t values, use %d for int values, use %p for pointer values, etc.

John Bode
  • 119,563
  • 19
  • 122
  • 198
  • Thanks! How can I learn C with this details ? Is there any reference or book for C that refer to this tips ? – alirezaarzehgar Sep 25 '22 at 16:40
  • 1
    [This answer](https://stackoverflow.com/questions/562303/the-definitive-c-book-guide-and-list/562377#562377) has links to a number of references, some good, some out of date, some awful. But be aware, a lot of my knowledge is just gained through experience; some of this stuff just takes time before it really makes sense. I think I had been writing C for more than 10 years or so before I really *understood* how declaration syntax worked. – John Bode Sep 25 '22 at 18:59