2

here is the c code:

char **s;
s[334]=strdup("test");
printf("%s\n",s[334]);`

i know that strdup does the allocation of "test", but the case s[334] where we will put the pointer to the string "test" is not allocated,however,this code works like a charm

Amine Hajyoussef
  • 4,381
  • 3
  • 22
  • 26
  • Specifically, s is allocated on the stack, and so it might actually contain a legal pointer, depending on what you did in other functions before. – cha0site Jan 14 '12 at 11:19
  • there are no other functions,main contain only these three lines. – Amine Hajyoussef Jan 14 '12 at 11:21
  • What signature did you give main? `int main()`, or `int main(int argc, char ** argv)`? – cha0site Jan 14 '12 at 11:22
  • the signature used is int main(),compiled using gcc – Amine Hajyoussef Jan 14 '12 at 11:24
  • OK, you're right, there is something _weird_ going on here... – cha0site Jan 14 '12 at 11:33
  • Who knows what you are overwriting... `s` points at some "random" memory location (really, it contains whatever happened to be the variable's contents in previous function calls). If you are lucky, that is illegal as the address of an `char *`, or the `char *` points somewhere that can't be written to, and KA-BOOM; if you are unlucky, it will write somewhere inocuous; if you are *extremely* unlucky it will overwrite something critical with garbage, and your program will silently misfunction a few days later. You have three guesses as to what Murphy's law will choose... – vonbrand Jan 20 '13 at 18:35

5 Answers5

5

Your code exhibits undefined behavior. That does not mean it will crash. All it means is that you can't predict anything about what will happen.

A crash is rather likely, but not guaranteed at all, in this case.

Mat
  • 202,337
  • 40
  • 393
  • 406
3
  1. You don't always get segmentation fault if you access uninitialized memory.

  2. You do access uninitialized memory here.

Igor
  • 26,650
  • 27
  • 89
  • 114
  • It's not unallocated memory, it's uninitialized memory, specifically, `s`. Whatever `s` happens to point at might as well be allocated memory. – cha0site Jan 14 '12 at 11:21
  • Where "unallocated memory" can mean memory that happens to be allocated to something else. – Keith Thompson Jan 14 '12 at 11:22
3

"Undefined behaviour" doesn't mean you'll get a segfault, it means you might get a segfault. A conforming implementation might also decide to display ASCII art of a puppy.

You might like to check this code with a tool like Valgrind.

cha0site
  • 10,517
  • 3
  • 33
  • 51
  • Valgrind doesn't find this error, though. It's weirder than I thought. I think I'll open another question. – cha0site Jan 14 '12 at 11:34
  • This is also dependant on optimization levels... This is some interesting undefined behaviour, as `s` is always `0`. – cha0site Jan 14 '12 at 11:37
  • it works as long as the index(334 here) is bellow 4098 (maybe it's the limit of addressing space) – Amine Hajyoussef Jan 14 '12 at 11:40
  • 1
    I think the OP should file a bug against the compiler because he doesn't get that ASCII puppy. – Hot Licks Jan 14 '12 at 14:29
  • If `s` is zero, then `s[334]`, depending on memory layout, is the address of some variable (or even executable code), which could just happen to contain a valid pointer to a writable memory area, which gets overwritten with junk. To find out exactly what would happen you'd need to know the memory layout of that program, and the contents of said memory. An interesting project for a slow weekend's afternoon, perhaps. (This kind of analysis is what you see when a vulnerability's exploit is explained). – vonbrand Jan 20 '13 at 18:49
2

I get a segfault without optimisations, but when compiled with optimisations, gcc doesn't bother with the s at all, it's eliminated as dead code.

gcc -Os -S:

.cfi_startproc
subq    $8, %rsp
.cfi_def_cfa_offset 16
movl    $.LC0, %edi     # .LC0 is where "test" is at
call    strdup
addq    $8, %rsp
.cfi_def_cfa_offset 8
movq    %rax, %rdi
jmp     puts
.cfi_endproc

gcc -S -O (same for -O2, -O3):

.LFB23:
    .cfi_startproc
    subq    $8, %rsp
    .cfi_def_cfa_offset 16
    movl    $5, %edi
    call    malloc
    movq    %rax, %rdi
    testq   %rax, %rax
    je      .L2
    movl    $1953719668, (%rax)
    movb    $0, 4(%rax)
.L2:
    call    puts
    addq    $8, %rsp
    .cfi_def_cfa_offset 8
    ret
    .cfi_endproc
Daniel Fischer
  • 181,706
  • 17
  • 308
  • 431
  • i'm not very familiar with assembly language, can you explain the behavior of gcc when compiling with optimization – Amine Hajyoussef Jan 14 '12 at 13:10
  • I'm not very familiar with assembly either, but as far as I can tell, -Os does: subtract 8 from stack pointer, move address of string into register, strdup string, add 8 back to stack pointer, move pointer obtained from strdup to another register, jump to puts to output string. -O1/2/3 does: subtract 8 from stack pointer, move literal 5 to register %edi, malloc (5 bytes), move obtained pointer to %rdi, check for NULL, move literal "test" (as a 4-byte little-endian integer) to the allocated memory, move the 0-terminator at the end, call puts, add 8 back to stack pointer. – Daniel Fischer Jan 14 '12 at 13:39
  • The optimizer transforms the program into `puts(strdup("test"));` – Bo Persson Jan 14 '12 at 14:03
  • @BoPersson Yes, thanks for confirmation. However, I don't understand why it bothers with the `strdup()` at all. It should be able to make it `puts("test");`, shouldn't it? – Daniel Fischer Jan 14 '12 at 14:12
  • @Daniel - Yes, but that is perhaps not common enough to have been put into the optimizer? `printf("%s\n"` **is** common. – Bo Persson Jan 14 '12 at 14:20
  • So if I use `LD_PRELOAD` to override the `printf` function and then run a program compiled with a optimization that turn `printf` into `puts` I will be up for a surprise? – Mattias Wadman Jan 15 '12 at 23:33
  • @Mattias Give it a try, I'm too lazy to check. I wouldn't be surprised either way. – Daniel Fischer Jan 15 '12 at 23:43
  • Looks like I would be surprised, `puts` symbol shows up in `nm` http://www.ciselant.de/projects/gcc_printf/gcc_printf.html You learn something every day :) – Mattias Wadman Jan 15 '12 at 23:49
2

The compiler is too smart for us! It knows that printf("%s\n", some_string) is exactly the same as puts(some_string), so it can simplify

char **s;
s[334]=strdup("test");
printf("%s\n",s[334]);

into

char **s;
s[334]=strdup("test");
puts(s[334]);

and then (assuming no UB) that is again equivalent to

puts(strdup("test"));

So, by chance the segment fault didn't happen (this time).

Bo Persson
  • 90,663
  • 31
  • 146
  • 203
  • The compiler smarts don't make any difference here. Besides, the asignment to `s[334]` can't just be eliminated. – vonbrand Jan 20 '13 at 18:37
  • @vonbrand - If you look at the assembly in Daniel Fisher's answer, that is exactly what is does. – Bo Persson Jan 20 '13 at 19:47