Seemingly invalid memory access not reported by CMBC

Question

I have following code snippet.

#include<stdio.h>
#include<stdlib.h>
int main()
{
        char *c = malloc(1);
        printf("%p\n", c);
        c = c + 20;

        printf("%p\n", c);
        printf("%d\n", *c);

        free(c - 20);
        return 0;
}

In this code, I'm allocating 1 byte of memory to a pointer. Then I'm accessing a memory location which 20 unit after allocated memory. When I dereference that pointer, I'm expected to get a Memory Access Violation error or Segmentation Fault or something like that. I'm not getting any such error.

Let us assume this is a case of undefined behaviour. So I tried to verify this program using CBMC, a well known model checker using following command.

cbmc test01.c --pointer-check

CBMC is reporting that program is safe. Is it an issue with CBMC or am I missing something?

It'd be interesting to see what `g++` with -`fsanitize=address` says about this. I can't see any way that this would be "OK"; perhaps this is a CBMC bug? — OMGtechy, Mar 31 '16 at 09:44
If the access was still within the heap allocation you won't get a Memory Access violation. C does not hold your hand with array indexing. — Weather Vane, Mar 31 '16 at 09:44
`I'm expected to get a Memory Access Violation error. I'm not getting that error.` What make you think there is protected memory at `c + 20`? Use `(void *)c` for `%p` — David Ranieri, Mar 31 '16 at 09:45
BTW it's a bad idea to alter a pointer value that will be later required to `free` it, better would be `printf("%d\n", *(c+20));` or `printf("%d\n", c[20]);` — Weather Vane, Mar 31 '16 at 09:47
I would guess the "well known model checker", is not as good as popular. — ikrabbe, Mar 31 '16 at 09:52
The problem is: this is undefined behavior, but you expect a certain behavior. There's thousand of similar nonsense experiments all over SO. Just stop doing things like this, there is nothing to learn from it except knowing it is wrong. — Lundin, Mar 31 '16 at 11:12
@niyasc, bit of a late follow up, but I've just been experimenting with CBMC 5.8 and I'm getting `pointer outside dynamic object bounds in *c` as expected. I sometimes see other strange behaviour from CBMC, but your example works for me as expected. Running with `--bounds-check --pointer-check --trace` gives more information — Sam Mason, Mar 18 '18 at 11:52

axiac · Answer 1 · 2016-03-31T10:32:03.183

As you said in your question, the statement printf("%d\n", *c); exposes undefined behaviour. Being undefined, any expectation you might have against it is false. This includes getting a specific error or any error at all.

The C runtime library doesn't check the way your program accesses the memory. If it would do it, the program would run much, much slower. Regarding the heap, the C runtime library does some basic checks; for example, it puts a certain value at address 0 during the program startup and checks if the value is still there when the program ends. If the value changed then a null pointer was de-referenced for writing and it can warn you about this.

After c = c + 20;, c most probably points to a block of memory that belongs to your program. It can be a free area on the heap, it can be inside the data structures used by the heap manager to handle the heap, but there are big chances that it is still on the same memory page.

If you have (bad) luck and c + 20 lands outside the memory page that stores c then an exceptional situation happens that is handled by the operating system. It terminates the program and displays an error message similar with the one you listed in the question (the idea is the same, the words and the presentation differ on each OS).

Update

Allocating memory is not some kind of magic. The program starts with a block of memory (called "heap") that is assigned by the OS to the program for this purpose.

The C runtime library contains code that manages the heap. This code uses a small part of this memory for its bookkeeping. A common implementation uses double-linked lists, the payload of each node from the list being a block of memory "allocated" by the program using the functions declared in <memory.h> (malloc(), calloc() etc). When a call to malloc() happens, this code runs, creates a new node in the list and returns the address of the node's payload (an address inside the heap).

The program uses this pointer as it wants. Your program is free to write at c-1, for example. In fact, inside malloc() it actually wrote information there. After malloc() returns c, your code can also write at c-1. From the OS point of view, there is no difference between these two write operations. And because C is not a managed or interpreted language, there is no code included in your program that watches what the code you wrote does or hold its hand to not write in the wrong place.

If you write at c-1 there is a big chance that you corrupt the data structures used by the heap manager. Nothing wrong happens immediately. No error message is displayed and your program continues to run apparently normal. But on the next call to a function that handles the heap (be it a memory allocation or a release) the program will start wreaking havoc. The heap data structures being corrupt anything could happen.

Regarding CBMC, I don't know how it works. Maybe it cannot detect such situations. Or maybe it reports your program as being safe because it doesn't write at c after it was incremented.

Such *undefined* behaviour is expected. My specific question is why `CBMC` is not reporting this problem. — niyasc, Mar 31 '16 at 09:53
@niyasc you could follow the link on the CMBC page, he gives an email address. — Weather Vane, Mar 31 '16 at 10:05
@niyasc When you expect something, that something is not undefined anymore. I updated the answer with a brief explanation of how the heap works and why reading from `c+20` doesn't trigger any error message. — axiac, Mar 31 '16 at 10:33

score 0 · Answer 2 · answered Mar 31 '16 at 11:12

BTW. gcc -fsanitize=address does compile this file without warning, but when you run the code you will get a message from the address sanitzer

=================================================================
==24198==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000f004 at pc 0x400921 bp 0x7ffe1e66b900 sp 0x7ffe1e66b8f0
READ of size 1 at 0x60200000f004 thread T0
    #0 0x400920 in main (/home/ingo/test/c/sanitize_address+0x400920)
    #1 0x7fdd9ddca7af in __libc_start_main (/lib64/libc.so.6+0x207af)
    #2 0x4007d8 in _start (/home/ingo/test/c/sanitize_address+0x4007d8)

0x60200000f004 is located 19 bytes to the right of 1-byte region [0x60200000eff0,0x60200000eff1)
allocated by thread T0 here:
    #0 0x7fdd9e19c7b7 in malloc (/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.3/libasan.so.1+0x577b7)
    #1 0x4008b7 in main (/home/ingo/test/c/sanitize_address+0x4008b7)
    #2 0x7fdd9ddca7af in __libc_start_main (/lib64/libc.so.6+0x207af)

SUMMARY: AddressSanitizer: heap-buffer-overflow ??:0 main
Shadow bytes around the buggy address:
  0x0c047fff9db0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff9dc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff9dd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff9de0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff9df0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa 01 fa
=>0x0c047fff9e00:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff9e10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff9e20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff9e30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff9e40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff9e50: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Contiguous container OOB:fc
  ASan internal:           fe
==24198==ABORTING

such an output can be quite helpfull to find such leaks and overruns.

But it is not a simple task for a compiler to find such errors that occur during runtime.

Seemingly invalid memory access not reported by CMBC

2 Answers2