2

I have been trying to reason about a loop over a dinamically-allocated array with a non-deterministic size, but I haven't been able to convince Frama-C/Eva of the\validity of memory that is returned by malloc.

After boiling it down to:

#include <stdio.h>
#include <stdlib.h>
#include "__fc_builtin.h"
int main(void) {
  int i,sz; char*arr;
  sz = Frama_C_interval(1,2);
  arr = (char*) malloc(sz*sizeof(char));
  arr[0]='a';
  arr[sz-1]='b';
}

when run in Frama-C 24.0 Chromium/amd64 with opts frama-c -eva -eva-no-alloc-returns-null, Eva fails to prove that \valid(arr+sz-1).

I am quite intrigued about why this is the case as everything seems ok in the intermediate values. When replacing sz for a concrete size (1 or 2 for instance) the proof goes trivially.

hpacheco
  • 235
  • 1
  • 8
  • 1
    I'd suggest leaving the `#includes` in your example, otherwise people trying to just copy-paste it in Frama-C will have different results. In particular, I'd keep `#include ` (for `malloc`) and `#include "__fc_builtin.h"` (for `Frama_C_interval`). – anol Feb 15 '22 at 09:41

1 Answers1

2

The problem comes from the fact that Eva's abstract domains do not keep track of the relation that exists between sz and the size of the allocated block at which arr points to. Basically, with -eva-no-alloc-returns, Eva knows that arr is a block of 1 or 2 elements, thus the assignment to arr[0] succeeds. On the other hand, all it knows is that sz-1 is between 0 and 1, but not that sz-1 is equal to 1 only if arr has two elements, hence the warning on the last line.

In your particular example, it is possible to preserve two separated states, one for sz==1 and one for sz==2:

#include <stdlib.h>
#include "__fc_builtin.h"
int main(void) {
  int i,sz; char*arr;
  sz = Frama_C_interval(1, 2);
  //@ split sz;
  Frama_C_show_each_1(sz);
  //@ eva_allocate fresh;
  arr = (char *)malloc(sz * sizeof(char));
  Frama_C_show_each_2(arr,sz);
  arr[0]='a';
  arr[sz - 1] = 'b';
  //@ merge sz;
}

(analyze it with frama-c -eva -eva-no-alloc-returns-null test.c)

More information can be found in the Eva manual, but here are the main points:

  • //@ split sz; instructs Eva to separate states according to the possible values of sz (of course, this is doable only if sz has a few possible values, like here)
  • Frama_C_show_each_1(sz) let Eva displays the value it has computed for sz at this point. It allows us to check that it indeed propagates to separate states (we'll have two messages in Eva's log)
  • //@ eva_allocate fresh; indicates that each call to malloc must result in a new fresh base. Note that if used in a loop, this may trigger non-termination, as the abstract state keeps growing. Here, it is needed to have two distinct base for the two states that Eva is propagating.
  • Frama_C_show_each_2(arr,sz) allows us to confirm that we have indeed two different base address for the two possible values of sz

Here is what you should see during the analysis:

...
[eva] test.c:7: Frama_C_show_each_1: {2}
[eva] test.c:7: Frama_C_show_each_1: {1}
[eva] test.c:9: allocating variable __malloc_main_l9
[eva] test.c:9: allocating variable __malloc_main_l9_0
[eva] test.c:10: Frama_C_show_each_2: {{ &__malloc_main_l9 }}, {2}
[eva] test.c:10: Frama_C_show_each_2: {{ &__malloc_main_l9_0 }}, {1}
...
Virgile
  • 9,724
  • 18
  • 42
  • I see, then Eva can only keep track that the malloc-ed block `arr` and `sz` have the same size if the size is a constant value (modulo splitting). I was originally actually trying to prove a more general result for a possibly large non-deterministic positive int, like `Frama_C_interval(1, INT_MAX)` . Could I prove such a goal? Since the size of the malloc-ed block is internal to the C semantics, I can't see any straightforward annotations that could help. – hpacheco Feb 15 '22 at 14:28
  • 1
    @hpacheco, unfortunately you can't. For that, you'd need to have an abstract domain in Eva that would take care of keeping the relation between the size of the malloc'ed block (i.e. `__malloc_main_l9`) and `sz`. There exists relational domains in Eva, notably `symbolic_locations` and `equality`, and it would be perfectly possible to build another one to track the size of allocated blocks, but it hasn't happened yet (and it's pretty complicated to do in the general case, e.g. if you have to deal with `malloc(*ptr_to_sz)` and the domain must understand where `ptr_to_sz` might point to). – Virgile Feb 15 '22 at 17:37
  • Thanks for the quick reply. I can image how inferring such relations can quickly become complicated for general pointer arithmetic. My hope was on some sort of dirty hack/assumption; at least splitting offers sort of a half-way-compromise for demo purposes. – hpacheco Feb 15 '22 at 23:57