11

even after reading quite a bit about the strict-aliasing rules I am still confused. As far as I have understood this, it is impossible to implement a sane memory allocator that follows these rules, because malloc can never reuse freed memory, as the memory could be used to store different types at each allocation.

Clearly this cannot be right. What am I missing? How do you implement an allocator (or a memory pool) that follows strict-aliasing?

Thanks.

Edit: Let me clarify my question with a stupid simple example:

// s == 0 frees the pool
void *my_custom_allocator(size_t s) {
    static void *pool = malloc(1000);
    static int in_use = FALSE;
    if( in_use || s > 1000 ) return NULL;
    if( s == 0 ) {
        in_use = FALSE;
        return NULL;
    }
    in_use = TRUE;
    return pool;
}

main() {
    int *i = my_custom_allocator(sizeof(int));
    //use int
    my_custom_allocator(0);
    float *f = my_custom_allocator(sizeof(float)); //not allowed...
}
curiousguy
  • 8,038
  • 2
  • 40
  • 58
Sebastian Ende
  • 113
  • 1
  • 5
  • 2
    How is paxdiablo's answer an answer? – curiousguy Nov 01 '11 at 04:25
  • @curiousguy, I agree! This is a fascinating issue. The standard talks about " .. explicitly deallocated" (in paxdiablo's quote from the standard). But, if `free` has a *monopoly* on "deallocating", then that means `free` is very special. The code in the question does *not* call `free`, therefore this special behaviour is not available. Therefore, the question is: does "deallocation" occur in the questioner's code - i.e. is the "pseudo-deallocation" in the questioner's code sufficient to bring the object lifetime to an end and to allow a new type to be written to the existing location? – Aaron McDaid Jul 21 '15 at 09:56
  • 2
    @AaronMcDaid: The idea that `free()` should have a monopoly on such behavior is particularly absurd on freestanding implementations which aren't required to support that function. – supercat Jan 08 '17 at 22:37
  • The idea of the `free` (`delete` for C++) monopoly seems relatively new. It is a very silly idea that was never floated in these committees before. – curiousguy Jan 09 '17 at 17:59

4 Answers4

11

I don't think you're right. Even the strictest of strict aliasing rules would only count when the memory is actually allocated for a purpose. Once an allocated block has been released back to the heap with free, there should be no references to it and it can be given out again by malloc.

And the void* returned by malloc is not subject to the strict aliasing rule since the standard explicitly states that a void pointer can be cast into any other sort of pointer (and back again). C99 section 7.20.3 states:

The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated).


In terms of your update (the example) where you don't actually return the memory back to the heap, I think your confusion arises because allocated object are treated specially. If you refer to 6.5/6 of C99, you see:

The effective type of an object for an access to its stored value is the declared type of the object, if any (footnote 75: Allocated objects have no declared type).

Re-read that footnote, it's important.

If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.

If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.

For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

In other words, the allocated block contents will become the type of the data item that you put in there.

If you put a float in there, you should only access it as a float (or compatible type). If you put in an int, you should only process it as an int (or compatible type).

The one thing you shouldn't do is to put a specific type of variable into that memory and then try to treat it as a different type - one reason for this being that objects are allowed to have trap representations (which cause undefined behaviour) and these representations may occur due to treating the same object as different types.

So, if you were to store an int in there before the deallocation in your code, then reallocate it as a float pointer, you should not try to use the float until you've actually put one in there. Up until that point, the type of the allocated is not yet float.

Community
  • 1
  • 1
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • 1
    But doesn't that mean the compiler must have special knowledge of the free-call so that a pointer [after a free and a new malloc] might alias to the formerly freed pointer? – Sebastian Ende Oct 07 '11 at 13:19
  • @Sebastian, I've updated the answer with the relevant standards info. Allocated memory is treated specially. – paxdiablo Oct 07 '11 at 13:29
  • @Sebastian: accessing anything trough the formerly-freed pointer is UB anyway, so the compiler is free to assume what it likes in terms of aliasing. For example, `int *ptr = malloc(sizeof(int)); *ptr = 1; free(ptr); long *ptr2 = malloc(sizeof(long)); *ptr2 = 2; if (ptr == ptr2) *ptr;` has UB in the case where the second allocation just so happens to have equal address to the first. So the compiler doesn't have to track the fact that `ptr` has been freed, it can safely just continue to apply strict aliasing rules, and assume that it is not aliased. – Steve Jessop Oct 07 '11 at 16:08
  • 1
    That said, as long as the compiler treats `malloc` and `free` as calls to unknown code in another TU, then none of the usual optimizations that rely on strict aliasing are going to be applied anyway, since for all the compiler knows `ptr` is aliased somewhere else in the program, and `malloc` and `free` modify its contents. – Steve Jessop Oct 07 '11 at 16:11
  • "_since objects are allowed to have trap representations_" No. This issue has **nothing** to do with trap representation. Even if you know that there isn't a trap representation, you are still not allowed to break the aliasing rules. – curiousguy Oct 31 '11 at 16:25
  • curiousguy, I stated "... *one* reason" was to do with trap representations, I didn't say it was the *only* reason. For example, allowing aliasing will remove quite a bit of power for the compiler to optimise, but the trap representations are also a limit on treating memory content as a different type to what was stored there. – paxdiablo Jul 22 '15 at 04:45
3

I post this answer to test my understanding of strict aliasing:

Strict aliasing matters only when actual reads and writes happen. Just as using multiple members of different type of an union simultaneously is undefined behavior, the same applies to pointers as well: you cannot use pointers of different type to access the same memory for the same reason you cannot do it with an union.

If you consider only one of the pointers as live, then it's not a problem.

  • So if you write through an int* and read through an int*, it is OK.
  • If you write using an int* and read through an float*, it is bad.
  • If you write using an int* and later you write again using float*, then read it out using a float*, then it's OK.

In case of non-trivial allocators you have a large buffer, which you typically store it in a char*. Then you make some sort of pointer arithmetic to calculate the address you want to allocate and then dereference it through the allocator's header structs. It doesn't matter what pointers do you use to do the pointer arithmetic only the pointer you dereference the area through matters. Since in an allocator you always do that via the allocator's header struct, you won't trigger undefined behavior by that.

Calmarius
  • 18,570
  • 18
  • 110
  • 157
  • 1
    There's some other nastiness too. Even if "int" and "long" have the same representation, using memcpy to copy data from a "long" to allocated storage and then reading the memcpy'd memory as an "int" will yield Undefined Behavior. – supercat Nov 28 '15 at 21:46
  • @supercat `long` is a 64 bit integer on 64 bit Linux while `int` is 32 bit. – Calmarius Nov 28 '15 at 21:58
  • 1
    Substitute "long" and "long long" then. My point is that using memcpy to move data between malloc-allocated arrays of different types is UB even when the types have the same representation. – supercat Nov 29 '15 at 04:23
  • 1
    Union aliasing is allowed in C – M.M Apr 12 '20 at 23:21
3

Standard C does not define any efficient means by which a user-written memory allocator can safely take a region of memory that has been used as one type and make it safely available as another. Structures in C are guaranteed not to trap representations--a guarantee which would have little purpose if it didn't make it safe to copy structures with fields containing Indeterminate Value.

The difficulty is that given a structure and function like:

struct someStruct {unsigned char count; unsigned char dat[7]; }
void useStruct(struct someStruct s); // Pass by value

it should be possible to invoke it like:

someStruct *p = malloc(sizeof *p);
p->count = 1;
p->dat[0] = 42;
useStruct(*p);

without having to write all of the fields of the allocated structure first. Although malloc will guarantee that the allocation block it returns may be used by any type, there is no way for user-written memory-management functions to enable such reuse of storage without either clearing it in bytewise fashion (using a loop or memset) or else using free() and malloc() to recycle the storage.

supercat
  • 77,689
  • 9
  • 166
  • 211
0

Within the allocator itself, only refer to your memory buffers as (void *). when it is optimized, the strict-aliasing optimizations shouldn't be applied by the compiler (because that module has no idea what types are stored there). when that object gets linked into the rest of the system, it should be left well-enough alone.

Hope this helps!

Woodrow Douglass
  • 2,605
  • 3
  • 25
  • 41
  • Things could still break, especially with fixed-size-chunk allocators, if whole-program optimization can see into them but is willfully blind to pointer type conversions. – supercat Sep 26 '17 at 17:46