20

For one reason or another, I want to hand-roll a zeroing version of malloc(). To minimize algorithmic complexity, I want to write:

void * my_calloc(size_t size)
{
    return memset(malloc(size), 0, size);
}

Is this well-defined when size == 0? It is fine to call malloc() with a zero size, but that allows it to return a null pointer. Will the subsequent invocation of memset be OK, or is this undefined behaviour and I need to add a conditional if (size)?

I would very much want to avoid redundant conditional checks!

Assume for the moment that malloc() doesn't fail. In reality there'll be a hand-rolled version of malloc() there, too, which will terminate on failure.

Something like this:

void * my_malloc(size_t size)
{
    void * const p = malloc(size);
    if (p || 0 == size) return p;
    terminate();
}

Matthew Slattery
  • 45,290
  • 8
  • 103
  • 119
Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • AFAIK `memset` isn't required to check for NULL, so if the `malloc` fails you'll zero `size` bytes starting from address 0. – Praetorian Dec 21 '11 at 22:08
  • 1
    @Praetorian: Sorry, I added this later: Assume that `malloc()` never fails. The question is only if `size` can be `0`. – Kerrek SB Dec 21 '11 at 22:09

3 Answers3

10

Here is the glibc declaration:

extern void *memset (void *__s, int __c, size_t __n) __THROW __nonnull ((1));

The __nonnull shows that it expects the pointer to be non-null.

Pubby
  • 51,882
  • 13
  • 139
  • 180
  • Which corrobates my idea that it looks like it is **unspecified**, and in the case of glibc, **undefined** as well (the prototype just makes doubly sure that the compiler warns about the results being _undefined_ when calling with a null pointer) – sehe Dec 22 '11 at 07:41
8

Here's what the C99 standard says about this:

7.1.4 "Use of library functions"

If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after promotion) not expected by a function with variable number of arguments, the behavior is undefined.

7.21.1 "String function conventions" (remember that memset() is in string.h)

Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero on a call to that function. Unless explicitly stated otherwise in the description of a particular function in this subclause, pointer arguments on such a call shall still have valid values, as described in 7.1.4.

7.21.6.1 "The memset function"

The memset function copies the value of c (converted to an unsigned char) into each of the first n characters of the object pointed to by s.

So strictly speaking, since the standard specifies that s must point to an object, passing in a null pointer would be UB. Add the check (the cost compared to the malloc() will be vanishingly small). On the other hand, if you know the malloc() cannot fail (because you have a custom one that terminates), then obviously you don't need to perform the check before calling memset().

Community
  • 1
  • 1
Michael Burr
  • 333,147
  • 50
  • 533
  • 760
  • But in my interpretation, `malloc(0)` can also never fail, so I'm not guaranteed that I have a valid pointer. Are you saying that it *is* definitely UB to call `memset` with an invalid pointer, even if the size is 0? – Kerrek SB Jan 09 '12 at 22:15
  • 3
    @Kerrick: my reading of the standard is that it's UB if you pass in an invalid pointer - including `NULL` - to `memset()`, even if the size is `0`. I imagine that in most implementations it'll work as one would want (`memset()` being a nop if `size == 0`), but that's just a happy coincidence of UB 'working'. Also, I think that making a performance argument for not performing the check is weak - zeroing the memory block will dominate the performance for the use case of this function that should be far more likely by orders of magnitude: when `size > 0`. – Michael Burr Jan 10 '12 at 08:52
  • @Kerrick: I'm a little confused about your comment regarding whether or not the pointer is valid - if you can assume `malloc(0)` cannot fail, then you're OK to call `memset()` with a zero size on the pointer returned by `malloc(0)`. – Michael Burr Jan 10 '12 at 08:57
  • Sorry, I was unclear. I say that `malloc(0)` cannot fail because it is acceptable for it to return a null pointer, since I can `free` a null pointer, and thus a result of `NULL` is not a failure. So what could happen is that I say `memset(malloc(n), 0, n)`, and when `n = 0`, the *inner* function works correctly, but it might still (correctly) return a null pointer. So what you're saying is that this composition is *not* correct because the outer function must never receive a null pointer? – Kerrek SB Jan 10 '12 at 13:51
7

Edit Re:

I added this later: Assume that malloc() never fails. The question is only if size can be 0

I see. So you only want things to be secure if the pointer is null and the size is 0.

Referring to the POSIX docs

No, it is not specified that it should be safe to call memset with a null pointer (if you called it with zero count or size... that'd be even more 'interesting', but also not specified).

Nothing about it is even mentioned in the 'informative' sections.

Note that the first link mentions

The functionality described on this reference page is aligned with the ISO C standard. Any conflict between the requirements described here and the ISO C standard is unintentional. This volume of IEEE Std 1003.1-2001 defers to the ISO C standard

Update I can confirm that the ISO C99 standard (n1256.pdf) is equally brief as the POSIX docs and the C++11 spec just refer to the ANSI C standard for memset and friends. N1256 states:

The memset function copies the value of c (converted to an unsigned char) into each of the first n characters of the object pointed to by s.

and says nothing about the situation where s is null (but note that a null pointer does not point to an object).

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
sehe
  • 374,641
  • 47
  • 450
  • 633
  • I've browsed standards for a bit before posting :-) I must have missed it... The whole point of my question is the situation `size == 0`. – Kerrek SB Dec 21 '11 at 22:10
  • Well, since it's a library function, I want to a) avoid unnecessary conditionals, and b) be prepared to deal with any user input, even `0`. If I can achieve both purely by virtue of the C library guarantees, that'd be far preferable over adding my own cumbersome checks. – Kerrek SB Dec 21 '11 at 22:17
  • I can confirm that the ISO C99 standard (n1256.pdf) is equally brief as the POSIX docs **and** the C++11 spec just refers to the ansi C standard for `memset` and friends. – sehe Dec 21 '11 at 22:17
  • ... so my hand-rolled `malloc`-function will be like this: `void * p = malloc(size); if (p || 0 == size) return p; terminate();` – Kerrek SB Dec 21 '11 at 22:18
  • Ok, I've stricken my query as to 'why' you care. Fair arguments given. – sehe Dec 21 '11 at 22:18
  • Huh.. Oh you fixed the comment. Good. Looks to make sense, though I'm not sure about the wanting to terminate. Why can't your malloc return 0 if the standard library can? – sehe Dec 21 '11 at 22:20
  • It will happily return NULL, but *only if* `size == 0`. If `size != 0` and `p == NULL`, then we have a genuine allocation error, so we terminate. – Kerrek SB Dec 21 '11 at 22:21
  • 2
    So in conclusion... `memset(NULL, 0, 0)` is or is not UB? – Kerrek SB Dec 22 '11 at 01:03
  • @KerrekSB To save a conditional, you are paying with a function call. That's a bad tradeoff, no, let's say: you are **prematurely optimizing**, and at a level that is actually better left to the compiler, which knows better what's fast and what's not for your arch. – jørgensen Dec 22 '11 at 01:42
  • 1
    @jørgensen: How so? Most times I will make the function call anyway! – Kerrek SB Dec 22 '11 at 02:18
  • @jørgensen? I'm positive the compiler won't meddle with that. (Now, the library _might_ have optimization hints as `attributes`/`pragmas`). @KerrekSB I'd say it is unspecified, so at best implementation defined. You nor I can find the specification! – sehe Dec 22 '11 at 07:40