realloc but only first few bytes is meaningful

Question

Assume I have used ptr = malloc(old_size); to allocate a memory block with old_size bytes. Only the first header_size bytes is meaningful. I'm going to increase the size to new_size.

new_size is greater than old_size and old_size is greater than header_size.

before:

/- - - - - - - old_size - - - - - - - \
+===============+---------------------+
 \-header_size-/

after:

/- - - - - - - - - - - - - - - new_size - - - - - - - - - - - - - - - - - - -\
+===============+------------------------------------------------------------+
\- header_size-/

I don't care what is stored after ptr + header_size because I'll read some data to there.

method 1: go straight to `new_size`

ptr = realloc(ptr, new_size);

method 2: shrink to `header_size` and grow to `new_size`

ptr = realloc(ptr, header_size);
ptr = realloc(ptr, new_size);

method 3: allocate a new memory block and copy the first `header_size` bytes

void *newptr = malloc(new_size);
memcpy(newptr, ptr, header_size);
free(ptr);
ptr = newptr;

Which is faster?

@nos I'll write some data beyond `header_size` after reallocation so I don't care what is in there. — lqs, Nov 06 '12 at 09:14

paxdiablo · Answer 1 · 2012-11-06T09:28:42.580

Neither malloc (for the whole block) nor realloc (for the space beyond the size of the old block when increasing the size) guarantee what the memory you receive will contain so, if you want those excess bytes set to zero (for example), you'll have to do it yourself with something like:

// ptr contains current block.
void *saveptr = ptr;
ptr = realloc (ptr, new_size);
if (ptr == NULL) {
    // do something intelligent like recover saveptr and exit.
}
memset (ptr + header_size, 0, new_size - header_size);

However, since you've stated that you don't care about the content beyond the header, the fastest is almost certainly a single realloc since that's likely to be optimised under the covers.

Calling it twice for contraction and expansion, or calling malloc-new/memcpy/free-old is very unlikely to be as efficient though, as with all optimisations, you should measure, don't guess!

Keep in mind that realloc doesn't necessarily have to copy your memory at all. If the expansion can be done in place, then an intelligent heap manager will just increase the size of the block without copying anything, such as:

+-----------+   ^        +-----------+ <- At same address,
| Old block |   | Need   | New block |      no copying
|           |   | this   |           |      involved.
+-----------+   | much   |           |
| Free      |   | now.   |           |
|           |   v        +-----------+
|           |            | Free      |
|           |            |           |
+-----------+            +-----------+

Huh? Of course `realloc()` guarantees that the new memory will contain as much as will fit from the old block. — unwind, Nov 06 '12 at 09:04
@unwind: I meant beyond the old boundary, not the memory in the old block. I'll clarify. — paxdiablo, Nov 06 '12 at 09:05
@lqs, since you've stated in a comment that you don't care what the non-header contains, I've simplified my answer to suit. — paxdiablo, Nov 06 '12 at 09:21

Steve Jessop · Accepted Answer · 2012-11-06T11:27:56.173

It almost certainly depends on the values of old_size, new_size and header_size, and also it depends on the implementation. You'd have to pick some values and measure.

1) is probably best in the case where header_size == old_size-1 && old_size == new_size-1, since it gives you the best chance of the single realloc being basically a no-op. (2) should be only very slightly slower in that case (2 almost-no-ops being marginally slower than 1).

3) is probably best in the case where header_size == 1 && old_size == 1024*1024 && new_size == 2048*1024, because the realloc would have to move the allocation, but you avoid copying 1MB of data you don't care about. (2) should be only very slightly slower in that case.

2) is probably best when header_size is much smaller than old_size, and new_size is in a range where it's reasonably likely that the realloc will relocate, but also reasonably likely that it won't. Then you can't predict which of (1) and (3) it is that will be very slightly faster than (2).

In analyzing (2), I have assumed that realloc downwards is approximately free and returns the same pointer. This is not guaranteed. I can think of two things that can mess you up:

realloc downwards copies to a new allocation
realloc downwards splits the buffer to create a new chunk of free memory, but then when you realloc back up again the allocator doesn't merge that new free chunk straight back onto your buffer again in order to return without copying.

Either of those could make (2) significantly more expensive than (1). So it's an implementation detail whether or not (2) is a good way of hedging your bets between the advantages of (1) (sometimes avoids copying anything) and the advantages of (3) (sometimes avoids copying too much).

Btw, this kind of idle speculation about performance is more effective in order to tentatively explain your observations, than it is to tentatively predict what observations we would make in the unlikely event that we actually cared enough about performance to test it.

Furthermore, I suspect that for large allocations, the implementation might be able to do even a relocating realloc without copying anything, by re-mapping the memory to a new address. In which case they would all be fast. I haven't looked into whether implementations actually do that, though.

score 1 · Answer 3 · answered Nov 06 '12 at 09:04

1

That probably depends on what the sizes are and if copying is needed.

Method 1 will copy everything contained in the old block - but if you don't do that too often, you won't notice.

Method 2 will only copy what you need to keep, as you discard everything else beforehand.

Method 3 will copy unconditionally, while the others only copy if the memory block cannot be resized where it is.

Personally, I would prefer method 2 if you do this quite often, or method 1 if you do it more seldom. Respectively, I would profile which of these will be faster.

answered Nov 06 '12 at 09:04

glglgl

89,107
13
149
217

Method 2 is the only one that *both* avoids a copy entirely, if there is free space after the block, *and* otherwise copies only the minimum memory. Method 1 is the most efficient in the best case, but possibly also the the least efficient in the worst case. It does depend now much data we're talking about whether the extra function overheads are worthwhile. – ams Nov 06 '12 at 09:36
@ams Are there actually `realloc` implementations that consider free blocks after the block about to be `realloc`'ed? It seems unlikely to me. – pmr Nov 06 '12 at 09:42
1

@pmr: even if they don't try to merge blocks, in the case where `new_size - old_size` is small there might be free space at the end of the block that can be used to satisfy the new size, due to `old_size` having been rounded up. – Steve Jessop Nov 06 '12 at 09:53
1

@pmr Why wouldn't the implementation extend an allocation into following free space, if there is some? Sure, somebody has to implement it that way, but there's a lot of smart people out there! In reality though, you can't even assume that *reducing* the size of the allocation won't result in a copy. – ams Nov 06 '12 at 11:26
1

@pmr At least some implementations of `malloc` use `mmap` for allocations greater than one page (typically 4KiB), so extending the size of the allocation is almost a no-op. It depends how crowded the address space is, but on a 64-bit machine that's probably not a problem. – ams Nov 06 '12 at 11:45

realloc but only first few bytes is meaningful

method 1: go straight to `new_size`

method 2: shrink to `header_size` and grow to `new_size`

method 3: allocate a new memory block and copy the first `header_size` bytes

3 Answers3

Linked

realloc but only first few bytes is meaningful

method 1: go straight to new_size

method 2: shrink to header_size and grow to new_size

method 3: allocate a new memory block and copy the first header_size bytes

3 Answers3

Linked

method 1: go straight to `new_size`

method 2: shrink to `header_size` and grow to `new_size`

method 3: allocate a new memory block and copy the first `header_size` bytes