0

For a serializing system, I need to allocate buffers to write data into. The size needed is not known in advance, so the basic pattern is to malloc N bytes and use realloc if more is needed. The size of N would be large enough to accommodate most objects, making reallocation rare.

This made me think that there is probably an optimal initial amount of bytes that malloc can satisfy more easily than others. I'm guessing somewhere close to pagesize, although not necessarily exactly if malloc needs some room for housekeeping.

Now, I'm sure it is a useless optimization, and if it really mattered, I could use a pool, but I'm curious; I can't be the first programmer to think give me whatever chunk of bytes is easiest to allocate as a start. Is there a way to determine this?

Any answer for this that specifically applies to modern GCC/G++ and/or linux will be accepted.

porgarmingduod
  • 7,668
  • 10
  • 50
  • 83
  • optimization in itself is not useless but maybe you should wait with optimizing until there is a concrete need for it? or do you have too much time on your hands? :-) – AndersK Apr 07 '11 at 01:16
  • @Anders: If I spent hours researching this myself, that could be wasting time. Seeing if something I am curious about can be answered on SO seems time efficient to me :) – porgarmingduod Apr 07 '11 at 01:23
  • 1
    @Anders K: it is my experience that a majority of posters with questions about optimization issues need input about what to do rather than what not to do. – Olof Forshell Apr 25 '11 at 15:29

3 Answers3

2

From reading this wiki page it seems that your answer would vary wildly depending on the implementation of malloc you're using and the OS. Reading the bit on OpenBSD's malloc is particularly interesting. It sounds like you want to look at mmap, too, but at a guess I'd say allocating the default pagesize (4096?) would be optimised for.

Adam
  • 713
  • 1
  • 8
  • 16
  • Actually, I'd be interested to know the results of an experiment done on this. Maybe try allocating 4095 bytes x times and see what your timing averages look like, and then doing the same experiment with 4096 bytes and then 4097 bytes. You'd want to ensure that your pagesize is indeed set to 4kb, and probably want to restart your computer before each test. Any takers? :-) – Adam Apr 07 '11 at 02:45
1

My suggestion to you would be to find an appropriate malloc/realloc/free source code such that you can implement your own "malloc_first" alongside the others in the same source module (and using the same memory structures) which simply allocates and returns the first available block greater than or equal to a passed minimum_bytes parameter. If 0 is passed you'll get the first block period.

An appropriate declaration could be

void *malloc_first (size_t minimum_bytes, size_t *actual_bytes);

How doable such an undertaking would be I don't know. I suggest you attempt it using Linux where all source codes are available.

Olof Forshell
  • 3,169
  • 22
  • 28
  • Yes, I guess. But at this point I am probably much better off simply using memory pools (for which expert implementations are already available) and a simple allocation policy on top of that. But your general suggestion of "take control yourself" is probably spot on. – porgarmingduod May 04 '11 at 14:33
-1

The way it's done in similar cases is for the first malloc to allocate some significant but not too large chunk, which would suit most cases (as you described), and every subsequent realloc call to double the requested size.

So, if at first you allocate 100, next time you'll realloc 200, then 400, 800 and so on. In this way the chances of subsequent reallocation will be lower after each time you do it.

If memory serves me right, that's how std::vector behaves.

after edit

The optimal initial allocation size would be the one that will cover most of your cases on one side, but won't be too wasteful on the other side. If your average case is 50, but can spike to 500, you'll want to allocate initially 50, and then double or triple (or multiple by 10) every next realloc so that you could get to 500 in 1-3 reallocs, but any further reallocs would be unlikely and infrequent. So it depends on your usage patterns, basically.

littleadv
  • 20,100
  • 2
  • 36
  • 50
  • I am very aware of this approach. It does not address the question at all, which is entirely concerned with the determining __initial size__. – porgarmingduod Apr 07 '11 at 01:15
  • Yeah, added two cents for that, but you didn't mention why you're looking for optimization, so I'm guessing it's because you don't want to waste memory. Is there anything else you wanted to optimize? – littleadv Apr 07 '11 at 01:19
  • The question is about whether it is possible to know what medium size __malloc can most efficiently deal with__. The usage pattern, i.e the size I actually need, is not a part of the question, except that I hint that `pagesize` would probably be fine. – porgarmingduod Apr 07 '11 at 01:21