157

I'm reading a book on memory as a programming concept. In one of the later chapters, the author makes heavy use of the word arena, but never defines it. I've searched for the meaning of the word and how it relates to memory, and found nothing. Here are a few contexts in which the author uses the term:

"The next example of serialization incorporates a strategy called memory allocation from a specific arena."

"...this is useful when dealing with memory leaks or when allocating from a specific arena."

"...if we want to deallocate the memory then we will deallocate the whole arena."

The author uses the term over 100 times in one chapter. The only definition in the glossary is:

allocation from arena - Technique of allocating an arena first and then managing the allocation/deallocation within the arena by the program itself (rather then by the process memory manager); used for compaction and serialization of complex data structures and objects, or for managing memory in safety-critical and /or fault-tolerant systems.

Can anyone define arena for me given these contexts?

pro-gramer
  • 166
  • 1
  • 3
  • 14
Nocturno
  • 9,579
  • 5
  • 31
  • 39

5 Answers5

174

An arena is just a large, contiguous piece of memory that you allocate once and then use to manage memory manually by handing out parts of that memory. For example:

char * arena = malloc(HUGE_NUMBER);

unsigned int current = 0;

void * my_malloc(size_t n) { current += n; return arena + current - n; }

The point is that you get full control over how the memory allocation works. The only thing outside your control is the single library call for the initial allocation.

One popular use case is where each arena is only used to allocate memory blocks of one single, fixed size. In that case, you can write very efficient reclamation algorithms. Another use case is to have one arena per "task", and when you're done with the task, you can free the entire arena in one go and don't need to worry about tracking individual deallocations.

Each of those techniques is very specialized and generally only comes in handy if you know exactly what you're doing and why the normal library allocation is not good enough. Note that a good memory allocator will already do lots of magic itself, and you need a decent amount of evidence that that's not good enough before you start handling memory yourself.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • 48
    It's a good answer, but please consider deleting or amending the last paragraph. You really don't need any evidence at all. Any time you know *how* you're going to use memory, you know more than a "good" general-purpose allocator, and if you use this knowledge your custom allocator will always win. Allocators are not magic. An arena is useful if you have a lot of items that all die at the same, well-defined point in time. That's pretty much all you need to know. It's not rocket-science. – Andreas Haferburg Apr 02 '16 at 05:35
  • 25
    @AndreasHaferburg: The memory allocator from the standard library automatically has a massive advantage over custom-writing your own, namely that you don't have to write/test/debug/maintain etc. Even if you're certain with no evidence that you can improve performance by managing your own allocation, you still need good evidence before deciding that this improvement is worth the tradeoff. – ruakh Oct 01 '16 at 19:03
  • 37
    @ruakh I just don't like this cargo-cult mentality thing that is repeated a million times everywhere as "wisdom". "The gods of C++ gave it to us, so we have to use it." And my favorite: "It's magic." No. It's not magic. It's just an algorithm that is so simple that even a computer can run it. In my book that's pretty far from magic. My guess: You underestimate how much of an impact memory allocation can have on performance, and overestimate how complicated arenas are. Whether performance is more important than developer time is a business decision that is a bit pointless to discuss on SO. – Andreas Haferburg Oct 02 '16 at 11:32
  • 9
    @AndreasHaferburg: Sure, tcmalloc uses some particular algorithm, and the idea behind it is easy enough to explain, but the implementation is still complex and non-trivial. Most importantly, it requires platform-specific knowledge to get the memory ordering right. I use "magic" for things that can either not be written portably by the user at all (like an efficient mutex, or tcmalloc, or the type name of a lambda), or only with extreme heroics (like std::function); I don't mean it as "cannot be understood". – Kerrek SB Oct 02 '16 at 12:49
  • 18
    @AndreasHaferburg: And my final advice is not so much saying that it's in principle hard to "know better than the default", but rather that the cost of maintaining a custom solution is high (somebody has to write it, document it, get it right, and someone else has to fix the bugs, and everybody has to review and reverify the original assumptions as usage spreads), and that you need evidence to justify that cost. – Kerrek SB Oct 02 '16 at 12:50
  • 6
    I would still say that it is important to collect evidence that your implementation is needed. It's easy to set off on a massive excursion only to find the default implementation handles your supposed special case just as well as your most efficient specialisation. – Keldon Alleyne Dec 29 '16 at 23:43
  • @jdk1.0: What is an "addition/return pointer", and why do you believe that that's the entirety of what a memory allocator does? – ruakh Oct 21 '20 at 18:38
  • One use case for arena allocation is if you need time guarantees, e.g. for real-time embedded use. But that is definitely knowing that std::malloc is insufficient. However I must agree that you need to be careful. E.g. when you start allocating odd numbers of bytes, you need to look at having proper alignment. Something that's not seen in this example. What about multi-threaded use? You can hide this in a good implementation, and if you break something this way, you can easily fix it everywhere. But there's always more complexity than what's obvious from a toy example. It's not magic though. – CodeMonkey Sep 29 '21 at 08:18
15

I'll go with this one as a possible answer.

•Memory Arena (also known as break space)--the area where dynamic runtime memory is stored. The memory arena consists of the heap and unused memory. The heap is where all user-allocated memory is located. The heap grows up from a lower memory address to a higher memory address.

I'll add Wikipedia's synonyms: region, zone, arena, area, or memory context.

Basically it's memory you get from the OS, and divvy out, then can be freed all at once. The advantage to this is that repeated small calls to malloc() could be costly (Every memory allocation has a performance cost: the time it takes to allocate the memory in your program’s logical address space and the time it takes to assign that address space to physical memory) where as if you know a ball park you can get yourself a big chunk of memory then hand it out to your variables as/how you need it.

Mike
  • 47,263
  • 29
  • 113
  • 177
13

Think of it as a synonym for 'heap'. Ordinarily, your process only has one heap/arena, and all memory allocation happens from there.

But, sometimes you have a situation where you would to group a series of allocations together (e.g. for performance, to avoid fragmentation, etc.). In that case, it's better to allocate a new heap/arena, and then for any allocation, you can decide which heap to allocate from.

For example, you might have a particle system where lots of objects of the same size are being frequently allocated and deallocated. To avoid fragmenting memory, you could allocate each particle from a heap which is only used for those particles, and all other allocations would come from the default heap.

Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589
6

From http://www.bozemanpass.com/info/linux/malloc/Linux_Heap_Contention.html:

The libc.so.x shared library contains the glibc component and the heap code resides inside it. The current implementation of the heap uses multiple independent sub-heaps called arenas. Each arena has its own mutex for concurrency protection. Thus if there are sufficient arenas within a process' heap, and a mechanism to distribute the threads' heap accesses evenly between them, then the potential for contention for the mutexes should be minimal. It turns out that this works well for allocations. In malloc(), a test is made to see if the mutex for current target arena for the current thread is free (trylock). If so then the arena is now locked and the allocation proceeds. If the mutex is busy then each remaining arena is tried in turn and used if the mutex is not busy. In the event that no arena can be locked without blocking, a fresh new arena is created. This arena by definition is not already locked, so the allocation can now proceed without blocking. Lastly, the ID of the arena last used by a thread is retained in thread local storage, and subsequently used as the first arena to try when malloc() is next called by that thread. Therefore all calls to malloc() will proceed without blocking.

You can also refer to this link:

http://www.codeproject.com/Articles/44850/Arena-Allocator-DTOR-and-Embedded-Preallocated-Buf

jscs
  • 63,694
  • 13
  • 151
  • 195
Rahul Tripathi
  • 168,305
  • 31
  • 280
  • 331
  • 4
    FYI when posting links you should post a summary so that if the linked article goes away your post is still useful. – stonemetal Oct 10 '12 at 17:48
  • 5
    This seems to be a copy-paste from http://www.bozemanpass.com/info/linux/malloc/Linux_Heap_Contention.html Please credit your sources when you use them verbatim. – jscs Oct 11 '12 at 07:27
1

Share what I learned about this issue. (The answer is based on glibc malloc).

arena and heap are two different data structure for memory management. And they are working in different levels: arena is in the higher level.

Arena A structure that is shared among one or more threads which contains references to one or more heaps. By default, each process has at least one arena, the main arena which is created by the main thread. And for multi-threads program, there will be multiple arenas(called thread arena). But there is no one to one mapping relationship between thread and arena. Since there is a upper limit for the number of arenas as below:

For 32 bit systems:
     Number of arena = 2 * number of cores.
For 64 bit systems:
     Number of arena = 8 * number of cores.

Thread arena can contain multiple heaps, but main arena doesn't have multiple heaps.

Heap A contiguous region of memory that is subdivided into chunks to be allocated. Each heap belongs to exactly one arena.

enter image description here

Chris Bao
  • 2,418
  • 8
  • 35
  • 62