Memory leaks - the horror of every programmer?

Question

I'm programming a game engine in C++, which also has Lua support.

My biggest horror: Memory leaks.

It's not like my game is already infested with them, I'm rather afraid of them popping out of the ground like mushrooms, when the development is in a late phase and the project huge and complex.

I'm afraid of them because they seem extremely hard to spot for me. Especially in sophisticated systems. If my engine is almost finished, the game runs and the memory gets eaten away, what will I do? Where will I start searching?

Is my fear of memory leaks justified?
How can one find out where a memory leak lies?
Aren't there good tools which help in finding the source of memory leaks today?

Meh, the real horror is heap corruption. Leaks are easy to diagnose with a debug allocator. Heap corruption is liable to be your next stop when you try to solve the leaks. — Hans Passant, Feb 06 '11 at 16:02
"If my engine is almost finished, the game runs and the memory gets eaten away, what will I do?" - if your engine is almost finished and it doesn't give the right answers, what will you do? You'll wish that you'd tested it earlier. Same with memory leaks - start testing for them now. — Steve Jessop, Feb 06 '11 at 18:20

James Armstrong · Answer 1 · 2011-02-06T15:40:31.410

37

How can one find out where a memory leak lies?

Valgrind

edited Feb 06 '11 at 15:40

answered Feb 06 '11 at 15:35

James Armstrong

600
3
10

another +1 it really is a fantastic tool for this type of problem – Sam Miller Feb 06 '11 at 16:01

score 17 · Answer 2 · answered Feb 06 '11 at 16:10

17

Raw pointers are only one potential cause of memory leaks.

Even if you use smart pointers, like shared_ptr, you can get a leak if you have a cycle - the cure for this is to use a weak_ptr somewhere to break the cycle. Using smart pointers is not a cure-all for memory leaks.

You can also forget a virtual destructor in a base class, and get leaks that way.

Even if there are no problems with new-ed objects not being deleted, a long-running process can grow (and appear to leak) because of address space fragmentation.

Tools like valgrind are very, very useful for finding leaks, but they won't always tell you where the fix should be (e.g. in the case of cycles or objects holding onto smart pointers)

answered Feb 06 '11 at 16:10

Chris Card

3,216
20
15

1

Using proper tools isn't a replacement for proper design, indeed. But IMX, designs that actually require cycles at all are surprisingly rare. And yes, a tool can't really tell you how to fix a problem properly. If it could that would be a major breakthrough in AI ;) – Karl Knechtel Feb 06 '11 at 17:33
upvoted for the fragmentation reference. On a long running system that will bite you long after you track and get rid of all memory leaks. – shoosh Feb 06 '11 at 21:56

score 12 · Answer 3 · answered Feb 06 '11 at 15:37

A well defined "object lifetime" model is needed. Anytime you do a "new", you need to think about

Who owns this heap object? i.e. who is responsible for maintaining this pointer and allowing other "clients" to reference it?
Who is responsible for deleting this object? This is normally #1, but not necessarily.
Are there any clients of this object whose lifetime is longer than that of this object? If that's true, and they are actually STORING this heap pointer, it will dereference memory that no longer "exists". You may need to add some notification mechanisms or redesign your "object lifetime" model.

A lot of times you fix memory leaks, but then run into problem #3. That's why it's best to have thought out your object lifetime model before you write too much code.

score 11 · Answer 4 · answered Feb 06 '11 at 15:30

11

You fight memory leaks by never using raw pointers. If you have code using raw pointers, refactor.

answered Feb 06 '11 at 15:30

fredoverflow

256,549
94
388
662

And if you have to use raw pointers, take advantage of [RAII](http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization) and stash them in their own objects. – In silico Feb 06 '11 at 15:38
6

Raw pointers are only one potential cause of memory leaks. Even if you use smart pointers (like shared_ptr) you can get a leak if you have a cycle. You can also forget a virtual destructor in a base class, and get leaks that way. Even if there are no problems with new-ed objects not being deleted, a long-running process can grow (and appear to leak) because of address space fragmentation. – Chris Card Feb 06 '11 at 15:53
1

@Chris: Great comment, can you refactor that into an answer? :) – fredoverflow Feb 06 '11 at 15:59
I think this answer assumes that the only use for raw pointers is for holding the address of dynamically allocated memory. Am I wrong? How do you keep re-assignable/non-owning references to other objects? – Benjamin Lindley Feb 06 '11 at 16:51
I agree is PigBen. Raw pointers are only a problem when object lifetime/ownership is associated with them. FredOverflow's point still very applicable. Memory cycles are much easier to spot then misused pointers. – deft_code Feb 07 '11 at 06:40

score 7 · Answer 5 · 2011-02-08T20:14:23.423

Is my fear of memory leaks justified?

Short answer is yes. Long answer is yes, but there are techniques of reducing them and generally making it easier. The most important of these things, in my opnion, is not to use new/delete lightly and design your program to reduce or elliminate as much dynamic allocation as you can. Instead of allocating memory, try the following and only allocate your own memory when non of these methods work for you (roughly in order of what you should prefer):

Allocate on the stack
Use standard containers (C++ Standard Library and/or Boost), eg instead of writing your own linked list, use std::list
Related to the above, store objects by value if you can (ie, if copy construction isn't too expensive), or at least references (with references you do not need to worry about null) to stack allocated objects
Pass references to stack objects where possible
When you do need to allocate your own memory, try to use RAII to allocate in the constructor and free it again in the destructor. Make sure this object is allocated on the stack.
When you need to use pointers to manually allocated data, use smart pointers to automatically free objects that are no longer used
Clearly define what objects own what memory. Try to limit your class hierarchy to as few resource owners as possible and let them deal with allocating and releasing objects for you
If you need more general dynamic memory access, write a resource manager which takes ownership of objects. A handle system as described here is also useful because it allows you to "garbage collect" memory that is no longer needed. Handles could also be tagged with what subsystem owns what memory, so you can dump the state of the systems memory usage, eg, "subsystem A owns 32% of allocated memory"
Overload operators new and delete for your allocated classes so that you can maintain additional metadata about who/what/where/when allocated what

How can one find out where a memory leak lies?

Valgrind's suite of tools: Memcheck, Cachegrind, Callgrind, Massif, Helgrind...

You could also try compiling with electric fence (-lefence for gcc) or your compilers equivelent. You can also try Intels suite of tools, especially if you are writing multithreaded code or performance sensitive code (eg, Parallel Studio), though they are expensive.

Aren't there good tools which help in finding the source of memory leaks today?

Sure there are. See above.

Now, since you are writing a game, I will share some of my own game development related thoughts/experiences:

Preallocate as much as you can (ideally everything), at the start of each level. Then during a level, you do not need to worry about memory leaks. Keep pointers to everything you allocated and at the start of the next level, free it all, since there should be no way for dangling pointers to exist when the next level loads its own clean set of data.
Use stack allocators (either using the actual stack or creating your own in the heap) to manage allocations within a level or frame. When you need memory, pop it off the top of the stack. Then when that level or frame is completed, you can simply clear the stack (if you store only POD types, this is fast and simple: just reset the stack pointer. I use this to allocate messages or my messaging system in my own game engine: during a frame, messages (which are fixed size POD types in my engine) are allocated off a stack-like memory pool (whos memory is preallocated). All events are simply taken from the stack. At the end of the frame, I "swap" the stack for a second one (so that event handlers can also send events) and call each handler on the events. Finally, I simply reset the pointer. This makes allocation of messages very fast and impossible to leak memory.
Use a resource system (with handles) for all resources which you can't manage through memory pools or stack allocators: buffers, level data, images, audio. My resource system, for example, also supports streaming resources in the background. Handles are created immediately but marked as "not ready" until the resource has finihed streaming.
Design your data, not your code (ie, design the data structures first). Isolate memory allocation where possible.
Try to keep similar data together. Not only will this make managing its lifetime easier, but may also improve your engines performance due to better cache utilization (eg, all positions of characters are stored in a container of positions, all positions are updated/processed together, etc).
Finally, if you can program in as pure a functional style as possible, rather than relying on OOP too much, you can simplify a number of problems: memory management is easier because what parts of your code can mutate data is limited. Allocation happens before the functions are called, deallocation when they are finished (a dataflow pipeline of function calls). Secondly, if you deal with pure-functional code working on immutable data, multicore programming will be greatly simplified. Double win. Eg, in my engine, game objects' data is processed by purely functional code which takes, as input, the current game state and returns, as output, the next frames game state. This makes it very easy to trace which parts of the code can allocate or free memory and generally trace the lifetime of objects. I can also process the game objects in parallel because of this.

Hope that helped.

score 6 · Answer 6 · answered Feb 06 '11 at 22:05

AFAIK Valgrid is only Linux.
For Windows you have tools like BoundsChecker and Purify.
If you're using Visual Studio, the C Runtime library (CRT) also provides a surprisingly simple and useful tool for finding memory leaks out of the box. Read about _CrtDumpMemoryLeaks and its related functions and macros.
It basically allows you to get an indexed dump of the memory leaks when the process exits and then allows you to set a breakpoint in the time where the leaking memory was allocated to see exactly when it happened. This is in contrast to most other tools that only give you post-mortem analysis without a way to reproduce the events that led to the memory leak.
Using these little gems from day one gives you a relative peace of mind that you're in good shape.

+1 on this response. _CrtDumpMemoryLeaks and friends is a fine start to weeding out memory leaks. Otherwise, the workaround is to write macros that instrument calls to malloc/free and use your own data structure to keep track of allocations that haven't been released. — selbie, Feb 07 '11 at 05:27

score 4 · Answer 7 · answered Feb 06 '11 at 17:31

Is my fear of memory leaks justified?

If you are writing code that contains them, then absolutely.

How can one find out where a memory leak lies?

By analyzing the code.

Aren't there good tools which help in finding the source of memory leaks today?

Yes, but it's still not easy.

The good news is, it's completely unnecessary with proper design. An ounce of prevention is worth a ton of cure: the best strategy for dealing with memory leaks is to write the code in a way that guarantees that there aren't any.

score 3 · Answer 8 · answered Feb 06 '11 at 18:31

3

At the risk of sounding like the smug jerk that I probably am, consider using any programming language developed after 1979, which don't have problems with memory leaks, heap corruption stack corruption, or even memory management. (Anyone who says something like "I need my program to be fast" has probably never heard of Donald Knuth.)

answered Feb 06 '11 at 18:31

Michael Lorton

43,060
26
103
144

latest C++ dates to 2003. The upcoming version that is actually already supported in most major platforms include smart pointers in the standard library. – shoosh Feb 06 '11 at 22:07
1

2003. Your defense of the language is that its most recent update was eight years ago and that one of its most serious shortcomings will be addressed quite soon? Really? Shoosh, seriously, when I started using C++ it was an exciting, innovative language, Reagan was president, it was Morning in America, and everybody was listening to Wham!, but those days are gone. Today, the Gipper turns 100, Blackeyed Peas is singing at the Superbowl, and it's time to switch to a garbage-collected, strongly-typed language. – Michael Lorton Feb 06 '11 at 22:23
1

@Malvolio: have you _actually_ read Knuth? The weird MIX stuff? He's the guy that actually insists on using a language that was outdated in 1979, let alone today. As for the "shortcoming addressed", Boost dates back to the previous century and addressed the problem. The standard merely standardizes. – MSalters Feb 07 '11 at 10:08
Saying "there exists a solution to memory leaks" is like saying "black plague doesn't kill *everyone*." You can avoid memory leaks in any language, so long as every module and library you use is perfect and is used perfectly. What C++ needed to do -- and do back when the first Bush was present -- was standardized *and universalize* on a solution that made memory leaks impossible. Yes, Knuth is a weird guy (he has his secretary print out his email so he can read it), but that doesn't mean he's wrong about premature optimization. – Michael Lorton Feb 07 '11 at 17:12
For the record, half of Knuth's book _Literate Programming_ was about why he liked to use GOTO even though Dijkstra said not to. – Max Lybbert Feb 07 '11 at 18:07
C++ has had features to manage resources since the beginning. The standard library may not have a good smart pointer, but C++ programmers know where to get one ( http://www.boost.org/ ). You can even get a garbage collector ( http://www.hpl.hp.com/personal/Hans_Boehm/gc/ ). C++ recognizes that memory is only one resource that needs managing, and the mechanisms to manage memory work well for managing other resources, which isn't true of other languages ( http://mail.openjdk.java.net/pipermail/coin-dev/2009-February/000011.html "2/3 of the uses of the close method in the JDK itself are wrong!"). – Max Lybbert Feb 07 '11 at 18:21
"the mechanisms to manage memory work well for managing other resources" -- that just isn't true. It's extraordinary rare that a non-memory resource allocated by one module would be managed by another; for that manner, a non-memory resource allocation of any sort is vanishingly rare compared to memory allocation. "half of Knuth's book Literate Programming was about why he liked to use GOTO". Is that true? Knuth is so weird. – Michael Lorton Feb 07 '11 at 19:19
"It's extraordinary rare that a non-memory resource allocated by one module would be managed by another": personally I don't like to have memory allocated by one module deallocated by another. It's hard to identify ownership of the object and it's possible to introduce bugs when the modules are compiled at different times, with different compilers, or with different settings ( http://blogs.msdn.com/b/oldnewthing/archive/2006/09/15/755966.aspx ). At the very least, module A may tell module B that module A no longer needs a resource, but module B should be the one to clean up. – Max Lybbert Feb 07 '11 at 20:13
When I use RAII and ownership semantics to manage resources, including memory, I don't generally transfer ownership (and responsibility for cleanup) all around my program. Most of the time, the resource being managed only exists inside a well-defined scope, and `try`-`finally` would work if `try`-`finally` weren't fundamentally broken (again, at some point Sun's programmers implementing the JDK -- who should have known Java pretty well -- got `try`-`finally` wrong in 2/3 of the cases; a mechanism that experienced programmers get wrong 2/3 of the time is not a well-designed mechanism). – Max Lybbert Feb 07 '11 at 20:18
@Max -- close() is misused in 2/3 of all cases, finally is misused in 2/3 of all cases, addition is misused in 2/3 of all cases ... Maybe what's wrong is your definition of "misused". – Michael Lorton Feb 07 '11 at 22:13
I'm not the one who defined "misused." The statement comes from a JDK proposal that was eventually accepted ( http://mail.openjdk.java.net/pipermail/coin-dev/2009-February/000011.html ). Although you are correct that it was `close()` that was misused, not `finally` (of course `close()` was most likely called from `finally` blocks). – Max Lybbert Feb 07 '11 at 23:22
1

The underlying point is Stroustrup realized memory management was a special case of resource management. Other languages solved the memory management problem with a garbage collector, realized that they hadn't solved resource management, and gave that problem to a crack-addled monkey. C#, Python and Java are all running away from `finally` now, with mechanisms that look suspiciously like what Stroustrup came up with years ago. – Max Lybbert Feb 07 '11 at 23:24
@Max -- `finally` is far from perfect, but the example cited is just plain stupid. Whoever wrote the code didn't read the manual. I don't know what languages other than Python are planning -- but much as I like Python's solution (the `with` context manager), it wouldn't work at all for memory. – Michael Lorton Feb 08 '11 at 08:17
I'm here to tell you it does work just fine for memory. Or, rather, it's based on a design that works just fine for memory. I'm not saying "it would possibly work," I'm saying programs on your computer right now use RAII to successfully avoid memory leaks. And they also use it to avoid file handle leaks, GUI handle leaks, network socket leaks, etc. One happens to be the Pyhon interpreter (yes, it's written in C, but it uses the same technique). The problem with garbage collection is that it solves resource management of one resource and leaves the programmer on his own for all others. – Max Lybbert Feb 08 '11 at 09:57
1

The second chapter of _Literate Programming_ is "Structured Programming with go to Statements" and takes up a third of the book. This chapter also includes the statement, "It seems clear that languages somewhat different from those in existence today would enhance the preparation of structured programs. We will perhaps eventually be writing only small modules which are identified by name as they are used to build larger ones, so that devices like indentation, rather than delimiters, might become feasible for expressing local structure in the source language." – Max Lybbert Feb 08 '11 at 10:01

score 2 · Answer 9 · answered Feb 06 '11 at 15:46

Memory leaks are not too scary - but they can be detrimental to a program's performance (so get rid of them!).

Don't be afraid of memory leaks. Think about what they are - memory which hasn't been deleted but to which all access is removed (so until execution ends, the system doesn't know it can re-allocate that memory).
You can find memory leaks "by-hand" so to speak by going through your objects and being sure they're deleted in the proper place (and only once! otherwise other errors will come about). Tools such as Valgrind can help spot where the error may be.
As someone mentioned before (and I mentioned above) Valgrind is a great tool to finding memory leaks. Running it like this:

valgrind --leak-check=full -v ./YOUR_EXECUTABLE

That will give you a full leak check and verbose (-v) output on how memory is being used in your program.

Regards,
Dennis M.

score 2 · Answer 10 · answered Feb 06 '11 at 23:34

2

At least for the Lua part you can use your own memory allocator and track all allocations and freeing and so spot any memory leaks.

answered Feb 06 '11 at 23:34

lhf

70,581
9
108
149

score 1 · Answer 11 · answered Feb 06 '11 at 22:20

There are various techniques to track memory leaks.

The simplest is to use a macro and a specific allocator that would store the function that allocated this. That way, you could track each allocation and see which aren't deleted when they should be. Then you can start writing unittest and assert that memory has been freed.

If you use pre-compiled containers all the time, this won't work as all allocations will be in containers. Then your options are:

Use a thread-local-value to identify the subsystem or class id (in debug builds) that's running, so that your allocator can detect who is allocating memory. You could even use a stack to track memory usage hierarchially in your engine.
Actually retrieve the call stack and store that, if your compiler has sufficient support.
Use memory pools for the subsystems, and measure if their size increases disproportionately. (This is also a (admittedly poor) workaround for leaky memory, since you could free the entire pool at once, thus freeing leaked memory too, if you're able.)
On Windows, there are some macros that track memory allocation by source line automatically under debug builds.

There are probably more options than that. Testing and the use of a custom global new/delete override (that can be queried) should prove useful, if your design permits it.

Also, see the Electronic Arts STL C++ paper for some discussion on what needs to be done in STL/C++ to support proper game development. (It's probably a bit more hardcore than your engine, but it certainly contains many nuggets of inspiration and ingenuity.)

score 1 · Answer 12 · answered Mar 05 '11 at 12:57

1

How can one find out where a memory leak lies?

Visual Leak Detector for Visual C++ 2008/2010

answered Mar 05 '11 at 12:57

KindDragon

6,558
4
47
75

Memory leaks - the horror of every programmer?

12 Answers12