Memory leak with TBB 4.2 with high multi-threading

Question

I'm not sure if this is a false positive or a real problem. I have tried and failed to find anyone else who's faced/fixed this yet.

Using TBB Version 4.2 (I can't upgrade due to legacy issues). I observe a memory leak within my application while using TBB concurrent containers in combination with a high load of TBB parallel threading.

A simple snippet to reproduce this issue -

#include "tbb/tbb.h"

void testFunc()
{
    tbb::concurrent_unordered_map<int,int> t;
    tbb::parallel_for(1,100,[&](int p)
    {
        tbb::parallel_for(1,10000000,[&](int n)
        {
            t.insert(tbb::concurrent_unordered_map<int,int>::value_type(n,n));
        });
    });
    t.clear();
}
int main()
{
    testFunc();
    return 0;
}

This leaks about 500MB every time you execute testFunc().

I iterate again, you don't notice this when you are only running about 1000-100000 threads at a time. 1 Million is where you can begin to notice chunks of memory being lost.

My questions-

Is this an old issue that's been fixed in the later TBB Versions?
Is it unnoticed because people don't generally tend to use concurrent containers that need to be so big/ are accessed/allocated along with such high threading loads?
Does anyone know a way to work around this?

My application deals with huge data such that I cannot do-away with multi-threading or know in advance the size of the containers necessary for my operations.

Any help would be appreciated! Thanks!

Edit:

I'm on version 4.2.2014.601. I can't seem to find which update this is from.

I have checked the changes list and I did notice something in version 4.3 update 6, where they say they've fixed a race condition that consumes a lot of memory on high parallel load. I'm not sure if this is a manifestation of the same problem? I'm also unsure on how to verify this.

I only used Visual Leak Detector while running the program, And I'm pretty sure it's not a case of threads finishing later, as I can see the chunk of memory being used all the way till I explicitly close my application (noticed from windows task manager)

My system has 4 cores with 4 threads. And this happens with concurrent_unordered_set as well. The application uses TBB concurrent containers and parallel_for extensively, and I don't notice a leak when the number of iterations is in the order of hundreds of thousands. It's visible once you hit a million.

I have posted the question with TBB as well. For those who may be facing a similar issue - https://software.intel.com/en-us/forums/intel-threading-building-blocks/topic/597165

I will post their response here as well once they've decided if it's a user error or a real issue ;)

score 1 · Answer 1 · edited May 23 '17 at 10:27

The TBB's CHANGES file says nothing regarding concurrent_unordered_map or a memory leak being fixed since the last update release (5) of TBB 4.2. But if you are not on the latest update of TBB 4.2, there were memory leaks fixed on Intel Xeon Phi (MIC) and for parallel_reduce (not _for).

You have not described your system (e.g. N of threads) nor the tools used, so I have to guess. But if you are running your reproducer with the latest update version of TBB 4.2 or on regular CPU and with parallel_for only, it might be a problem either in TBB or in your way of memory leak detection.

Examples of the latter can be caching of the memory in TBB scheduler or TBB memory allocator, and the point of worker threads termination w.r.t. the point of memory leak detection as described in this answer.

When you verify your detection approach and still not satisfied with the memory consumption, please ask for help on TBB forum. Just please note that trading memory consumption for the performance improvement is a traditional deal in the parallel programming.

Thanks for your clear response Anton, let me attempt addressing your questions. — divSivasankaran, Oct 23 '15 at 02:29

Memory leak with TBB 4.2 with high multi-threading

1 Answers1