I'm not sure if this is a false positive or a real problem. I have tried and failed to find anyone else who's faced/fixed this yet.
Using TBB Version 4.2 (I can't upgrade due to legacy issues). I observe a memory leak within my application while using TBB concurrent containers in combination with a high load of TBB parallel threading.
A simple snippet to reproduce this issue -
#include "tbb/tbb.h"
void testFunc()
{
tbb::concurrent_unordered_map<int,int> t;
tbb::parallel_for(1,100,[&](int p)
{
tbb::parallel_for(1,10000000,[&](int n)
{
t.insert(tbb::concurrent_unordered_map<int,int>::value_type(n,n));
});
});
t.clear();
}
int main()
{
testFunc();
return 0;
}
This leaks about 500MB every time you execute testFunc().
I iterate again, you don't notice this when you are only running about 1000-100000 threads at a time. 1 Million is where you can begin to notice chunks of memory being lost.
My questions-
- Is this an old issue that's been fixed in the later TBB Versions?
- Is it unnoticed because people don't generally tend to use concurrent containers that need to be so big/ are accessed/allocated along with such high threading loads?
- Does anyone know a way to work around this?
My application deals with huge data such that I cannot do-away with multi-threading or know in advance the size of the containers necessary for my operations.
Any help would be appreciated! Thanks!
Edit:
I'm on version 4.2.2014.601. I can't seem to find which update this is from.
I have checked the changes list and I did notice something in version 4.3 update 6, where they say they've fixed a race condition that consumes a lot of memory on high parallel load. I'm not sure if this is a manifestation of the same problem? I'm also unsure on how to verify this.
I only used Visual Leak Detector while running the program, And I'm pretty sure it's not a case of threads finishing later, as I can see the chunk of memory being used all the way till I explicitly close my application (noticed from windows task manager)
My system has 4 cores with 4 threads. And this happens with concurrent_unordered_set as well. The application uses TBB concurrent containers and parallel_for extensively, and I don't notice a leak when the number of iterations is in the order of hundreds of thousands. It's visible once you hit a million.
I have posted the question with TBB as well. For those who may be facing a similar issue - https://software.intel.com/en-us/forums/intel-threading-building-blocks/topic/597165
I will post their response here as well once they've decided if it's a user error or a real issue ;)