We have a problem with using of HeapCreate()/HeapAlloc() for big allocations (> 512K)
We are developing a C++ server application performing some 'image processing' operations concurrently on a few images. It should work for a long time without restarting.
Our processing model is quite specific. Server starts, performs some required analysis in order to detect max. number of concurrent images for the given hardware configuration, meaning stable working with best performance, quickly reaches the max loading and then works more or less with the same high loading most of the time, depending on input queue.
That means we utilize all required memory at the beginning and total amount of memory should not grow (if everything is fine). Our pain is fragmentation. Size of incoming images may vary from 400K to possibly 50M and processing of each one leads to corresponding (proportional to image size) relatively big OPENCV allocations. Processing scenarios (and related allocations) vary, depends on image specifics, alloc/free actions are very intensive, then after some time we get fragmentation. Some local optimizations were developed given negligible improvements. Actually we have out-of-memory/fragmentation related effects after approx. 50000-70000 images which is not so acceptable. Current solution is restarting of the server, which is far from be ideal.
Initial naive proposal to solve the problem was:
- We have own custom heap committing initially the whole required memory.
- All required 'big' OPENCV allocations (and ONLY those) redirected to this heap
- At the moment, fragmentation arrives, we stop new input and finish all running jobs.
- That means all image related allocations are released.
- Check the heap and clean it if required (due to memory leaks, for example)
- Now, we have absolutely empty heap and can start from scratch. Open input again.
Simple proof-of-concept project quickly figured out the following:
- HeapCreate(), committing initially 250M, grows by 10M each time I call HeapAlloc() from it! Strange, isn't it?
- As was recognized using HeapWalk(), the committed memory was reserved not in one continuous block, but as a list of more than 500 chunks of 512K each. So none of them was suitable for my 10M request and heap called to process uncommitted memory
It seems Win32 Custom Heap is optimized for small allocations only and I was unable to find a way to use it for my needs :( VirtualAlloc() seems to be a solution, but it's very low-level API and using it means developing of my own memory-management system, seems some kind of wheel reinvention.
I want to believe some standard way exists and I just cannot find it. Any help or relevant resources to read will be much appreciated