1

This is a follow up question to an earlier question.

From the discussion, the mmc code (https://github.com/fangq/mmc) appears to be fine, and the memory was properly released when running on Intel CPU and AMD GPU. However, on NVIDIA GPU, valgrind reported significant memory leak, so was the test. Every time after a cycle of creating and releasing a GPU context, the memory kept increasing.

You can see this result in the below memory (blue line) profiling report. enter image description here

Here is the test and commands to reproduce the issue (need to run this on NVIDIA GPUs):

git clone https://github.com/fangq/mmc.git
cd mmc/src
sed -i -e 's/mmc_init_from_cmd/for(int i=0;i<5;i++){\nmmc_init_from_cmd/g' mmc.c
sed -i -e 's/return/getchar();}\nreturn/g' mmc.c
make clean
make all
cd ../examples/validation
../../src/bin/mmc -f cube2.inp -G 1 -s cube2 -n 1e4 -b 0 -D TP -M G -F bin

run ../../src/bin/mmc -L to list GPUs, use -G # to specify which GPU to use.

as you will see, the simulation will repeat 5 times, separated by enter keys. You can start a memory monitor, like top command in Linux, and see the increasing memory allocation after each repetition.

I googled and found multiple previous reports on OpenCL memory leaks, but I did not find an solution. I would like to know if if there any trick to force NVIDIA OpenCL driver to clean up memory after each run. I am asking this because mmc has a MATLAB/Octave mex function which can be called multiple times, and this issue could lead to large memory usage after multiple calls.

talonmies
  • 70,661
  • 34
  • 192
  • 269
FangQ
  • 1,444
  • 10
  • 18
  • I know you are looking for the solution but have you tried to workaround the issue by using `libOpenCL.so` from different than Cuda source? Like from `ocl-icd-libopencl1:amd64` package on Ubuntu. Also have you tried to use [POCL](https://launchpad.net/ubuntu/+source/pocl)? It should support nvidia gpu too. Workarounding the issue might be your only bet... – doqtor Apr 12 '20 at 10:22

0 Answers0