Why mxnet's GPU version cost more memory than CPU version?

Question

I made a very simple network using mxnet(two fc layers with dim of 512).

By changing the ctx = mx.cpu() or ctx = mx.gpu(0), I run the same code on both CPU and GPU.

The memory cost of GPU is much bigger than CPU version.(I checked that using 'top' instead of 'nvidia-smi').

It seems strange, as the GPU version also has memory on GPU already, why GPU still need more space on memory?

(line 1 is CPU program / line 2 is GPU program)

score 0 · Answer 1 · answered Sep 01 '17 at 07:13

0

It may be due to differences in how much time each process was running. Looking at your screenshot, CPU process has 5:48.85 while GPU has 9:11.20 - so the GPU training was running almost double the time which could be the reason.

answered Sep 01 '17 at 07:13

Hagay Lupesko

368
1
11

Thomas · Answer 2 · 2018-03-07T01:21:47.717

When running on GPU you are loading a bunch of different lower-level libraries in memory (CUDA, CUDnn, etc) which are allocated first in your RAM. If your network is very small like in your current case, the overhead of loading the libraries in RAM will be higher than the cost of storing the network weights in RAM.

For any more sizable network, when running on CPU the amount of memory used by the weights will be significantly larger than the libraries loaded in memory.

Why mxnet's GPU version cost more memory than CPU version?

2 Answers2