Import error of theano occasionally after interruption of the program

Question

I am implementing some deep learning algorithms using theano. After I stop some programs running theano, occasionally the following error appears if I want to import theano again.

    >>> import theano
ERROR (theano.sandbox.cuda): ERROR: Not using GPU. Initialisation of device gpu failed:
initCnmem: cnmemInit call failed! Reason=CNMEM_STATUS_OUT_OF_MEMORY. numdev=1

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jjhu/.local/lib/python2.7/site-packages/theano/__init__.py", line 118, in <module>
    theano.sandbox.cuda.tests.test_driver.test_nvidia_driver1()
  File "/home/jjhu/.local/lib/python2.7/site-packages/theano/sandbox/cuda/tests/test_driver.py", line 40, in test_nvidia_driver1
    if not numpy.allclose(f(), a.sum()):
  File "/home/jjhu/.local/lib/python2.7/site-packages/theano/compile/function_module.py", line 875, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "/home/jjhu/.local/lib/python2.7/site-packages/theano/gof/link.py", line 317, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "/home/jjhu/.local/lib/python2.7/site-packages/theano/compile/function_module.py", line 862, in __call__
    self.fn() if output_subset is None else\
RuntimeError: Cuda error: kernel_reduce_ccontig_node_4894639462a290346189bb38dab7bb7e_0: out of memory. (grid: 1 x 1; block: 256 x 1 x 1)

Apply node that caused the error: GpuCAReduce{add}{1}(<CudaNdarrayType(float32, vector)>)
Toposort index: 0
Inputs types: [CudaNdarrayType(float32, vector)]
Inputs shapes: [(10000,)]
Inputs strides: [(1,)]
Inputs values: ['not shown']
Outputs clients: [[HostFromGpu(GpuCAReduce{add}{1}.0)]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

I search for several solutions. Someone suggests to remove the compilation folder by rm -rf ./theano . I also check that the owner of ./theano is not root user. I also try setting my ./theanorc as following. But both do not work for me.

[global]
floatX = float32
device = cpu
optimizer=fast_run

[lib]
cnmem = 0.1

[cuda]
root = /usr/local/cuda

The only working solution is to reboot or log out the machine. It is very awkward. I don't know what causes this problem. Can anyone suggest some solutions?

Well, the error sugests that the GPU is out of memory. Are you sure it is not the case? Which GPU are you using? What is the result of `nvidia-smi` (if applicable)? Anyway, this error shouldn't happen with `device=cpu` in `~/.theanorc` -- does it happen when you start python as: `THEANO_FLAGS=device=cpu python`? — sygi, Nov 10 '16 at 12:11
Thanks for your reply. I have GTX 1080 (8GB) installed in my machine, and device=gpu in my ~/.theanorc. I don't start python with the THEANO_FLAGS=device=cpu. This annoying problem happens even when I just try to import the theano library, I don't even use GPU to do any computation yet. As I described above, this problem happens when I stop some running programs using theano. I suspect some memory caches are not released. Only when I restart my machine, the GPU memory are reset to normal. Do you know how to solve it? — Jun, Nov 12 '16 at 00:15
Have you taken look at `nvidia-smi` results? It will tell you how much memory is used at a given moment. — sygi, Nov 12 '16 at 09:39
Sorry for the late response. Yes, I use nvidia-smi to look at the result, but it shows that my GPU usage is just around 30%. — Jun, Nov 16 '16 at 17:05

Import error of theano occasionally after interruption of the program

0 Answers0