I am implementing some deep learning algorithms using theano. After I stop some programs running theano, occasionally the following error appears if I want to import theano again.
>>> import theano
ERROR (theano.sandbox.cuda): ERROR: Not using GPU. Initialisation of device gpu failed:
initCnmem: cnmemInit call failed! Reason=CNMEM_STATUS_OUT_OF_MEMORY. numdev=1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/jjhu/.local/lib/python2.7/site-packages/theano/__init__.py", line 118, in <module>
theano.sandbox.cuda.tests.test_driver.test_nvidia_driver1()
File "/home/jjhu/.local/lib/python2.7/site-packages/theano/sandbox/cuda/tests/test_driver.py", line 40, in test_nvidia_driver1
if not numpy.allclose(f(), a.sum()):
File "/home/jjhu/.local/lib/python2.7/site-packages/theano/compile/function_module.py", line 875, in __call__
storage_map=getattr(self.fn, 'storage_map', None))
File "/home/jjhu/.local/lib/python2.7/site-packages/theano/gof/link.py", line 317, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/home/jjhu/.local/lib/python2.7/site-packages/theano/compile/function_module.py", line 862, in __call__
self.fn() if output_subset is None else\
RuntimeError: Cuda error: kernel_reduce_ccontig_node_4894639462a290346189bb38dab7bb7e_0: out of memory. (grid: 1 x 1; block: 256 x 1 x 1)
Apply node that caused the error: GpuCAReduce{add}{1}(<CudaNdarrayType(float32, vector)>)
Toposort index: 0
Inputs types: [CudaNdarrayType(float32, vector)]
Inputs shapes: [(10000,)]
Inputs strides: [(1,)]
Inputs values: ['not shown']
Outputs clients: [[HostFromGpu(GpuCAReduce{add}{1}.0)]]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
I search for several solutions. Someone suggests to remove the compilation folder by rm -rf ./theano . I also check that the owner of ./theano is not root user. I also try setting my ./theanorc as following. But both do not work for me.
[global]
floatX = float32
device = cpu
optimizer=fast_run
[lib]
cnmem = 0.1
[cuda]
root = /usr/local/cuda
The only working solution is to reboot or log out the machine. It is very awkward. I don't know what causes this problem. Can anyone suggest some solutions?