Could not create cudnn handle: CUDNN STATUS INTERNAL ERROR

Question

I'm trying to create machinelearing in python 3. but then i trying to compile my code i got this error in Cuda 10.0/cuDNN 7.5.0, can some one help me with this?

RTX 2080

I'm on: Keras (2.2.4) tf-nightly-gpu (1.14.1.dev20190510)

Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

Code erorr: tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

Here is my code:

model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(50, 50, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(1, activation='softmax'))

model.summary()

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])
model.fit(x, y, epochs=1, batch_size=n_batch)

OOM when allocating tensor with shape[24946,32,48,48] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

Roko Mijic · Answer 1 · 2019-12-20T12:27:37.073

Using Tensorflow 2.0, CUDA 10.0 and CUDNN 7.5 the following worked for me:

gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)

There are some other answers (such as the one here by venergiac) that use outdated Tensorflow 1.x syntax. If you are using the latest tensorflow you'll need to use the code I gave here.

If you get the following error:

Physical devices cannot be modified after being initialized

then the problem will be resolved by putting the gpus = tf.config ... lines directly below where you import tensorflow, i.e.

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)

I put that line directly below where I import tensorflow but it doesn't work and I should restart the kernal? — cloudscomputes, Aug 31 '21 at 15:17

score 2 · Answer 2 · answered May 10 '19 at 12:43

2

There are 2 possible solutions.

Problem on allocating GPU Memory

add the following code

import tensorflow as tf
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.5)
config = tf.ConfigProto(gpu_options=gpu_options)
config.gpu_options.allow_growth = True
session = tf.Session(config=config)

check also this issue

Problem with your NVIDIA Driver

As posted there you need to upgrade your NVIDIA Driver using ODE driver.

Please check NVIDIA Documentation for version of the driver

answered May 10 '19 at 12:43

venergiac

7,469
2
48
70

Hej, i got this error (OP_REQUIRES failed at conv_ops.cc:484 : Resource exhausted: OOM when allocating tensor with shape[24946,32,48,48] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc) then i add your code. – 007fred May 10 '19 at 14:25
did you test with smaller NN? – venergiac May 10 '19 at 15:48
sorry, what do you mean with NN ? – 007fred May 10 '19 at 16:13
But, my RTX 2080 works with (LSTM) But it does not work with (Conv2D) – 007fred May 10 '19 at 16:31
OK it is similar to my case. I updated the driver to the latest version "430" – venergiac May 12 '19 at 07:43
I'm on ubuntu server 16.04. not windows. – 007fred May 12 '19 at 08:10
please add this info to the issue.......did you try to update the driver to the latest version? https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html – venergiac May 12 '19 at 09:05
i'm on cuda 10.0 i have read tensorflow will not work with cuda 10.1. My driver version is 418.39 Ubuntu – 007fred May 12 '19 at 09:12
me too! I'm on cuda 10.0 with 430 driver ... on linux >= 410.48 is required for CUDA but for CUDNN I do not know... please report the output of nvidia-smi – venergiac May 12 '19 at 09:16
please refer to "One of the following supported CUDA versions and NVIDIA graphics driver: NVIDIA graphics driver R410 or newer for CUDA 10.0 .. https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html" so it seems that you are OK...an it works on LSTM but not on Conv2D...my same case – venergiac May 12 '19 at 09:18

score 0 · Answer 3 · answered Feb 19 '20 at 08:44

Roko's answer should work if you're using Tensorflow 2.0.

If you wanna set the exact amount of memory to limit(e.g. 1024MB or 2GB, etc.), there's another way to restrict your GPU memory usage.

Use this code:

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  try:
    tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
  except RuntimeError as e:
    print(e)

This code will limit your 1st GPU’s memory usage up to 1024MB. Just change the index of gpus and memory_limit as you want.

Could not create cudnn handle: CUDNN STATUS INTERNAL ERROR

3 Answers3

Problem on allocating GPU Memory

Problem with your NVIDIA Driver

Linked