CUDA/cuDNN error after upgrading nvidia drivers from 384.90 to 384.111

Question

This morning I updated the nvidia drivers on Mint 18.3 Sylvia (based on Ubuntu xenial 16.04) with standard update procedure (using Update Manager) and I got this error when running tensorflow 1.4.1:

2018-01-10 13:48:39.161422: E tensorflow/stream_executor/cuda/cuda_dnn.cc:385] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2018-01-10 13:48:39.161456: E tensorflow/stream_executor/cuda/cuda_dnn.cc:352] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
2018-01-10 13:48:39.161466: F tensorflow/core/kernels/conv_ops.cc:667] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo<T>(), &algorithms)

I'm using cuda 8.0 and cudnn 7.0

Why and how to fix it?

How is this a programming question ("Help, `apt-get update` broke my linux!")? — talonmies, Jan 10 '18 at 15:19
Well, CUDA and tensorflow are usually used to program, not to read email. I got this exception while running a project I'm working on and I thought it was a good idea to share the solution here. There are several questions about cuda/cudnn installation problems around. — lorenzo, Jan 10 '18 at 17:41

lorenzo · Answer 1 · 2018-01-24T09:09:02.727

After investigating for a while, I noticed two broken links in the /usr/lib/nvidia-384 folder, still pointing to the 384.90 files.

So I just updated the two links like this:

ln -sf libnvidia-ptxjitcompiler.so.384.111 libnvidia-ptxjitcompiler.so.1
ln -sf libnvidia-wfb.so.384.111 libnvidia-wfb.so.1

and now it works perfectly.

BTW, another similar problem I had was when I upgraded from a major driver version to another, like from 372 to 384, forgetting to update the LD_LIBRARY_PATH in my scripts.

CUDA/cuDNN error after upgrading nvidia drivers from 384.90 to 384.111

1 Answers1