4

I'm trying to analyze my tensorflow application. The training runs well, but I get Failed to load libcupti (is it installed and accessible?) if I open the Profile-Tab in Tensorboard.

My configuration is:

  • Windows 10
  • Python 3.9.7
  • Tensorflow 2.6.0
  • CUDA Toolkit 11.2
  • cuDNN 8.1.1 (installed as here by copying files as described)
  • Visual Studio Professional 2019

CUDA_PATH is C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2

My Path-Variable contains:

  • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin
  • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\libnvvp
  • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\extras\CUPTI\lib64
  • C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\include
  • C:\Program Files\NVIDIA Corporation\Nsight Systems 2020.4.3\target-windows-x64

conda list (only relevant packages):

cudatoolkit               11.3.1               h59b6b97_2
cudnn                     8.2.1                cuda11.3_0
tensorboard               2.6.0                      py_1
tensorboard-data-server   0.6.0            py39haa95532_0
tensorboard-plugin-profile 2.5.0                    pypi_0    pypi
tensorboard-plugin-wit    1.6.0                      py_0
tensorflow                2.6.0           gpu_py39he88c5ba_0
tensorflow-base           2.6.0           gpu_py39hb3da07e_0
tensorflow-datasets       4.5.2                    pypi_0    pypi
tensorflow-estimator      2.6.0              pyh7b7c402_0
tensorflow-gpu            2.6.0                h17022bd_0
tensorflow-metadata       1.6.0                    pypi_0    pypi

I am surprised that Anaconda has installed CUDA Toolkit version 11.3 and cuDNN version 8.2.1. According to GPU-configurations this should be version 11.2 and 8.1. Can this be the problem?

Or has someone an idea how to solve this problem?

talonmies
  • 70,661
  • 34
  • 192
  • 269
Ozelot
  • 105
  • 1
  • 9
  • See this similar issue https://stackoverflow.com/questions/56860180/tensorflow-cuda-cupti-error-cupti-could-not-be-loaded-or-symbol-could-not-be –  Feb 15 '22 at 01:41

1 Answers1

2

Hidden in the log output of jupyter I found an this error message: Could not load dynamic library 'cupti64_113.dll': dlerror: cupti64_113.dll not found (This log can be found in terminal, where jupyter is running.)

With this error message and that github issue I was able to solve the problem: I copied cupti64_2020.3.0.dll in C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\extras\CUPTI\lib64 and renamed it to cupti64_113.dll and now the profiler works.

Hacker1337
  • 50
  • 4
Ozelot
  • 105
  • 1
  • 9
  • 1
    I have the same problem. I would like to determine what the name of the .dll file should be for my installation. Where did you find the `log output of jupyter` ? – Roland Feb 06 '23 at 02:20
  • Are you talking about the log console: `view->show log console`? I tried copying the file to cupti64_113.dll but it didn't fix the problem. – Roland Feb 06 '23 at 02:29