1

When I try to run a script and use tensorboard profiler to monitor my training time, it gives me such error:

2021-10-28 16:52:40.613220: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcupti.so.11.2'; dlerror: libcupti.so.11.2: cannot open shared object file: No such file or directory

my code is

tboard_callback = TensorBoard(log_dir = logs,
                         histogram_freq = 1,
                         profile_batch = '2,500')

however,everything goes fine and the training begin.I can load the profiler file into the tensorboard. But it just so strange that if I turn off the profiler,like this:

tboard_callback = TensorBoard(log_dir = logs,
                         histogram_freq = 1,
                         profile_batch = 0)

there is no such error. And I try to make a symbolic link from

/usr/local/cuda-11.2/extras/CUPTI/lib64

to

/usr/lib

I reuse the profiler again and it gives me such error:

2021-10-28 16:50:15.144301: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcupti.so.11.2'; dlerror: /lib/libcupti.so.11.2: invalid ELF header

Can anyone explains why? I am very confused and hope for your answer!

  • Hi! It seems to be a GPU related issue. Could you please try again setting CUDA bin file path in environment variable . Reference : https://github.com/tensorflow/tensorflow/issues/43193 https://stackoverflow.com/questions/65933271/could-not-load-dynamic-library-libcupti-so-11-0-dlerror-libcupti-so-11-0-ca –  Oct 29 '21 at 12:21

0 Answers0