When I try to run a script and use tensorboard profiler to monitor my training time, it gives me such error:
2021-10-28 16:52:40.613220: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcupti.so.11.2'; dlerror: libcupti.so.11.2: cannot open shared object file: No such file or directory
my code is
tboard_callback = TensorBoard(log_dir = logs,
histogram_freq = 1,
profile_batch = '2,500')
however,everything goes fine and the training begin.I can load the profiler file into the tensorboard. But it just so strange that if I turn off the profiler,like this:
tboard_callback = TensorBoard(log_dir = logs,
histogram_freq = 1,
profile_batch = 0)
there is no such error. And I try to make a symbolic link from
/usr/local/cuda-11.2/extras/CUPTI/lib64
to
/usr/lib
I reuse the profiler again and it gives me such error:
2021-10-28 16:50:15.144301: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcupti.so.11.2'; dlerror: /lib/libcupti.so.11.2: invalid ELF header
Can anyone explains why? I am very confused and hope for your answer!