0

I get the message in the subject when I try to run a program I developed with OpenACC through Nvidia's nvprof profiler like this:

nvprof ./SFS 4

If I run nvprof with -o [output_file] the warning message doesn't appear, but the output file is not created. What could be wrong here?

The LD_LIBRARY_PATH is set in my .bashrc to: /opt/nvidia/hpc_sdk/Linux_x86_64/20.7/cuda/11.0/lib64/ because there I have found these files there (they have "cupti" and "inj" in their names and I thought they are the needed ones):

lrwxrwxrwx 1 root root      19 Aug  4 05:27 libaccinj64.so -> libaccinj64.so.11.0
lrwxrwxrwx 1 root root      23 Aug  4 05:27 libaccinj64.so.11.0 -> libaccinj64.so.11.0.194
...
lrwxrwxrwx 1 root root      16 Aug  4 05:27 libcupti.so -> libcupti.so.11.0
lrwxrwxrwx 1 root root      20 Aug  4 05:27 libcupti.so.11.0 -> libcupti.so.2020.1.0
...

I am on Ubuntu 18.04. workstation with Nvidia GeForce RTX 2070, and have CUDA version 11 installed.

nvidia-smi command gives me this:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.66       Driver Version: 450.66       CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 2070    Off  | 00000000:02:00.0  On |                  N/A |
| 30%   40C    P2    58W / 185W |    693MiB /  7981MiB |      3%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

The compilers I have (nvidia and portland) are from the latest Nvidia HPC-SDK, version 20.7-0

I compile my programs with -acc -Minfo=accel options, not sure how could I set -ta= and if it is needed at all?

P.S. I am also not sure if running my code, with or without nvprof uses GPUs at all, although I did set ACC_DEVICE_TYPE to nvidia.

Any advice would be very welcome.

Cheers

Bojan Niceno
  • 113
  • 1
  • 1
  • 11

1 Answers1

1

Which nvprof are you using? The one that ships with NV HPC 20.7 or your own install?

This looks very similar to an issue reported yesterday on the NVIDIA DevTalk user forums:

https://forums.developer.nvidia.com/t/new-20-7-version-where-is-the-detail-release-bugfix/146168/4

Granted this was for Nsight-systems, but it may be the same issue. It appears to be a problem with the 2020.3 version of the profilers which is the version we ship with the NV HPC 20.7 SDK. As I note, the Nsight-Systems 2020.4 release should have this fixed, so the work around would be download and install 2020.4 or use a prior release.

https://developer.nvidia.com/nsight-systems

There does seem to be a temporary issue with the Nsight-systems download that hopefully be corrected before you see this note.

Also, nvprof is in the process of being deprecated so you should consider moving to use Nsight-systems and Nsight-compute.

https://developer.nvidia.com/blog/migrating-nvidia-nsight-tools-nvvp-nvprof/

Mat Colgrove
  • 5,441
  • 1
  • 10
  • 11
  • Thanks Mat. In the meanwhile I realized what the issue was. I was compiling and running the version of the code without OpenACC directives (used the wrong branch) that seems to have confused nvprof (I used nvprof which ships with NV HPC 20.7.) Once the nvprof was running, I wanted to try NV Visual Profiler. To my understanding it comes with CUDA Development Toolkit. I installed version 11.1, but now my codes don't run any more. Compilation is still fine, as before, but when I run I get the message: `Failing in Thread:0 call to cuInit returned error 804: Other` – Bojan Niceno Sep 24 '20 at 16:28
  • 1
    Does the code run without the profiler? Per: https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__TYPES.html error 804 means "This error indicates that the system was upgraded to run with forward compatibility but the visible hardware detected by CUDA does not support this configuration.", which I interpret to mean that your device can't run a a CUDA 11.0 built application on a CUDA 11.1 driver. Though, I'm not sure about that since it's out of my area of expertise. This document might be helpful: https://docs.nvidia.com/deploy/cuda-compatibility/index.html – Mat Colgrove Sep 24 '20 at 17:02
  • 1
    FYI, I just tested running and profiling some code on a system here using the CUDA 11.1 driver (455.18), and it worked fine for me. Though, it has a Tesla P100 in it and is headless, so I didn't run the Visual Profiler. – Mat Colgrove Sep 24 '20 at 17:18
  • Hi Mat. I resolved the issue from my first comment. After installing CUDA Dev. Toolkit 11.1, my Nvidia driver got somehow corrupted, which is version 450.66; CUDA Version: 11.0. Anyhow, removing CUDA Dev. Toolkit (CUDA 11.1) and reinstalling Nvidia driver (CUDA 11.0) resolved the problem from my first comment (`Failing in Thread:0 call to cuInit returned error 804`) – Bojan Niceno Sep 25 '20 at 06:44
  • And indeed, your compatibility table was quite helpful, my current driver (which is the latest available for Ubuntu), isn't compatible with CUDA 11.1. – Bojan Niceno Sep 25 '20 at 06:48