Questions tagged [nvidia]

For programming questions specifically related to Nvidia hardware. N.B. Questions about system configuration are usually off-topic here!

Nvidia is an American global technology company based in Santa Clara, California, best known for its graphics processors (GPUs).

More about Nvidia at http://en.wikipedia.org/wiki/Nvidia
Nvidia website at http://www.nvidia.com/content/global/global.php

3668 questions
19
votes
4 answers

Why aren't there bank conflicts in global memory for Cuda/OpenCL?

One thing I haven't figured out and google isn't helping me, is why is it possible to have bank conflicts with shared memory, but not in global memory? Can there be bank conflicts with registers? UPDATE Wow I really appreciate the two answers from…
smuggledPancakes
  • 9,881
  • 20
  • 74
  • 113
19
votes
1 answer

CUDA atomicAdd for doubles definition error

In previous versions of CUDA, atomicAdd was not implemented for doubles, so it is common to implement this like here. With the new CUDA 8 RC, I run into troubles when I try to compile my code which includes such a function. I guess this is due to…
kalj
  • 1,432
  • 2
  • 13
  • 30
19
votes
3 answers

NVidia CUDA toolkit 7.5.27 failing to install on OS X

Downloading the CUDA toolkit DMG works, but the installer fails with a cryptic "package manifest parsing error" error after selecting packages. Running the installer from the command line using the binary inside fails in a similar manner. The log…
rdadolf
  • 1,218
  • 11
  • 20
18
votes
3 answers

GPU shared memory size is very small - what can I do about it?

The size of the shared memory ("local memory" in OpenCL terms) is only 16 KiB on most nVIDIA GPUs of today. I have an application in which I need to create an array that has 10,000 integers. so the amount of memory I will need to fit 10,000 integers…
rana
  • 181
  • 1
  • 1
  • 3
18
votes
1 answer

How do you measure peak memory bandwidth in OpenGL?

Just to get an idea of what kind of speeds I should be expecting I have been trying to benchmark transfer between global memory and shaders, rather than relying on GPU spec sheets. However I can't get close to the theoretical maximum. In fact I'm…
jozxyqk
  • 16,424
  • 12
  • 91
  • 180
17
votes
3 answers

running nvidia-docker on Windows 10 + WSL2

I saw several Q&As on this topic and tried both approaches. Any advice on how to proceed with either route are appreciated: Running nvidia-docker from within WSL2 I followed NVIDIA docs and this tutorial. Everything installs and docker command runs…
Dima Lituiev
  • 12,544
  • 10
  • 41
  • 58
17
votes
1 answer

Programmatically selecting integrated graphics in nVidia Optimus

There are many questions and answers about how to select nVidia discrete adapter on runtime on Windows platform. The easiest way is to export a NvOptimusEnablement variable like this: extern "C" _declspec(dllexport) DWORD NvOptimusEnablement =…
Anton K
  • 4,658
  • 2
  • 47
  • 60
17
votes
1 answer

__forceinline__ effect at CUDA C __device__ functions

There is a lot of advice on when to use inline functions and when to avoid it in regular C coding. What is the effect of __forceinline__ on CUDA C __device__ functions? Where should they be used and where be avoided?
Farzad
  • 3,288
  • 2
  • 29
  • 53
16
votes
2 answers

Unsupported gpu architecture compute_30 on a CUDA 5 capable gpu

I'm currently trying to compile Darknet on the latest CUDA toolkit which is version 11.1. I have a GPU capable of running CUDA version 5 which is a GeForce 940M. However, while rebuilding darknet using the latest CUDA toolkit, it said nvcc fatal …
3MP The Rook
  • 173
  • 1
  • 2
  • 5
16
votes
5 answers

Error: NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver

The NVIDIA-SMI is throwing this error: NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running I purged NVIDIA and installed it again following steps…
16
votes
2 answers

Why does my OpenCL kernel fail on the nVidia driver, but not Intel (possible driver bug)?

I originally wrote an OpenCL program to calculate very large hermitian matrices, where the kernel calculates a single pair of entries in the matrix (the upper triangular portion, and its lower triangular complement). Very early on, I found a very…
stix
  • 1,140
  • 13
  • 36
16
votes
1 answer

Limiting register usage in CUDA: __launch_bounds__ vs maxrregcount

From the NVIDIA CUDA C Programming Guide: Register usage can be controlled using the maxrregcount compiler option or launch bounds as described in Launch Bounds. From my understanding (and correct me if I'm wrong), while -maxrregcount limits…
Kelsius
  • 433
  • 2
  • 5
  • 19
16
votes
3 answers

Fatal error: cuda.h: No such file or directory

I successfully installed CUDA 8.0 in my PC and I can see its files by running the following commands in my Ubuntu 16.10: $ sudo find / -name nvcc /usr/local/cuda-8.0/bin/nvcc $ sudo find / -name…
mad
  • 2,677
  • 8
  • 35
  • 78
16
votes
1 answer

Does nvidia-smi give instantaneous informations or an average on the interval?

When i use nvidia-smi -l 60 for example, i was asking to myself if : the information given is a snapshot at the time it's used each 60 seconds the information given is the average between the time and the time +/- 60 seconds Do you know the answer…
Vincent Rossignol
  • 215
  • 1
  • 2
  • 8
16
votes
5 answers

Could not insert 'nvidia_352': No such device

I am trying to run caffe on Linux Ubuntu. After installation, I run caffe in gpu and the error is I0910 13:28:13.606891 10629 caffe.cpp:296] Use GPU with device ID 0 modprobe: ERROR: could not insert 'nvidia_352': No such device F0910…
batuman
  • 7,066
  • 26
  • 107
  • 229