Questions tagged [nvidia]

For programming questions specifically related to Nvidia hardware. N.B. Questions about system configuration are usually off-topic here!

Nvidia is an American global technology company based in Santa Clara, California, best known for its graphics processors (GPUs).

More about Nvidia at http://en.wikipedia.org/wiki/Nvidia
Nvidia website at http://www.nvidia.com/content/global/global.php

3668 questions
14
votes
3 answers

How to use GPU to accelerate the processing speed of ffmpeg filter?

According to NVIDIA's developer website, you can use GPU to speed up the rendering of the ffmpeg filter. Create high-performance end-to-end hardware-accelerated video processing, 1:N encoding and 1:N transcoding pipeline using built-in > filters in…
Zedd W
  • 165
  • 1
  • 1
  • 9
14
votes
2 answers

What do G and C types mean in nvidia-smi?

I have an open issue because I thought that my cuda code wasn't running in my GPU (here). I thougth that because I get a C in the type field of my process when I use nvidia-smi, but I see that my GPU-Util grows when I run my code so now I don't know…
ipgvl
  • 161
  • 1
  • 1
  • 5
14
votes
2 answers

Which NVIDIA cuDNN release type for TensorFlow on Ubuntu 16.04

According to TensorFlow 1.5 installation instructions for Ubuntu 16.04, you need to install cuDNN 7.0 but they don't mention exactly what should be installed: cuDNN v7.0. For details, see NVIDIA's documentation. Ensure that you create the…
traveh
  • 2,700
  • 3
  • 27
  • 44
14
votes
2 answers

How to interrupt or cancel a CUDA kernel from host code

I am working with CUDA and I am trying to stop my kernels work (i.e. terminate all running threads) after a certain if block is being hit. How can I do that? I am really stuck in here.
14
votes
4 answers

where is the ./configure of TensorFlow and how to enable the GPU support?

When installing TensorFlow on my Ubuntu, I would like to use GPU with CUDA. But I am stopped at this step in the Official Tutorial : Where exactly is this ./configure ? Or where is my root of source tree. My TensorFlow is located here…
fluency03
  • 2,637
  • 7
  • 32
  • 62
14
votes
4 answers

creating arrays in nvidia cuda kernel

hi I just wanted to know whether it is possible to do the following inside the nvidia cuda kernel __global__ void compute(long *c1, long size, ...) { ... long d[1000]; ... } or the following __global__ void compute(long *c1, long size,…
kl.
  • 361
  • 3
  • 7
  • 15
14
votes
2 answers

Compile OpenCL on Mingw Nvidia SDK

Is it possible to compile OpenCL using Mingw and Nvidia SDK? I'm aware that its not officially supported but that just doesn't make sense. Aren't the libraries provided as a statically linked libraries? I mean once compiled with whatever compiler…
omarzouk
  • 933
  • 10
  • 23
14
votes
1 answer

How CUDA constant memory allocation works?

I'd like to get some insight about how constant memory is allocated (using CUDA 4.2). I know that the total available constant memory is 64KB. But when is this memory actually allocated on the device? Is this limit apply to each kernel, cuda context…
hthms
  • 853
  • 1
  • 10
  • 25
13
votes
2 answers

CUDA Out of memory when there is plenty available

I'm having trouble with using Pytorch and CUDA. Sometimes it works fine, other times it tells me RuntimeError: CUDA out of memory. However, I am confused because checking nvidia-smi shows that the used memory of my card is 563MiB / 6144 MiB, which…
Jeff Chen
  • 600
  • 5
  • 13
13
votes
4 answers

Is it possible to run Java3D applications on Nvidia 3D Vision hardware?

Is is possible to run a Java3D application on Nvidia 3D Vision hardware? I've got an existing Java3D application that can run in stereoscopic 3D. In the past, I've always run the application on Quadro cards using the OpenGL renderer and quad…
JohnnyO
  • 3,018
  • 18
  • 30
13
votes
2 answers

Pytorch CUDA error: no kernel image is available for execution on the device on RTX 3090 with cuda 11.1

If I run the following: import torch import sys print('A', sys.version) print('B', torch.__version__) print('C', torch.cuda.is_available()) print('D', torch.backends.cudnn.enabled) device = torch.device('cuda') print('E',…
Benjamin Crouzier
  • 40,265
  • 44
  • 171
  • 236
13
votes
2 answers

AWS EC2 instance losing GPU support after reboot

Rebooting an instance on tuesday, I first ran into the problem of losing GPU support on a AWS p2.xlarge machine with the Ubuntu Deep Learning AMI. I tested it three times now on two days and a collegue had the same problem, so I guess it is a AWS…
Simon
  • 398
  • 2
  • 11
13
votes
1 answer

OpenMP 4.0 in GCC: offload to nVidia GPU

TL;DR - Does GCC (trunk) already support OpenMP 4.0 offloading to nVidia GPU? If so, what am I doing wrong? (description below). I'm running Ubuntu 14.04.2 LTS. I have checked out the most recent GCC trunk (dated 25 Mar 2015). I have installed the…
Marc Andreson
  • 3,405
  • 5
  • 35
  • 51
13
votes
1 answer

why do we need cudaDeviceSynchronize(); in kernels with device-printf?

__global__ void helloCUDA(float f) { printf("Hello thread %d, f=%f\n", threadIdx.x, f); } int main() { helloCUDA<<<1, 5>>>(1.2345f); cudaDeviceSynchronize(); return 0; } Why is cudaDeviceSynchronize(); at many places for example…
gpuguy
  • 4,607
  • 17
  • 67
  • 125
13
votes
6 answers

How do I use atomicMax on floating-point values in CUDA?

I have used atomicMax() to find the maximum value in the CUDA kernel: __global__ void global_max(float* values, float* gl_max) { int i=threadIdx.x + blockDim.x * blockIdx.x; float val=values[i]; atomicMax(gl_max, val); } It is throwing…
Alvin
  • 940
  • 2
  • 13
  • 27