Questions tagged [gpu]

Acronym for "Graphics Processing Unit". For programming traditional graphical applications, see the tag entry for "graphics programming". For general-purpose programming using GPUs, see the tag entry for "gpgpu". For specific GPU programming technologies, see the popular tag entries for "opencl", "cuda" and "thrust".

Acronym for "Graphics Processing Unit". For programming traditional graphical applications, see the tag entry for . For general-purpose programming using GPUs, see the tag entry for . For specific GPU programming technologies, see the popular tag entries for , and .

More information on GPU at http://en.wikipedia.org/wiki/Graphics_processing_unit

8854 questions
44
votes
5 answers

Can/Should I run this code of a statistical application on a GPU?

I'm working on a statistical application containing approximately 10 - 30 million floating point values in an array. Several methods performing different, but independent, calculations on the array in nested loops, for example: Dictionary
Mike
  • 1,992
  • 4
  • 31
  • 42
43
votes
5 answers

How can I use GPU on Google Colab after exceeding usage limit?

I'm using Google Colab's free version to run my TensorFlow code. After about 12 hours, it gives an error message "You cannot currently connect to a GPU due to usage limits in Colab." I tried factory resetting the runtime to use the GPU again but…
Mert Ege
  • 551
  • 1
  • 4
  • 5
42
votes
1 answer

Differences between VexCL, Thrust, and Boost.Compute

With a just a cursory understanding of these libraries, they look to be very similar. I know that VexCL and Boost.Compute use OpenCl as a backend (although the v1.0 release VexCL also supports CUDA as a backend) and Thrust uses CUDA. Aside from the…
Sean Lynch
  • 2,852
  • 4
  • 32
  • 46
41
votes
2 answers

CPU SIMD vs GPU SIMD?

GPU uses the SIMD paradigm, that is, the same portion of code will be executed in parallel, and applied to various elements of a data set. However, CPU also uses SIMD, and provide instruction-level parallelism. For example, as far as I know,…
Carmellose
  • 4,815
  • 10
  • 38
  • 56
41
votes
4 answers

How does CUDA assign device IDs to GPUs?

When a computer has multiple CUDA-capable GPUs, each GPU is assigned a device ID. By default, CUDA kernels execute on device ID 0. You can use cudaSetDevice(int device) to select a different device. Let's say I have two GPUs in my machine: a GTX 480…
solvingPuzzles
  • 8,541
  • 16
  • 69
  • 112
40
votes
2 answers

What is actually a Queue family in Vulkan?

I am currently learning vulkan, right now I am just taking apart each command and inspecting the structures to try to understand what they mean. Right now I am analyzing QueueFamilies, for which I have the following…
Makogan
  • 8,208
  • 7
  • 44
  • 112
37
votes
5 answers

Setting up Visual Studio Intellisense for CUDA kernel calls

I've just started CUDA programming and it's going quite nicely, my GPUs are recognized and everything. I've partially set up Intellisense in Visual Studio using this extremely helpful guide here:…
sj755
  • 3,944
  • 14
  • 59
  • 79
37
votes
6 answers

How to use GPU for mathematics

I am looking at utilising the GPU for crunching some equations but cannot figure out how I can access it from C#. I know that the XNA and DirectX frameworks allow you to use shaders in order to access the GPU, but how would I go about accessing it…
Neil Knight
  • 47,437
  • 25
  • 129
  • 188
36
votes
3 answers

Cannot dlopen some GPU libraries. Skipping registering GPU devices

Tensorflow is only using the CPU and wont use the GPU. I assume its because it expects Cuda 10.0 and it finds 10.2. I had installed 10.2 but have purged it and installed 10.0. Im running Ubuntu 19.10, AMD Ryzen 2700 Cpu, RTX 2080 S. I have…
dev
  • 651
  • 1
  • 6
  • 14
35
votes
1 answer

WKWebView crashes in acceleratedAnimationDidStart

I'm having a crash occur on a client's app and other than a lot at the WTFCrash, I'm not getting much use out of the stack trace. I am using a WKWebView instance to show a web page that has some CSS based animations and a video. The issues occurs on…
TheGeoff
  • 3,860
  • 2
  • 23
  • 23
34
votes
10 answers

Fastest SVM implementation usable in Python

I'm building some predictive models in Python and have been using scikits learn's SVM implementation. It's been really great, easy to use, and relatively fast. Unfortunately, I'm beginning to become constrained by my runtime. I run a rbf SVM on a…
tomas
  • 665
  • 1
  • 10
  • 14
34
votes
2 answers

When is CUDA's __shared__ memory useful?

Can someone please help me with a very simple example on how to use shared memory? The example included in the Cuda C programming guide seems cluttered by irrelevant details. For example, if I copy a large array to the device global memory and want…
Tudor
  • 61,523
  • 12
  • 102
  • 142
34
votes
4 answers

printf inside CUDA __global__ function

I am currently writing a matrix multiplication on a GPU and would like to debug my code, but since I can not use printf inside a device function, is there something else I can do to see what is going on inside that function. This my current…
Jose Vega
  • 10,128
  • 7
  • 40
  • 57
33
votes
4 answers

Can OpenMP be used for GPUs?

I've been searching the web but I'm still very confused about this topic. Can anyone explain this more clearly? I come from an Aerospace Engineering background (not from a Computer Science one), so when I read online about OpenMP/CUDA/etc. and…
André Almeida
  • 379
  • 1
  • 4
  • 11
32
votes
2 answers

How to use AMD GPU for fastai/pytorch?

I'm using a laptop which has Intel Corporation HD Graphics 5500 (rev 09), and AMD Radeon r5 m255 graphics card. Does anyone know how to it set up for Deep Learning, specifically fastai/Pytorch?
Mohanned ElSayed
  • 321
  • 1
  • 3
  • 3