Questions tagged [gpgpu]

GPGPU is an acronym for the field of computer science known as "General Purpose computing on the Graphics Processing Unit (GPU)"

GPGPU is an acronym for the field of computer science known as "General Purpose computing on the Graphics Processing Unit (GPU)". The two biggest manufacturers of GPUs are NVIDIA and AMD, although Intel has recently been moving in this direction with the Haswell APUs . There are two popular frameworks for GPGPU - NVidia's CUDA, which is only supported on its own hardware, and OpenCL developed by the Khronos Group. The latter is a consortium including all of AMD, NVidia, Intel, Apple and others, but the OpenCL standard is only half-heartedly supported by NVidia - creating a partial reflection of the rivalry among GPU manufacturers in the rivalry of programming frameworks.

The attractiveness of using GPUs for other tasks largely stems from the parallel processing capabilities of many modern graphics cards. Some cards can have thousands of streams processing similar data at incredible rates.

In the past, CPUs first emulated threading/multiple data streams through interpolation of processing tasks. Over time, we gained multiple cores with multiple threads. Now video cards house a number of GPUs, hosting many more threads or streams than many CPUs, and extremely fast memory integrated together. This huge increase of threads in execution is achieved thanks to the technique SIMD which stands for Single Instruction Multiple Data. This makes an environment uniquely suited for heavy computational loads that are able to undergo parallelization. Furthermore this technique also marks one of main differences between GPUs and CPUs as they are doing best what they were designed for.

More information at http://en.wikipedia.org/wiki/GPGPU

2243 questions

votes

3 answers

CUDA: How many concurrent threads in total?

I have a GeForce GTX 580, and I want to make a statement about the total number of threads that can (ideally) actually be run in parallel, to compare with 2 or 4 multi-core CPU's. deviceQuery gives me the following possibly relevant information:…

cuda gpgpu

asked Jun 27 '11 at 08:58

Eskil

3,385
5
28
32

votes

2 answers

Running more than one CUDA applications on one GPU

CUDA document does not specific how many CUDA process can share one GPU. For example, if I launch more than one CUDA programs by the same user with only one GPU card installed in the system, what is the effect? Will it guarantee the correctness of…

cuda gpu gpgpu nvidia

asked Jul 27 '15 at 00:55

cache

1,239
3
13
21

votes

3 answers

CUDA model - what is warp size?

What's the relationship between maximum work group size and warp size? Let’s say my device has 240 CUDA streaming processors (SP) and returns the following information - CL_DEVICE_MAX_COMPUTE_UNITS: 30 CL_DEVICE_MAX_WORK_ITEM_SIZES: 512 / 512 /…

cuda gpgpu

asked Aug 31 '10 at 06:52

r00kie

votes

4 answers

CUDA Driver API vs. CUDA runtime

When writing CUDA applications, you can either work at the driver level or at the runtime level as illustrated on this image (The libraries are CUFFT and CUBLAS for advanced math): (source: tomshw.it) I assume the tradeoff between the two are…

c# c++ cuda gpgpu cuda.net

asked Oct 28 '08 at 11:03

Morten Christiansen

19,002
22
69
94

votes

1 answer

Choosing between GeForce or Quadro GPUs to do machine learning via TensorFlow

Is there any noticeable difference in TensorFlow performance if using Quadro GPUs vs GeForce GPUs? e.g. does it use double precision operations or something else that would cause a drop in GeForce cards? I am about to buy a GPU for TensorFlow, and…

machine-learning gpu gpgpu tensorflow

asked Jan 11 '16 at 05:57

user2771184

votes

4 answers

How does CUDA assign device IDs to GPUs?

When a computer has multiple CUDA-capable GPUs, each GPU is assigned a device ID. By default, CUDA kernels execute on device ID 0. You can use cudaSetDevice(int device) to select a different device. Let's say I have two GPUs in my machine: a GTX 480…

cuda gpu gpgpu nvidia

asked Dec 08 '12 at 20:42

solvingPuzzles

8,541
16
69
112

votes

3 answers

GPGPU vs. Multicore?

What are the key practical differences between GPGPU and regular multicore/multithreaded CPU programming, from the programmer's perspective? Specifically: What types of problems are better suited to regular multicore and what types are better…

multithreading performance multicore gpgpu parallel-processing

asked May 07 '11 at 04:45

dsimcha

67,514
53
213
334

votes

2 answers

Should I unify two similar kernels with an 'if' statement, risking performance loss?

I have 2 very similar kernel functions, in the sense that the code is nearly the same, but with a slight difference. Currently I have 2 options: Write 2 different methods (but very similar ones) Write a single kernel and put the code blocks…

c++ c optimization cuda gpgpu

asked May 30 '11 at 17:45

lina

1,679
4
21
25

votes

4 answers

What is the current status of C++ AMP

I am working on high performance code in C++ and have been using both CUDA and OpenCL and more recently C++AMP, which I like very much. I am however a little worried that it is not being developed and extended and will die out. What leads me to this…

c++ c++11 gpgpu c++-amp

asked Jan 23 '16 at 21:48

JoeTaicoon

1,383
1
12
28

votes

2 answers

OpenCL vs OpenMP performance

Have there been any studies comparing OpenCL to OpenMP performance? Specifically I am interested in the overhead cost of launching threads with OpenCL, e.g., if one were to decompose the domain into a very large number of individual work items (each…

opencl gpgpu

asked Aug 31 '11 at 20:46

Robert

votes

8 answers

How to use OpenCL on Android?

For plattform independence (desktop, cloud, mobile, ...) it would be great to use OpenCL for GPGPU development when speed does matter. I know Google pushes RenderScript as an alternative, but it seems to be only be available for Android and is…

android opengl-es opencl gpgpu renderscript

asked Jan 25 '12 at 15:33

Rodja

7,998
8
48
55

votes

8 answers

CUDA apps time out & fail after several seconds - how to work around this?

I've noticed that CUDA applications tend to have a rough maximum run-time of 5-15 seconds before they will fail and exit out. I realize it's ideal to not have CUDA application run that long but assuming that it is the correct choice to use CUDA and…

cuda timeout gpgpu gpu

asked Jan 30 '09 at 23:29

rck

2,020
2
23
23

votes

4 answers

Python real time image classification problems with Neural Networks

I'm attempting use caffe and python to do real-time image classification. I'm using OpenCV to stream from my webcam in one process, and in a separate process, using caffe to perform image classification on the frames pulled from the webcam. Then I'm…

python multiprocessing deep-learning caffe gpgpu

asked Sep 16 '16 at 01:46

user3543300

votes

7 answers

How to obtain OpenCL SDK?

I was perusing http://www.khronos.org/ web site and only found headers for OpenCL (not OpenGL which I don't care about). How can I obtain OpenCL SDK?

sdk gpu gpgpu opencl

asked Jul 27 '09 at 21:27

Roman Kagan

10,440
26
86
126

votes

1 answer

Integer calculations on GPU

For my work it's particularly interesting to do integer calculations, which obviously are not what GPUs were made for. My question is: Do modern GPUs support efficient integer operations? I realize this should be easy to figure out for myself, but I…

performance optimization integer gpgpu

asked Dec 06 '10 at 09:25

gspr

11,144
3
41
74

Prev 1

…

99 100 Next