Questions tagged [gpgpu]

GPGPU is an acronym for the field of computer science known as "General Purpose computing on the Graphics Processing Unit (GPU)"

GPGPU is an acronym for the field of computer science known as "General Purpose computing on the Graphics Processing Unit (GPU)". The two biggest manufacturers of GPUs are NVIDIA and AMD, although Intel has recently been moving in this direction with the Haswell APUs . There are two popular frameworks for GPGPU - NVidia's CUDA, which is only supported on its own hardware, and OpenCL developed by the Khronos Group. The latter is a consortium including all of AMD, NVidia, Intel, Apple and others, but the OpenCL standard is only half-heartedly supported by NVidia - creating a partial reflection of the rivalry among GPU manufacturers in the rivalry of programming frameworks.

The attractiveness of using GPUs for other tasks largely stems from the parallel processing capabilities of many modern graphics cards. Some cards can have thousands of streams processing similar data at incredible rates.

In the past, CPUs first emulated threading/multiple data streams through interpolation of processing tasks. Over time, we gained multiple cores with multiple threads. Now video cards house a number of GPUs, hosting many more threads or streams than many CPUs, and extremely fast memory integrated together. This huge increase of threads in execution is achieved thanks to the technique SIMD which stands for Single Instruction Multiple Data. This makes an environment uniquely suited for heavy computational loads that are able to undergo parallelization. Furthermore this technique also marks one of main differences between GPUs and CPUs as they are doing best what they were designed for.

More information at http://en.wikipedia.org/wiki/GPGPU

2243 questions

votes

5 answers

Why do we need GPU for Deep Learning?

As the question already suggests, I am new to deep learning. I know that the learning process of the model will be slow without GPU. If I am willing to wait, Will it be OK if i use CPU only ?

gpu deep-learning gpgpu

asked Apr 04 '17 at 07:27

Kanu

votes

7 answers

OpenCL FFT lib for GPUs?

Is there any general FFT lib available for running on the GPU using OpenCL? As far as my knowledge goes, Apple sample code for power-of-two OpenCL FFT is the only such code available? Does any such library exist for non-power-of-two transform…

opencl fft gpgpu gpu

asked Nov 19 '10 at 01:28

Neo

votes

3 answers

Speedup GPU vs CPU for matrix operations

I am wondering how much GPU computing would help me speed up my simulations. The critical part of my code is matrix multiplication. Basically the code looks like the following python code with matrices of order 1000 and long for loops. import numpy…

python gpu gpgpu matrix-multiplication

asked Aug 01 '16 at 16:31

physicsGuy

3,437
3
27
35

votes

2 answers

Does cuDNN library works with All nvidia graphic cards?

I study the use of cuDNN library in my project. But my nvidia graphic card is little bit old. I search on the net if cuDNN works with all graphic cards. I didn,t find even in their principal page. Which nvidia graphic cards are compatible with…

cuda gpgpu nvidia gpu

asked May 26 '16 at 13:16

ProEns08

1,856
2
22
38

votes

3 answers

Does GLSL utilize SLI? Does OpenCL? What is better, GLSL or OpenCL for multiple GPUs?

To what extend does OpenGL's GLSL utilize SLI setups? Is it utilized at all at the point of execution or only for end rendering? Similarly, I know that OpenCL is alien to SLI but assuming one has several GPUs, how does it compare to GLSL in…

opengl glsl opencl gpgpu opengl-3

asked Sep 11 '10 at 15:00

j riv

3,593
6
39
54

votes

1 answer

Synax for functions other than vertex|fragment|kernel in metal shader file

I'm porting some basic OpenCL code to a Metal compute shader. Get stuck pretty early when attempting to convert the miscellaneous helper functions. For example, including something like the following function in a .metal file Xcode (7.1) gives me a…

ios gpgpu metal

asked Nov 29 '15 at 00:32

Jaysen Marais

3,956
28
44

votes

1 answer

How to make the most of SIMD in OpenCL?

In the optimization guide of Beignet, an open source implementation of OpenCL targeting Intel GPUs Work group Size should be larger than 16 and be multiple of 16. As two possible SIMD lanes on Gen are 8 or 16. To not waste SIMD lanes, we need to…

opencl gpgpu simd spmd

asked Oct 31 '15 at 14:24

user3528438

2,737
2
23
42

votes

2 answers

How to list CUDA devices in windows 7 using cmd?

How to display as a list CUDA availible devices in windows 7 using command line? Do I need to install additional software to do this?

windows command-line windows-7 cuda gpgpu

asked Sep 22 '15 at 10:07

mrgloom

20,061
36
171
301

votes

1 answer

Differences between clBLAS and ViennaCL?

Looking at the OpenCL libraries out there I am trying to get a complete grasp of each one. One library in particular is clBLAS. Their website states that it implements BLAS level 1,2, & 3 methods. That is great but ViennaCL also has BLAS…

opencl gpgpu viennacl

asked May 26 '15 at 12:50

cdeterman

19,630
7
76
100

votes

6 answers

What's the most trivial function that would benfit from being computed on a GPU?

I'm just starting out learning OpenCL. I'm trying to get a feel for what performance gains to expect when moving functions/algorithms to the GPU. The most basic kernel given in most tutorials is a kernel that takes two arrays of numbers and sums…

opencl gpgpu

asked Mar 14 '10 at 19:19

hanDerPeder

votes

1 answer

Does NVidia support OpenCL SPIR?

I am wondering that whether nvidia supports spir backend or not? if yes, i couldn't find any document and sample example about that. but if not, is there a any way to work spir backend onto nvidia gpus? thanks in advance

parallel-processing opencl gpgpu nvidia

asked Feb 26 '14 at 20:45

grypp

votes

1 answer

Does the nVidia RDMA GPUDirect always operate only physical addresses (in physical address space of the CPU)?

As we know: http://en.wikipedia.org/wiki/IOMMU#Advantages Peripheral memory paging can be supported by an IOMMU. A peripheral using the PCI-SIG PCIe Address Translation Services (ATS) Page Request Interface (PRI) extension can detect and signal…

cuda gpgpu pci-e memory-mapping gpudirect

asked Nov 07 '13 at 16:50

Alex

12,578
15
99
195

votes

1 answer

CUDA: What is the threads per multiprocessor and threads per block distinction?

We have a workstation with two Nvidia Quadro FX 5800 cards installed. Running the deviceQuery CUDA sample reveals that the maximum threads per multiprocessor (SM) is 1024, while the maximum threads per block is 512. Given that only one block can be…

cuda gpu gpgpu nvidia

asked Jul 23 '13 at 16:42

James Paul Turner

votes

1 answer

Performance issues: Single CPU core vs Single CUDA core

I wanted to compare the speed of a single Intel CPU core with the speed of an single nVidia GPU core (ie: a single CUDA code, a single thread). I did implement the following naive 2d image convolution algorithm: void convolution_cpu(uint8_t* res,…

performance cuda gpgpu convolution

asked Jun 12 '13 at 04:40

AstrOne

3,569
7
32
54

votes

3 answers

Efficient bucket-sort on GPU

For a current OpenCL GPGPU project, I need to sort elements in an array according to some key with 64 possible values. I need the final array to have all elemens with the same key to be contiguous. It's sufficient to have an associative array…

synchronization opencl semaphore gpgpu bucket-sort

asked May 27 '13 at 23:57

leemes

44,967
21
135
183

Prev 1 2 3

…

99 100 Next