Questions tagged [gpgpu]

GPGPU is an acronym for the field of computer science known as "General Purpose computing on the Graphics Processing Unit (GPU)"

GPGPU is an acronym for the field of computer science known as "General Purpose computing on the Graphics Processing Unit (GPU)". The two biggest manufacturers of GPUs are NVIDIA and AMD, although Intel has recently been moving in this direction with the Haswell APUs . There are two popular frameworks for GPGPU - NVidia's CUDA, which is only supported on its own hardware, and OpenCL developed by the Khronos Group. The latter is a consortium including all of AMD, NVidia, Intel, Apple and others, but the OpenCL standard is only half-heartedly supported by NVidia - creating a partial reflection of the rivalry among GPU manufacturers in the rivalry of programming frameworks.

The attractiveness of using GPUs for other tasks largely stems from the parallel processing capabilities of many modern graphics cards. Some cards can have thousands of streams processing similar data at incredible rates.

In the past, CPUs first emulated threading/multiple data streams through interpolation of processing tasks. Over time, we gained multiple cores with multiple threads. Now video cards house a number of GPUs, hosting many more threads or streams than many CPUs, and extremely fast memory integrated together. This huge increase of threads in execution is achieved thanks to the technique SIMD which stands for Single Instruction Multiple Data. This makes an environment uniquely suited for heavy computational loads that are able to undergo parallelization. Furthermore this technique also marks one of main differences between GPUs and CPUs as they are doing best what they were designed for.

More information at http://en.wikipedia.org/wiki/GPGPU

2243 questions

vote

2 answers

OpenCL version of cudaMemcpyToSymbol & optimization

Can someone tell me OpenCl version of cudaMemcpyToSymbol for copying __constant to device and getting back to host? Or usual clenquewritebuffer(...) will do the job ? Could not find much help in forum. Actually a few lines of demo will suffice. …

asked May 02 '12 at 11:44

gpuguy

4,607
17
67
125

vote

1 answer

clGetDeviceIDs fails in OpenCL with error code -30

The output of the following program on my machine with ATI Firepro V8750 is as follows: "Couldn't find any devices:No error" (this happens at the call of first clGetDeviceIDs). the error code returned is -30. What does that mean? I am not able to…

opencl gpu gpgpu pyopencl opencl.net

asked Apr 29 '12 at 13:01

gpuguy

4,607
17
67
125

vote

1 answer

Copying array from RAM to GPU and from GPU to RAM

I'm trying to introduce some CUDA optimizations in one of my projects. But I think I'm doing something wrong here. I want to implement a simple matrix-vector multiplication (result = matrix * vector). But when I want to copy the result back to the…

c++ cuda gpgpu

asked Apr 16 '12 at 16:33

alfa

3,058
3
25
36

vote

1 answer

Maximum (shared memory per block) / (threads per block) in CUDA with 100% MP load

I'm trying to process array of big structures with CUDA 2.0 (NVIDIA 590). I'd like to use shared memory for it. I've experimented with CUDA occupancy calculator, trying to allocate maximum shared memory per thread, so that each thread can process…

cuda gpgpu gpu-shared-memory

asked Apr 16 '12 at 13:54

mirror2image

vote

1 answer

What is the difference between the OpenCL functions length() and fast_length()?

On page three of this OpenCL reference sheet (broken link) there are two built in vector length functions with identical parameters: length() and half_length(). What is the difference between these functions? I gather from the name one is 'faster'…

performance opencl gpgpu

asked Apr 14 '12 at 15:46

sebf

2,831
5
32
50

vote

1 answer

cuda on integrated gpu + external device

I have a dell desktop pc which has an integrated gpu. If I add one more gpu over PCIe will I be able to run cuda? Probably yes. The integrated gpu has its own driver (i915) and I am not sure what will happen with nvidia driver (for the second gpu)…

linux ubuntu cuda gpgpu

asked Apr 10 '12 at 18:16

amanda

vote

1 answer

Passing GPUArray to feval

I have the following kernel __global__ void func( float * arr, int N ) { int rtid = blockDim.x * blockIdx.x + threadIdx.x; if( rtid < N ) { float* row = (float*)((char*)arr + rtid*N*sizeof(float) ); for (int c = 1; c…

matlab cuda gpgpu gpu

asked Apr 08 '12 at 06:53

VIHARRI PLV

vote

0 answers

Profiler shows OpenCL not uses all registers available

Here is the copy of occupancy analysis of my kernel from the NVIDIA Compute Visual Profiler: Kernel details : Grid size: 300 x 1, Block size: 224 x 1 x 1 Register Ratio = 0.75 ( 24576 / 32768 ) [48 registers per thread] Shared Memory Ratio =…

opencl gpgpu

asked Apr 05 '12 at 14:43

altair211

votes

2 answers

Advanced Encryption Standard on GPU using CUDA

I am a CUDA developer, I am assisting undergrad students in implementing AES on GPU. They don't have much knowledge about cryptography also this is the first time I am working on it. I have a few questions if anyone could answer them. How do we…

encryption cuda gpgpu

asked Mar 09 '12 at 10:57

Bilal

votes

2 answers

OpenCL- waste of host computing power

I am new to OpenCL, please tell me that the host cpu can be used only for allocating memory to the device, or we can use it can as an openCL device. (Because after the allocation is done, the host cpu will be idle).

opencl gpgpu

asked Feb 16 '12 at 10:09

Nirley Gupta

votes

2 answers

Information on current GPU Architectures

I have decided that my bachelors thesis will be about general purpose GPU-computing and which problems are more suitable for this than others. I am also trying to find out if there are any major differences between the current GPU architectures that…

architecture gpu gpgpu nvidia amd-gpu

asked Feb 12 '12 at 12:25

vichle

2,499
1
15
17

votes

1 answer

OpenHMPP in GCC

The gist of the question is: Do you know any projects that aim to bring OpenHMPP support to GCC? I could also possibly live with affordable commercial compilers, but it's very unlikely, because I prefer Linux, and I would like the compiler to…

c++ gcc openmp gpgpu

asked Jan 30 '12 at 14:54

enobayram

4,650
23
36

votes

1 answer

"cast" GL_R8 to GL_BGRA

I'm doing some GPGPU programming with OpenGL. I want to be able to write all my data to one-dimensional textures with the format GL_R8, so that I basically can treat it as an std:array object. Then during rendering I would like to be able to set…

c++ opengl textures gpgpu

asked Jan 20 '12 at 09:38

ronag

49,529
25
126
221

votes

2 answers

Trouble with CUDA Memory Allocation and Access

I am working on learning CUDA right now. I have some basic experience with MPI so I figured I'd start with some really simple vector operations. I am trying to write a parallelized dot product thing. I am either having trouble allocating/writing…

memory gpgpu cuda

asked Jan 18 '12 at 02:31

Joe

votes

1 answer

Advice needed regarding GPGPU library

I am writing an application and eventually it comes to well parallelisable part: two dimensional float initialData and result arrays for each cell (a, b) in result array: for each cell (i, j) in initialData: result(a, b) +=…

c# .net opencl gpgpu

asked Jan 12 '12 at 15:35

user380041

Prev 1 2 3

…

100