Questions tagged [gpgpu]

GPGPU is an acronym for the field of computer science known as "General Purpose computing on the Graphics Processing Unit (GPU)"

GPGPU is an acronym for the field of computer science known as "General Purpose computing on the Graphics Processing Unit (GPU)". The two biggest manufacturers of GPUs are NVIDIA and AMD, although Intel has recently been moving in this direction with the Haswell APUs . There are two popular frameworks for GPGPU - NVidia's CUDA, which is only supported on its own hardware, and OpenCL developed by the Khronos Group. The latter is a consortium including all of AMD, NVidia, Intel, Apple and others, but the OpenCL standard is only half-heartedly supported by NVidia - creating a partial reflection of the rivalry among GPU manufacturers in the rivalry of programming frameworks.

The attractiveness of using GPUs for other tasks largely stems from the parallel processing capabilities of many modern graphics cards. Some cards can have thousands of streams processing similar data at incredible rates.

In the past, CPUs first emulated threading/multiple data streams through interpolation of processing tasks. Over time, we gained multiple cores with multiple threads. Now video cards house a number of GPUs, hosting many more threads or streams than many CPUs, and extremely fast memory integrated together. This huge increase of threads in execution is achieved thanks to the technique SIMD which stands for Single Instruction Multiple Data. This makes an environment uniquely suited for heavy computational loads that are able to undergo parallelization. Furthermore this technique also marks one of main differences between GPUs and CPUs as they are doing best what they were designed for.

More information at http://en.wikipedia.org/wiki/GPGPU

2243 questions

vote

1 answer

CUDA Toolkit 4.1/4.2: nvcc Crashes with an Access Violation

I am developing a CUDA application for GTX 580 with Visual Studio 2010 Professional on Windows 7 64bit. My project builds fine with CUDA Toolkit 4.0, but nvcc crashes when I choose CUDA Toolkit 4.1 or 4.2 with the following error: 1> Stack dump: …

asked Sep 02 '12 at 21:23

meriken2ch

vote

1 answer

Which is better, the atomic's competition between: threads of the single Warp or threads of different Warps?

Which is better, the atomic's competition (concurrency) between threads of the single Warp or between threads of different Warps in one block? I think that when you access the shared memory is better when threads of one warp are competing with each…

cuda synchronization gpgpu

asked Aug 07 '12 at 13:10

Alex

12,578
15
99
195

vote

1 answer

about floating point operation

Recently, I have been making program (FDTD Operation) using the CUDA development environment, OS is Windows server 2008 , Graphic card is TeslaC2070, compiler is VS2010. This program calculates using single and double precision floating-point. I was…

cuda gpgpu

asked Aug 01 '12 at 19:31

오승택

vote

1 answer

Java: Cast or reference multidimensional array into single dimensional array

I have a program written in Java which involves massive amount of multidimensional array. I am trying to parallelize it using JOCL (OpenCL), but multidimensional array has to be converted to single dimensional array before being passed to OpenCL.…

java parallel-processing opencl gpgpu jocl

asked Aug 01 '12 at 09:52

aaronqli

vote

1 answer

clSetKernelArg changed arg_value from 16 to 140733193388048?

I'm delving into OpenCL by making a Matrix dot product implementation. I'm having a problem with getting my kernels to return the same values as my host. I have made an encapsulation function that allocates device memory, sets parameters to a…

c osx-snow-leopard opencl gpgpu nvidia

asked Jul 20 '12 at 07:13

user1509669

vote

0 answers

Read/Write the registers on a GPU

Is it possible to read/wite from/to the registers on the GPU using OpenCL? I am using a NVIDIA GeForce 9400gt graphics card. Tried googling out but not much info out there. Can someone tell me if this possible and if yes, how?

opencl gpu gpgpu

asked Jul 17 '12 at 16:57

Nike

vote

0 answers

Understanding Registers in OpenCL

I am a little confused regarding the usage of registers internally by OpenCL kernels. I am using -cl-nv-verbose to capture the register usage for my kernel. At the moment, my kernel is recording ptxas info: Used 4 registers for some code in the…

opencl gpgpu

asked Jul 16 '12 at 14:48

Omar Khan

vote

2 answers

How do i get started with CUDA development on UBUNTU 9.04?

How do i get started with CUDA development on Ubuntu 9.04? Are there any prebuilt binaries? Are the default accelerated drivers sufficient? My thought is to actually work with OpenCL but that seems to be hard to do right now so i thought that i…

c cuda ubuntu-9.04 gpgpu nvidia

asked Jul 16 '09 at 09:02

Per Arneng

2,100
5
21
32

vote

3 answers

Shift vector in thrust

I'm looking at a project involving online (streaming) data. I want to work with a sliding window of that data. For example, say that I want to hold 10 values in my vector. When value 11 comes in, I want to drop value 1, shift everything over, and…

c++ cuda gpgpu thrust

asked Jul 02 '12 at 20:25

Noah

vote

1 answer

Optimizing a threaded simultaneous check

I have a device function that checks a byte array using threads, each thread checking a different byte in the array for a certain value and returns bool true or false. How can I efficiently decide if all the checks have returned true or otherwise?

parallel-processing cuda gpgpu

asked Jul 01 '12 at 15:10

gamerx

vote

2 answers

Perfect hashing for OpenCL

I have a set (static, known in compile time) of about 2 million values, 20 bytes each. What I need is a fast O(1) way to check if a given value is in this set. It seems that perfect hash function with a bit array is ideal for this, but I can't find…

hash opencl gpgpu perfect-hash

asked Jun 24 '12 at 11:13

aplavin

2,199
5
32
53

vote

2 answers

cudaStreamDestroy() does not synchronize/block?

I'm using CUDA 4.2 on a Quadro NVS 295 on a Win7 x64 machine. From the CUDA C Programming Manual I read this: "...Streams are released by calling cudaStreamDestroy(). for (int i = 0; i < 2; ++i) cudaStreamDestroy(stream[i]); cudaStreamDestroy()…

cuda gpgpu

asked Jun 11 '12 at 14:21

ACRay

vote

1 answer

OpenCL producing QNaN on NVidia hardware

I'm programming in OpenCL using the C++ bindings. I have a problem where on NVidia hardware, my OpenCL code is spontaneously producing very large numbers, and then (on the next run) a "1.#QNaN". My code is pretty much a simple physics simulation…

c++ opencl gpgpu

asked May 31 '12 at 18:42

Chaosed0

vote

3 answers

How to use shared memory between kernel launches in CUDA?

I want to use values in shared memory over multiple launches of the same kernel. Can I do that?

cuda gpgpu gpu-shared-memory

asked May 15 '12 at 22:17

Amin

vote

1 answer

Image cross-correlation with Matlab GPGPU, indexing into 3d array

The problem I'm encountering is writing code such that the built-in features of Matlab's GPU programming will correctly divide data for parallel execution. Specifically, I'm sending N 'particle' images to the GPU's memory, organized in a 3-d array…

matlab gpgpu cross-correlation

asked May 10 '12 at 17:13

ejmunson

Prev 1 2 3

…

100 Next