Questions tagged [opencl]

OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors.

This tag refers to the OpenCL (Open Computing Language) by Khronos Group. It is the first open, royalty-free standard for cross-platform, parallel programming of modern processors found in personal computers, servers and handheld/embedded devices. Using OpenCL, one can affect execution of parallel computations greatly improving speed and responsiveness of a wide spectrum of applications: From gaming and entertainment to scientific and medical software.

OpenCL is an API and a C99-like language; for each device, implementations are provider-specific. Some of the OpenCL implementation providers are:

Questions about OpenCL can be asked here along with the vendor/provider and architecture details. Bug reports should be discussed in the respective forums of the vendors NVIDIA Forums, Intel Forums, AMD Forums

Books

5705 questions
13
votes
4 answers

What should I use instead of cl::KernelFunctor?

I'm following some tutorials on OpenCL and they mention a type called cl::KernelFunctor. However, that type isn't found and when I looked at the headers of the AMD APP SDK, I saw that the declaration of the cl::KernelFunctor class is commented…
Theodoros Chatzigiannakis
  • 28,773
  • 8
  • 68
  • 104
13
votes
4 answers

Measuring execution time of OpenCL kernels

I have the following loop that measures the time of my kernels: double elapsed = 0; cl_ulong time_start, time_end; for (unsigned i = 0; i < NUMBER_OF_ITERATIONS; ++i) { err = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &global, NULL, 0, NULL,…
user1096294
  • 829
  • 2
  • 10
  • 19
13
votes
2 answers

Using R's GPU packages on Amazon

I spent almost a whole day trying to get this running and finally decided to come to SO because there are going to be people here who have tried this =) I would like to get an Amazon-EC2 GPU machine running with rpud (or another R GPU package),…
James Tobin
  • 3,070
  • 19
  • 35
12
votes
1 answer

Buffer object and image buffer object in OpenCL

What is the difference between Buffer object and image buffer object in opencl? It is evident that image buffer is faster but to what extent? Where they must be used?
Megharaj
  • 1,589
  • 2
  • 20
  • 32
12
votes
3 answers

Does AMD's OpenCL offer something similar to CUDA's GPUDirect?

NVIDIA offers GPUDirect to reduce memory transfer overheads. I'm wondering if there is a similar concept for AMD/ATI? Specifically: 1) Do AMD GPUs avoid the second memory transfer when interfacing with network cards, as described here. In case…
arrayfire
  • 1,744
  • 12
  • 19
12
votes
6 answers

Disassemble an OpenCL kernel?

I'm not sure if it's possible. I want to study OpenCL in-depth, so I was wondering if there is a tool to disassemble an compiled OpenCL kernel. For normal x86 executable, I can use objdump to get a disassembly view. Is there a similar tool for…
Patrick
  • 4,186
  • 9
  • 32
  • 45
12
votes
2 answers

OpenCL synchronization between work-groups

Is it possible to synchronize OpenCL work-groups? For example, I have 100 work-groups every work-groups have only one item (don't ask me why, this is an example), and I need to put barrier to every work-item which ensure that all work-groups will…
pierre tautou
  • 807
  • 2
  • 20
  • 37
12
votes
1 answer

Why do we need SPIR-V?

I have been reading about heterogeneous computing and came across SPIR-V. There I found the following: SPIR-V is the first open standard, cross-API intermediate language for natively representing parallel compute and graphics.. From this image I…
Dimitar Hristov
  • 123
  • 1
  • 7
12
votes
1 answer

OpenCL When to use global, private, local, constant address spaces

I'm trying to learn OpenCL but I'm a having a hard time deciding which address spaces to use, as I only find assembled resources declaring what these address spaces are, but not why they exist or when to use them. The resources are at least too…
Safron
  • 802
  • 1
  • 11
  • 23
12
votes
2 answers

When to use cudaHostRegister() and cudaHostAlloc()? What is the meaning of "Pinned or page-locked" memory? Which are the equivalent in OpenCL?

I am just new with this APIs of the Nvidia and some expressions are not so clear for me. I was wondering if somebody can help me to understand when and how to use these CUDA commands in a simply way. To be more precise: Studing how is possible to…
Leos313
  • 5,152
  • 6
  • 40
  • 69
12
votes
1 answer

How to determine max size of images I can safely pass to/from OpenCL kernel?

I'm developing an OpenCL 1.2 application that deals with large imagery. At the moment, the image I'm testing with is 16507x21244 pixels. My kernel is run in a loop that operates on chunks of the image. The kernel takes 32bpp (rgba) chunks of the…
mio iwakura
  • 301
  • 2
  • 11
12
votes
3 answers

Can I use Julia to program my GPU & CPU?

My system has graphics card. I do not play games. I want to program some high performance computing stuff for fun. Can I use JULIA lang to leverage my hardware?
suryakrupa
  • 3,852
  • 1
  • 25
  • 34
12
votes
2 answers

How to setup OpenCL on AMD videocard with opensource driver?

I have read this link - https://wiki.debian.org/ru/AtiHowTo and decide to setup OpenCL. the r600g driver still have to load proprietary microcode into the GPU to enable hardware acceleration. This firmware is usually included in the kernel, but…
user2743980
  • 121
  • 2
  • 2
  • 5
12
votes
4 answers

Getting started with PyOpenCL

I have recently discovered the power of GP-GPU (general purpose graphics processing unit) and want to take advantage of it to perform 'heavy' scientific and math calculations (that otherwise require big CPU clusters) on a single machine. I know that…
mariotoss
  • 414
  • 3
  • 7
  • 17
12
votes
2 answers

Are there any good 3rd party libraries build on top of openCL yet?

I'm thinking in particular of processing primitives, things like FFT, convolution, correlation, matrix mathematics, any kind of machine vision primitives. I haven't been able to find anything along these lines, does anyone know of any good projects…
gct
  • 14,100
  • 15
  • 68
  • 107