Questions tagged [opencl]

OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors.

This tag refers to the OpenCL (Open Computing Language) by Khronos Group. It is the first open, royalty-free standard for cross-platform, parallel programming of modern processors found in personal computers, servers and handheld/embedded devices. Using OpenCL, one can affect execution of parallel computations greatly improving speed and responsiveness of a wide spectrum of applications: From gaming and entertainment to scientific and medical software.

OpenCL is an API and a C99-like language; for each device, implementations are provider-specific. Some of the OpenCL implementation providers are:

Questions about OpenCL can be asked here along with the vendor/provider and architecture details. Bug reports should be discussed in the respective forums of the vendors NVIDIA Forums, Intel Forums, AMD Forums

Books

5705 questions
15
votes
4 answers

How to compile OpenCL on Ubuntu?

Question: What is needed headers and drivers are needed and where would I get them for compiling open CL on ubuntu using gcc/g++? Info: for a while now I've been stumbling around trying to figure out how to install open CL on my desktop and if…
Narcolapser
  • 5,895
  • 15
  • 46
  • 56
15
votes
4 answers

OpenCL for Python

I'm looking for a good OpenCL wrapper\library for Python, with good documentation. I tried to search some... but couldn't find one good enough.
obenjiro
  • 3,665
  • 7
  • 44
  • 82
15
votes
2 answers

Can I run Cuda or opencl on intel iris?

I have a Macbook pro mid 2014 with intel iris and intel core i5 processor 16GB of RAM. I am planing to learn some ray-traced 3D. But, I am not sure, if my laptop can render fast without any nvidia's hardware. So, I would appreciate it, if someone…
Fudoli
  • 199
  • 1
  • 1
  • 4
15
votes
2 answers

HPC programming language relying on implicit vectorization

Are there programming languages or language extensions that rely on implicit vectorization? I would need something that make aggressive assumptions to generate good DLP/vectorized code, for SSE4.1, AVX, AVX2 (with or without FMA3/4) in single/double…
diegor
  • 542
  • 2
  • 15
15
votes
1 answer

Different ways to make kernel

In this tutorial There are 2 methods to run the kernel, and another one mentioned in the comments: 1. cl::KernelFunctor…
otisonoza
  • 1,334
  • 2
  • 14
  • 32
15
votes
1 answer

OpenCL: Work items, Processing elements, NDRange

My classmates and me are being confronted with OpenCL for the first time. As expected, we ran into some issues. Below I summarized the issues we had and the answers we found. However, we're not sure that we got it all right, so it would be great if…
typeduke
  • 6,494
  • 6
  • 25
  • 34
15
votes
4 answers

Using Delphi to take advantage of GPGPU technology?

GPGPU is the principle of using the parallel processors on video cards for massive increases in performance. Does anyone have any ideas about using GPGPU in Delphi, using either OpenCL or CUDA? CUDA was/is NVidia only, but they have also adopted…
TallGuy
  • 151
  • 1
  • 3
15
votes
2 answers

How to represent scientific notation in C

How do I represent extremely large or small numbers in C with a certain amount of significant figures. For example, if I want to do calculations on 1.54334E-34, how could I do this. Also, is this applicable to OpenCL code?
user1876508
  • 12,864
  • 21
  • 68
  • 105
15
votes
1 answer

How to use async_work_group_copy in OpenCL?

I would like to understand how to correctly use the async_work_group_copy() call in OpenCL. Let's have a look on a simplified example: __kernel void test(__global float *x) { __local xcopy[GROUP_SIZE]; int globalid = get_global_id(0); int…
SDwarfs
  • 3,189
  • 5
  • 31
  • 53
15
votes
2 answers

Keep getting CL_INVALID_KERNEL_ARGS on nvidia gpu

I'm using OpenCL on an nvidia GPU and I keep getting CL_INVALID_KERNEL_ARGS when I try to execute a kernel. I've stepped it down to a very simple program: __kernel void foo(int a, __write_only image2d_t bar) { int 2 coords = {0,…
Trevor
  • 1,369
  • 2
  • 13
  • 28
15
votes
3 answers

When will OpenCL 1.2 for NVIDIA hardware be available?

I would have asked this question on the NVIDIA developer forum but since it's still down maybe someone here can tell me something. Does anybody know if there is already OpenCL 1.2 support in NVIDIAs driver? If not, is it coming soon? I don't have a…
lochotzke
  • 181
  • 1
  • 4
15
votes
2 answers

Confusion on CUDA/openCL and C++ AMP

I read that Microsoft is closely working with Nvidia to improve AMP performances. But my question is: is AMP a CUDA-replace by Microsoft? Or does AMP use CUDA drivers when a NVIDIA CUDA video card is available? Is AMP an openCL substitute? I'm still…
Marco A.
  • 43,032
  • 26
  • 132
  • 246
14
votes
5 answers

Are OpenCL work items executed in parallel?

I know that work items are grouped into the work groups, and you cannot synchronize outside of a work group. Does it mean that work items are executed in parallel? If so, is it possible/efficient to make 1 work group with 128 work items?
K0n57an71n
  • 367
  • 1
  • 4
  • 11
14
votes
1 answer

How to draw OpenCL calculated pixels to the screen with OpenGL?

I wan't to do some calculated pixelart with OpenCL and display this directly on the display without CPU roundtripping. I could use interoperability of OpenCL with OpenGL and write to the texture-banks of the GPU and display the texture with OpenGL.…
RobotRock
  • 4,211
  • 6
  • 46
  • 86
14
votes
1 answer

Branch predication on GPU

I have a question about branch predication in GPUs. As far as I know, in GPUs, they do predication with branches. For example I have a code like this: if (C) A else B so if A takes 40 cycles and B takes 50 cycles to finish execution, if assuming…
Zk1001
  • 2,033
  • 4
  • 19
  • 36