Questions tagged [opencl]

OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors.

This tag refers to the OpenCL (Open Computing Language) by Khronos Group. It is the first open, royalty-free standard for cross-platform, parallel programming of modern processors found in personal computers, servers and handheld/embedded devices. Using OpenCL, one can affect execution of parallel computations greatly improving speed and responsiveness of a wide spectrum of applications: From gaming and entertainment to scientific and medical software.

OpenCL is an API and a C99-like language; for each device, implementations are provider-specific. Some of the OpenCL implementation providers are:

Questions about OpenCL can be asked here along with the vendor/provider and architecture details. Bug reports should be discussed in the respective forums of the vendors NVIDIA Forums, Intel Forums, AMD Forums

Books

5705 questions
2
votes
2 answers

How to check the throughput and latency in Altera OpenCL

In altera design example, I tried vector add but I can't get the throughput and latency of kernel from the compilation results. I read the programming guide of Altera. It mentioned to use profile.mon. Is it possible to use -march=emulator --profile…
dev_55
  • 21
  • 3
2
votes
2 answers

OpenCL with ARM NEON (without Mali GPU) available?

I working on customized SoC with ARM Cortex-A9. It supports NEON, but do not has Mali GPU. With system, can I use OpenCL with NEON? I found OpenCL SDK for Mali at ARM website. (http://malideveloper.arm.com/resources/sdks/mali-opencl-sdk/) but there…
soongk
  • 259
  • 3
  • 17
2
votes
0 answers

Slow copying from T-API Umat to memory buffer

I am using the OpenCV T-API to get my OpenCV stuff executed on the GPU, if available. I have a function that gets memory buffer, I read that into a Mat, convert it to a cv::UMatand perform my changes. This works pretty good, the processing speeds up…
BT9
  • 87
  • 1
  • 11
2
votes
1 answer

opencl internal error on build with python

I keep getting an internal error despite successful builds of the below openCL code. The error message is not very helpful as it does not point out the line or column. Can any body spot this. The kernel code (for the openCL program) was able to…
ssn
  • 439
  • 5
  • 14
2
votes
1 answer

OpenCL: clSetKernelArg returns CL_INVALID_ARG_SIZE

I'm a newbie at OpenCL. I'm trying to pass 5 arguments into a kernel: an input buffer, an output buffer, an integer, and 2 local arrays the size of the input buffer. //Create input/output cl_mem objects cl_mem inputBuffer = clCreateBuffer(context,…
user3760657
  • 397
  • 4
  • 16
2
votes
1 answer

OpenCL kernel with generic data type

When writing code for kernels, is it possible to specify a generic data type so that copying the kernel for each used data type is not necessary? Currently I'm using preprocessor macros to define the whole function with various data types: #define…
Quxflux
  • 3,133
  • 2
  • 26
  • 46
2
votes
0 answers

Processing N images/frames using OpenCL

I have an OpenCL kernel that applies some filter to a grayscale 1920x1080 image, and I would like to apply the same filter to N different images , to fully utilise the GPU what is the best practice of such case that achieves the highest frames per…
mmain
  • 333
  • 3
  • 19
2
votes
1 answer

How to share global variables (arrays) in an OpenCL kernel with several user defined functions

I'm having a problem in my OpenCL kernel. I am trying to do Runge-Kutta 4 integration. I already implemented it in an OpenGL compute shader and it works and now I want do implement in OpenCL. I think my issue has to do with not knowing how to…
sleon
  • 51
  • 7
2
votes
0 answers

OpenCL acceleration with gographics/imagick

I have been putting enormous effort trying to get GPU acceleration for my Imagemagick resize operations using Go and its imagick library: https://gopkg.in/gographics/imagick.v2/imagick I have built imagemagick with OpenCL support and OpenCL…
N. Saarela
  • 21
  • 3
2
votes
2 answers

clSetKernelArg returns with CL_INVALID_MEM_OBJECT

I'm totally green with OpenCL. I'm trying to get a sample on Intel's website to work, but I cannot. This is the sample. I'm getting the error CL_INVALID_MEM_OBJECT when trying to pass an integer argument to clSetKernelArg like so: err =…
user2415010
2
votes
0 answers

Speed of API calls

I installed imagemagick following instructions from Why is ImageMagick with OpenCL slower than OpenMP?. I need to resize an image to 4 different sizes and have to crop them. I wrote a C++ program that calls image.resize("500x500!"),…
Vanns
  • 159
  • 7
2
votes
1 answer

OpenCL multiple in-order command queues vs single out-of-order one

I have a number of jobs to execute. Each job consists of a buffer write, a kernel execution and a buffer read and those operations must be of course executed in order. The various jobs are however indipendent and can therefore be executed…
Shepard
  • 801
  • 3
  • 9
  • 17
2
votes
1 answer

dynamic allocation in shared memory in opencl on Nvidia

I'm following the example here to create a variable-length local memory array. The kernel signature is something like this: __kernel void foo(__global float4* ex_buffer, int ex_int, __local void *local_var) Then…
tavr
  • 81
  • 1
  • 6
2
votes
1 answer

OpenCL - Most efficient way to split byte into an 8-component-vector

I'm building a simulation of the Ising Model in OpenCL which means that my data consists of a bunch of states which can either be up/1 or down/-1. To save memory bandwidth 8 of these states are encoded into a single byte (up=1, down=0). Now in one…
Gigo
  • 3,188
  • 3
  • 29
  • 40
2
votes
1 answer

OpenCL Choosing Optimal Device for Throughput

I am working with Cloo, an OpenCL C# library, and I was wondering how I can best determine which device to use for my kernels at runtime. What I really want to know is how many cores I have (compute units * cores per compute unit) on GPUs. How do I…
guitar80
  • 716
  • 6
  • 19