0

I'm looking for a fast implementation of scan(prefixsum) in OpenCL. The best thing that I found is in the Nvidia SDK but it's old(2010). Does anyone know any other implementation of Scan in OpenCL?

Shewartz
  • 5
  • 4

2 Answers2

1

There are several open-source implementations of scan operation in OpenCL:

  • CLOGS, a library for higher-level operations on top of the OpenCL C++ API.
  • Boost.Compute, a C++ GPU Computing Library for OpenCL.
  • VexCL, a C++ vector expression template library for OpenCL/CUDA.
  • Bolt, a C++ template library optimized for GPUs.

The author of CLOGS wrote a paper comparing performance of scan (and sort) operations in these implementations.

ddemidov
  • 1,731
  • 13
  • 15
0

if your device supports 2.0 then, use builtin operations for that.

https://stackoverflow.com/a/32394920/4877550

http://developer.amd.com/community/blog/2014/11/17/opencl-2-0-device-enqueue/

Community
  • 1
  • 1
eclipse0922
  • 158
  • 2
  • 15