14

I am aware of:

https://github.com/lsegal/barracuda

Which hasn't been updated since 01/11

And

http://rubyforge.org/projects/ruby-opencl/

Which hasn't been updated since 03/10.

Are these projects dead? Or have they simply not changed because their functioning, and OpenCL/Ruby haven't changed since then. Is anybody using these projects? Any luck?

If not, can you recommend another opencl gem for Ruby? Or how is this sort of call done usually? Just call raw C from Ruby?

Nakilon
  • 34,866
  • 14
  • 107
  • 142
Abraham P
  • 15,029
  • 13
  • 58
  • 126

3 Answers3

4

You can try opencl_ruby_ffi, it's actively developed (by a colleague of mine) and working well with OpenCL version 1.2. OpenCL 2.0 should also be available soon.

sudo gem install opencl_ruby_ffi

In Khronos forum you can find a quick example that shows how it works:

require 'opencl_ruby_ffi'

# select the first platform/device available
# improve it if you have multiple GPU on your machine
platform = OpenCL::platforms.first
device = platform.devices.first

# prepare the source of GPU kernel
# this is not Ruby but OpenCL C
source = <<EOF
__kernel void addition(  float2 alpha, __global const float *x, __global float *y) {\n\
  size_t ig = get_global_id(0);\n\
  y[ig] = (alpha.s0 + alpha.s1 + x[ig])*0.3333333333333333333f;\n\
}
EOF

# configure OpenCL environment, refer to OCL API if necessary
context = OpenCL::create_context(device)
queue = context.create_command_queue(device, :properties => OpenCL::CommandQueue::PROFILING_ENABLE)

# create and compile the OpenCL C source code
prog = context.create_program_with_source(source)
prog.build

# allocate CPU (=RAM) buffers and 
# fill the input one with random values
a_in = NArray.sfloat(65536).random(1.0)
a_out = NArray.sfloat(65536)

# allocate GPU buffers matching the CPU ones
b_in = context.create_buffer(a_in.size * a_in.element_size, :flags => OpenCL::Mem::COPY_HOST_PTR, :host_ptr => a_in)
b_out = context.create_buffer(a_out.size * a_out.element_size)

# create a constant pair of float
f = OpenCL::Float2::new(3.0,2.0)

# trigger the execution of kernel 'addition' on 128 cores
event = prog.addition(queue, [65536], f, b_in, b_out, 
                      :local_work_size => [128])
# #Or if you want to be more OpenCL like:
# k = prog.create_kernel("addition")
# k.set_arg(0, f)
# k.set_arg(1, b_in)
# k.set_arg(2, b_out)
# event = queue.enqueue_NDrange_kernel(k, [65536],:local_work_size => [128])

# tell OCL to transfer the content GPU buffer b_out 
# to the CPU memory (a_out), but only after `event` (= kernel execution)
# has completed
queue.enqueue_read_buffer(b_out, a_out, :event_wait_list => [event])

# wait for everything in the command queue to finish
queue.finish
# now a_out contains the result of the addition performed on the GPU

# add some cleanup here ...

# verify that the computation went well
diff = (a_in - a_out*3.0)
65536.times { |i|
  raise "Computation error #{i} : #{diff[i]+f.s0+f.s1}" if (diff[i]+f.s0+f.s1).abs > 0.00001
}
puts "Success!"
Kevin
  • 4,618
  • 3
  • 38
  • 61
  • Is it possible to elaborate a bit more about what is happening here? Where can I read out the actual values that were added, etc.? – Automatico Feb 01 '15 at 17:13
  • I've commented the source code, the result of the addition is retrieved from the GPU memory with operation `queue.enqueue_read_buffer`. There are plenty of (C) OpenCL tutorial available on the web, it should be fairly easy to translate to Ruby once you get the gist of the API. – Kevin Feb 01 '15 at 17:41
2

You may want to package whatever C functionality you would like as a gem. This is pretty straightforward and this way you can wrap all your c logic in a specific namespace that you can reuse in other projects.

http://guides.rubygems.org/c-extensions/

jmontross
  • 3,533
  • 1
  • 21
  • 17
0

If you want to do high speed calculations with GPU, Cumo / NArray is a good choice. Cumo has the same interface as NArray. Although it is cuda rather than opencl.

https://github.com/sonots/cumo

kojix2
  • 806
  • 7
  • 18