0

How do I use OpenCL(for GPU compute) with Hadoop ?

My data set resides in HDFS. I need to compute 5 metrics, among which 2 are compute intensive. So I want to compute those 2 metrics on GPU using OpenCL and the rest 3 metrics using java map reduce code on Hadoop.

How can I pass data from HDFS to GPU ? or How can my opencl code access data from HDFS ?

How can I trigger OpenCL codes from my Java map reduce codes ?

It would be great if someone could share a sample code.

  • As isti_spl mentions below, there are a number of options for accessing the GPU from Java (APARAPI, RootBeer, JOCL, or just plain old JNI), though they each come with their own idiosyncrasies. I've recently published work on using Hadoop with GPUs, and would be very interested in learning about the metrics you are computing as evaluation for my work. There has been other previous work on different MapReduce frameworks on GPUs, though most/all do not integrate with Hadoop/HDFS. I may also be able to help with accelerating your Hadoop jobs if you contact me at jmaxg3@gmail.com. – agrippa May 13 '13 at 16:45

1 Answers1

1

one can use jogamp (jocl) to invoke opencl from java, which is basically a wrapper over the native opencl libraries. you need to access first the data using java/hadoop libraries, transfer them to CLBuffers (which are java objects containing buffers used to communicate with opencl), copy them to gpu, invoke kernel, copy results back from gpu to your buffers. check the jocl examples.

another alternative is to use the aparapi library. here the data processing kernel is a simple java function (with some restrictions), the framework translates from java bytecode-> opencl, so the opencl part is hidden from the programmer. Of course, not everything can be translated from java->opencl, check their examples.

isti_spl
  • 706
  • 6
  • 10