In order to get a rough feeling for how much openCl is going to help me, I ran a test of matrix-matrix multiplication as this kind of basic linear algebra will be my primary use. The code I used can be found here: http://vasanthexperiments.wordpress.com/2011/11/20/aparapi-java-matrix-multiplication-example/. (1024*1024 x 1024*1024 matrix-matrix product)
Basically, I was quite disappointed by the results as the speedup was only marginal compared to serial computation on the CPU (less than x2) and if I made Aparapi use the CPU (which it does parallelized) the CPU was even faster.
During execution, the graphic card is under full load so I think there should be no communication issues.
My hardware config.:
i7 2670QM
AMD 7610M
16GB RAM
Since I'm completely new to GPGPUs I don't know what to expect.
1. Is it likely that my setup is somehow screwed? If so, where should I look?
2. Or am I simply expecting too much from an entry level graphic card? If so, how do different models of graphic cards scale with this kind of problem? What are the specs that I have to look for if I wanted to get hardware that is faster?
EDIT:
Ok, so I just reran the program with a 10x10 matrix.
Unsurprisingly, the CPU needed less than 1ms.
However, the GPU needs more than 1600, so there is definitely something wrong with either Aparapi or openCL or my hardware (drivers should be up to date). Anyone an idea where I should look?