Having stumbled over this forum thread, dot product faster on cpu than on gpu using OpenCL, I was reminded again, that there are instances, which look like they're made for OpenCL*, but where they're used, OpenCL does not provided us with a gain. i.e. I also have a kmeans implementation using pyopencl code which is several times faster than a simple python code, but still several times faster than the scipy function for kmeans.
So how do you decide when to use OpenCL?
- What graphics card do you need? How much 'better than the cpu' does the graphics card have to be. Is Quadro FX 580 vs. i7 860 enough?
- How big does the problem have to be? Do you need millions of multiplications to gain something or are several hundreds enough?
- How much optimizing of an even 'simple' algorithm like kmeans or the dot product is necessary to make OpenCL worthwhile?
Or is it one of these triangle cases, where you only can (/have to) choose two of the three corners to make it work?
problem size /\ / \ / \ / \ /________\ GPU/CPU optimization
I know, that I used a little bit too bold of language for the title and the questions. I'll change it, if I can think of a more suitable wording.
Thanks.
* simple matrix operation like dot product, kmeans or matrix multiplications