I am trying to solve an entropy problem on GPU using simplex optimisation. Because each iteration of simplex relies on the previous one I believe that there is no way that I can make my algorithm parallel.
However having done some research on PyOpenCl and Numbapro, OpenCl offers a type of programming architecture called SIMD. I just wondered if Numbapro would offer the same?
So far I have tried jit, autojit & vectorize for some part of the code but there was no sign of performance improvement.