I have a Matlab simulation which updates an array :
Array=zeros(1,1000)
as follows:
for j=1:100000
Array=Array+rand(1,1000)
end
My question is the following: This loop is linear, so it cannot be parralelized for each slot in the array, but different slots are updated independently. So, naturally Matlab performs array operations such as this in parralell using all the cores of the CPU.
I wish to get the same calculation performed on my NVIDIA GPU, in order to speed it up (utilizing the larger number of cores there).
The problem is: that naively doing
tic
Array=gpuArray(zeros(1,1000));
for j=1:100000
Array=Array+gpuArray(rand(1,1000));
end
toc
results in the calculation time being 8 times longer!
a. What am I doing wrong?
Update: b. Can someone provide a different simple example perhaps, to which GPU computing is beneficial? My aim is to understand how I can utilize it in Matlab for very "heavy" stochastic simulations (multiple linear operations on big arrays and matrices).