I have a snippet of code which takes a vector A and creates a new vector C by summing A with both a left and right shifted version of itself (this is based on a centered finite difference formula ). For example, if A = [1 2 3 4 5], then C = [0 6 9 12 0], where the edges are zero because these don't have entries on both sides.
I created two versions, one which loops over element by element in C, and a second which processes the whole array. I measure the "vectorised" version to be around 50 times slower than the loop version... I was expecting the vectorised version to offer a speed improvement, but seems like it is not the case - what am I missing?
A = [1 2 3 4 5];
N = length(A);
C1 = zeros([1, N]);
C2 = zeros([1, N]);
%%% Loop version, element by element %%%
tic
for ntimes = 1:1e6
for m = 2:(N-1)
C1(m) = A(m+1) + A(m-1) + A(m);
end
end
t1 = toc;
disp(['Time taken for loop version = ',num2str(t1)])
%%% Vector version, process entire array at once %%%
tic
for ntimes = 1:1e6
C2(2:(N-1)) = A(3:N) + A(1:(N-2)) + A(2:(N-1));
end
t2 = toc;
disp(['Time taken for vector version = ',num2str(t2)])