How to speed up this for loop containing kronecker product?

Question

I want to speed up the following for loop.

% Use random matrices for testing. 
% Elapsed time of the following code is around 19 seconds. 
% So, when N is large, it's very slow.

n= 800; N =100; k =10;
x = rand(n,N); S = rand(k,N); H = 0;
for i = 1: size(x,2)
  X = x(:,i)*x(:,i)' ;
  DW = diag( S(:,i) ) - S(:,i)*S(:,i)' ;  
  H = H + kron(X,DW);
end

My attempt:

kron(X, DW) = kron(x(:,i)*x(:,i)' ,diag(S(:,i))) - 
              kron(x(:,i)*x(:,i)', S(:,i)*S(:,i)');

We can use and to rewrite the above equation.

kron(x(:,i)*x(:,i)',diag(S(:,i))) = 
  kron(x(:,i), sqrt( diag(S(:,i))))* 
  kron(x(:,i)*x(:,i)',diag(S(:,i)))' ;

(since S is nonnegative then we can take sqrt )

kron(x(:,i)*x(:,i)', S(:,i)*S(:,i)') = 
  kron( x(:,i), S(:,i))*
  kron( x(:,i), S(:,i))'

Therefore, we only need to compute kron(x(:,i), sqrt( diag(S(:,i)))) and kron(x(:,i), S(:,i)).

Here are the codes:

x = rand(n,N); S = rand(k,N);
H1= 0; K_D= zeros(n*k, k*1, N); K_S = zeros(n*k,N); 

%K_D records kron( x(:,i), sqrt (diag(S(:,i)) ) ), K_S records kron(x(:,i), S(:,i));

for i = 1:N
      D_half = sqrt( diag(S(:,i)));
      K_D(:,:,i) = kron( x(:,i),D_half);
      K_S(:,i) =  reshape (S(:,i)*x(:,i)',[],1);
end

K_D = reshape(K_D,n*k,[]);
H = K_D*K_D' - K_S*K_S';

The new codes save much time. Elapsed time is around 1 second. But I still want to speed up it.

Can someone help me speed up the above code (my attempt)? Or do someone have a new idea/method to speed up my original problem? Thank you very much!

Probably you might have considered already. Do you really need double precision? making all `rand` and `zeros` single precision initialization with `'single'` as last argument should cut the computation time in half. — Praveen, Apr 09 '21 at 13:24
If your matrix size is larger than this N~800, try using `sparse` and triangular sparse matrices with `tril`. You might see performance improvement for symmetric and diagonal matrices computation. For this matrix size it is not worth it I think. — Praveen, Apr 09 '21 at 13:29

rahnema1 · Answer 1 · 2021-04-14T06:26:18.120

The expression H1 = A(:) .* B(:).' produces the same result as H = kron(A,B) except that the order of elements are changed. So the expected result can be computed without using kron:

X = reshape(reshape(x, n, 1, N) .* reshape(x, 1, n, N), [], N);
S1 = -reshape(reshape(S, k, 1, N) .* reshape(S, 1, k, N), [], N);
S1(1:k+1:end, :) = reshape(S1 (1:k+1:end, :), size(S)) + S;
S1 = reshape(S1,  [], N);
H1 = X * S1.';
H = reshape(permute(reshape(permute(reshape(H1,[],k,k),[3 2 1]),k,n*k,[]),[2,1,3]),n*k,n*k);

If you interested in the result of the summation regardless of the order of the elements computing H1 is sufficient and it results in more than 3X speed-up comparing to your second method. However you can use reshape and permute to recover the order of elements that brings nearly 0.3X speed-up.

If H is computed in a loop and its size is constant across loops you can use a pre-computed index for re-arrangement of the elements:

idx = reshape(permute(reshape(permute(reshape(1:(n*k)^2,[],k,k),[3 2 1]),k,n*k,[]),[2,1,3]),n*k,n*k);
for ....
    X = ...
    ....
    H = H1(idx);
end

Thank you very much! I ran your code. It seemed that the H in your code was different from the H that I got using my code. But the sum of all the entries are the same. — learner, Apr 14 '21 at 02:10
@learner I updated the answer. It requires two permute operations. — rahnema1, Apr 14 '21 at 06:15

How to speed up this for loop containing kronecker product?

1 Answers1