-4

I have the following code which needs to calculate exponential of a 3D matrix. It works fine but I wonder if there is anyway to make it faster. (It currently needs more than 2 seconds. I am trying to make it run under half a second if possible.)

Result = zeros(200,50,300);
for i=1:30
   delta = i*randn(200,50,300);
   X = exp(1i*2*pi*delta);
   Result = Result + X;
end

Any help would be appreciated. Thanks in advance.

Alithewise
  • 25
  • 1
  • 1
  • 7
  • 2
    Did you profile the code? In that case how much time did the `exp` compared to the other lines? Maybe the problem is not in the `exp`. – mpaskov Feb 06 '17 at 20:09
  • 2
    I doubt this will be any faster with vectorization. The memory consumption alone is going to be quite big. For calculations involving large memory, `for` loops are better. – rayryeng Feb 06 '17 at 20:11
  • 1
    in your code, `i` is not the same as `1i` . I hope you already know that! – Sardar Usama Feb 06 '17 at 20:12
  • 1
    Random number generation is almost certainly the most expensive part of this calculation. – horchler Feb 06 '17 at 21:36

1 Answers1

2

First of all, I don't think that this sequence of computations can be made a lot faster. The following might be a tiny bit faster, but certainly not by the amount you're looking for:

dim    = [200,50,300]; % given
N      = prod(dim);    % total number of samples
M      = 30;           % number of iterations in for-loop
phase  = bsxfun(@times, randn(N,M), [1:M]); % scaled phases
Result = reshape(sum(exp(1i*2*pi*phase),2), dim); % sum of exp and reshape

EDIT 1 Start:

As @horchler pointed out in comments below, this method is actually slower than the original method on R2016b. He also suggested using a faster random generation scheme, which I tried and observed significant improvements. A similar amount of improvement can also be had by eliminating some of the temporary variables.

s = RandStream('dsfmt19937','Seed',1);
for i=1:30
    Result = Result + exp(1i*2*pi*i*randn(s,200,50,300));
end

My timing results for various stages of optimization, on R2016b, are as follows:

  • Original code: 4.3 s
  • My original proposal: 4.5 s
  • Original code with simpler random number generation: 3.8 s
  • Above with temporary variables removed: 3.2 s

Other random number generation schemes, such as shr3cong can also be tried in an effort to speed it up further.

EDIT 1: End

A different approach: Let me suggest a different approach that would generate the desired output in a different way, but such that it would have the same statistical properties as your output.

Result = sqrt(15) * (randn(200,50,300) + 1i*randn(200,50,300));

Let's try to justify this now. Firstly, we can argue that since the output is a sum of 30 random processes, by central limit theorem (CLT), the output would be Gaussian distributed. CLT only applies in loose sense here, since 30 is not infinite and the processes are not identically distributed. But as we will see shortly, it is still a very good approximation. Moreover, the real and imaginary terms are also independent, due to the sum over 30 independent complex numbers. I am not going to try to prove this here, but we will do some statistical checks.

Once we settle on independent Gaussian distributions, the analysis becomes much simpler. A Gaussian distribution can be defined by just two parameters: mean and the variance. Let's estimate them individually:

Mean: Since the phases are randomly distributed and cover a region much larger than 2*pi, the means for both the real and imaginary terms are 0.

Variance: The variance of sine/cosine of large random phase distribution is 0.5. So the variance for the sum of 30 sines/cosines would be 15. This is the reason for sqrt(15) term in the formula.

Statistical analysis: To verify that all the above assumptions and approximations are reasonable, lets perform some statistical analysis.

Firstly, let's check the distribution:

figure;
xGrid = (-15 : 0.1 : 15);
histogram(real(Result(:)), xGrid, 'Normalization','pdf', 'EdgeColor', 'None');
hold on;
plot(xGrid, normpdf(xGrid, 0, sqrt(15)), 'r', 'LineWidth', 2);
legend({'Simulated histogram', 'Gaussian pdf'});
title('Distribution of the real term');

enter image description here

The histogram of the imaginary term (not shown here) also looks identical. This test verifies the assumptions of Gaussian distribution with zero mean and 15 variance.

Finally, let's check the independence between the real and imaginary terms.

covar = cov(real(Result(:)), imag(Result(:)));
disp(covar);
%   14.9968    0.0036
%    0.0036   14.9936    

Two things can be noticed: (1) As stated earlier, the variance of the real and imaginary terms each is about 15. (2) the covariance between the real and imaginary terms is much smaller compared to the individual variance. This supports the thesis of their being independent.

aksadv
  • 806
  • 5
  • 6
  • 1
    +1 Interesting answer. In R2016b, your first approach getting rid of the `for` loop is about 20% slower. Vectorization doesn't always win. – horchler Feb 07 '17 at 22:58
  • Also consider using a different [base random number generation scheme](https://www.mathworks.com/help/matlab/ref/randstream.list.html), for more speed, e.g., `'dsfmt19937'` or `'shr3cong'`: `s = RandStream('dsfmt19937','Seed',1);` `randn(s,200,50,300)`. – horchler Feb 07 '17 at 23:03
  • @horchler: I have incorporated your suggestions in my answer. Thanks! – aksadv Feb 08 '17 at 08:44
  • Great. Just, FYI, `'dsfmt19937'` is definitely not "simpler" than the default `'mt19937ar'` scheme, it's just optimized for double precision and SIMD hardware. The `'shr3cong'` scheme is actually simpler and even faster. – horchler Feb 08 '17 at 17:12
  • @horchler: Thanks for pointing out. I've changed "simpler" to "faster" and added that other RN generation schemes can speed up the code further. – aksadv Feb 08 '17 at 17:53