1

I wanted to compute a finite difference with respect to the change of the function in Matlab. In other words

f(x+e_i) - f(x)

is what I want to compute. Note that its very similar to the first order numerical partial differentiation (forward differentiation in this case) :

(f(x+e_i) - f(x)) / (e_i)

Currently I am using for loops to compute it but it seems that Matlab is much slower than I thought. I am doing it as follows:

function [ dU ] = numerical_gradient(W,f,eps)
%compute gradient or finite difference update numerically
[D1, D2] = size(W);
dU = zeros(D1, D2);
for d1=1:D1
    for d2=1:D2
        e = zeros([D1,D2]);
        e(d1,d2) = eps;
        f_e1 = f(W+e);
        f_e2 = f(W-e);
        %numerical_derivative = (f_e1 - f_e2)/(2*eps);
        %dU(d1,d2) = numerical_derivative
        numerical_difference = f_e1 - f_e2;
        dU(d1,d2) = numerical_difference;
    end
end

it seems that its really difficult to vectorize the above code because for numerical differences follow the definition of the gradient and partial derivatives which is:

df_dW = [ ..., df_dWi, ...]

where df_dWi assumes the other coordinates are fixed and it only worries about the change of the variable Wi. Thus, I can't just change all the coordinates at once.

Is there a better way to do this? My intuition tells me that the best way to do this is to implement this not in matlab but in some other language, say C and then have matlab call that library. Is that true? Does it mean that the best solution is some Matlab library that does this for me?


I did see:

https://www.mathworks.com/matlabcentral/answers/332414-what-is-the-quickest-way-to-find-a-gradient-or-finite-difference-in-matlab-of-a-real-function-in-hig

but unfortunately, it computes exact derivatives, which isn't what I am looking for. I am explicitly looking for differences or "bad approximation" to the gradient.


Since it seems this code is not easy to vectorize (in fact my intuition tells me its not possible to do so) my only other idea is to implement this finite difference function in C and then have C call the function. Is this a good idea? Anyone know how to do this?

I did try reading the following: https://www.mathworks.com/help/matlab/matlab_external/standalone-example.html

but it was too difficult to understand for me because I have no idea what a mex file is, if I need to have a arrayProduct.c file as well as a mex.h file, if I also needed a matlab file, etc. If there just existed a way to simply download a working example with all the functions they suggest there and some instructions to compile it, then it would be super helpful. But just reading the hmtl/article like that its impossible for me to infer what they want me to do.


For the sake of completness it seems reddit has some comments in its discussion of this:

https://www.reddit.com/r/matlab/comments/623m7i/how_does_one_compute_a_single_finite_differences/

Charlie Parker
  • 5,884
  • 57
  • 198
  • 323

2 Answers2

1

Here is a more efficient doing so:

function [ vNumericalGrad ] = CalcNumericalGradient( hInputFunc, vInputPoint, epsVal )

numElmnts       = size(vInputPoint, 1);
vNumericalGrad  = zeros([numElmnts, 1]);

refVal = hInputFunc(vInputPoint);

for ii = 1:numElmnts
    % Set the perturbation vector
    refInVal = vInputPoint(ii);
    vInputPoint(ii) = refInVal + epsVal;

    % Compute Numerical Gradient
    vNumericalGrad(ii) = (hInputFunc(vInputPoint) - refVal) / epsVal;

    % Reset the perturbation vector
    vInputPoint(ii) = refInVal;
end

end

This code allocate less memory.

The above code performance will be totally controlled by the speed of the hInputFunction.

The small tricks compared to original code are:

  • No memory reallocation of e each iteration.
  • Instead of addition of vectors W + e there are 2 assignments to the array.
  • Decreasing the calls to hInputFunction() by half by defining the reference value outside the loop (This only works for Forward / Backward difference).

Probably this will be very close to C code unless you can code in C more efficiently the function which computes the value (hInputFunction).

A full implementation can be found in StackOverflow Q44984132 Repository (It was Posted in StackOverflow Q44984132).
See CalcFunGrad( vX, hObjFun, difMode, epsVal ).

Royi
  • 4,640
  • 6
  • 46
  • 64
  • I don't understand, why is this code faster than what I have? It ran in 1 second vs 3 seconds for my code (also I had to transpose everything to row vectors for things to work, might be a particular details of my code though) – Charlie Parker Mar 29 '17 at 16:44
  • 1
    @CharlieParker, I added the reasons why it is faster. – Royi Mar 29 '17 at 16:50
  • also don't forget that because you called `refVal = hInputFunc(vInputPoint);` at the beginning of the code **once** you decreased by half the amount of times I need to call `hInputFunc`, which might be another reason for the speed up I see in the code. – Charlie Parker Mar 29 '17 at 19:06
  • 1
    You're right. I added it. Anyhow, I think it is as good as it gets in MATLAB. – Royi Mar 29 '17 at 20:36
  • I am very curious to see what an implementation in C would look like and bench mark it with what you have. – Charlie Parker Mar 29 '17 at 21:25
  • 1
    The difference between C and MATLAB in this case will be in the function you define not in the calculation of the Gradient. If you can make in C the input function much faster than its MATLAB version you'll gain. Otherwise, you won't. – Royi Mar 30 '17 at 05:30
  • in the case where `f` is well vectorized, would u expect an implementation in C to help? For example in one case I have `f` to be `f = @(x) -prod( (x > edge_start1) .* ( (edge_start1+edge_length1) >= x) ) + -prod( (x > edge_start2) .* ( (edge_start2+edge_length2) >= x) );`. Basically a bunch of boolen comparisons in a vector and then multiplying them together each coordinate. What do you think of this vs a C implementation? In the end I guess the questions boils down to matlabs implementation of vectorized code vs mine? is that what it boils down to? – Charlie Parker Mar 30 '17 at 16:25
  • 1
    @CharlieParker, Actually for this kind of code (Element Wise) if you know how to write C code with SSE / AVX + OpenMP you'll gain nicely. The reason is MATLAB is creating many temporal arrays. They improve it in the latest versions but I still think you can gain nicely. – Royi Mar 30 '17 at 20:07
0

A way better approach (numerically more stable, no issue of choosing the perturbation hyperparameter, accurate up to machine precision) is to use algorithmic/automatic differentiation. For this you need the Matlab Deep Learning Toolbox. Then you can use dlgradient to compute the gradient. Below you find the source code attached corresponding to your example.

Most importantly, you can examine the error and observe that the deviation of the automatic approach from the analytical solution is indeed machine precision, while for the finite difference approach (I choose second order central differences) the error is orders of magnitude higher. For 100 points and a range of $[-10, 10]$ this errors are somewhat tolerable, but if you play a bit with Rand_Max and n_points you observe that the errors become larger and larger.

Error of algorithmic / automatic diff. is: 1.4755528111219851e-14
Error of finite difference diff. is: 1.9999999999348703e-01 for perturbation 1.0000000000000001e-01
Error of finite difference diff. is: 1.9999999632850161e-03 for perturbation 1.0000000000000000e-02
Error of finite difference diff. is: 1.9999905867860374e-05 for perturbation 1.0000000000000000e-03
Error of finite difference diff. is: 1.9664569947425062e-07 for perturbation 1.0000000000000000e-04
Error of finite difference diff. is: 1.0537897883625319e-07 for perturbation 1.0000000000000001e-05
Error of finite difference diff. is: 1.5469326944467290e-06 for perturbation 9.9999999999999995e-07
Error of finite difference diff. is: 1.3322061696937969e-05 for perturbation 9.9999999999999995e-08
Error of finite difference diff. is: 1.7059535957436630e-04 for perturbation 1.0000000000000000e-08
Error of finite difference diff. is: 4.9702408787320664e-04 for perturbation 1.0000000000000001e-09

Source Code:

f2.m

function y = f2(x)

  x1 = x(:, 1);
  x2 = x(:, 2);
  x3 = x(:, 3);
  
  y = x1.^2 + 2*x2.^2 + 2*x3.^3 + 2*x1.*x2 + 2*x2.*x3;

f2_grad_analytic.m:

function grad = f2_grad_analytic(x)
  x1   = x(:, 1);
  x2   = x(:, 2);
  x3   = x(:, 3);
  
  grad(:, 1) = 2*x1 + 2*x2;
  grad(:, 2) = 4*x2 + 2*x1 + 2 * x3;
  grad(:, 3) = 6*x3.^2 + 2*x2;

f2_grad_AD.m:

function grad = f2_grad_AD(x)
  x1 = x(:, 1);
  x2 = x(:, 2);
  x3 = x(:, 3);
  
  y = x1.^2 + 2*x2.^2 + 2*x3.^3 + 2*x1.*x2 + 2*x2.*x3;
  grad = dlgradient(y, x);

CalcNumericalGradient.m:

function NumericalGrad = CalcNumericalGradient(InputPoints, eps)

% (Central, second order accurate FD)
NumericalGrad = zeros(size(InputPoints) );
for i = 1:size(InputPoints, 2)
  perturb             = zeros(size(InputPoints));
  perturb(:, i)       = eps;
  NumericalGrad(:, i) = (f2(InputPoints + perturb) - f2(InputPoints - perturb)) / (2 * eps);
end

main.m:

clear;
close all;
clc;

n_points = 100;
Rand_Max = 20;
x_test_FD = rand(n_points, 3) * Rand_Max - Rand_Max/2;

% Calculate analytical solution
grad_analytic = f2_grad_analytic(x_test_FD);

grad_AD = zeros(n_points, 3);
for i = 1:n_points
  x_test_dl    = dlarray(x_test_FD(i,:) );
  grad_AD(i,:) = dlfeval(@f2_grad_AD, x_test_dl);
end
Err_AD = norm(grad_AD - grad_analytic);
fprintf("Error of algorithmic / automatic diff. is: %.16e\n", Err_AD);

eps_range = [1e-1, 1e-2, 1e-3, 1e-4, 1e-5, 1e-6, 1e-7, 1e-8, 1e-9];
for i = 1:length(eps_range)
  eps = eps_range(i);
  grad_FD = CalcNumericalGradient(x_test_FD, eps);
  Err_FD = norm(grad_FD - grad_analytic);
  fprintf("Error of finite difference diff. is: %.16e for perturbation %.16e\n", Err_FD, eps);
end
Dan Doe
  • 153
  • 2
  • 11