3

MATLAB has two methods to solve a nonlinear equation:

  • fzero: solves a single nonlinear equation
  • fsolve: solves a system of nonlinear equations

Therefore, one can use the following methods to solve a system of n nonlinear independent equations:

  1. Use a loop to solve the equations separately using fzero
  2. Use a loop to solve the equations separately using fsolve
  3. Use fsolve to solve them together

My intuition would be that:

  • A loop method is faster than a single system for large n as complexity (gradient calculation) is 0(n^2)
  • A loop may be slower for small n as a loop has a high overhead in MATLAB and there may be some constant startup time
  • fzero is faster than fsolve as it is specifically made for a single nonlinear equation.

Question: What is the fastest method to a system of n nonlinear independent equations? Are there other methods? Which options should be used to speed up the process?

Example problem: (may be used to benchmark different answers)

x_i^2 = 1

with i between 1 and n

Related threads

m7913d
  • 10,244
  • 7
  • 28
  • 56
  • 1
    There is a vectorized version of fzero on the file exchange: https://www.mathworks.com/matlabcentral/fileexchange/28150-bisection-method-root-finding In my experience it beats independent fsolve (using the 'JacoPattern' = speye(n) option) by a order of magnitude. – Bob Dec 13 '18 at 12:34
  • NOTE TO REVIEWERS: Added example problem, to be able to better compare different answers. Note that Bob already pointed out an alternative method, which deserves an own answer with benchmark. – m7913d Aug 19 '20 at 12:53
  • 1
    I added benchmarks to my full answer below. Unfortunately it got deleted by the mods, so I am not sure if anyone else can see it. Should I repost it? – Bob Jan 29 '21 at 15:55
  • In case my answer below isn't visible to others, here is a summary: using the same benchmark problem as in the other answer, my timings for `ns = 10` are: `0.0027s` via loop vs `0.0049s` via indep fsolve vs `5.0978e-05s` via the vectorized code. For `ns = 1e5` I get: `28.7574s` via loop vs `7.7601s` via indep fsolve vs `0.0013s` via the vectorized code. – Bob Jan 29 '21 at 16:03
  • 1
    @Bob, I agree to repost the answer. I, the OP, do not see your answer. So, almost nobody will. Note that I added the example/benchmark problem explicitly to my question. After this action, the question was reopened (after being closed for needs more focus). However, the preferred solution may be to flag your answer for moderation attention. – m7913d Jan 29 '21 at 20:21
  • I just reposted, hopefully the mods won't delete again – Bob Jan 29 '21 at 20:54

2 Answers2

6

The best approach to evaluate the performance of some method is to write a benchmark. Four cases are considered:

  1. loop fzero: uses a loop to solve the equations separately using fzero
  2. loop fsolve: uses a loop to solve the equations separately using fsolve
  3. default fsolve: solves the equations together as one system of equations
  4. independent fsolve: same as default fsolve, but specifies that the equations are independent

f = @(x) x.^2-1; % the set of non-linear equations
ns = 1:1:100; % the sizes for which the benchmark is performed
options=optimset('Display','off'); % disable displaying

figure
hold on
plot(ns, loopFSolve(f, ns, options), 'DisplayName', 'loop fsolve')
plot(ns, loopFZero(f, ns, options), 'DisplayName', 'loop fzero')
plot(ns, defaultFSsolve(f, ns, options), 'DisplayName', 'default fsolve')
plot(ns, independentFSolve(f, ns, options), 'DisplayName', 'independent fsolve')

legend ('Location', 'northwest')

function t = loopFZero(f, ns, options)
  t1 = timeit(@() fzero(f, rand(1), options));
  t = ns * t1;
end

function t = loopFSolve(f, ns, options)
  t1 = timeit(@() fsolve(f, rand(1), options));
  t = ns * t1;
end

function t = defaultFSsolve(f, ns, options)
  t = zeros(size(ns));
  for i=1:length(ns)
    n = ns(i);
    un = rand(n, 1);
    t(i) = timeit(@() fsolve(f, un, options));
  end
end

function t = independentFSolve(f, ns, options)
  t = zeros(size(ns));
  for i=1:length(ns)
    n = ns(i);
    un = rand(n, 1);
    options.Algorithm = 'trust-region-reflective';
    options.JacobPattern = speye(n);
    options.PrecondBandWidth = 0;

    t(i) = timeit(@() fsolve(f, un, options));
  end
end

Results

All the figures show the computation time for the complete system in function of n, the number of equations.

The first two figures plot n up to 1000, with an interval of 100. The last two figures plot n up to 100 with an interval of 1. For each, the second plot is the same as the first one but without loop fzero as it is much slower than the others.

enter image description here enter image description here enter image description here enter image description here Conclusion

  1. loop fsolve: do not use it, startup time is too high
  2. loop fzero: you can use it for small n (fastest method for n < ~20)
  3. default fsolve: you can use it for relative small n (fastest method for ~20 < n < ~50, but difference with 2 and 3 is relative small).
  4. independent fsolve: you should use it for large n (fastest method for ~50 < n)

In general, you should use independent fsolve, only for small n loop fzero may be used instead, i.e. use fsolve with the following options:

options.Algorithm = 'trust-region-reflective';
options.JacobPattern = speye(n);
options.PrecondBandWidth = 0;

Lazy people may just use the default fsolve as it has a reasonable performance for a moderate number of equations (n < ~200)

Important remark

Bob pointed out that a vectorised version of fzero may be multiple orders of magnitude faster. According to Bob's test results, his method is the fastest method for every size of the problem, i.e. large and small n.

Remarks

  • Note that the time complexity of default fsolve is O(n^2) while the others are O(n).
  • Note that fzero and fsolve may behave differently in some boundary cases, f.e fzero will not found a solution for x^2 as it searches the location of a sign change.
m7913d
  • 10,244
  • 7
  • 28
  • 56
1

I'm adding this answer to elaborate on my comment above. In my experience by far the fastest method is to use a vectorized version of fzero that is available on the file exchange: Link to vectorized bisection code

Here is a couple of benchmarks comparing it's performance to (i) looped fzero and (ii) independent fsolve.

f = @(x) x.^2-1; %the set of non-linear equations
ns = 1e5; %size of the problem

% method 1: looped fzero
t = timeit(@() fzero(f, rand(1)));
loopFZero = t*ns

% method 2: independent fsolve
options=optimset('Display','off'); % disable displaying
options.Algorithm = 'trust-region-reflective';
options.JacobPattern = speye(ns);
options.PrecondBandWidth = 0;
indepFSolve = timeit(@() fsolve(f, rand(ns,1), options))

% method 3: vectorized bisection, available here:
% https://www.mathworks.com/matlabcentral/fileexchange/28150-bisection-method-root-finding
vecBisection = timeit(@() bisection(f, zeros(ns,1), 2))

Results

%---------------
% ns = 10

loopFZero =

    0.0027


indepFSolve =

    0.0049


vecBisection =

   5.0978e-05

%---------------
% ns = 1e5

loopFZero =

   28.7574


indepFSolve =

    7.7601


vecBisection =

    0.0013
Bob
  • 428
  • 3
  • 11
  • @mods: This is a re-post of my earlier answer that got deleted for not satisfying the standards of the community. I am re-posting because it was explicitly requested by the OP (see comments to their question above), and because I still believe that my approach here is orders of magnitude more efficient than the other answer and hopefully of use to others. – Bob Jan 29 '21 at 20:59
  • @mods: Note that the question now contains an explicit benchmark problem, which allows for a clear comparison between the different solutions. – m7913d Jan 30 '21 at 11:44
  • 1
    I accepted your answer as it is clearly to best method (so far). I hope it will prevent your answer from being deleted. – m7913d Jan 30 '21 at 11:55