What's the accuracy difference between the two versions derivative calculation?

Question

The following is Python code aiming to calculate the derivative of a given function f.

Version one (Solution)

x[ix] += h # increment by h
fxh = f(x) # evalute f(x + h)
x[ix] -= 2 * h 
fxnh = f(x)
x[ix] += h
numgrad = (fxh - fxnh) / 2 / h

Version two (my version)

fx = f(x) # evalute f(x)
x[ix] += h 
fxh = f(x) # evalute f(x+h)
x[ix] -= h
numgrad = (fxh - fx) / h

It has shown version one gives a better accuracy, could anyone explain why it is the case, what's the difference between the two calculations?

UPDATES I didn't realize it's a mathematical problem at the first place, I thought it was a problem relating to effects of floating accuracy. As suggested by MSeifert, I do agree that float-point noise matters, small magnitude result is more susceptible when exposing to the noise.

Welcome to CS.SE! Generally speaking, code is off-topic here, and anything Python-specific is off-topic here. Coding questions can often be asked on Stack Overflow; if you'd prefer to move your question over there, click 'flag' to flag this for moderator attention and ask the mods to migrate it. Alternatively, if this isn't Python-specific, please replace the code with mathematics or pseudocode that will be understandable to even people who don't know Python. (e.g.: I know some Python but I have no idea what `x[ix]` means in this context.) — D.W., Dec 03 '16 at 05:07
Note that the statement that the first version gives better accuracy is not in general true. There are situations where a one-sided approximation is *mathematically* favourable (that is, independent of your computer's floating point arithmetic). See, e.g., [upwind schemes](https://en.wikipedia.org/wiki/Upwind_scheme). For the explanation why the first version is more accurate *most times*, make yourself familiar with the *order* of [finite differences](https://en.wikipedia.org/wiki/Finite_difference). — Phillip, Dec 19 '16 at 10:35

score 3 · Accepted Answer · answered Dec 19 '16 at 10:50

This is not a Python problem but a pure algorythmic one. Assuming that the function f has nice properties, you can look at its Taylor series development:

f(x+h) = f(x) + h f'(x) + h*h/2 f"(x) + h*h*h/6 f'''(x) + o(h3)

It comes that your first form gives for the error:

((f(x+h) - f(x)) /  h) - f'(x) = h/2 f"(x) + o(h)

that is an error in the magnitude order of h

If you use the second form, you get:

((f(x+h) - f(x-h)) / 2*h) - f'(x) = h*h/3 f'''(x) + o(h2)

the terms in h have fell and the error is in the magnitude order of h²

Of course it only makes sense if the required derivatives exist...

score 0 · Answer 2 · answered Dec 19 '16 at 10:18

Your solution is "one-sided", you compare f(x+h) - f(x) and the general solution is "two-sided" f(x+h) - f(x-h).

It would be good to know what

It has shown version one gives a better accuracy

means. Because that's far too general.

But I think I have an example which might be appropriate here:

def double_sided_derivative(f, x, h):
    x_l, x_h = x - h, x + h
    return (f(x_h) - f(x_l)) / 2 / h

def one_sided_derivative(f, x, h):
    x_h = x + h
    return (f(x_h) - f(x)) / h

h = 1e-8

def f(x):
    return 1e-6 * x

# difference to real derivate:
double_sided_derivative(f, 10, h) - 1e-6, one_sided_derivative(f, 10, h) - 1e-6
# (6.715496481486314e-14, 1.5185825954029317e-13)

Note that the double-sided result is closer to the expected value. This might even lead to catastrophic cancellation. Then you may end up with a result that is mostly governed by floating-point-noise. This effect is further enhanced because the value is divided by a really small number.

By using both sides you increase (depending on your function!) the difference and thus the point at which the cancellation may occur. But in my opinion the biggest advantage is that you take the slope at both sides into account (averaging it somewhat). For example:

h = 1e-6

def f(x):
    return 4 + x + 5 * x**2

def fdx(x):
    return 1 + 10 * x

x = 10

double_sided_derivative(f, x, h) - fdx(x), one_sided_derivative(f, x, h) - fdx(x)
# (-2.7626811061054468e-08, 4.974594048690051e-06)

This is much closer to the true value (two orders of magnitude) than the one-sided approximation.

I'd say the biggest advantage is that the first version's error is in `O(h²)` while the other one's is in `O(h)`. — Phillip, Dec 19 '16 at 10:39
@Phillip I'd say it depends on the function. I was mostly focussing on the floating-point aspect (it was tagged with floating-point-accuracy). Or is `O(h**2)` a result of the floating point arithmetic? — MSeifert, Dec 19 '16 at 10:53
It is a mathematical result that holds for all functions where the first and second derivatives exist, and it is one on the mathematical objects, not on their representation in a computer. — Phillip, Dec 19 '16 at 11:39
yes I know that. I was just confused because you're original comment was ambiguous. _Error_ could mean anything in this context. Thanks for clarifying. — MSeifert, Dec 19 '16 at 11:41

What's the accuracy difference between the two versions derivative calculation?

2 Answers2