I have the following problem. I have a function f
defined in python using numpy functions. The function is smooth and integrable on positive reals. I want to construct the double antiderivative of the function (assuming that both the value and the slope of the antiderivative at 0 are 0) so that I can evaluate it on any positive real smaller than 100.
Definition of antiderivative of f
at x
:
integrate f(s) with s from 0 to x
Definition of double antiderivative of f
at x
:
integrate (integrate f(t) with t from 0 to s) with s from 0 to x
The actual form of f
is not important, so I will use a simple one for convenience. But please note that even though my example has a known closed form, my actual function does not.
import numpy as np
f = lambda x: np.exp(-x)*x
My solution is to construct the antiderivative as an array using naive numerical integration:
N = 10000
delta = 100/N
xs = np.linspace(0,100,N+1)
vs = f(xs)
avs = np.cumsum(vs)*delta
aavs = np.cumsum(avs)*delta
This of course works but it gives me arrays instead of functions. But this is not a big problem as I can interpolate aavs
using a spline to get a function and get rid of the arrays.
from scipy.interpolate import UnivariateSpline
aaf = UnivariateSpline(xs, aavs)
The function aaf
is approximately the double antiderivative of f
.
The problem is that even though it works, there is quite a bit of overhead before I can get my function and precision is expensive.
My other idea was to interpolate f
by a spline and take the antiderivative of that, however this introduces numerical errors that are too big for what I want to use the function.
Is there any better way to do that? By better I mean faster without sacrificing accuracy.
Edit: What I hope is possible is to use some kind of Fourier transform to avoid integrating twice. I hope that there is some convenient transform of vs
that allows to multiply the values component-wise with xs
and transform back to get the double antiderivative. I played with this a bit, but I got lost.
Edit: I figured out that by using the trapezoidal rule instead of a naive sum, increases the accuracy quite a bit. Using Simpson's rule should increase the accuracy further, but it's somewhat fiddly to do with numpy arrays.
Edit: As @user202729 rightfully complains, this seems off. The reason it seems off is because I have skipped some details. I explain here why what I say makes sense, but it does not affect my question.
My actual goal is not to find the double antiderivative of f
, but to find a transformation of this. I have skipped that because I think it only confuses the matter.
The function f
decays exponentially as x approaches 0 or infinity. I am minimizing the numerical error in the integration by starting the sum from 0 and going up to approximately the peak of f
. This ensure that the relative error is approximately constant. Then I start from the opposite direction from some very big x and go back to the peak. Then I do the same for the antiderivative values.
Then I transform the aavs
by another function which is sensitive to numerical errors. Then I find the region where the errors are big (the values oscillate violently) and drop these values. Finally I approximate what I believe are good values by a spline.
Now if I use spline to approximate f
, it introduces an absolute error which is the dominant term in a rather large interval. This gets "integrated" twice and it ends up being a rather large relative error in aavs
. Then once I transform aavs
, I find that the 'good region' has shrunk considerably.
EDIT: The actual form of f
is something I'm still looking into. However, it is going to be a generalisation of the lognormal distribution. Right now I am playing with the following family.
I start by defining a generalization of the normal distribution:
def pdf_n(params, center=0.0, slope=8):
scale, min, diff = params
if diff > 0:
r = min
l = min + diff
else:
r = min - diff
l = min
def retfun(m):
x = (m - center)/scale
E = special.expit(slope*x)*(r - l) + l
return np.exp( -np.power(1 + x*x, E)/2 )
return np.vectorize(retfun)
It may not be obvious what is happening here, but the result is quite simple. The function decays as exp(-x^(2l)) on the left and as exp(-x^(2r)) on the right. For min=1 and diff=0, this is the normal distribution. Note that this is not normalized. Then I define
g = pdf(params)
f = np.vectorize(lambda x:g(np.log(x))/x/area)
where area
is the normalization constant.
Note that this is not the actual code I use. I stripped it down to the bare minimum.