1

I want to use scipy.integrate for some numerical calculations. I just ran a little example to try it and ran across some unexpected behavior.

I made some clean code to demonstrate the problem. I use a very simple exponential distribution to test.

Here's my code:

import numpy as np
import sys
import scipy as sc
from scipy import integrate


print(sys.version)
print(np.version.version)
print(sc.version.version)
print()

r1 = integrate.quad(lambda x: sc.exp(-x), 0, 10)
r2 = integrate.quad(lambda x: sc.exp(-x), 0, 100000)
r3 = integrate.quad(lambda x: sc.exp(-x), 0, np.inf)

print(r1)
print(r2)
print(r3)

print()
r4 = integrate.quad(lambda x: sc.exp(-x), 0, 10000)
print(r4)

The output is

3.7.2 (default, Jan  2 2019, 17:07:39) [MSC v.1915 64 bit (AMD64)]
1.15.4
1.1.0

r1 (0.9999546000702375, 2.8326146575791917e-14)
r2 (2.0614532085314573e-45, 4.098798466247153e-45)
r3 (1.0000000000000002, 5.842606996763696e-11)

r4 (1.0, 1.6059202674761255e-14)

I expect all the output to be always approximately one. But in r2 i get an incredibly small value. Strangely, when integrating to infinity (r3), or a very small border (r1), the problem doesn't show up. Also, by decreasing the limit by one order of magnitude (r4) I also get a perfect result.

Does anyone know why this problem appears in scipy? I would call this a bug, but maybe I violated some restrictions? How do I know in advance to prevent wrong results in my applied problems?

Thank you in advance

Output for full_output:

r2 (2.0614532085314573e-45, 4.098798466247153e-45, {'neval': 63, 'last': 2, 'iord': array([      1,       2,       3,       4,       5, 6357060, 6357108,
       4259932, 6357102, 7274595, 6553710, 3342433, 7077980, 6422633,
       7536732, 7602281, 2949221, 6357104, 7012451, 6750305, 7536741,
       7536732, 6881379, 7929968, 7274588, 7602288, 7143529, 7995497,
       6029413, 7209055, 7077998, 3014771, 7340131, 3604531, 7798829,
       7209065, 6357087, 6553709, 3407926, 7340078, 6553721, 3276846,
       5046318, 7209057, 6684777, 7536741,     116, 6619136, 7602291,
             0], dtype=int32), 'alist': array([0.00000000e+000, 5.00000000e+004, 0.00000000e+000, 0.00000000e+000,
       6.88436472e-272, 3.80218509e-136, 2.65902947e-068, 2.20016853e-034,
       1.04474528e-019, 3.09734336e-016, 9.03970673e-019, 8.23342652e-316,
       8.23342968e-316, 8.23343284e-316, 8.23343601e-316, 8.23343917e-316,
       8.23344233e-316, 8.23344549e-316, 8.23344865e-316, 8.23345182e-316,
       8.23345498e-316, 8.23345814e-316, 8.23346130e-316, 8.23346446e-316,
       8.23346763e-316, 8.23347079e-316, 8.23347395e-316, 8.23347711e-316,
       8.23348027e-316, 8.23348344e-316, 8.23348660e-316, 8.23348976e-316,
       8.23349292e-316, 8.23349608e-316, 8.23349925e-316, 8.23350241e-316,
       8.23350557e-316, 8.23350873e-316, 8.23351189e-316, 8.23351506e-316,
       8.23351822e-316, 8.23352138e-316, 8.23352454e-316, 8.23352770e-316,
       8.23353087e-316, 8.23353403e-316, 8.23353719e-316, 8.23354035e-316,
       8.23354351e-316, 8.23354668e-316]), 'blist': array([5.00000000e+004, 1.00000000e+005, 0.00000000e+000, 0.00000000e+000,
       6.88436472e-272, 3.80218509e-136, 2.65902947e-068, 2.20016853e-034,
       1.04474528e-019, 3.09734336e-016, 9.03970673e-019, 1.20736675e+285,
       1.05117823e-153, 1.05132391e-153, 1.05146958e-153, 3.79823888e-258,
       1.61465766e+184, 3.11517960e+161, 4.26137323e+257, 6.01346953e-154,
       6.01366349e-154, 1.19632546e-153, 3.64465882e-086, 1.31100174e-259,
       1.20679441e-153, 1.20679327e-153, 3.24245662e-086, 3.64465882e-086,
       6.01357764e-154, 1.20679441e-153, 5.75105581e+072, 2.20791354e+214,
       1.27734658e-152, 5.29444423e+160, 6.19633416e+223, 2.25563599e-153,
       8.21947530e+223, 6.09892510e-013, 1.06097757e-153, 2.86747940e-110,
       6.06154135e-154, 6.06445477e-154, 6.96312298e-077, 3.00226946e-067,
       6.03810921e-154, 1.30421760e-076, 1.21438942e-067, 4.61448322e-072,
       8.51221910e-053, 3.73237334e+069]), 'rlist': array([2.06145321e-045, 0.00000000e+000, 6.73898103e+149, 3.51023756e+151,
       4.50937881e-292, 9.43293441e-314, 4.65203811e+151, 6.99386802e-283,
       3.53886392e-308, 1.33360313e+241, 1.15420781e+171, 9.30281767e+242,
       1.17364463e+214, 3.12671297e+185, 2.85341794e-313, 8.18432962e-085,
       6.45840689e+170, 4.42638830e-239, 9.78681729e+199, 3.38460675e+125,
       3.11732880e+150, 9.78747303e+199, 2.27948172e-191, 1.04972250e+214,
       4.77402433e+180, 1.12985581e+277, 3.16464606e-307, 1.33360315e+241,
       1.76252970e-310, 1.02318154e-012, 1.15549302e-313, 1.03539814e-308,
       1.33360293e+241, 5.67421675e-311, 5.00120719e-162, 6.46048250e-313,
       1.68400738e-019, 1.10811151e-302, 1.66468912e-312, 1.09403545e-303,
       1.27613271e-303, 7.10020498e-270, 4.99875566e-111, 9.11927054e-304,
       9.11571045e-304, 9.11749048e-304, 9.11571042e-304, 9.60205653e+303,
       5.43239349e-312, 1.79972786e-304]), 'elist': array([4.09879847e-045, 0.00000000e+000, 6.47287707e+170, 5.98178835e-154,
       1.69375668e+190, 4.44389806e+252, 1.12297399e+219, 1.87673453e-152,
       7.20706153e+159, 1.27826731e-152, 2.43812981e-152, 5.52716101e+228,
       6.01346953e-154, 1.57761457e+214, 7.19938459e+252, 3.94357072e+180,
       3.44210870e+175, 3.62478142e+228, 1.23732543e-259, 3.53810655e+155,
       4.81222029e+233, 1.06843264e-258, 9.15000112e+199, 4.26614628e+180,
       3.53387914e+246, 2.35509149e+251, 1.69375944e+190, 1.57762309e+214,
       6.19634286e+223, 8.95533289e-106, 5.98148090e-154, 1.17914189e+195,
       5.42869734e+213, 6.72794695e+199, 5.30383390e+180, 1.02188594e-152,
       2.16452413e+233, 7.50052033e+247, 6.98907523e+096, 7.69843824e+218,
       3.23097122e+174, 9.84214185e-154, 1.36723829e+161, 1.19346501e+243,
       1.94670285e+227, 2.21366476e+214, 8.95533289e-106, 8.75378213e+247,
       1.87673453e-152, 2.50722129e-310])})

´´´

julian
  • 35
  • 7
  • Could you please try to use the keyword argument `full_output=1` for `r2`. This will add a dictionary to the output with some extra information regarding the computation see [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.quad.html). My first guess would be that you hit an "Antisweetspot" for the computation. – Chgad Aug 13 '19 at 12:36
  • 1
    @Chgad I added the output above – julian Aug 13 '19 at 13:00
  • The results of the partitioned subintervals are with in the array of the key `rlist` The first interval chosen by the integration algorithm is 0.0 to 5.0e+4 with a result of 2.06e-45 and an absoulute error of twice that much (see first element of `elist`). This can not be correct at all. Comparing further intervalls reveals more such unrealistic results. I suggest playing around with the upper limit of the integral. Maybe chose something like 99900? – Chgad Aug 13 '19 at 13:12
  • FYI: This is a FAQ regarding `quad`. There are many related questions like this on SO; here are just three that I could find easily because I provided answers: https://stackoverflow.com/questions/29179778/scipy-quad-uses-only-1-subdivision-and-gives-wrong-result, https://stackoverflow.com/questions/47193045/why-does-integrate-quadlambda-x-xexp-x2-2-sqrt2pi-0-0-100000-give-0, https://stackoverflow.com/questions/40536467/scipy-integrate-quad-not-returning-the-expected-values – Warren Weckesser Aug 13 '19 at 15:29

2 Answers2

3

It is not a bug, it has to do with the numerical precision of the integration, and the fact that you are integrating a function that is (almost) 0 in most of the interval. From the docs:

Be aware that pulse shapes and other sharp features as compared to the size of the integration interval may not be integrated correctly using this method.

Based on your output, the function is using only two (last=2) intervals, evaluating values for rlist=(2.06145321e-045, 0.00000000e+000, ..) on each (see the docs for more detail on the output)

You can add points to the interval to force the routine to use points closer to the left limit.

a = quad(lambda x: np.exp(-x), 0, 1e9, points=np.logspace(-10,3,10))
print(a)
(0.9999999999999997, 2.247900608926337e-09)

Adding to the explanation (thanks to @norok2): Note that points is a sequence of break points in the bounded integration interval where local difficulties of the integrand may occur (e.g., singularities, discontinuities). In this case, I'm not using it to point out discontinuities, but rather to force quad to perform more integration steps near the left boundary, using a log-spaced interval since a I have an exponential function (this is of course arbitrary and for this function, since I know its shape).

Tarifazo
  • 4,118
  • 1
  • 9
  • 22
  • Thank you a lot for your well-explained answer. I wonder why the +infinity integration works so much better. Does it do logarithmic-interval evaluation automatically in that case? – julian Aug 13 '19 at 13:13
  • The 10000 limit gives a good result. Is there any rule of thumb what limit to use for good precision and little risk for things like above to happen? – julian Aug 13 '19 at 13:15
  • @Mstain You could probably explain the `points` argument a bit better. `points` is *A sequence of break points in the bounded integration interval where local difficulties of the integrand may occur (e.g., singularities, discontinuities). The sequence does not have to be sorted.* So, you do not need that many intervals and the way to choose them is to make sure to capture *difficulties* within. For example, in this case, `points=(b, np.inf)` would be just as effective as long as `np.exp(-b)` is small *enough*. There is no simple way to know this beforehand for arbitrary functions. – norok2 Aug 13 '19 at 13:36
  • (the more values you have in `points`, the higher the computational burden). An alternative method would be to perform the sampling yourself and then use `scipy.integrate.trapz()` or similar for the numerical integration. – norok2 Aug 13 '19 at 13:41
  • Exactly. See also [this post](https://stackoverflow.com/questions/34877147/limits-of-quad-integration-in-scipy-with-np-inf) – Tarifazo Aug 13 '19 at 14:03
0

There is no need to convert your integral into one over a (very large) interval. There is a specific integration scheme for integrals of the form

,

namely Gauss-Laguerre quadrature. It's also included in quadpy (a project of mine). Simply try out

import numpy
import quadpy

scheme = quadpy.e1r.gauss_laguerre(1)

val = scheme.integrate(lambda x: numpy.ones(x.shape[1:]))

print(val)
1.0
Nico Schlömer
  • 53,797
  • 27
  • 201
  • 249