0

I want to integrate a Gaussian function over a very large interval. I chose spicy.integrate.quad function for the integration. The function seems to work only when I select a small enough interval. When I use the codes below,

from scipy.integrate import quad
from math import pi, exp, sqrt

def func(x, mean, sigma):
    return 1/(sqrt(2*pi)*sigma) * exp(-1/2*((x-mean)/sigma)**2) 

print(quad(func, 0, 1e+31, args=(1e+29, 1e+28))[0]) # case 1
print(quad(func, 0, 1e+32, args=(1e+29, 1e+28))[0]) # case 2
print(quad(func, 0, 1e+33, args=(1e+29, 1e+28))[0]) # case 3
print(quad(func, 1e+25, 1e+33, args=(1e+29, 1e+28))[0]) # case 4

then the followings are printed.

1.0
1.0000000000000004
0.0
0.0

To obtain a reasonable result, I had to try and change the lower/upper bounds of the integral several times and empirically determine it to [0, 1e+32]. This seems risky to me, as when the mean and sigma of the gaussian function changes, then I always have to try different bounds.

Is there a clear way to integrate the function from 0 to 1e+50 without bothering with bounds? If not, how do you expect from beginning which bounds would give non-zero value?

Nownuri
  • 689
  • 2
  • 7
  • 19

2 Answers2

1

In short, you can't.

On this long interval, the region where the gaussian is non-zero is tiny, and the adaptive procedure which works under the hood of integrate.quad fails to see it. And so would pretty much any adaptive routine, unless by chance.

ev-br
  • 24,968
  • 9
  • 65
  • 78
0

Notice,

enter image description here

and the CDF of a normal random variable is known as ϕ(x) as it can not be expressed by an elementary function. So take ϕ((b-m)/s) - ϕ((a-m)/s). Also note that ϕ(x) = 1/2(1 + erf(x/sqrt(2))) so you need not call .quad to actually perform an integration and may have better luck with erf from scipy.

from scipy.special import erf

def prob(mu, sigma, a, b):
    phi = lambda x: 1/2*(1 + erf((x - mu)/(sigma*np.sqrt(2))))
    return phi(b) - phi(a)

This may give more accurate results (it does than the above)

>>> print(prob(0, 1e+31, 0, 1e+50))
0.5
>>> print(prob(0, 1e+32, 1e+28, 1e+29))
0.000359047985937333
>>> print(prob(0, 1e+33, 1e+28, 1e+29))
3.5904805169684195e-05
>>> print(prob(1e+25, 1e+33, 1e+28, 1e+29))
3.590480516979522e-05

and avoid the intense floating point error you are experiencing. However, the regions you integrate are so small in area that you may still see 0.

modesitt
  • 7,052
  • 2
  • 34
  • 64