4

I face a problem of estimating a joint probability for independent variables in a simple setup. Currently I have an array of 100 random variables and I would like to obtain their joint probability without failing into the underflow problem. Any ideas how to achieve this goal in numpy? If possible?

If not could someone please explain me further the role of NumPy routine (logaddexp) as I thought it might be a help for me in such situation.

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
JustInTime
  • 2,716
  • 5
  • 22
  • 25

1 Answers1

9

logaddexp lets you expand the range (reducing the precision) of representable values by storing and processing their logarithm instead.

e1, e2 = np.log(p1), np.log(p2)    # convert p1, p2 to log form
e1e2 = e1 + e2                     # e1e2 is now np.log(p1 * p2)
e3 = np.logaddexp(e1, e2)          # e3 = np.log(p1 + p2)

You just need to go through your code changing ** to *, * to + and + to np.logaddexp, and convert back with np.exp at the end.

The normal 64-bit double-precision floating point has least positive normal value 2.2E-308; storing logs gives you an effective least positive normal 1E-(1.7E308).

ecatmur
  • 152,476
  • 27
  • 293
  • 366
  • But e3 != e1e2 which is my target. may be I did not get your point on this, if my arr is x=[0.01,0.02,0.03], could you explain how np.logaddexp would help getting p(X) – JustInTime Aug 13 '12 at 14:53
  • @JustInTime e1e2 is equivalent to p1 * p2; e3 is equivalent to p1 + p2. How are you computing p(X)? – ecatmur Aug 13 '12 at 15:02
  • its joint probab. of independent variables. which means it is :p(x) = 0.01*0.02*0.03. Imagine that I have an array of 100 small probabilities. This will lead to underflow. Hope I made the point here – JustInTime Aug 13 '12 at 15:13
  • 1
    @JustInTime In that case `log(p(x)) = log(0.01) + log(0.02) + log(0.03)`. – ecatmur Aug 13 '12 at 15:20