4

I'm trying to write a program to calculate probabilty mass function of a poisson distribution, P(x=n) with parameter lambda, using this formula: ( (e^-lambda)*(lambda^n))/n!

This approach works well when I use small lambda and small numbers, but if I want to calculate for example P(x=30) with lambda 20 the result is 4.68903e+006 which is wrong.

I think the problem is for calculating n!. I implemented a function to calculate factorial value and used unsigned long long data type for the result of factorial calculation, but the problem is that the amount of 30! is equal to 265,252,859,812,191,058,636,308,480,000,000 and the maximum number available for unsigned long long is 18,446,744,073,709,551,615, which is less than 30!.

What should I do to handle this issue? Is there any other way or any function to calculate this prabability in c++?

data type

JohnIdol
  • 48,899
  • 61
  • 158
  • 242
starrr
  • 1,013
  • 1
  • 17
  • 48
  • 4
    Why the downvote? What is wrong with this question? – Cheiron May 10 '15 at 21:23
  • 1
    @Cheiron: Hard to find bugs in code we can't see. The SO guidelines are fairly explicit on that point. – MSalters May 10 '15 at 21:26
  • 1
    @MSalters Questions don't have to include code to be on topic on SO. In fact, many of the highest ranking questions don't have code. – isarandi May 10 '15 at 22:55
  • @isarandi: I know, I follow Meta.SO. But just look at the start of the third paragraph: _I think the problem is ..._. That's why we do want code if it exists. – MSalters May 11 '15 at 07:32
  • 1
    The SO guidelines are also specific on the fact that you should comment what is wrong when downvoting a post. – Cheiron May 12 '15 at 07:41
  • The Wikipedia article on [Poisson distribution](https://en.wikipedia.org/wiki/Poisson_distribution) gives the recommendation to implement its ***probability mass function*** (not to be confused with a random variable function) in terms of the `std::lgamma` function in order to avoid the issues with large numbers. – rwong Oct 26 '15 at 23:50
  • If approximations are ok, you can use a Gaussian distribution when n is > 15 or so (or whenever n! would overflow). – E L Dec 13 '18 at 17:30

3 Answers3

3

If one only needs to generate random values from a Poisson distribution, and otherwise does not need to know its probability mass function, the most direct way is to use the predefined distribution which is part of C++11. A similar implementation can also be found in Boost.

rwong
  • 6,062
  • 1
  • 23
  • 51
MSalters
  • 173,980
  • 10
  • 155
  • 350
  • 1
    I'm not aware of any way to actually extract the value of the pmf from a `std::poisson_distribution`. Care to elaborate? – T.C. May 10 '15 at 23:45
  • @T.C. : I think you may have a point - there's no defined mapping of the underlying uniform range to the poisson distributed range. I'm going to leave the answer here in case someone does know how to extract a PMF from a given C++ distribution. – MSalters May 11 '15 at 20:56
  • I don't think it's possible in the general case. In many cases, you don't need to compute the PMF/PDF/CDF in order to generate the number. For instance, `normal_distribution` might use the Box-Muller transform. – T.C. May 12 '15 at 22:54
3

Try a little maths

 (lambda^n))/n!

Is that not

 (lambda/n) * (lambda/(n-1) * ...

and those numbers will be manageable with doubles instead of computing very large nunbers

Ed Heal
  • 59,252
  • 17
  • 87
  • 127
  • This is a poor implementation. Evaluating `lambda^n` is O(log N), evaluating this is O(N). The proper way to implement this (in C++98 before `` distributions) is via [Stirling's approximation](http://en.wikipedia.org/wiki/Stirling's_approximation) which is also O(log N). (An even better approach first calculates `log(( (e^-lambda)*(lambda^n))/n!)` as that is quite trivial) – MSalters May 10 '15 at 21:38
  • It is accurate and not an approximation – Ed Heal May 10 '15 at 21:40
  • It will be an approximation within three terms; computers are binary so `lambda/n` or `lambda/n-1` or `lambda/n-2` is trying to divide by a factor of three. – MSalters May 10 '15 at 21:42
  • Is O(n) when n=30 really all that bad in this case? This answer is simple. The only change I might make is to do this: double total = pow(lamda, n); for (int i = 2; i <= n; i++) { total /= i; } – E L Dec 13 '18 at 17:22
3

One workaround to deal with large n's is calculating the distribution in the log domain:

X = ((e^-lambda)*(lambda^n))/n!
ln X = -lambda + n*ln(lambda) - Sum (ln(n))
return e^X
Paulo Mendes
  • 697
  • 5
  • 16