How to approximate a Poisson distribution $P(n,lambda)$

Question

I have to implement an algorithm to evaluate a sum of poisson functions, each one with multiplying constants:

Required Sum

Where C(k) are positive constants<1, cut is a cutoff because in principle the sum should take infinite numbers of k, and lambda is a number that may vary in my case from 20 to 100. I've tried a straight forward implementation in my code:

    #include<quadmath.h>
   ... //Definitions of lambda and c[k]...
    long double sum=0;
    for(int k=0;k<cutoff;++k)
    {
     sum=sum + c[k] powq(lambda,k)/tgamma(k+1)* (1.0/expq(lambda));
    }

But I am not quite satisfied. I've searched on "Numerical recipes" for a good approach to evalutation of a poisson distribution, but I didn't find anything about that. Are there better ways to do this?

Edit: to be clear, I'm looking for the most precise way to approximate the probaility of large events, given a poisson distribution, without computing awkward (lambda^k)/k! Factors!

Well, since my cutoff should be at least 4 sigma, when lambda is like 50, following my code the processor must calculate something like 50^75/75!, And I fear that this could be a source of signficiative errors... — Francesco Di Lauro, Feb 28 '17 at 10:09

StoryTeller - Unslander Monica · Accepted Answer · 2017-02-28T17:35:51.683

1

Well, a simple improvement will be to calculate by hand, and cache lambda^k and (k+1)!, since their value in the previous iteration can be used to quickly calculate the respective ones in the current iteration, with an O(1) calculation.

Also, since 1.0/exp(lambda) is a constant, you should calculate it once in advance ¹

#include<quadmath.h>
... //Definitions of lambda and c[k]...

const long double e_coeff   = 1.0 / expq(lambda);
long double inv_k_factorial = 1.0l;
long double lambda_pow_k    = 1.0l;
long double sum             = 0.0l;
for(int k=0; k < cutoff; ++k)
{
 lambda_pow_k *= lambda;
 inv_k_factorial /= (k + 1);
 sum += (c[k] * lambda_pow_k * inv_k_factorial);
} 
sum *= e_coeff;

So now the three function calls and their respective overhead are completely gone from your loop.

Now, I've attempted to use the same data types as you did when writing your question. Since your comment indicates that lambda is greater than 1.0 (no relative error growth due to a quickly diminishing lambda_pow_k, I believe) Any significance lost here would depend on the limits of long double, which is either good or bad, depending on your concrete needs.

Compilers are clever nowadays. So it could be optimized like that any way, but I think it's best to leave less obvious optimizations to the optimizer. Your code shouldn't suffer in performance even when handed to a non-optimizing compiler.

edited Feb 28 '17 at 17:35

answered Feb 28 '17 at 10:05

StoryTeller - Unslander Monica

165,132
21
377
458

In the comments, he stated that he fears possible inaccuracies arising through the big values. Could you address this. I don't think this should be an issue with long double, but it would expand your answer. Well done nonetheless.+1 – Kami Kaze Feb 28 '17 at 10:18
@KamiKaze - Added a paragraph. But I could have some wrong assumptions there, so any corrections will be appreciated. – StoryTeller - Unslander Monica Feb 28 '17 at 10:23
Yes, that's very clever! Is this the more efficient way to implement a summation of poisson distributions? – Francesco Di Lauro Feb 28 '17 at 10:28
@FrancescoDiLauro - I can't tell you for sure. This seems to me like it should provide some improvement in terms of execution speed (the usual trick of trading some memory for time). You'd have to verify the results have as much significance as your original loop. – StoryTeller - Unslander Monica Feb 28 '17 at 10:31
With lambda certainly bigger than 1 (you said 50),I would think as long as x! and lambda^x don't hit the bounds of the possible number space of the variable type you should be fine there. Multiplication and division are quite accurate the "problem" comes in the addition of the values the first and last values might be "overshadowed" by your middleground(50+-(~25)) as these values are very large and the numbers at the ends are very small in comparison they might be lost in the addition. Then again it won't matter much to the final sum as these numbers are also finite and decreasing. @FrancescoD – Kami Kaze Feb 28 '17 at 12:49
@Kami Kaze the problem is that 50^75 will hit the bounds of a long int, so I am searching for some good approximation. The problem is that with my numbers I can't use a normal approximation nor a binomial one... I am searching for the more precise algo possible... – Francesco Di Lauro Feb 28 '17 at 16:38
@FrancescoDiLauro - Well, then flat out avoiding the computation of `(k+1)!` and calculating each term incrementally is far less likely overflow a long integer. You should be in the clear regarding that. – StoryTeller - Unslander Monica Feb 28 '17 at 16:40
Yes, so this method is both faster and more precise? – Francesco Di Lauro Feb 28 '17 at 16:56
@FrancescoDiLauro - Almost. You can improve numerical stabillity by keeping a running inverse of the factorial instead. That way you won't overflow a long – StoryTeller - Unslander Monica Feb 28 '17 at 17:31
@FrancescoDiLauro 50^75 is something*10^127 double can hold up to something*10^308 so you have plenty at least 100 more iterations on the power. But keeping a running value that gets multiplied by lambda/k should make this a non issue – Kami Kaze Mar 01 '17 at 07:56

score 0 · Answer 2 · answered Feb 28 '17 at 15:47

0

Since the Poisson probabilities obey the simple recurrence

P(k,lam) = lam/k * P(k-1,lam)

one possibility would be to use something like Horner's rule for polynomials. That is:

Sum{k|C[k]*P(k,lam)} = exp(-lam)*(C[0]+(lam/1)*(C[1]+(lam/2)*(..))..)

or

P = c[cut]
for k=cut-1 .. 0
  P = P*lam/(k+1) + C[k]
P *= exp(-lam)

answered Feb 28 '17 at 15:47

dmuir

4,211
2
14
12

How is it different than my answer? Besides dropping Horner's name, which is nice. – StoryTeller - Unslander Monica Feb 28 '17 at 16:29
Two questions: Why horner rule is better than the recurrence rule? Is this algo more precise than the straight forward implementation of poisson distribution, following the definition? – Francesco Di Lauro Feb 28 '17 at 16:30
In general Horner's rule is both more efficient and more accurate than other methods. It's worth looking up. – dmuir Feb 28 '17 at 16:57
@StoryTeller Your code does 3 multiplications per loop iteration (though you could reduce that to 2 by combining inv_k_factorial and lambda_pow_j); mine does 1 – dmuir Mar 01 '17 at 09:28
@dmuir, just to be sure, if my sum starts at k=1, the loop just stops when k reaches 1, right? so it's something like for(k=cutoff-1; k>1; k--) – Francesco Di Lauro Mar 01 '17 at 10:56
@FrancescoDiLauro No, k should get sown to 0 (so that eg c[0] is used( – dmuir Mar 01 '17 at 16:46

How to approximate a Poisson distribution $P(n,lambda)$

2 Answers2