How to deal with the log of a sum of more than two super small probabilities

Question

For some evil reason I need to calculate the log of the sum of 500 super small probabilities, each term computed by

dmvnorm(X[,i], mean=rep(0,3), sigma=diag(3))

Sometimes the codes above return 0 due to underflow, but using logarithms will be fine.

dmvnorm(X[,i], mean=rep(0,3), sigma=diag(3), log=TRUE)

I know I can handle two terms mathematically: log(p1 + p2) = log(p2) + log(1 + p1/p2). But Can we generalize this approach to more terms? Anyone with more experience on this?

BTW, I wrote a recursive function to compute this. Mathematically, it works. But I don't think this is practical.

MESSY <- function (pv) 
{
  N <- length(pv)
  if (N==1)
    {return(pv[1])}
  else
    {w <- pv[N]
     pv <- pv[1:N-1]-w
     return(w + log(1 + exp(MESSY(pv))))
    }
}

Because some p are super small and I can only use w=log(p), we have log(exp(w1)+exp(w2)) = w2 + log(1+exp(w1-w2)) and log(exp(w1)+exp(w2)+exp(w3)) = w3 + log(1 + exp(w1-w3) + exp(w2-w3)) for two and three terms.

Can you show a small reproducible example. Also, if it is three terms, what would be the formula — akrun, Jan 14 '20 at 20:19
Is `log(p2) + log(1 + p1/p2)` actually more accurate than `log(p1 + p2)`? Still seems like `p1/p2` would be problematic. — Gregor Thomas, Jan 14 '20 at 20:26
This might help: https://statmodeling.stat.columbia.edu/2016/06/11/log-sum-of-exponentials/ — Kent Johnson, Jan 14 '20 at 20:51
@Gregor--reinstateMonica-- You're right. It could be problematic. But some of the probability terms are too small, so I have to do it some other ways. — Paw in Data, Jan 14 '20 at 21:46
@KentJohnson Thanks a lot! That might be the most elegant solution! I can't find the function `log_sum_exp( )` or `logSumExp( )`, though. Do you know which package they're from? — Paw in Data, Jan 14 '20 at 21:53
I think they're in C source code underlying R, but not exported ... https://github.com/wch/r-source/blob/5a156a0865362bb8381dcd69ac335f5174a4f60c/src/nmath/pgamma.c#L219-L230 . Translating to R, `logspace_add <- function(logx,logy) { max(logx,logy) + log1p(exp(-abs(logx - logy))) }` (this is for two elements, would have to think about how to generalize it ...) — Ben Bolker, Jan 15 '20 at 15:51
@akash87 It's for the computation of Minimum-Description-Length scores, if that's what you're asking? — Paw in Data, Jan 21 '20 at 20:22

Ben Bolker · Answer 1 · 2020-01-24T18:55:26.020

This function is translated from the internal logspace_add function in the R source code here

logspace_add <- function(logx,logy) { 
    pmax(logx,logy) + log1p(exp(-abs(logx - logy))) 
}

Not necessarily the most efficient, but you should be able to do this for >2 elements by using Reduce():

logspace_add_mult <- function(x) {
    Reduce(logspace_add, x)
}

Quick test (using values that are large enough not to underflow, so that we can compare the results of the regular and logspace calculations).

x <- c(1e-4,1e-5,1e-6)
logspace_add_mult(log(x))
## [1] -9.10598
log(sum(x))
## [1] -9.10598

As far as I know, this is more or less the standard approach to logspace addition. The advantage of using something other than this home-grown implementation would be (1) code maturity and testing and (2) speed (at least for the logspace_add_mult version; I doubt there would be much advantage of a native-C (or whatever) implementation of logspace_add). The Brobdingnag package uses similar representations:

library(Brobdingnag)
brob(log(x))
## [1] +exp(-9.2103) +exp(-11.513) +exp(-13.816)
sum(brob(log(x)))
## [1] +exp(-9.106)
log(as.numeric(sum(brob(log(x)))))
## [1] -9.10598

In Python, numpy has logaddexp, but this only works pairwise: you can use functools.reduce() to generalize it as above.

import numpy as np
import functools as f
x = np.array([1e-4,1e-5,1e-6])
f.reduce(np.logaddexp,np.log(x))

This is probably a little faster than Reduce() in R.

Thank you very much! This approach works, and it's so much faster. Although I got some unexpected results, so I suspect that underflow still happens for some terms only I don't get to see that. But I'm not sure. Would you recommend MATLAB or Python for such calculation? — Paw in Data, Jan 24 '20 at 14:41
I doubt there would be very much difference in the numerical stability. — Ben Bolker, Jan 24 '20 at 18:56

How to deal with the log of a sum of more than two super small probabilities

1 Answers1