0

I'm writing a function for Gaussian mixture models with spherical covariance structures--ie $\Sigma_k = \sigma_k^2 I$. This particular function is similar to the mclust package with identifier VII.

http://en.wikipedia.org/wiki/Mixture_model

Anyways, the problem I'm having is running into infinite values for the weight matrix. Definition: Let W be an n x m matrix where n = 1, ..., n (number of obs) and m = 1, ..., m (number of mixtues). Each element of W (ie w_ij) can essentially be defined as a specific form of:

w_im = \frac{a / b * exp(c)}{\sum_i=1^m [a_i / b_i * exp(c_i)]}

Computing this numerically is giving me infinite values. So I'm trying to use the log-identity log(x+y) = log(x) + log(1 + y/x). But the issue is that it's not as simple as log(x+y) but rather log(\sum_i=1^m [a_i / b_i * exp(c_i)]).

Here's some code define:

  1. n_im = a / b * exp(c) ;
  2. d_.m = \sum_i=1^m [a_i / b_i * exp(c_i)] ; and
  3. c_mat[i,j] as the value of the exponent for the [i,j]th term.

    n_mat[, i] <- log(a[i]) - log(b[i]) - c[,i] # numerator of w_im
    internal_vec1[i] <- (a[i] * b[1])/ (a[1] * b[i]) # an internal for the step below
    c_mat2 <- cbind(rep(1, n), c_mat[,1] - c_mat[,-1]) # since e^a / e^b = e^(a-b)
    for (i in 1:n) {
          d_vec[i] <- n_mat[i,1] + log(sum(internal_vec1 * exp(c_mat2[i,))) 
    } ## still getting infinite values
    

I'm trying to define the problem as briefly as possible. the entire function is obviously much larger than this. But, since the problem I'm running into is specifically dealing with infinite (and 1/infinity) values, I'm hoping this snippet is sufficient. Anyone with a coding trick here?

Carl Witthoft
  • 20,573
  • 9
  • 43
  • 73
alexwhitworth
  • 4,839
  • 5
  • 32
  • 59
  • 2
    Post the R code (rather than pseudo-Latex) AND full error message. And do not forget that log(0) will throw an error. – IRTFM May 27 '14 at 05:46
  • I'm not having a problem with the rest of the code, just this one segment. I think the full code would just complicate things.... that said, I can post it if no one has any ideas otherwise. There is no error message. Instead, I'm getting convergence to equivalent clusters instead of to multiple different clusters. – alexwhitworth May 27 '14 at 15:52
  • Debugging 101 rules would be: make a small test case where you know what the correct answer would be and then step through it to see where it goes wrong. At least for my understanding of the SO rules of engagement, this is still insufficiently expressed at the hard code level and perhaps something that needs to be posted at a Math or Statistics forum. – IRTFM May 27 '14 at 20:56

1 Answers1

0

Here is the solution!! (I've spent way too damn long on this)

**The first function log_plus() solves the simple problem where you want log(\sum_{i=1)^n x_i)

**The second function log_plus2() solves the more complicated problem described above where you want log(\sum_{i=1}^n [a_i / b_i * exp(c_i)])

log_plus <- function(xvec) {
  m <- length(xvec)
  x <- log(xvec[1])
  for (j in 2:m) {
    sum_j <- sum(xvec[1:j-1])
    x <- x + log(1 + xvec[j]/sum_j)
  }
  return(x)
}

log_plus2 <- function(a, b, c) {
  # assumes intended input of form sum(a/b * e^c)
  if ((length(a) != length(b)) || (length(a) != length(c))) {
    stop("Input equal length vectors")
  }
  if (!(all(c > 0) || all(c < 0))) {
    stop("All values of c must be either > 0 or  < 0.")
  }
  m <- length(a)
  # initilialize log sum
  x <- log(a[1]) - log(b[1]) + c[1]
  # aggregate / loop log sum
  for (j in 2:m) { 
    # build denominator
    b2 <- b[1:j-1]
    for (i in 1:j-1) {
      d1 <- 0
      c2 <- c[1:i]
      if (all(c2 > 0)) {
        c_min <- min(c2[1:j-1])
        c2 <- c2 - c_min 
      } else if (all(c2 < 0)) {
        c_min <- max(c2[1:j-1])
        c2 <- c2 - c_min
      }
      d1 <- d1 + a[i] * prod(b2[-i]) * exp(c2[i])
    }
    den <- b[j] * (d1)
    num <- a[j] * prod(b[1:j-1]) * exp(c[j] - c_min)
    x <- x + log(1 + num / den)
  }
  return(x)
}
alexwhitworth
  • 4,839
  • 5
  • 32
  • 59