0

I'm working on implementing a Gaussian Mixture Model (GMM) for three-way data (i.e., a set of matrices) in R. The GMM is being estimated using the Expectation-Maximization (EM) algorithm. However, I'm encountering an issue during the Expectation (E) step of the algorithm.

In the E step, I'm calculating the posterior probabilities (responsibilities) for each data matrix belonging to each component of the mixture model. These are calculated as the product of the matrix-normal density of the data matrix given the parameters of the component and the mixing weight of the component, divided by the sum of these products over all components.

The issue is that when I sum the posterior probabilities over all components for each data matrix (using colSums(comp.post)), I'm not getting a result of 1 for all data matrices, as I would expect.

Here's the relevant portion of the code:

#' Expectation Step of the EM Algorithm for Gaussian Mixture Model
#' with three-way data (a 3D array where each matrix is a data point).
#' Calculate the posterior probabilities (soft labels) that each component
#' has to each data matrix using matrix-normal distribution.
#' 
#' @param Y Three-dimensional array containing the data, where each matrix represents a data point.
#' @param M Array containing the mean matrix of each component, where each matrix represents a component.
#' @param Phi List containing the row covariance matrix of each component.
#' @param Omega List containing the column covariance matrix of each component.
#' @param alpha Vector containing the mixing weights  of each component.
#' @return Named list containing the loglik and posterior.df for each data matrix

  e_step <- function(Y, M, Phi, Omega, alpha) {
  
  # Number of components in the mixture
  n_clusters <- length(alpha)
  
  # Number of matrices in the dataset
  n_matrices <- dim(Y)[3]
  
  # Calculate the log of the matrix-normal density for each matrix in Y for each component
  # and add the log of the mixing weight of the component
  log_comp.prod <- array(dim = c(n_clusters, n_matrices))
  for (i in 1:n_clusters) {
    for (j in 1:n_matrices) {
      log_comp.prod[i, j] <- matrixNormal::dmatnorm(Y[,,j], M[,,i], Phi[,,i], Omega[,,i]) + log(alpha[i])
    }
  }
  
  # Subtract the max to avoid overflow when exponentiating
  log_comp.prod <- log_comp.prod - max(log_comp.prod)
  
  # Calculate the log of the total density for each matrix in Y
  log_sum.of.comps <- log(colSums(alpha * exp(log_comp.prod)))
  
  # Subtract the log of the total density from the log of the density for each component
  # to get the log of the posterior probabilities (responsibilities)
  log_comp.post <- log_comp.prod - log_sum.of.comps
  
  # Exponentiate to get back to the original scale
  comp.post <- exp(log_comp.post)
  
  # Calculate the log-likelihood as the sum of the log of the total densities
  loglik <- sum(log_sum.of.comps)
  
  return(list("loglik" = loglik, "posterior.df" = comp.post))
}

Here what I get:

colSums(comp.post)
 [1] 3.739240e+01 6.498986e-02 9.844537e+00 3.475485e+00 9.047767e+03 2.196267e-01
 [7] 2.079427e+01 1.165800e+02 1.405744e-01 1.353819e+01 2.433372e+00 2.051357e+00
[13] 4.472772e+00 3.597247e-02 3.210629e-01 8.761967e-01 4.396359e+01 3.265571e+02
[19] 1.247715e+02 3.616610e-02 4.361902e-01 2.035783e-02 5.585075e+01 3.328536e+00
[25] 5.880054e+00 2.166311e-01 5.388875e+02 3.931191e+01 1.642435e+00 5.129309e-01

I've tried different ways to compute the matrix-normal density, including using the matrixNormal::dmatnorm() function with log = FALSE and log = TRUE, and writing a custom function to compute the density. However, none of these approaches have resolved the issue.

I use this formula for my calculations: enter image description here

I'm not sure what's causing this issue or how to fix it. Any insights or suggestions would be greatly appreciated. Thank you!

0 Answers0