R: Maximum Likelihood Estimation of a exponential mixture using optim

Question

I'm trying to get the parameters w, lambda_1, lambda_2 and p from a mixture bi-exponential model, using a loglikelihood function and the optim function in R. The model is the following

Here is the code

biexpLL <- function(theta, y) {
  # define parameters
  w <- theta[1]
  lambda_1 <- theta[2]
  a <- theta[3]
  lambda_2 <- theta[4]
  # likelihood function with dexp
  l <- w * dexp((y - a), rate = 1/lambda_1) + (1 - w) * dexp((y - a), rate = 1/lambda_2)
  
  - sum(log(l))
}
# Generate some fake data
w <- 0.7
n <- 500
lambda_1 <- 2
lambda_2 <- 0.2
set.seed(45)
biexp_data <- (w * rexp(n, 1/lambda_1) + (1 - w) * rexp(n, 1/lambda_2)) 
# Optimization
optim(par = c(0.5,0.1,0.001,0.2),
      fn=biexpLL,
      y=biexp_data)
#$par
#[1] -94789220.4     16582.9   -333331.7 134744336.2

The parameters are very different from the used in the fake data! What I'm doing wrong?

This may not be the direct cause of the problem, but it seems you are generating a weighted average of two exponential distributions. But usually mixture models choose one of them with the probability `w`. — Kota Mori, Aug 11 '20 at 06:47
@KotaMori thank you. I'm not sure I'm following you. Did you mean I just need to estimate `lambda_1` and `w`? But how to get `lambda_2` from that? — jealcalat, Aug 11 '20 at 07:30
No, I mean that your code for generating the fake data may not be correct. See my answer below. — Kota Mori, Aug 11 '20 at 07:35

Kota Mori · Accepted Answer · 2020-08-11T08:21:49.537

The original code is prone to warnings and errors since the parameters may go to invalid values easily. For example, we need w in [0, 1] and lambda > 0. Also, if a is larger than a data point, then the density becomes zero, hence infinite log likelihood.

The code below uses some tricks to handle these cases.

w is converted to the range [0, 1] by the logistic function
lambda are converted to positive values by the exponential function.
Added tiny value to the likelihood to deal with cases of zero likelihood.

Also, the data generation process has been changed so that samples are generated from one of the exponential distributions with the given probability w.

Finally, increased the sample size since the result was not stable with n=500.

biexpLL <- function(theta, y) {
  # define parameters
  w <- 1/(1+exp(-theta[1]))
  lambda_1 <- exp(theta[2])
  a <- theta[3]
  lambda_2 <- exp(theta[4])
  # likelihood function with dexp
  l <- w * dexp((y - a), rate = 1/lambda_1) + (1 - w) * dexp((y - a), rate = 1/lambda_2)
  - sum(log(l + 1e-9))
}
# Generate some fake data
w <- 0.7
n <- 5000
lambda_1 <- 2
lambda_2 <- 0.2
set.seed(45)
n1 <- round(n*w)
n2 <- n - n1
biexp_data <- c(rexp(n1, rate=1/lambda_1),
                rexp(n2, rate=1/lambda_2)) 
# Optimization
o <- optim(par=c(0.5,0.1,0.001,0.2),
           fn=biexpLL,
           y=biexp_data)

1/(1+exp(-o$par[1]))
exp(o$par[2])
o$par[3]
exp(o$par[4])

On my environment I obtained the below.
The result seems reasonably close to the simulation parameters (note that two lambda values are swapped).

> 1/(1+exp(-o$par[1]))
[1] 0.3458264
> exp(o$par[2])
[1] 0.1877655
> o$par[3]
[1] 3.738172e-05
> exp(o$par[4])
[1] 2.231844

Notice that for mixture models of this kind, people often use the EM algorithm for optimizing the likelihood instead of the direct optimization as this. You may want to have a look at it as well.

score 0 · Answer 2 · answered Oct 08 '22 at 01:11

I have been able to get the parameters with the R package DEoptim :

library(DEoptim)

biexpLL <- function(theta, y) 
{
  w <- theta[1]
  lambda_1 <- theta[2]
  lambda_2 <- theta[3]
  l <- w * dexp(y, rate = 1 / lambda_1) + (1 - w) * dexp(y, rate = 1 / lambda_2)
  log_Lik <- -sum(log(l))
  
  if(is.infinite(log_Lik))
  {
    return(10 ^ 30)
    
  }else
  {
    return(log_Lik)
  }
}

w <- 0.7
n <- 500
lambda_1 <- 2
lambda_2 <- 0.2
set.seed(45)
indicator <- rbinom(n = 500, size = 1, prob = w)
biexp_data <- (indicator * rexp(n, 1 / lambda_1) + (1 - indicator) * rexp(n,  1 / lambda_2)) 

obj_DEoptim <- DEoptim(fn = biexpLL, lower = c(0, 0, 0), upper = c(1, 1000, 1000), control = list(itermax = 1000, parallelType = 1), y = biexp_data)
obj_DEoptim$optim$bestmem

 par1      par2      par3 
0.7079678 2.2906098 0.2026040

R: Maximum Likelihood Estimation of a exponential mixture using optim

2 Answers2