How to fit a finite mixture of Dirichlet distributions

Question

I have a compositional sample and I would like to fit a finite mixture of Dirichlet distributions. To be more precise, consider the following example:

library(gtools)
set.seed(1)
PROB = c(0.25, 0.15, 0.60)
ALPHA = list(
  c(1,1,1),
  c(2,1,1),
  c(1,1,20)
)
size = 500

N = sapply(1:3, function(i, z) sum(z == i),
           sample(1:3, size, prob = PROB, replace = TRUE))

X = do.call('rbind', 
            sapply(1:3, function(i, N) 
              rdirichlet(N[i], ALPHA[[i]]), N))[sample(1:size),]

X contains sample generated from a mixture of Dirichlet distributions defined in the 3-part simplex. The first Dirichlet component of this mixture has parameter (1,1,1), the second component has parameter (2,1,1) and the third (1,1,20). The mixture probabilities are 0.25, 0.15, 0.60. I would like to retrieve these parameters from the sample.

How would you find this parameters?

What about starting with, say, `nls` and see if you'll get in troubles? — Severin Pappadeux, May 24 '16 at 22:46
I've never used function `nls`. For what I've read, it is used to fit a model to a response variable. Am I wrong? If so, what is the response variable in my sample? — marc1s, May 25 '16 at 07:55
You have to start somewhere. If you have a sample, make a histogram and convert sample to approximation of PDF. Or use Q-Q. Or some likelihood. Take a look at https://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf — Severin Pappadeux, May 26 '16 at 00:32
But Dirichlet is a multivariate distribution, it seems that the examples available in this link are univariate examples. — marc1s, May 26 '16 at 12:34
Ok, I see. I found one link (never used it myself, cannot say anything wrt quality/applicability): Dirichlet Regression Data/Model, https://cran.r-project.org/web/packages/DirichletReg/DirichletReg.pdf — Severin Pappadeux, May 26 '16 at 18:56
Thanks Severin I didn't know this package and I found it very interesting. But the package is intended to fit a regression model where the outcome follows a Dirichlet distribution. In my case, instead of fitting a regression model (where the labels of the components are known) I would like to fit a mixture model (where the labels of the components are hidden) ...I don't know if I explain myself — marc1s, May 27 '16 at 06:39
Yeah, I understand what you'r saying wrt mixture model. Have no package/code handy, sorry. You might want to take a look at DirichletReg and grab the code and build something with that. Good luck... — Severin Pappadeux, May 27 '16 at 17:08
@Marc1s - If I understand you correctly, you have a data set, let us call it `X1`, the data was generated from a mixture of Dirichlet distributions defined in the 3-part simplex. Thus, your goal is to reverse engineer the probabilities from the data you have been given. i.e. From you example you wish to determine `PROB = c(0.25, 0.15, 0.60)` for your actual dataset. Correct? — Technophobe01, Jun 04 '16 at 01:58
Yes @Technophobe01! And also to recover the parameters from the Dirichlet distributions: `c(1,1,1)` for the first one, `c(2,1,1)` for the second and `c(1,1,20)` for the third. — marc1s, Jun 04 '16 at 08:21

Jarle Tufto · Accepted Answer · 2016-06-05T22:02:10.593

Reparameterizing in terms of theta1=log(p1/p3), theta2=log(p2/p3) and logs of all 9 alpha parameters, and then maximising the log likelihood using optim() with method="BFGS" seems to work if using initial values sufficiently close to the parameter values used to simulate the data. At least, all eigenvalues of the Hessian are negative, and small changes in initial values leads to the the same optimum.

repar <- function(theta) {
  p <- exp(theta[1])
  p[2] <- exp(theta[2])
  p[3] <- 1
  p <- p/sum(p)
  alpha <- matrix(exp(theta[3:11]),3,3,byrow=TRUE)
  list(p=p,alpha=alpha)
}
logL <- function(theta,x) {
  par <- repar(theta)
  p <- par$p
  alpha <- par$alpha
  terms <- 0
  for (i in 1:length(p)) {
    terms <- terms + p[i]*ddirichlet(x,alpha[i,])
  }
  -sum(log(terms))
}
start <- c(log(c(.25,.15)/.6), log(c(1,1,1, 2,1,1, 1,1,20)))
fit <- optim(start,logL,x=X,hessian=TRUE,method="BFGS")
repar(fit$par)
eigen(fit$hessian)$val
fit2 <- optim(start+rnorm(11,sd=.2),logL,x=X,hessian=TRUE,method="BFGS")
repar(fit2$par)

It seems to work well, perfect. I didn't use `optim` function before. Nice and very useful function! Thank you! — marc1s, Jun 06 '16 at 21:29
Ok, that's good. I suspect that this sort of likellihood function may have several maxima, however, beyond the 3!=6 maxima corresponding to just interchanging the components of the mixture. So to reassure yourself that you have found the global maximum you may want to run the optimization from a large set of random initial starting values. — Jarle Tufto, Jun 07 '16 at 06:35

How to fit a finite mixture of Dirichlet distributions

1 Answers1