0

Suppose in the M-step of EM algorithm, the denominator of some parameters are matrix and they are non-invertible, we would use pseudo inverse matrix instead of it. If so, would the log likelihood still always increase?

I couldn't give a specific case and I fabricated this question. If you really need one, you could follow the EM algorithm of wiki page. In the filtering and smoothing part. suppose the denominator are matrix and the sum of them are not invertible, so what would happen for the loglikelihood? Still always increase?

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
Lazar
  • 11
  • 4

1 Answers1

0

For any particular case, I suggest that you work through the proof of the EM algorithm, such as https://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm#Proof_of_correctness, in that setting. In general, I would expect that as long as your M step is increasing the value it is maximising the EM pass as a whole will increase the log likelihood, even if the M step isn't, for example, finding the absolute maximum at each pass.

I would still worry if this non-invertible matrix means that you have entered some special region of the solution set, though. Because the Expectation step works out the expected log likelihood under the current parameters, some special parameter values, especially zero, will mean that all of the possibilities considered in the maximization step share those special parameters - sometimes, once a parameter goes to zero, the EM algorithm can never change its mind and move that parameter away from zero. So it might be the case that once you get a non-invertible matrix, all further EM steps from that position will also have non-invertible matrices, in which case you might find that the EM algorithm gets stuck in local optima very quickly, before it has done much optimising.

mcdowella
  • 19,301
  • 2
  • 19
  • 25
  • Thanks for your reply. Yes, you are right the designed matrix are sparse. I can understand, it get stuck in local optima rapidly, but the likelihood is still increasing whenever it is quick or not,isn't it? My point is more like if I use pseudo inverse in the equation of maximization or greedy algorithm such as find the each row that maximize the likelihood and then combining them to a matrix, would these still lead to the ML estimate? I suppose not. So these are not the ML estimate, though they may be close to and would this lead to non-increase likelihood? – Lazar Feb 13 '17 at 13:52