-1

I have an exercise, where I am given 5 data points:

x1 = 10, x2 = 7, x3 = 1, x4 = 15, x5 = 8 generated independently.

For the first part, I am told that they follow a Poisson distribution, of parameter theta and I am asked to find the maximum likelihood estimate for theta.

I calculated argmax(theta) of ln(x1,x2,x3,x4,x5 | theta) and got a result of

theta = 41/5 = 8.2.

For the second part I am asked the same thing, but this time I am told that they follow an Exponential distribution of paramter theta.

I did the same calculus and got a result of

theta = 5/41 = 0.12.

Now I am asked which of these two distributions(Poisson or Exponential) is the most probable to have generated the 5 points (x1,x2,x3,x4,x5).

Basically I need to find out which of these two distributions has the highest probability to have generated the 5 points, based (I believe) on that theta that I calculated for both.

But I can't seem to figure it out what's the form of these 2 probabilities that I need to find. Is it the MAP probability? P(theta | x1,x2,x3,x4,x5) ? If yes, I can use the Bayes formula to get

P(x1,x2,x3,x4,x5 | theta) * P(theta) / P(x1,x2,x3,x4,x5). But what is P(theta) and P(x1,x2,x3,x4,x5) ?

Any ideas?

Eman Yalpsid
  • 141
  • 1
  • 9
  • Do I need to compute P(data | theta) first? – Eman Yalpsid Jan 23 '17 at 17:33
  • 1
    Sounds like a contrived [model selection](https://en.wikipedia.org/wiki/Model_selection) problem. So there are many possible approaches. Given that you've just computed the MLE [AIC](https://en.wikipedia.org/wiki/Akaike_information_criterion) would be one of them. – Stefan Zobel Jan 23 '17 at 20:56
  • I'm voting to close this question as off-topic because it is about probability / statistics / [math.se] instead of programming or software development. – Pang Feb 02 '17 at 02:02

1 Answers1

1

You are asked which of the two models are more probable, so you need to know the prior over two distributions. Since you know nothing about them, and there are just two, lets assume that priors are 1/2, then you have:

P(distr = x | data) = P(data | distr = x) P(distr = x) / P(data)

thus

P(distr = exp | data) > P(distr = poiss | data) <-> 
P(data | distr = exp) > P(data | distr = poiss)

and all you have to do is compare these two probabilities (coming from MLE) which you already have done.

P(data) does not matter because it is the same in both cases. P(distr=x) we assumed to be equal,so does not matter either. In general people modify P(distr=x) in various ways to take into account "complexity" of the distribution (this is what things like AIC and other do - they assume some heuristic mapping between parametrization of the distribution to its prior probability).

lejlot
  • 64,777
  • 8
  • 131
  • 164