-1

Just to let everybody know the context, I will tell my aim prior my question. I´m trying to modelize, for soccer matches prediction, corners and goals. In order to modelize it, I´m trying with a discretized normal distribution, and as the title suggest, Poisson distribution. So, let´s assume that I collected the data, and my assumption is results fo over 3 months ago are not relevant, so I will have a vector like this one.

a=[6,3,12,4,7,8,6,8,9]

I know that I can use scipy to return the odds of a particular outcome, like for example 6 corners, this way.

>>> scipy.stats.distributions.poisson.pmf(6, mean)

And I guess I could add all prior numbers to get the odds for 6 or less corners, but isn´t there a way to calculate the odds for a whole range of numbers, in this case, from 0 to 6?

Besides, if anyone versed on mathematics or whatever want to share a better suited distribution or procedure, be my guest. I know that in order to use a fitting distribution model algorithm I should have far larger data than 10 matches, but thats with what I have to work. AS an afterthought, i´m considering using median instead of mean if the mean is too skewed, so you guays can share your opinions on that too.

Thanks in advance

puppet
  • 707
  • 3
  • 16
  • 33

1 Answers1

1

For a given discrete distribution, the pmf function calculates the probability of a particular value, and the cdf function calculates the probability of any value less than or equal to the given value:

>>> poisson = scipy.stats.distributions.poisson
>>> poisson.pmf(6, 11.5)
0.032543780632085614
>>> poisson.pmf([0,1,2,3,4,5,6], 11.5)
array([  1.01300936e-05,   1.16496076e-04,   6.69852439e-04,
         2.56776768e-03,   7.38233209e-03,   1.69793638e-02,
         3.25437806e-02])
>>> sum(poisson.pmf([0,1,2,3,4,5,6], 11.5))
0.060269722823413086
>>> poisson.cdf(6, 11.5)
0.060269722823413183
>>> 

If you want the probability of a range of values that doesn't start with zero, you can subtract cdfs, so the probability of 3 <= X <= 6 for X a Poisson variable with mean 11.5 is:

>>> poisson.cdf(6, 11.5) - poisson.cdf(2, 11.5)
0.059473244214220844
>>> sum(poisson.pmf([3,4,5,6], 11.5))
0.059473244214220747
>>> 
K. A. Buhr
  • 45,621
  • 3
  • 45
  • 71