Python mode function gives error for real-valued vector: No unique mode; found 2 equally common values

Question

Why can't statistics.mode find the mode for a normally distributed (therefore, unimodal) random variable, but works fine for vectors containing integers?

import numpy as np
from numpy.random import rand,randn
import statistics as st

y = randn(20)
print(st.mode(y))

This returns the following error

StatisticsError: no unique mode; found 20 equally common values

Could you provide the code that produces an error/doesn't produce the desired result? — Andre, Nov 16 '20 at 16:26
it is what's given. the error comes up for every generated `y` — develarist, Nov 16 '20 at 16:37

Lawhatre · Answer 1 · 2020-11-16T17:56:44.533

1

That's because mode doesn't exist. The number of unique element in y and the total element in y are same so no mode exits by definition.

np.size(np.unique(y)) - np.size(y)

>>> 0

Mode doesn't exist can also be verified by looking at the histogram (flat in the present case). Peaks in this graph represents mode and since we cann't find a peak, mode is None.

Edit: If you want to really find the mode then

Draw enough samples from the distribution. So that it reflects the original pdf
Adjust the precision (I have rounded it off to 1 decimal place). Consequently, the model will have a error range accordingly.

import numpy as np
from numpy.random import rand,randn
import statistics as st

y = randn(10000000)
st.mode(list(np.round(y,1)))

This gives

>>> 0.0

Following is the hist (See now you also get a peak at 0.0)

edited Nov 16 '20 at 17:56

answered Nov 16 '20 at 17:29

Lawhatre

1,302
2
10
28

A normal distribution always has a mode though so the program is wrong obviously, even the histogram. what is the work around – develarist Nov 16 '20 at 17:33
The output of the ```statistics ``` library is 100% correct. You cann't compute the mode in your case. – Lawhatre Nov 16 '20 at 17:35
Although, you are sampling from a normal distribution but the sample size matters! The present sample size is not a representation of the normal distribtion but rather a uniform distribution. And thats exactly why you are getting the error by the ```statistics``` – Lawhatre Nov 16 '20 at 17:39
Check out the edit @develarist, this will solve your problem. – Lawhatre Nov 16 '20 at 17:49
I've raised the sample size to 2000000 before and the error is still there – develarist Nov 16 '20 at 17:58
Can you please copy the code I provided and then run it. Its working fine – Lawhatre Nov 16 '20 at 18:01
what is the `round` for, and what if it is not included – develarist Nov 17 '20 at 13:50
Can you pls change the decimal places for rounding off ranging from 1 to say 6. For each setting observe the peak in the histogram. Also calculate mode for each. you will then understand its purpose. – Lawhatre Nov 17 '20 at 15:40
Actually, the pdf will gradually change from normal to uniform. round provides the precision of number measurement. – Lawhatre Nov 17 '20 at 15:43

score 1 · Answer 2 · answered Nov 16 '20 at 17:30

1

randn returns a third-party ndarray rather than a Python builtin array (i.e. a list). The statistics module was not built to serve numpy explicitly and so unexpected behaviour occurs.

A solution could be converting y to a list (i.e. st.mode(list(y))).

answered Nov 16 '20 at 17:30

honno

53
1
6

Completly disagree with the reasoning!! – Lawhatre Nov 16 '20 at 17:39
@Ragnar Ah so when I reproduced OP's example, I got a number which wasn't even in `y`, and so thought that was their problem (and converting to `list` solved it). If it's something else then ignore me. – honno Nov 16 '20 at 17:43
Ragnar updated his answer using your `list` idea though – develarist Nov 16 '20 at 17:59
1

Dunno what Ragnar is up to, but try running your example and change the last line i.e. `print(st.mode(y))` -> `print(list(st.mode(y)))`. Also note that `y` will not have a mode or mean of exactly `0`, as it is just generating numbers that will have modes or means that _tend towards_ `0`. – honno Nov 16 '20 at 18:04
I don't understand why someone would say your idea doesn't work and takes it for their own – develarist Nov 16 '20 at 18:06
1

Ragnars example isn't doing what I suggest (why `round` or change the sample size). – honno Nov 16 '20 at 18:07
@develarist Ah so what's the specific problem? Ragnar edited your question to say you get a `StatisticsError` raised—is that true? If so I can't help, because I can't reproduce that problem. I tried this on Python3.8. – honno Nov 17 '20 at 14:04
the error was also in the question title. i have the same python. why don't you get the error – develarist Nov 17 '20 at 14:07
@develarist Hmm, could you print the output of `y`? – honno Nov 17 '20 at 15:38
`y` is a random number generator, so it's stochastic – develarist Nov 17 '20 at 16:14
@develarist That's the thing, I'm guessing something is wrong with your Python environment so that the "OS-level RNG" being used by `randn` is faulty.. Just `print(y)` to see the contents of `y`. If you mean there's a different value for `y` everytime, then yeah I know that should be the case. – honno Nov 17 '20 at 17:39

Python mode function gives error for real-valued vector: No unique mode; found 2 equally common values

2 Answers2