I am trying to get the most dominant/the most frequent value of a column, so I tried the following code,
df['currency'].value_counts(normalize=True)
which gives me, e.g.
USD 0.800000
CAD 0.100000
EUR 0.050000
GBP 0.050000
now the edge cases are like
USD 0.500000
CAD 0.500000
or
USD 0.333333
CAD 0.333333
CNY 0.333333
or
USD 0.400000
CAD 0.400000
CNY 0.100000
EUR 0.100000
and so on, where the frequencies are even among all values or part of the values.
Now I am trying to detect such edge cases so what is the best way to do that?
In other words, I am trying to find the most dominant frequency of some value in the series/column, in that df['currency'].value_counts().max()
is not necessarily giving the most frequency, since the values given by df['currency'].value_counts()
could all be the same. Hence data.df['currency'].value_counts().idxmax()
won't necessarily give the index/column value having the higest frequency in the column.