Frequency Table by range in pandas

Question

I have two arrays in Python with random numbers:

vn = np.random.normal(20,5,500);
vu = np.random.uniform(17,25,500);

I'm trying to create a Frequency table with pandas to have a count of the occurrences by range, but I really have no idea on how to do it, an example input and output would look like:

input:

vn: [2,3,6,6,7,8,9,9,10,7]
vu: [1,1,2,3,6,7,7,7,8,9]

output:

Range     count_vn     count_vu
(0, 5]        2            4
(5, 10]       8            6

MaxU - stand with Ukraine · Accepted Answer · 2017-08-22T07:26:38.863

8

IIUC:

In [228]: df.apply(lambda x: pd.cut(x, bins=[0,5,10]).value_counts()).add_prefix('count_')
Out[228]:
         count_vn  count_vu
(5, 10]         8         6
(0, 5]          2         4

or a nicer solution provided by @ayhan:

In [26]: df.apply(pd.Series.value_counts, bins=[0,5,10])
Out[26]:
               vn  vu
(5.0, 10.0]     8   6
(-0.001, 5.0]   2   4

somehow it produced "strange" bins...

edited Aug 22 '17 at 07:26

answered Aug 21 '17 at 22:40

MaxU - stand with Ukraine

205,989
36
386
419

Hey, in this case the output shows always 5, as if it was counting the occurences of the index in the range, instead of the actual values in vu and vn, Thank you – Colanta Aug 21 '17 at 22:47
@Colanta, i've updated my post - is that what you want? – MaxU - stand with Ukraine Aug 21 '17 at 23:08
1

Ah! That's it!:). My solution looks way too convoluted in front of this – Vaishali Aug 21 '17 at 23:10
@MaxU Indeed, thank you very much, i need only to find how to order the ranges, because after (1.5, 2] it follows (10, 10.5] but i markes as solved, thank you very much – Colanta Aug 21 '17 at 23:18
2

value_counts also acceps a bin argument so you can do things like `df.apply(pd.Series.value_counts, bins=[0,5,10, 15, 20])` – ayhan Aug 21 '17 at 23:37
@ayhan, it looks much nicer, thank you! Somehow it produces "strange" bins for me... – MaxU - stand with Ukraine Aug 22 '17 at 07:27
Note: Use pd.cut() when you need to segment and sort data values into bins. https://pandas.pydata.org/docs/reference/api/pandas.cut.html – masaya Jun 19 '22 at 15:17

score 1 · Answer 2 · answered Aug 21 '17 at 23:09

You can try pd.cut with groupby and then concat the dataframes.

vn= [2,3,6,6,7,8,9,9,10,7]
vu= [1,1,2,3,6,7,7,7,8,9]
df = pd.DataFrame({'vn': vn, 'vu': vu})
bins = np.arange(0,df.stack().max()+1,5)
pd.concat([df.groupby(pd.cut(df.vn, bins=bins)).vn.count(),\ 
df.groupby(pd.cut(df.vu, bins=bins)).vu.count()], axis = 1)

You get

        vn  vu
(0, 5]  2   4
(5, 10] 8   6

There might be a way to do it directly without concat but I couldn't come up with any

Frequency Table by range in pandas

2 Answers2

Linked