3

I have two arrays in Python with random numbers:

vn = np.random.normal(20,5,500);
vu = np.random.uniform(17,25,500);

I'm trying to create a Frequency table with pandas to have a count of the occurrences by range, but I really have no idea on how to do it, an example input and output would look like:

input:

vn: [2,3,6,6,7,8,9,9,10,7]
vu: [1,1,2,3,6,7,7,7,8,9]

output:

Range     count_vn     count_vu
(0, 5]        2            4
(5, 10]       8            6
Christopher Moore
  • 15,626
  • 10
  • 42
  • 52
Colanta
  • 87
  • 1
  • 6

2 Answers2

8

IIUC:

In [228]: df.apply(lambda x: pd.cut(x, bins=[0,5,10]).value_counts()).add_prefix('count_')
Out[228]:
         count_vn  count_vu
(5, 10]         8         6
(0, 5]          2         4

or a nicer solution provided by @ayhan:

In [26]: df.apply(pd.Series.value_counts, bins=[0,5,10])
Out[26]:
               vn  vu
(5.0, 10.0]     8   6
(-0.001, 5.0]   2   4

somehow it produced "strange" bins...

MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
1

You can try pd.cut with groupby and then concat the dataframes.

vn= [2,3,6,6,7,8,9,9,10,7]
vu= [1,1,2,3,6,7,7,7,8,9]
df = pd.DataFrame({'vn': vn, 'vu': vu})
bins = np.arange(0,df.stack().max()+1,5)
pd.concat([df.groupby(pd.cut(df.vn, bins=bins)).vn.count(),\ 
df.groupby(pd.cut(df.vu, bins=bins)).vu.count()], axis = 1)

You get

        vn  vu
(0, 5]  2   4
(5, 10] 8   6

There might be a way to do it directly without concat but I couldn't come up with any

Vaishali
  • 37,545
  • 5
  • 58
  • 86