2

I want to resample a dataframe by date and creating a column based on the aggregate sum of city df:

+-----------------+-------------------+------------+
| booking_date    |  Cities           |  province  | 
+-----------------+-------------------+------------+
|  15-12-17       |  Kota Depok       | Jawa Barat |    
|  15-12-17       |  Bogor            | Jawa Barat |      
|  15-12-17       |  Kota Depok       | Jawa Barat |     
|  15-12-17       |  Kota Bandung     | Jawa Barat |    
|  15-12-17       |  Kota Bandung     | Jawa Barat |   
+-----------------+-------------------+------------+

The output may looks like this:

df:

+-----------------+-------------------+------------+------------+
| booking_date    |  Cities           |  province  |  Count     | 
+-----------------+-------------------+------------+------------+
|  15-12-17       |  Kota Depok       | Jawa Barat |  2         |
|  15-12-17       |  Bogor            | Jawa Barat |  1         |
|  15-12-17       |  Kota Bandung     | Jawa Barat |  2         | 
+-----------------+-------------------+------------+------------+

How to achieve it?

ebuzz168
  • 1,134
  • 2
  • 17
  • 39

2 Answers2

2

Use GroupBy.size with Series.reset_index and name parameter:

df = df.groupby(['booking_date','Cities','province']).size().reset_index(name='Count')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • dupe of [this](https://stackoverflow.com/questions/17945247/pandas-groupby-get-size-of-a-group-knowing-its-id-from-grouper-group-info0)? – Erfan Apr 10 '20 at 11:59
1

The first solution that came to my mind was the same as @jezrael's one. However, another could be the one combining pandas.DataFrame.assign(), pandas.Series.map(), pandas.Series.value_counts() and pandas.DataFrame.drop_duplicates().

The code is following.

>>> df = df\
...     .assign(Count = df['Cities'].map(df['Cities'].value_counts()))\
...     .drop_duplicates()
>>> print(df)
  booking_date        Cities    province  Count
0     15-12-17    Kota Depok  Jawa Barat      2
1     15-12-17         Bogor  Jawa Barat      1
3     15-12-17  Kota Bandung  Jawa Barat      2
Jaroslav Bezděk
  • 6,967
  • 6
  • 29
  • 46