0

Probably a naive question, but new to it.

I have a dataset having multiple date entries. I want to create a new dataframe such that it has 2 columns, date and count.

My dataset looks like this :

Date
01/01/2020
01/10/2019
01/11/2019
01/12/2019
02/01/2020
02/10/2019
02/11/2019
02/12/2019
03/01/2020
03/10/2019
03/11/2019
03/12/2019
04/01/2020
04/10/2019
04/11/2019
04/12/2019
05/01/2020
05/10/2019
05/11/2019
05/12/2019
06/01/2020

I have tried :

import pandas as pd
count = df.groupby(['Date']).count()

But it is just returning me the dates grouped by and not the count.

the expected output is :

Date        Count

Since the dates are repeated in the dataset, each date should represent the same count. i.e. if 01/01/2020 has 5 entries then it should have count as 5 in all 5 entries.

Can anyone help.

Thanks

vp7
  • 388
  • 1
  • 13
  • 2
    look at transform : `df['Count']=df.groupby('Date').transform('count')` – anky Oct 17 '19 at 05:08
  • `.groupby('Date').size()` will return series with count – N.Moudgil Oct 17 '19 at 05:18
  • @anky_91 Thanks for your response, however I am getting this `Traceback (most recent call last): File "", line 1, in df1['Count']=df1.groupby('date').transform('count') KeyError: date'` – vp7 Oct 17 '19 at 05:21
  • It's a key error, you are using `date` as column but I think it should be `Date` caps `D` – N.Moudgil Oct 17 '19 at 05:24
  • @N.Moudgil I tried .size() but it creates the count column with NaN values. Not sure why. Can you please help. @anky_91 .transform('Count') is not working saying `TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed` – vp7 Oct 17 '19 at 05:30
  • @vp7 yeah, my bad, use the column name before transform: `df.groupby('Date')['Date'].transform('count')` – anky Oct 17 '19 at 05:36

0 Answers0