0

I have been trying to get the frequency of each ID per day over a period of time. I have the following dataframe:

data1 = pd.DataFrame({
'Date_Time': [
'2010-01-01', '2010-01-01', 
'2010-04-02', '2010-04-01',
'2011-01-01', '2011-01-01', 
'2013-01-01', '2014-01-01', 
'2014-01-01', '2015-01-01', 
'2016-01-01', '2011-01-01'], 
'ID': [1, 1, 1, 1, 2, 2, 3, 4, 4, 5, 6, 6]
})

So I would like to get the frequency of each ID per day given that there are many days in which the same ID exists. I tried the following approach which worked partly and am still strugling with getting the it right. Here is the code which I have used:

for dt in set(data1['Date_Time']):  
  for id in df['ID']:
    length = len(data1[data1['Date_Time']==dt])
  data1.loc[data1['Date_Time']==dt, 'new'] = length

The final result should be looking something like this

Here are the assumed results

Shaido
  • 27,497
  • 23
  • 70
  • 73
Dave Will
  • 95
  • 7

1 Answers1

0

You can use df.groupby() with transform:

In [94]: data1['ID_freq_per_day'] = data1.groupby(['Date_Time', 'ID'])['ID'].transform('size')

In [95]: data1
Out[95]: 
     Date_Time  ID  ID_freq_per_day
0   2010-01-01   1                2
1   2010-01-01   1                2
2   2010-04-02   1                1
3   2010-04-01   1                1
4   2011-01-01   2                2
5   2011-01-01   2                2
6   2013-01-01   3                1
7   2014-01-01   4                2
8   2014-01-01   4                2
9   2015-01-01   5                1
10  2016-01-01   6                1
11  2011-01-01   6                1
Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58