0

I have one dataframe as below. I want to add one column to store how many times the date occurs in this dataframe. Thanks.

import pandas as pd
import numpy as np
df2 = pd.DataFrame({
    'date': [20130101,20130101, 20130105, 20130105, 20130107, 20130108],
    'price': [25, 16.3, 23.5, 27, 40, 8],
})
jpp
  • 159,742
  • 34
  • 281
  • 339
Hong
  • 263
  • 2
  • 8

2 Answers2

2

Try:

df2['Occur']=df2.groupby('date')['date'].transform(pd.Series.value_counts)
print(df2)

OR:

df2['Occur']=df2['date'].apply(df2['date'].tolist().count)
print(df2)

Both reproduce:

       date  price  Occur
0  20130101   25.0      2
1  20130101   16.3      2
2  20130105   23.5      2
3  20130105   27.0      2
4  20130107   40.0      1
5  20130108    8.0      1
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
2

Using GroupBy + transform with size:

df2['date_count'] = df2.groupby('date')['date'].transform('size')

print(df2)

     C     D      date  price  date_count
0   25   2.3  20130101     25           2
1  163   4.2  20130101    163           2
2  235   6.8  20130105    235           2
3  -25   8.8  20130105    -25           2
4   40  11.3  20130107     40           1
5   -8  15.8  20130108     -8           1
jpp
  • 159,742
  • 34
  • 281
  • 339