1

I have:

df = pd.DataFrame({"A": [[55218],[55218],[55218],[55222]], "B": [[0],[0],[2],[1]]})

I want count every 0, 1 or 2 for 55218 in "A" and give the relative frequency back

My expected output is:

df_new = pd.DataFrame({"A": [[55218],[55218],[55218],[55222]],"B": [[0], [0], [2], [1]],"Count": [[2], [2], [1], [1]], "rel_frequ": [[0.67], [0.67], [0.33], [1]] })
Mars
  • 41
  • 6

1 Answers1

1

Use DataFrame.transform and then divide column by mapped frequencies of A by Series.value_counts and Series.map:

df['Count'] = df.groupby(['A','B'])['A'].transform('size')
df['rel_frequ'] = df['Count'].div(df['A'].map(df['A'].value_counts()))
print (df)
       A  B  Count  rel_frequ
0  55218  0      2   0.666667
1  55218  0      2   0.666667
2  55218  2      1   0.333333
3  55222  1      1   1.000000
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252