count frequency based in two columns without group by

Question

I have a dataset with 3 columns: Category, Country, and Count (which is always 1 - and is pretty useless, actually).

What I want to achieve is something like the yellow column here:

img 1: how I want and what I want

I could do a simple group by in python, but that's not what I want, because I want to preserve the individual rows of the data, different from the image below (that groups them):

what I did and I don't want (group by)

I just wanted to get the frequency based on both columns, without grouping it, any idea? I thought about iterating with for loops, but I couldn't, I'm kind of a beginner in python, so your help is deeply appreciated.

Please always try to write out your data instead of posting an image so we don't have to work extra to help you. See [How to ask](https://stackoverflow.com/help/how-to-ask) — Juan C, Nov 20 '19 at 14:00
hmmm, groupby is used incorrect, use [this](https://stackoverflow.com/q/37189878) — jezrael, Nov 20 '19 at 14:01

score 0 · Accepted Answer · answered Nov 20 '19 at 16:07

It seems like you want to use transform here. That will create a new column in your dataframe with the grouped summary statistics you are looking for.

import pandas as pd
df = pd.DataFrame({'category_cluster' : ['Assault', 'Assault', 'Assault', 'Assault', 'Assault', 'Assault', 'Assault'],
                   'Country': ['Egypt', 'India', 'India', 'Mexico', 'Mexico', 'Mexico', 'Morocco'],
                   'Count' : [1, 1, 1, 1, 1, 1, 1]})

df['new_column'] = df.groupby(['category_cluster', 'Country'])['Count'].transform('sum')

count frequency based in two columns without group by

1 Answers1