1

I would like to know if there is a way I can insert values into a new column of my dataframe based on using some sort of code which is similar to the df.groupby(['Col1','Col2']).agg({'Col2':'count'}) function.

My Df looks something like this:

ID Customer Connection Product_ID Count
10 AMXX     Instant    AAB        NaN 
11 JKXX     Slow       AAB        NaN
12 LKXX     Slow       HJA        NaN
13 AMXX     Instant    AAB        NaN
14 RFXX     Slow       WRQ        NaN
15 RFXX     Instant    WRQ        NaN

df['Count'] is the new empty column I have created where I would like to store the counts of how many times each customer uses the product that has been recorded per row in the 'Product_ID' column. Instead of doing a group-by, I was hoping to use the same df and only fill in the 'Count' column.

I would like the df to look something like this:

ID Customer Connection Product_ID Count
10 AMXX     Instant    AAB        2
11 JKXX     Slow       AAB        1
12 LKXX     Slow       HJA        1
13 AMXX     Instant    AAB        2
14 RFXX     Slow       WRQ        2
15 RFXX     Instant    WRQ        2

Would anyone happen to know how I can possibly do this? Thank you :)

R Sem
  • 241
  • 1
  • 7
  • 19
  • Can you explain ID 13, row where `product_ID` is changed from `AWR` to `AAB`? – harvpan Aug 27 '19 at 15:34
  • 1
    `df.groupby('Customer')['Product_ID'].transform('size')` – user3483203 Aug 27 '19 at 15:35
  • 2
    `df.groupby('Customer')['Product_ID'].transform('count')` – Erfan Aug 27 '19 at 15:35
  • 1
    Possible duplicate of [How do I create a new column from the output of pandas groupby().sum()?](https://stackoverflow.com/questions/30244952/how-do-i-create-a-new-column-from-the-output-of-pandas-groupby-sum) – Erfan Aug 27 '19 at 15:36
  • @harvpan Oh Apologies, that was an error. I will correct this now! – R Sem Aug 27 '19 at 15:36
  • @Erfan My question is different to the post you linked :) – R Sem Aug 27 '19 at 15:39
  • @RSem did the code in comments work? – harvpan Aug 27 '19 at 15:50
  • @harvpan Yesss! Thank you for your help everyone :) – R Sem Aug 27 '19 at 15:58
  • @guys, one question, you gave the answer as a comment, and I knew the answer but I didn't write my answer because it is already in the comment. Then someone else answered the question. Why you added a comment here not ana answer? :) – J.K Aug 27 '19 at 16:01
  • @J.K, if someone contribute as comment or in answer section as per his ease that doesn't matter, but if you place an answer which is producible and solve the purpose then it benefits the posterity , i placed the answer as i was already simulated the dataFrame and sought to put it there, which also helps the question to be marked as Answered and displaced it from the Unanswered queue. However, i clearly stated as i borrowed the idea which i was thinking the same. – Karn Kumar Aug 27 '19 at 16:14
  • 1
    @pygo, good to know :) – J.K Aug 27 '19 at 16:38

2 Answers2

1

Try:

df['Count'] = df.groupby('Customer')['Product_ID'].transform('count')
Joe
  • 879
  • 2
  • 6
  • 15
0

Just borrowing the same idea as given in the comment section:

You can use df.groupby and assign values back to a new column called like new_count which should suffice to get the desired..

You can use count or size with transform both should suffice the purpose.

>>> df.assign(new_count = df.groupby('Customer')['Product_ID'].transform('count')).drop('Count', axis=1)
   ID Customer Connection Product_ID  new_count
0  10     AMXX    Instant        AAB          2
1  11     JKXX       Slow        AAB          1
2  12     LKXX       Slow        HJA          1
3  13     AMXX    Instant        AAB          2
4  14     RFXX       Slow        WRQ          2
5  15     RFXX    Instant        WRQ          2

Note: In above example assign will help to create a new column for you, which gives you freedom to keep the existing cols and add new column yielded from the groupby , however if you want overwrite the existing count column then you can use below.. but better to have added new call in case you want to preserver all existing dataset.

df = df.assign(Count = df.groupby('Customer')['Product_ID'].transform('count'))

OR

df['Count'] = df.groupby('Customer')['Product_ID'].transform('count')`
Karn Kumar
  • 8,518
  • 3
  • 27
  • 53