1

Given this data:

df = pd.DataFrame({'src_ip':['192.168.1.1','192.168.1.2','192.168.1.3','192.168.1.2'],
                   'dst_ip':['192.168.1.12','192.168.1.12','192.168.1.99','192.168.1.99'],
                   'count':[10,20,5,1]})

        src_ip        dst_ip  count
0  192.168.1.1  192.168.1.12     10
1  192.168.1.2  192.168.1.12     20
2  192.168.1.3  192.168.1.99      5
3  192.168.1.2  192.168.1.99      1

I'd like two new columns here that contains the total count of each IP address, i.e.:

        src_ip        dst_ip  count  src_ip_count   dst_ip_count
0  192.168.1.1  192.168.1.12     10            10             30
1  192.168.1.2  192.168.1.12     20            21             30
2  192.168.1.3  192.168.1.99      5             5              6
3  192.168.1.2  192.168.1.99      1            21              6

I can get these as values as

     df.groupby('src_ip')['src_ip_count'].sum()
     df.groupby('dst_ip')['dst_ip_count'].sum()

But how do I apply these back to the original dataframe - or what would be an idiomatic way to get the desired output shown above ?

binary01
  • 1,728
  • 2
  • 13
  • 27
  • If I am not mistaken you will need [```transform```](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.transform.html) – sophocles Jun 27 '22 at 09:15

0 Answers0