1

I am having trouble analysing origin-destination values in a pandas dataframe which contains origin/destination columns and a count column of the frequency of these. I want to transform this into a dataframe with the count of how many are leaving and entering:

Initial:

    Origin  Destination  Count

    A       B            7
    A       C            1
    B       A            1
    B       C            4
    C       A            3
    C       B            10

For example this simplified dataframe has 7 leaving from A to B and 1 from A to C so overall leaving place A would be 8, and entering place A would be 4 (B - A is 1, C - A is 3) etc. The new dataframe would look something like this.

Goal:

   Place   Entering  Leaving

   A       4         8
   B       17        5
   C       5         13

I have tried several techniques such as .groupby() but have not yet created my intended dataframe. How can I handle the repeated values in the origin/destination columns and assign them to a new dataframe with aggregated values of just the count of leaving and entering?

Thank you!

Cœur
  • 37,241
  • 25
  • 195
  • 267
Geo_Py
  • 13
  • 2

2 Answers2

1

Use double groupby + concat:

a = df.groupby('Destination')['Count'].sum()
b = df.groupby('Origin')['Count'].sum()

df = pd.concat([a,b], axis=1, keys=('Entering','Leaving')).rename_axis('Place').reset_index()
print (df)
  Place  Entering  Leaving
0     A         4        8
1     B        17        5
2     C         5       13
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

pivot_table then do sum

df=pd.pivot_table(df,index='Origin',columns='Destination',values='Count',aggfunc=sum)
pd.concat([df.sum(0),df.sum(1)],1)

Out[428]: 
      0     1
A   4.0   8.0
B  17.0   5.0
C   5.0  13.0
BENY
  • 317,841
  • 20
  • 164
  • 234