I have this dataframe:
data = pd.DataFrame({
'Date': ['2000-01','2000-01', '2000-01','2000-02','2000-03', '2000-03', '2099-01'],
'Value': [10, 30, 2, 3, 40, 50, 999],
'Name': ['Peter', 'Peter', 'Jessica','Jessica', 'Will', 'Will', 'Peter']
})
Date Value Name
0 2000-01 10 Peter
1 2000-01 30 Peter
2 2000-01 2 Jessica
3 2000-02 3 Jessica
4 2000-02 40 Will
5 2000-03 50 Will
6 2099-01 999 Peter
I want to sum the values from Value
based on Date
and Name
such that the results will be below. Here you can see that for Peter
and dates 2000-01
, there are two values of 10
and 30
, where the sum is 40
, such that it is added in the first row of Sum
. In contrast, Jessica
only have one value per month (as can be seen in 2000-01
and 2000-02
with values 2
and 3
), such that those two rows will not be summed.
Date Value Name Sum
0 2000-01 10 Peter 40
1 2000-01 30 Peter 40
2 2000-01 2 Jessica 2
3 2000-02 3 Jessica 3
4 2000-02 40 Will 40
5 2000-03 50 Will 50
6 2099-01 999 Peter 999
How can I sum values based on two criteria as seen above? I only know of pd.pivot_table()
, but that will add new columns to the right based on unique values in a column rather downwards.