**My goals: ** filter an inner dataframe based on the row (then the values) of an outer dataframe
Let's consider three dataframes:
- first a list of shop with detail about the shop:
Shop = pd.DataFrame({
'Shop_number': ['SH10','SH20', 'SH30','SH40' ,'SH50','SH60','SH70'],
'Shop': ['Chisinau','Milan', 'Paris','Madrid', 'Lyon','London', 'foo']
})
Shop
- then a dataframe containing the sales records:
Sales = pd.DataFrame({
'Shop_number': ['SH20', 'SH30', 'SH50','SH70'],
'Amount':[200,400,50,1000],
'Type' : 'Actual'
})
Sales
Which gives:
then a dataframe containing the budget sales records:
Budget = pd.DataFrame({
'Shop_number': ['SH10','SH20', 'SH30','SH40' ,'SH50','SH60','SH70'],
'Amount':[100,150,200,300,40,500,150],
'Type' : 'Budget'
})
print(Budget)
Let's summarize data as is:
base=pd.concat([Sales,Budget])
base
Then what I try to achieve is express Sales and Budget based on Shop table as:
(Shop
.assign(
Budget = lambda x :
base.loc[(base['Shop_number']=='SH20') & (base['Type']=='Budget')].loc[:,'Amount'].sum(),
Actual = lambda x :
base.loc[(base['Shop_number']=='SH20') & (base['Type']=='Actual')].loc[:,'Amount'].sum(),
var = lambda x :
x.Budget-x.Actual
)
)
Then it work as intended:
But when I try to make it dynamic then it fails:
(Shop
.assign(
Budget = lambda x :
base.loc[(base['Shop_number']==x.Shop_number) & (base['Type']=='Budget')].loc[:,'Amount'].sum(),
Actual = lambda x :
base.loc[(base['Shop_number']==x.Shop_number) & (base['Type']=='Actual')].loc[:,'Amount'].sum(),
var = lambda x :
x.Budget-x.Actual
)
)
and returns a trace back:
ValueError: Can only compare identically-labeled Series objects
My question is:
How would you do to make it work? Would you handle the purpose of this calculation like this or on another more optimal way?
I try what is in the description then it failed. Of course I tried some make around but it failed as well can't find the solution