Use stats.trim_mean ass aggfunc in Python Pandas Pivot

Question

I'm trying to create a pivot with pandas where the aggregation function is truncated mean that takes 10% of each end of data.

    pivot = pd.pivot_table(data[(data['New_category_ID'] > 0)&(data['YYYY'] == 2016)], index = 'New_category',  values=['GrossRevenue_GBP','MOVC_GBP','PM_GBP'],aggfunc=stats.trim_mean(0.1))

I'm getting this following error:

TypeError: trim_mean() missing 1 required positional argument: 'proportiontocut'

stats.trim_mean() takes two arguments one being the data the other being the % of data to remove from each end of the data. In this case I'm specifying 10%.

How do I get around the need to specify the column/dataframe it should take the trimmed mean of given I am trying to create pivot table?

Would really appreciate some help

@ayhan - thank you so much. It runs perfectly, however when I validate with Excel it seems that the trim mean from Excel is different from that given from Python. Need to decide now which one is correct... If you make it an answer I will approve it — D_usv, Aug 09 '17 at 11:53
That's probably a difference caused by how they calculate the percentiles (there are several approaches). — ayhan, Aug 09 '17 at 12:19
Most probably..On further validation, when I say do truncated mean of 0.5 (i.e. the median) in Python and compare this to the median calculated in Excel they are the same. However when I do truncated mean in Excel of 0.5 it gives a different result to the median calculated in Excel. I will go with the Python truncated mean results. Thanks again — D_usv, Aug 09 '17 at 12:49

score 4 · Accepted Answer · answered Aug 09 '17 at 12:19

You can use a lambda function:

pivot = pd.pivot_table(data[(data['New_category_ID'] > 0) & (data['YYYY']==2016)], 
                       index='New_category',
                       values=['GrossRevenue_GBP', 'MOVC_GBP', 'PM_GBP'],
                       aggfunc=lambda x: stats.trim_mean(x, 0.1))

Use stats.trim_mean ass aggfunc in Python Pandas Pivot

1 Answers1