My data is being grouped correctly.
df_RFQ_by_Salesperson = df[
(df['state'].str.contains('Done'))
][['sales_person_name2',
'rfq_qty',
'rfq_qty_CAD_Equiv',
'state'
]].copy()
df_RFQ_by_Salesperson = df_RFQ_by_Salesperson.groupby('sales_person_name2').agg({'state': 'size','rfq_qty': 'sum', 'rfq_qty_CAD_Equiv': 'sum'})
df_RFQ_by_Salesperson['Percentage'] = df_RFQ_by_Salesperson.rfq_qty_CAD_Equiv / df_RFQ_by_Salesperson.rfq_qty_CAD_Equiv.sum()
df_RFQ_by_Salesperson = df_RFQ_by_Salesperson.rename(columns={'state':'Done Trades'}, level=0) # rename the column header in the groupby
display(df_RFQ_by_Salesperson.sort_values('Percentage',ascending=False))
sales_person_name2 Done Trades rfq_qty rfq_qty_CAD_Equiv Percentage
MP 11 214400000.0 3.045802e+08 0.258089
AC 22 228800000.0 2.648099e+08 0.224390
YJ 7 202500000.0 2.490527e+08 0.211038
RW 18 129000000.0 1.693008e+08 0.143459
AY 171 118366000.0 1.189635e+08 0.100805
RL 47 78617000.0 7.342725e+07 0.062219
But when I try to visualize with sns.countplot, it appears the grouped by column is not in the column list hence an error is raised.
display(df_RFQ_by_Salesperson.columns)
Index(['Done Trades', 'rfq_qty', 'rfq_qty_CAD_Equiv', 'Percentage'], dtype='object')
# # Visualisation
ax = sns.countplot(
x='sales_person_name2',
data=df_RFQ_by_Salesperson,
# Order by the count
order = df_RFQ_by_Salesperson['sales_person_name2'].value_counts().index,
color=plot_colour
)
for label in ax.xaxis.get_ticklabels():
label.set_rotation(90)
plt.show()
KeyError: 'sales_person_name2'
---> 22 order = df_RFQ_by_Salesperson['sales_person_name2'].value_counts().index,
Is there a way to force python to include sales_person_name2 in the datarame?