0

My data is being grouped correctly.

df_RFQ_by_Salesperson = df[
                          (df['state'].str.contains('Done'))
                          ][['sales_person_name2',
                             'rfq_qty',
                             'rfq_qty_CAD_Equiv',
                             'state'
                            ]].copy()
df_RFQ_by_Salesperson = df_RFQ_by_Salesperson.groupby('sales_person_name2').agg({'state': 'size','rfq_qty': 'sum', 'rfq_qty_CAD_Equiv': 'sum'})
df_RFQ_by_Salesperson['Percentage'] = df_RFQ_by_Salesperson.rfq_qty_CAD_Equiv / df_RFQ_by_Salesperson.rfq_qty_CAD_Equiv.sum()
df_RFQ_by_Salesperson = df_RFQ_by_Salesperson.rename(columns={'state':'Done Trades'}, level=0) # rename the column header in the groupby
display(df_RFQ_by_Salesperson.sort_values('Percentage',ascending=False))

sales_person_name2  Done Trades rfq_qty     rfq_qty_CAD_Equiv   Percentage          
MP                       11     214400000.0 3.045802e+08        0.258089
AC                       22     228800000.0 2.648099e+08        0.224390
YJ                       7      202500000.0 2.490527e+08        0.211038
RW                       18     129000000.0 1.693008e+08        0.143459
AY                       171    118366000.0 1.189635e+08        0.100805
RL                       47     78617000.0  7.342725e+07        0.062219

But when I try to visualize with sns.countplot, it appears the grouped by column is not in the column list hence an error is raised.

display(df_RFQ_by_Salesperson.columns)

Index(['Done Trades', 'rfq_qty', 'rfq_qty_CAD_Equiv', 'Percentage'], dtype='object')

# # Visualisation 
ax = sns.countplot(
                x='sales_person_name2', 
                data=df_RFQ_by_Salesperson, 
                # Order by the count
                order = df_RFQ_by_Salesperson['sales_person_name2'].value_counts().index,
                color=plot_colour
                 )
for label in ax.xaxis.get_ticklabels():
    label.set_rotation(90)  
plt.show()    

KeyError: 'sales_person_name2'
---> 22   order = df_RFQ_by_Salesperson['sales_person_name2'].value_counts().index,

Is there a way to force python to include sales_person_name2 in the datarame?

Peter Lucas
  • 1,979
  • 1
  • 16
  • 27

0 Answers0