I have a DataFrame df that contains information about orders, including the 'Webshop_Order', 'Category', 'Level', 'Class', 'USIM', 'Size', 'Color', 'Length', 'Demand_Qty', and 'Return_Qty' columns.
I want to create a pivot table that includes the number of orders for each size and color combination per USIM and Webshop_Order.
I would like to calculate some summary columns that I am able to do:
'multiple_sizes_in_transaction' 'duplicate_sizes_in_transaction'
Here's my current code:
# ...
# Pivot to create table of USIM/ORDER with the number of orders of each size
sizes_per_order = df.pivot_table(
index=['USIM', 'Webshop_Order'],
columns='Size',
values='Demand_Qty',
aggfunc='sum',
fill_value=0,
)
sizes_per_order = sizes_per_order.assign(
multiple_sizes_in_transaction=sizes_per_order.gt(0).sum(axis=1).gt(1),
duplicate_sizes_in_transaction=sizes_per_order.gt(1).any(axis=1),
)
# ...
However, I also want to include the 'Color' parameter in the pivot table and calculate the number of orders with
- multiple colors
- duplicate colors.
so, I decided to update the code
# Pivot to create table of USIM/ORDER with the number of orders of each size and color
sizes_per_order = df.pivot_table(
index=['USIM', 'Webshop_Order'],
columns=['Size', 'Color'],
values='Demand_Qty',
aggfunc='sum',
fill_value=0,
)
sizes_per_order = sizes_per_order.assign(
multiple_sizes_in_transaction=sizes_per_order.gt(0).sum(axis=1).gt(1),
multiple_colors_in_transaction=sizes_per_order.astype(bool).sum(axis=1, level='Color').gt(1).all(axis=1),
duplicate_sizes_in_transaction=sizes_per_order.gt(1).any(axis=1),
duplicate_colors_in_transaction=sizes_per_order.astype(bool).sum(axis=1, level='Color').gt(1).any(axis=1)
)
sizes_per_order
but the output is not visually good, it's difficult to read it well due to a lot of column values.
Could you please guide me on how to modify the code to achieve a nice representation of this?
Thank you in advance for your help!