I have a DataFrame that looks like this:
product units_sold week
sku 5 W01
sku 3 W02
sku 2 W03
sku 4 W04
sku 6 W05
sku 5 W36
sku 3 W38
sku 2 W39
sku 4 W40
In the 'week', w37 is missing, and columns are in a wrong sequence:
I have a list with all needed rows and right sequence:
week_list = ['W36','W37','W38','W39','W40','W01','W02','W03','W04','W05']
desired output is:
product units_sold week
sku 5 W36
sku 0 W37
sku 5 W38
sku 2 W39
sku 4 W40
sku 6 W01
sku 3 W02
sku 2 W03
sku 4 W04
sku 6 W05
Where W37 is on place and 'unit_sold' is 0
Just add one column solution is not suited, as I have quite a big DataFrame, and there is probably can be other missing rows.
I tried with pd.sort_values and pd.categorical:
def sorter(column):
reorder = week_list
cat = pd.Categorical(column, categories=reorder, ordered=True)
return pd.Series(cat)
df.sort_values(by="week", key=sorter)
this helped me to set the right sequence, but 'w37' is still missing, so the problem is not solved.
Is there any way I can update a DataFrame rows values based on this list?