I have a similar case like in my other question Pandas: map column using a dictionary on multiple columns but now, I want to use the max() value of column "category" not directly, but indirect for filling the None in the fourth column "category_name" the same case like in Question 1, but with an additional column with strings.
import pandas as pd
f = {'company': ['Company1', 'Company1', 'Company1', 'Company1', 'Company2', 'Company2'],
'product': ['Product A', 'Product A', 'Product F', 'Product A', 'Product F', 'Product F'],
'category': ['1', 1, '3', '2', 3, '5'],
'category_name': ['a', None, 'b', 'c', None, 'd']
}
df = pd.DataFrame(f)
Here the column "category" is always filled and the column "category_name" has some missing values:
company product category category_name
0 Company1 Product A 1 a
1 Company1 Product A 1 None
2 Company1 Product F 3 b
3 Company1 Product A 2 c
4 Company2 Product F 3 None
5 Company2 Product F 5 d
Again I would like to fill then None/Nan with values and again the logic I like to use would be: use the column "category_name" of the row with the max value in column "category" as a combination of column 1. + 2.
The wished result would be:
company product category category_name
0 Company1 Product A 1 a
1 Company1 Product A 1 **c**
2 Company1 Product F 3 b
3 Company1 Product A 2 c
4 Company2 Product F 3 **d**
5 Company2 Product F 5 d
-> combination "company1" + "Product A" the max(category)=3 -> therefore use "c" for the missing value of line 1 in column "category name".
I would highly appreciate also help on this. Thank you very much