This is similar to reversing one-hot encoding, but I have multiple columns that might be labeled.
I have this:
|col1|col2|
|1 |0 |
|0 |1 |
|1 |1 |
I want this:
|col1|col2|new |
|1 |0 |'col1' |
|0 |1 |'col2' |
|1 |1 |'col1_col2'|
Here is what I tried:
df.idxmax(axis=1)
It only returns the first instance and will not capture rows that have multiple 1
s
def get_cat(row):
temp = []
for c in df[codes].columns:
if row[c]==1:
return c
This does the same thing: it only returns the first column name and misses rows with multiple columns having a 1
.