My data is in the shape
Event Id Var1 Var2 Var3
1 a x w y
2 a z y w
3 b x y q
and I need to create multi-hot encoded vectors for each row in the table, considering all the values appearing in Var1, Var2 and Var3. Meaning that the desired output would be:
Event Id x y z w q
1 a 1 1 0 1 0
2 a 0 1 1 1 0
3 b 1 1 0 0 1
Meaning that I keep the same number of rows of the initial dataset, I only add for each row a number of columns equal to all the unique factors among Var 1, Var 2 and Var3.
I tried all aproaches I could think of, but nothing seems to work so far..
Any idea?