The title essentially captures my problem.
I have a dataframe and multiple columns have values such as [0,1]
and if I were to go and one hot encode the df, I'd have multiple columns with the same name.
The tedious solution would be to manually create unique columns but I have 58 columns that are categorical so that doesn't seem very efficient.
I'm not sure if this will be helpful, but here is the head()
of my dataframe.
x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 ... z217 z218 z219 z220 z221 z222 subject phase state output
0 0 0 1 -300.361218 0.886360 -2.590886 225.001899 0.006204 0.000037 -0.000013 ... 0.005242 0.024971 -1017.620978 -382.850838 -48.275711 -2.040336 A 3 B 0
1 0 0 1 -297.126090 0.622211 -3.960940 220.179017 0.006167 -0.000014 -0.000003 ... 0.001722 0.023595 91.229094 24.802230 1.783950 0.022620 A 3 C 0
2 0 0 1 -236.460253 0.423640 -12.656341 139.453445 0.006276 -0.000028 0.000022 ... -0.010894 -0.036318 -188.232347 -17.474861 -1.005571 -0.021628 A 3 B 0
3 0 0 1 33.411458 2.854415 -1.962432 3.208911 0.009752 -0.000273 -0.000024 ... -0.034184 -0.047734 185.122907 -549.282067 542.193381 -178.049926 A 3 A 0
4 0 0 1 -118.125214 2.009809 -3.291637 34.874176 0.007598 0.000001 -0.000022 ... 0.001963 0.004084 35.207794 -78.143166 57.084208 -13.700212 A 4 C 0