Try changing
dataframe.column
to dataframe.loc[:,column]
.
dataframe[['column']]
to dataframe.loc[:,[column]]
For more help, please provide more information. Such as: What is enc
(show your imports)? What does dataframe
look like (show a small example, perhaps with dataframe.head(5)
)?
Details:
Since column
is an input (probably a string), you need to use it correctly when asking for that column from the dataframe object. If you just use dataframe.column
it will try to find the column actually named 'column', but if you ask for it dataframe.loc[:,column]
, it will use the string that is represented by the input parameter named column.
With dataframe.loc[:,column]
, you get a Pandas Series, and with dataframe.loc[:,[column]]
you get a Pandas DataFrame.
The pandas attribute 'columns', used as dataframe.columns
(note the 's' at the end) just returns a list of the names of all columns in your dataframe, probably not what you want here.
TIPS:
Try to name input parameters so that you know what they are.
When developing a function, try setting the input to something static, and iterate the code until you get desired output. E.g.
input_df = my_df
column_name = 'some_test_column'
if input_df.loc[:,column_name].nunique() > 2 and input_df.loc[:,column_name].dtypes == object:
enc.fit(input_df.loc[:,[column_name]])
onehot = enc.transform(input_df.loc[:,[column_name]]).toarray()
input_df.loc[:, enc.categories_] = onehot
elif input_df.loc[:,column_name].nunique() == 2 and input_df.loc[:,column_name].dtypes == object :
le.fit_transform(input_df.loc[:,[column_name]])
else:
print('Column cannot be transformed')
Look up on how to use SciKit Learn Pipelines, with ColumnTransformer. It will help make the workflow easier (https://scikit-learn.org/stable/modules/compose.html).