If you have a dataframe with missing data in multiple columns, and you want to impute a specific column based on the others, you can impute everything and take that specific column that you want:
from sklearn.impute import KNNImputer
import pandas as pd
imputer = KNNImputer()
imputed_data = imputer.fit_transform(df) # impute all the missing data
df_temp = pd.DataFrame(imputed_data)
df_temp.columns = df.columns
df['COL_TO_IMPUTE'] = df_temp['COL_TO_IMPUTE'] # update only the desired column
Another method would be to transform all the missing data in the desired column to a unique character that is not contained in the other columns, say #
if the data is strings (or max + 1
if the data is numeric), and then tell the imputer that your missing data is #
:
from sklearn.impute import KNNImputer
import pandas as pd
cols_backup = df.columns
df['COL_TO_IMPUTE'].fillna('#', inplace=True) # replace all missing data in desired column with with '#'
imputer = KNNImputer(missing_values='#') # tell the imputer to consider only '#' as missing data
imputed_data = imputer.fit_transform(df) # impute all '#'
df = pd.DataFrame(data=imputed_data, columns=cols_backup)