I want to use KNN for imputing categorical features in a sklearn pipeline (muliple Categorical features missing).
I have done quite a bit research on existing KNN solution (fancyimpute, sklearn KneighborRegressor). None of them seem to be working in terms
- work in a sklearn pipeline
- impute categorical features
Some of my questions are (any advice is highly appreciated):
- is there any existing approach to allow using KNN (or any other regressor) to impute missing values (categorical in this case) to work with sklearn pipeline
- fancyimpute KNN implementation seems not use hamming distance for imputing missing values (which is ideal for categorical features).
- is there any fast KNN method implementation available considering KNN is time consuming when imputing missing values (i.e., run prediction on missing values against the whole datasets)