I have to create a pre-processing pipeline dynamically to impute missing values, this is, I want to go through all the columns in a pandas data frame (which I don't know before-hand), and impute their missing values.
To impute the missing values I use sklearn.preprocessing.SimpleImputer
I use a different imputer in case the column is numerical or not, like this:
numerical_imputer = SimpleImputer(strategy='median')
categorical_imputer = SimpleImputer(missing_values=None,strategy='most_frequent')
My problem is that sometimes pandas would encode the missing values as one of np.nan, None. pd.NaN, and it's not always the same. If I force the missing values encoding it changes the whole column dtype which is something I don't want to do
Is there any way to make this work with any data type and missing value encoding (of the possible ones for pandas)?