Problem
I am using the sklearn.preprocessing.Imputer class to impute NaN values using a mean strategy over the columns, i.e. axis=0. My problem is that some data which needs to be imputed only has NaN values in it's column, e.g. when there is only a single entry.
import numpy as np
import sklearn.preprocessing import Imputer
data = np.array([[1, 2, np.NaN]])
data = Imputer().fit_transform(data)
This gives an output of array([[1., 2.]])
Fair enough, obviously the Imputer cannot compute a mean for a set of values which are all NaN. However, instead of removing the value I would like to fall back to a default value, in my case 0.
Current approach
To solve this problem I first check whether an entire column only contains NaN values, and if so, replace them with my default value 0:
# Loop over all columns in data
for column in data.T:
# Check if all values in column are NaN
if all(np.isnan(value) for value in column):
# Fill the column with default value 0
column.fill(0)
Question
Is there a more elegant way to impute to a default value if an entire axis only contains NaN values?