(This question can probably be generalized to filtering any Boolean Pandas series, but nothing that I can find on that subject addresses my issue.)
Given this dataframe:
df = pd.DataFrame({'a': (1, None, 3), 'b': (4, 5, 6), 'c': (7, 8, None), 'd': (10, 11, 12)})
df
a b c d
0 1.0 4 7.0 10
1 NaN 5 8.0 11
2 3.0 6 NaN 12
I need to get a list of column names that have NaN values in them (my real dataset has 80+ columns and for cleaning purposes I only want to focus on anything with NaN for the time being). This will give me a full Boolean list:
df.isnull().any()
a True
b False
c True
d False
dtype: bool
Ideally I only want:
a True
c True
I cannot figure out how to do that. A mask is close, but is applied to the row:
mask = df.isnull().values
df[mask]
a b c d
1 NaN 5 8.0 11
2 3.0 6 NaN 12
Is there a way to apply them to the column axis instead, or is there a better way to do what I'm looking for?