0

I have a data frame as below,

df = pd.DataFrame({'A': [1, 2, 3, 4, 5],
              'B': [1, Nan, 2, 3, 4],
              'C': [2, 3, Nan, Nan,5],
              'D': [5, 6, 6, 7, 8],
              'E': [Nan, 2, 3, 4, 5]})

   A  B     C     D     E       
0  1  1     2     5     Nan       
1  2  Nan   3     6     2  
2  3  2     Nan   6     3  
3  4  3     Nan   7     4  
4  5  4     5     8     5

Here I am using df.dropna(axis=1,inplace=True) to drop columns having NAN in the data frame. But before dropping those columns I need to capture the columns which are having NAN values and store those column names in List.

eg: In the above data frame we have NAN in B,C and E columns. So those column names have to be stored in the list first [B,C,E] and then drop all those columns.

Any suggestions on how to store those column names in a list?

1 Answers1

0

I think this should do:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1, 2, 3, 4, 5],
              'B': [1, np.nan, 2, 3, 4],
              'C': [2, 3, np.nan, np.nan,5],
              'D': [5, 6, 6, 7, 8],
              'E': [np.nan, 2, 3, 4, 5]})

final = df.isna().sum()
listOfColumns = []
for i,j in zip(final,list(df.columns)):
    if i > 0:
        listOfColumns += j
    else:
        pass

print(listOfColumns)

If you want to simply check the number of Null / Nan values in your columns you can simply run:

df.isna().sum()
Prateek Jain
  • 231
  • 1
  • 11