0

I need help. I am trying to clean a very large data frame using pandas.I have 35064 rows and 16 columns.In 20 rows I have np.nan in 4 columns,so I want do delete these 20 rows. I wanted to replace np.nan with 0,and after that to find indexes in each od this 4 columns that have values 0

(indexes_to_drop=df.loc[df['temp']==0].index

and after that to do

df.drop(indexes_to_drop,axis=0,inplace=True)

But I forgot that this columns contain regular 0,which I can't drop Also I would like to add for loop,because I have 4 columns. Thank you

sophocles
  • 13,593
  • 3
  • 14
  • 33
Ana
  • 15
  • 2

2 Answers2

0

why not use this?

df = df.dropna()
teepee
  • 2,620
  • 2
  • 22
  • 47
  • But I have also NaN in first 4 columns also that I can't drop because I need to replace this NaN with other values aftter – Ana Dec 01 '20 at 18:42
0

In case there are other columns that contain NaN but you don't want to drop rows based on their values:

df_clean = df.dropna(subset=[column1, column2, column3, column4])

This will take into account only the 4 columns that you are worried about, and drop rows only based on them.

Edit: for clarity, grammar and missing words.

exokamen
  • 36
  • 5