12

I want to drop all NaN variables in one of my columns but when I use df.dropna(axis=0, inplace=True) it erases my entire dataframe. Why is this happening?

I've used both df.dropna and df.dropna(axis=0, inplace=True) and it doesn't work to remove NaN.

I'm binning my data so i can run a gaussian model but I can't do that with NaN variables, I want to remove them and still have my dataframe to run the model.

Before and AFter Data

enter image description here

Piper Ramirez
  • 373
  • 1
  • 3
  • 11
  • Can you post raw data, your code to recreate your df, and your code that produces the erroneous result. Note to just remove `NaN` from one column you can just do `df['Col'] = df['Col'].dropna()`, what you wrote was to drop rows that contained any `NaN` which would mean that if all your rows contained at least 1 `NaN` then the entire df would be deleted – EdChum Apr 03 '19 at 14:56
  • 6
    try `df.dropna(how='all',axis=0, inplace=True)` if you dont use all, it will remove all rows which has a `NaN` – anky Apr 03 '19 at 14:56
  • 1
    It sounds like you have an NaN in every row in some column. Adding to what anky_91 said, you can also have dropna look at only a subset of columns (or rows). So df = df.dropna(subset=["col1_name"]) and it will only drop rows that have NaN values in that column.. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html – Declan Apr 03 '19 at 15:01
  • @EdChum I've shared the before and after dataset – Piper Ramirez Apr 03 '19 at 16:00
  • 1
    @pramire1 what are you trying to achieve here? dropna() will drop all rows since one or the other column in each row is a nan – anky Apr 03 '19 at 16:06
  • `CancellationCode` columnn seems to have all `NaNs` so drop that with `drop('CancellationCode', axis=1)` the `dropna` – Serg Nov 03 '21 at 03:31

3 Answers3

2

Not sure about your case, but sharing the solution that worked on my case:

The ones that didn't work:

df = df.dropna() #==> make the df empty.
df = df.dropna(axis=0, inplace=True) #==> make the df empty.
df.dropna(axis=0, inplace=True) #==> make the df empty.

The one that worked:

df.dropna(how='all',axis=0, inplace=True) #==> Worked very well...

Thanks to Anky above for his comment.

HassanSh__3571619
  • 1,859
  • 1
  • 19
  • 18
1

Default 'dropna' command uses 'how=any' , which means that it would delete each row which has 'any' NaN

This, as you found out, delete rows which have 'all' NaN columns

df.dropna(how='all', inplace=True)

or, more basic:

newDF = df.dropna(how='all')
Lorenzo Bassetti
  • 795
  • 10
  • 15
0

For anyone in the future. Try changing axis=0 to axis=1

df.dropna(axis=1, how = 'all')
Carlost
  • 478
  • 5
  • 13