Python Pandas Dataframe, remove all rows where 'None' is the value in any column

Question

I have a large dataframe. When it was created 'None' was used as the value where a number could not be calculated (instead of 'nan')

How can I delete all rows that have 'None' in any of it's columns? I though I could use df.dropna and set the value of na, but I can't seem to be able to.

Thanks

I think this is a good representation of the dataframe:

temp = pd.DataFrame(data=[['str1','str2',2,3,5,6,76,8],['str3','str4',2,3,'None',6,76,8]])

Now you ask, I'm not sure. I think it's `None`, not a string — jlt199, Aug 04 '17 at 17:53
`None` is automatically replaced by `np.nan`. It's probably `"None"` that you have. Please check. The answer depends on what you have. — DYZ, Aug 04 '17 at 17:59
Paste some relevant row of your dataframe into your question. — DYZ, Aug 04 '17 at 18:00

piRSquared · Answer 1 · 2017-08-04T18:29:14.023

Setup
Borrowed @MaxU's df

df = pd.DataFrame([
    [1, 2, 3],
    [4, None, 6],
    [None, 7, 8],
    [9, 10, 11]
], dtype=object)

Solution
You can just use pd.DataFrame.dropna as is

df.dropna()

   0   1   2
0  1   2   3
3  9  10  11

Supposing you have None strings like in this df

df = pd.DataFrame([
    [1, 2, 3],
    [4, 'None', 6],
    ['None', 7, 8],
    [9, 10, 11]
], dtype=object)

Then combine dropna with mask

df.mask(df.eq('None')).dropna()

   0   1   2
0  1   2   3
3  9  10  11

You can ensure that the entire dataframe is object when you compare with.

df.mask(df.astype(object).eq('None')).dropna()

   0   1   2
0  1   2   3
3  9  10  11

I get the error `TypeError: Could not compare ['None'] with block values` with the strings solution and a dataframe of the same size as before with the first solution — jlt199, Aug 04 '17 at 18:26

score 36 · Answer 2 · answered Aug 04 '17 at 19:28

36

Thanks for all your help. In the end I was able to get

df = df.replace(to_replace='None', value=np.nan).dropna()

to work. I'm not sure why your suggestions didn't work for me.

answered Aug 04 '17 at 19:28

jlt199

2,349
6
23
43

This is very useful if one has empty value for columns with str data type. – Good Will Dec 21 '22 at 18:42

MaxU - stand with Ukraine · Answer 3 · 2017-08-04T18:27:20.833

11

UPDATE:

In [70]: temp[temp.astype(str).ne('None').all(1)]
Out[70]:
      0     1  2  3  4  5   6  7
0  str1  str2  2  3  5  6  76  8

Old answer:

In [35]: x
Out[35]:
      a     b   c
0     1     2   3
1     4  None   6
2  None     7   8
3     9    10  11

In [36]: x = x[~x.astype(str).eq('None').any(1)]

In [37]: x
Out[37]:
   a   b   c
0  1   2   3
3  9  10  11

or bit nicer variant from @roganjosh:

In [47]: x = x[x.astype(str).ne('None').all(1)]

In [48]: x
Out[48]:
   a   b   c
0  1   2   3
3  9  10  11

edited Aug 04 '17 at 18:27

answered Aug 04 '17 at 17:53

MaxU - stand with Ukraine

205,989
36
386
419

1

Slight aside; is there a reason for `~` + `eq` instead of just `ne`? I can't test atm but it seems from a quick search that `ne` would do the trick? I'm very much in the learning phase with pandas. – roganjosh Aug 04 '17 at 18:07
@jlt199, could you post (in your question) a small reproducible data set and your desired data set? – MaxU - stand with Ukraine Aug 04 '17 at 18:11
Working on getting a small dataset, but struggling. Please bare with me.. – jlt199 Aug 04 '17 at 18:15
1

Did you know about `pd.DataFrame.as_blocks`? Wow! `df.as_blocks()['object']` – piRSquared Aug 04 '17 at 18:37
@piRSquared, wow! i've never seen it before! I'm going to dig into it... – MaxU - stand with Ukraine Aug 04 '17 at 18:38
Just in case people are finding this on a search, **N.B.** `pd.DataFrame.as_blocks` is **Deprecated since version 0.21.0.** – Shawn Mehan Jul 29 '18 at 22:12

score 2 · Answer 4 · answered Feb 04 '22 at 12:32

2

im a bit late to the party, but this is prob the simplest method:

df.dropna(axis=0, how='any')

Parameters: axis='index/column' how='any/all'

axis '0' is for dropping rows (most common), and '1' will drop columns instead. and the parameter how will drop if there are 'any' None types in the row/ column, or if they are all None types (how='all')

answered Feb 04 '22 at 12:32

Oenomaus

21
1
3

NOTE that how='any' is the default value for .dropna(), hence, this doesn't work which is precisely what @jlt199 pointed out in his question. – iamakhilverma Sep 11 '22 at 13:16

Raj · Answer 5 · 2022-06-14T05:07:42.557

if still None is not removed , we can do

df = df.replace(to_replace='None', value=np.nan).dropna()

the above solution worked partially still the None was converted to NaN but not removed (thanks to the above answer as it helped to move further) so then i added one more line of code that is take the particular column

df['column'] = df['column'].apply(lambda x : str(x))

this changed the NaN to nan now remove the nan

df = df[df['column'] != 'nan']

Python Pandas Dataframe, remove all rows where 'None' is the value in any column

5 Answers5

Linked