24

I am fetching the rows with some values from a pandas dataframe with the following code. I need to convert this code to pandas.query().

results = rs_gp[rs_gp['Col1'].notnull()]

When I convert to:

results = rs_gp.query('Col1!=None')

It gives me the error

None is not defined
Tonechas
  • 13,398
  • 16
  • 46
  • 80
Rtut
  • 937
  • 3
  • 11
  • 19

2 Answers2

34

We can use the fact that NaN != NaN:

In [1]: np.nan == np.nan
Out[1]: False

So comparing column to itself will return us only non-NaN values:

rs_gp.query('Col1 == Col1')

Demo:

In [42]: df = pd.DataFrame({'Col1':['aaa', np.nan, 'bbb', None, '', 'ccc']})

In [43]: df
Out[43]:
   Col1
0   aaa
1   NaN
2   bbb
3  None
4
5   ccc

In [44]: df.query('Col1 == Col1')
Out[44]:
  Col1
0  aaa
2  bbb
4
5  ccc
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
6

I don't know, if my solution was added to pandas after the first answer on this question, but notnull() and isnull() are now valid options for queries in pandas.

df.query('Col1.isnull()', engine='python')

This will return all rows where the value in the cell of the row is null.

df.query('Col1.notnull()', engine='python')

Vice versa, this query will return every row, where the value is not NaN.

In Addition: stating the engine and setting it to python will let you use pandas functions in a query.

Phil
  • 111
  • 1
  • 4