3
>>> s = pd.Series([float('nan')])
>>> s.any()
False
>>> s.all()
True

Isn't that weird? Documentation on any (Return whether any element is True over requested axis) and all (Return whether all elements are True over requested axis) is similar, but the difference in behavior doesn't seem to make sense to me.

What gives?

Dennis Golomazov
  • 16,269
  • 5
  • 73
  • 81

1 Answers1

3

It seems to be an issue with how pandas normally ignores NaN unless told not to:

>>> pd.Series([float('nan')]).any()
False
>>> pd.Series([float('nan')]).all()
True
>>> pd.Series([float('nan')]).any(skipna=False)
True
>>> 

Note, NaN is falsey:

>>> bool(float('nan'))
True

Also note: this is consistent with the built-in any and all. Empty iterables return True for all and False for any. Here is a relevant question on that topic.

Interestingly, the default behavior appears to be inconsistent with the documentation:

skipna : boolean, default True Exclude NA/null values. If an entire row/column is NA, the result will be NA

But observe:

>>> pd.Series([float('nan')]).any(skipna=None)
False
>>> pd.Series([float('nan')]).any(skipna=True)
False
>>> pd.Series([float('nan')]).any(skipna=False)
True
>>> 
Community
  • 1
  • 1
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
  • The documentation says the following about `skipna`: `Exclude NA/null values. If an entire row/column is NA, the result will be NA`. And it doesn't work that way: `pd.Series([float('nan')]).any()` returns `False`, not `NA`. – Dennis Golomazov Oct 14 '16 at 21:43
  • @DennisGolomazov Yes, I was about to post that. – juanpa.arrivillaga Oct 14 '16 at 21:46
  • This just seems to be inconsistent, `skipna` is True by default in both `any` and `all`, so if `all` is True, `any` should be True too, shouldn't it? – Dennis Golomazov Oct 14 '16 at 21:46
  • 1
    @DennisGolomazov No, it seems to be consistent with the builtin `any` and `all`. Empty iterables return `True` for `all` and `False` for `any`. See the discussion [here](http://stackoverflow.com/questions/3275058/reason-for-all-and-any-result-on-empty-lists) – juanpa.arrivillaga Oct 14 '16 at 21:49
  • Good call, you might want to add that to the answer itself. – Dennis Golomazov Oct 14 '16 at 21:52
  • @DennisGolomazov went ahead and did that. I wonder if anyone has found this inconsistency with the documentation before? It seems like a sort of arcane corner. In any event, most of the `pandas` api ignores `NaN` by default, so maybe no one has been surprised by it yet... – juanpa.arrivillaga Oct 14 '16 at 21:58
  • Answer accepted, thank you! Yes, generally I find pandas documentation to be poorly organized. I tend to find most of the answers on SO, not in the docs. – Dennis Golomazov Oct 14 '16 at 22:00
  • 1
    Heh. Yeah, [this](http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.core.groupby.GroupBy.transform.html) is my favorite example of a very 'helpful' pandas documentation... – juanpa.arrivillaga Oct 14 '16 at 22:04