Pandas Series any() vs all()

Question

>>> s = pd.Series([float('nan')])
>>> s.any()
False
>>> s.all()
True

Isn't that weird? Documentation on any (Return whether any element is True over requested axis) and all (Return whether all elements are True over requested axis) is similar, but the difference in behavior doesn't seem to make sense to me.

What gives?

Perhaps it's actually doing *"no elements are false"*, rather than *"all elements are true"*, and `nan` makes it three-valued logic? — jonrsharpe, Oct 14 '16 at 21:33
@jonrsharpe, i guess NaN's will be ignored altogether in this case - check this: `pd.Series([]).any()` and `pd.Series([]).all()` — MaxU - stand with Ukraine, Oct 14 '16 at 21:35
Yes, this is likely an issue with how `pandas` deals with NaN by default. — juanpa.arrivillaga, Oct 14 '16 at 21:36
What i can't understand is why does: `pd.Series([]).all()` give `True`, `all([])` also returns `True` (and i can't understand this as well...) — MaxU - stand with Ukraine, Oct 14 '16 at 21:37
Found this question that delves into it: http://stackoverflow.com/questions/3275058/reason-for-all-and-any-result-on-empty-lists — juanpa.arrivillaga, Oct 14 '16 at 21:42

score 3 · Accepted Answer · edited May 23 '17 at 12:06

3

It seems to be an issue with how pandas normally ignores NaN unless told not to:

>>> pd.Series([float('nan')]).any()
False
>>> pd.Series([float('nan')]).all()
True
>>> pd.Series([float('nan')]).any(skipna=False)
True
>>>

Note, NaN is falsey:

>>> bool(float('nan'))
True

Also note: this is consistent with the built-in any and all. Empty iterables return True for all and False for any. Here is a relevant question on that topic.

Interestingly, the default behavior appears to be inconsistent with the documentation:

skipna : boolean, default True Exclude NA/null values. If an entire row/column is NA, the result will be NA

But observe:

>>> pd.Series([float('nan')]).any(skipna=None)
False
>>> pd.Series([float('nan')]).any(skipna=True)
False
>>> pd.Series([float('nan')]).any(skipna=False)
True
>>>

edited May 23 '17 at 12:06

Community

1
1

answered Oct 14 '16 at 21:39

juanpa.arrivillaga

88,713
10
131
172

The documentation says the following about `skipna`: `Exclude NA/null values. If an entire row/column is NA, the result will be NA`. And it doesn't work that way: `pd.Series([float('nan')]).any()` returns `False`, not `NA`. – Dennis Golomazov Oct 14 '16 at 21:43
@DennisGolomazov Yes, I was about to post that. – juanpa.arrivillaga Oct 14 '16 at 21:46
This just seems to be inconsistent, `skipna` is True by default in both `any` and `all`, so if `all` is True, `any` should be True too, shouldn't it? – Dennis Golomazov Oct 14 '16 at 21:46
1

@DennisGolomazov No, it seems to be consistent with the builtin `any` and `all`. Empty iterables return `True` for `all` and `False` for `any`. See the discussion [here](http://stackoverflow.com/questions/3275058/reason-for-all-and-any-result-on-empty-lists) – juanpa.arrivillaga Oct 14 '16 at 21:49
Good call, you might want to add that to the answer itself. – Dennis Golomazov Oct 14 '16 at 21:52
@DennisGolomazov went ahead and did that. I wonder if anyone has found this inconsistency with the documentation before? It seems like a sort of arcane corner. In any event, most of the `pandas` api ignores `NaN` by default, so maybe no one has been surprised by it yet... – juanpa.arrivillaga Oct 14 '16 at 21:58
Answer accepted, thank you! Yes, generally I find pandas documentation to be poorly organized. I tend to find most of the answers on SO, not in the docs. – Dennis Golomazov Oct 14 '16 at 22:00
1

Heh. Yeah, [this](http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.core.groupby.GroupBy.transform.html) is my favorite example of a very 'helpful' pandas documentation... – juanpa.arrivillaga Oct 14 '16 at 22:04

Pandas Series any() vs all()

1 Answers1