3

I was writing up a blog article on what reduce is, how it works, and the functions it hides behind, and at a certain point I talk about sum, all, any, max, and min, all reductions.

Then, I talk about how reduce accepts the third argument, the initial value, that should be the identity element of the function you are using. Then, I proceed to show some specialised reductions have the identity element baked in:

>>> sum([])
0
>>> import math
>>> math.prod([])
1
>>> all([])
True

So, why is it that max and min throw errors when called with empty iterables, when their operations have identity elements?

>>> max([])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: max() arg is an empty sequence

We can do float("-inf"), so why isn't this the value returned by max([])? Similarly, why doesn't min([]) return float("inf")?

RGS
  • 964
  • 2
  • 10
  • 27
  • 2
    Because an empty iterable doesn’t have any items, how can its max have _any_ value? An identity isn’t just something that works, it needs to make sense! – Boris the Spider Jun 08 '21 at 21:23
  • @BoristheSpider But `prod([])` nonetheless returns 1. – DYZ Jun 08 '21 at 21:24
  • The default start value for `math.prod` is `1`. If you want `0` then set it: `math.prod([], start=0)`. Unfortunately this will lead to a result of `0` for every list. – Matthias Jun 08 '21 at 21:27
  • 1
    An [interesting development](https://stackoverflow.com/questions/36157995/a-safe-max-function-for-empty-lists). – DYZ Jun 08 '21 at 21:27
  • 5
    `prod` is supposed to calculate the product of numbers, but a list can contain anything, and `max` can be applied to any list of items as long as they can be compared. Why would an infinite float be a reasonable max value for a list of strings if it happens to be empty? – Thierry Lathuille Jun 08 '21 at 21:29
  • 1
    @DYZ returning 1 for `prod` is based on a mathematics convention... think of `n^0` for `n ≠ 0`. – andand Jun 08 '21 at 21:34
  • Seems to me the logical value to return for `max([])` or `min([])` would be `None`. Not sure that's helpful for this discussion, just my opinion. – andand Jun 08 '21 at 21:36
  • 2
    Returning `None` would then mean that no meaningful max value can be returned, and that's exactly the situations in which exceptions should be raised. There is a generic way to manage exceptions, while creating a special case where `max` would return `None` would both prevent the code to fail immediately and allow invalid values to propagate, and force the rest of the code to test the returned value each time it's used. – Thierry Lathuille Jun 08 '21 at 21:41
  • @ThierryLathuille I completely missed the fact that something like `max("abc", "da")` returns `"da"`, that certainly is the reason! – RGS Jun 08 '21 at 21:55
  • Another good reason: each time I tried to get the max of an empty list, it was because of an error in my code. In such a case, I prefer to get an exception immediately telling me that I tried to get the max of an empty sequence, rather than have an unexpected infinite value getting used in the rest of the code, causing harder to find bugs later on. – Thierry Lathuille Jun 08 '21 at 22:28

1 Answers1

2

Because min() and max() accept any type, and they might not be comparable with floats. For example, strings:

>>> min('a', 'b')
'a'
>>> min(float('-inf'), 'a', 'b')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'str' and 'float'

If you want the behaviour you're talking about, you can use the default parameter:

>>> min([], default=float('-inf'))
-inf

So that is to say, min() and max() have broader applications than the other functions you mentioned. sum() and math.prod() are primarily used for numbers, so it makes sense to give them numeric start values. Although, you can change that with their start parameters. Here are some examples with lists, though they're not very idiomatic.

>>> math.prod([2, 3], start=['a'])  # better: ['a'] * math.prod([2, 3])
['a', 'a', 'a', 'a', 'a', 'a']
>>> sum([['a'], ['b']], start=[])  # better: list(itertools.chain(['a'], ['b']))
['a', 'b']
wjandrea
  • 28,235
  • 9
  • 60
  • 81