4

The following I had with Python 3.8.1 (on macOS Mojave, 10.14.6, as well as Python 3.7 (or some older) on some other platforms). I'm new to computing and don't know how to request an improvement of a language, but I think I've found a strange behaviour of the built-in function map.

As the code next(iter(())) raises StopIteration, I expected to get StopIteration from the following code:

tuple(map(next, [iter(())]))

To my surprise, this silently returned the tuple ()!

So it appears the unpacking of the map object stopped when StopIteration came from next hitting the "empty" iterator returned by iter(()). However, I don't think the exception was handled right, as StopIteration was not raised before the "empty" iterator was picked from the list (to be hit by next).

  1. Did I understand the behaviour correctly?
  2. Is this behaviour somehow intended?
  3. Will this be changed in a near future? Or how can I get it?

Edit: The behaviour is similar if I unpack the map object in different ways, such as by list, for for-loop, unpacking within a list, unpacking for function arguments, by set, dict. So I believe it's not tuple but map that's wrong.

Edit: Actually, in Python 2 (2.7.10), the "same" code raises StopIteration. I think this is the desirable result (except that map in this case does not return an iterator).

  • 2
    This behavior looks correct, since `map(next, [iter(())])` returns an (empty) map object – Josh Abraham Mar 27 '20 at 01:22
  • 4
    `map` doesn't catch the StopIteration exception. It lets it propagate, which looks like the end of the map. – user2357112 Mar 27 '20 at 01:49
  • 3
    @chepner `map` doesn't catch the `StopIteration`. it bubbles up and then tuple(...) thinks it's the end of the iterable. – wim Mar 27 '20 at 02:03
  • @JoshAbraham: Could you explain why an (empty) map object should be returned without an error? – Takuo Matsuoka Mar 27 '20 at 02:04
  • 1
    Josh is wrong, or at least not communicating clearly. `map(next, [iter(())])` returns something that looks like a normal empty map due to the `StopIteration` propagating out of `next`, but it's not a normal empty map. – user2357112 Mar 27 '20 at 02:07
  • Do you have a real use-case that requires you to ``map`` ``next`` onto an iterable of iterators, or is this just some toy situation? – MisterMiyagi Mar 27 '20 at 07:58
  • @wim Yeah, I noticed that when I commented on one of the answers; I think I missed reloading this page before making that comment and deleting my comment above. – chepner Mar 27 '20 at 11:52
  • @MisterMiyagi In the case of ``next``, I think the job is normally ``zip``'s. However, what we now see is there is always a danger whenever calling the first argument of ``map`` may result in ``StopIteration``. – Takuo Matsuoka Mar 27 '20 at 12:56

3 Answers3

5

This isn't a map bug. It's an ugly consequence of Python's decision to rely on exceptions for control flow: actual errors look like normal control flow.

When map calls next on iter(()), next raises StopIteration. This StopIteration propagates out of map.__next__ and into the tuple call. This StopIteration looks like the StopIteration that map.__next__ would normally raise to signal the end of the map, so tuple thinks that the map is simply out of elements.

This leads to weirder consequences than what you saw. For example, a map iterator doesn't mark itself exhausted when the mapped function raises an exception, so you can keep iterating over it even afterward:

m = map(next, [iter([]), iter([1])])

print(tuple(m))
print(tuple(m))

Output:

()
(1,)

(The CPython map implementation doesn't actually have a way to mark itself exhausted - it relies on the underlying iterator(s) for that.)

This kind of StopIteration problem was annoying enough that they actually changed generator StopIteration handling to mitigate it. StopIteration used to propagate normally out of a generator, but now, if a StopIteration would propagate out of a generator, it gets replaced with a RuntimeError so it doesn't look like the generator ended normally. This only affects generators, though, not other iterators like map.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • Well explained (+1). There is a similar "mistaken exception" gotcha with `hasattr`: if some buggy code within a property getter caused an `AttributeError` then `hasattr` does not understand what happened and just reports the property as not there. – wim Mar 27 '20 at 02:18
  • 1
    Thanks for the detailed answer. So the answer to my quesion (1) is No: it's not ``map``'s problem. And to my question (2), the behaviour as a whole is not intended anyway, is this correct? – Takuo Matsuoka Mar 27 '20 at 02:32
  • Well, I don't think intention can be known. The behaviour is annoying to some people (for the right reason, I think). – Takuo Matsuoka Mar 27 '20 at 02:38
  • I'm still NOT convinced that "(relying) on exceptions for control flow" is a bad idea. Indeed, even with the old (i.e., before PEP 479) treatment of ``StopIteration`` out of a generator, the following function ``map2`` would (althohgh I don't have Python 3 older than 3.5) have worked desirably (though not after PEP 479). – Takuo Matsuoka Mar 27 '20 at 07:33
  • The code: ```def map2(function, iterable): "This is a 2-argument version for simplicity." iterator = iter(iterable) while True: arg = next(iterator) # StopIteration out here would have been propagated. try: yield function(arg) except StopIteration: raise RuntimeError("generator raised StopIteration") ``` – Takuo Matsuoka Mar 27 '20 at 07:33
  • My point was I think the problem is still that ``map`` is treating exceptions irrespective of where they come from. You could see this, if indentation in the code was displayed correctly... (But I've learned using a generator function can be safer now than ``map``.) – Takuo Matsuoka Mar 27 '20 at 07:36
  • @TakuoMatsuoka Note that checking the origin of the ``StopIteration`` would have a serious impact on heavily nested iterators. It penalises the expected situation of correct code for the unexpected situation of someone mishandling internal protocols. – MisterMiyagi Mar 27 '20 at 08:42
  • @MisterMiyagi If you mean something like ``map(f, map(f0, map(f00, ...), map(f01, ...), ...), map(f1, map(f10, ...), map(...), ...), ...)``, then the functions can be composed first to avoid nesting, but yes, if its left nested, small increases multiply quickly, so the current ``map`` would be for when speed but not reliability as much is crucial. Thanks for the observation. – Takuo Matsuoka Mar 27 '20 at 12:57
1
  1. Did I understand the behavior correctly?

Not quite. map takes its first argument, a function, and applies it to every item in some iterable, its second argument, until it catches the StopIteration exception. This is an internal exception raised to tell the function that it has reached the end of the object. If you're manually raising StopIteration, it sees that and stops before it has the chance to process any of the (nonexistent) objects inside the list.

AAM111
  • 1,178
  • 3
  • 19
  • 39
  • Thanks for your answer! I think ``map`` should stop iterating only when its second argument stops iterating, but NOT when its first argument raises ``StopIteration`` for whatever reason. That's why I find the current behaviour strange. – Takuo Matsuoka Mar 27 '20 at 02:19
  • @TakuoMatsuoka But how would `map` be able to tell the difference? Are you suggesting to introspect the traceback on the exception instance? That would likely be slow and flakey. – wim Mar 27 '20 at 02:20
  • 2
    @TakuoMatsuoka funny OK, this answer is not worded too well, maybe "catches" should be replaced with "reaches". Actually, map doesn't iterate at all, it just returns an iterator that you have to iterate with something else. Whatever "other thing" which iterates the map is usually what catches that `StopIteration` exception. Does that make sense? – wim Mar 27 '20 at 03:07
  • It may just be the wording, but `map` does not apply the function; it creates an object that saves references to both the function and the iterable, and when you call this object's `__next__` method, it calls the iterable's `__next__` method, applies the function to the result, and returns that result. – chepner Mar 27 '20 at 11:50
  • @wim Right, it's not ``map`` that iterates. The correct expression would be "a map object should stop iterating only when the second argument of ``map`` (which created the map object) stops iterating" or further, "a map object's ``__next__`` method should raise (or let propagate) ``StopIteration`` only when ``StopIteration`` comes from the ``__next__`` method of the second argument of ``map``". Thanks for pointing out the confusion. – Takuo Matsuoka Mar 28 '20 at 04:31
0

I'm the poster of the question, and I would like to summarize here what I have learned and what I think has been left. (I do not plan to post it as a new question.)

In Python, StopIteration coming from the __next__ method of an iterator is treated as a signal that the iterator has reached the end. (Otherwise, it's the signal of an error.) Thus, the __next__ method of an iterator must catch all StopIteration which is not meant to be a signal of the end.

A map object is created with a code of the form map(func, *iterables), where func is a function, and *iterables stands for a finite sequence of one (as of Python 3.8.1) or more iterables. There are (at least) two kinds of subprocess of a __next__ process of the resulting map object which may raise StopIteration:

  1. Process where the __next__ method of one of the iterables in the sequence *iterables is called.
  2. Process where the argument func is called.

The intention of map as I understand it from it's document (or displayed by help(map)) is that StopIteration coming from a subprocess of kind (2) is NOT the end of the map object. However, the current behaviour of a map object's __next__ is such that it's process emits StopIteration in this case. (I haven't checked whether it actually catches StopIteration or not. If it does, then it raises StopIteration again anyway.) This appears the cause of the problem I asked about.

In an answer above, user2357112 supports Monica (let me friendlily abbreviate the name as "User Primes") finds the consequence of this ugly, but answered it's Python's fault, and not map's. Unfortunately, I do not find convincing support for this conclusion in the answer. I suspect fixing map would be better, but some other people seem to disagree to this for performance reasons. I know nothing about the implementation of built-in functions of Python and cannot judge. So this point has been left for me. Nevertheless, User Primes' answer was informative enough to leave the left question unimportant for me now. (Thanks user2357112 supports Monica again!)

By the way, the code I tried to post in a comment to User Primes' answer is as follows. (I think it would have worked before PEP 479.)

def map2(function, iterable):
    "This is a 2-argument version for simplicity."
    iterator = iter(iterable)
    while True:
        arg = next(iterator) # StopIteration out here would have been propagated.
        try:
            yield function(arg)
        except StopIteration:
            raise RuntimeError("generator raised StopIteration")

What's below is a slightly different version of this (again, a 2-argument version), which might be more convenient (posted with the hope of getting suggestions for improvement!).:

import functools
import itertools

class StopIteration1(RuntimeError):
    pass

class map1(map):
    def __new__(cls, func, iterable):
        iterator = iter(iterable)
        self = super().__new__(cls, func, iterator)
        def __next__():
            arg = next(iterator)
            try:
                return func(arg)
            except StopIteration:
                raise StopIteration1(0)
            except StopIteration1 as error:
                raise StopIteration1(int(str(error)) + 1)
        self.__next__ = __next__
        return self
    def __next__(self):
        return self.__next__()

# tuple(map1(tuple,
#            [map1(next,
#                  [iter([])])]))
# ---> <module>.StopIteration1: 1