94

I am interested in understanding the new language design of Python 3.x.

I do enjoy, in Python 2.7, the function map:

Python 2.7.12
In[2]: map(lambda x: x+1, [1,2,3])
Out[2]: [2, 3, 4]

However, in Python 3.x things have changed:

Python 3.5.1
In[2]: map(lambda x: x+1, [1,2,3])
Out[2]: <map at 0x4218390>

I understand the how, but I could not find a reference to the why. Why did the language designers make this choice, which, in my opinion, introduces a great deal of pain. Was this to arm-wrestle developers in sticking to list comprehensions?

IMO, list can be naturally thought as Functors; and I have been somehow been thought to think in this way:

fmap :: (a -> b) -> f a -> f b
Community
  • 1
  • 1
NoIdeaHowToFixThis
  • 4,484
  • 2
  • 34
  • 69
  • 3
    The rationale should be the same as to why we use generators instead of list comprehensions. By using lazy evaluation we don't need to keep huge things in memory. Check the accepted answer here: http://stackoverflow.com/questions/1303347/getting-a-map-to-return-a-list-in-python-3-x – Moberg Oct 13 '16 at 08:04
  • In C#, "maps" are lazily evaluated. I'd wager it's the same with Python 3's map or generator expressions. This saves memory. – Mateen Ulhaq Oct 13 '16 at 08:04
  • 8
    Could you explain why this brings you "a great deal of pain"? – RemcoGerlich Oct 13 '16 at 08:06
  • And do you seriously prefer that map in 2.7 over `[x+1 for x in [1,2,3]]` ? – RemcoGerlich Oct 13 '16 at 08:08
  • 3
    I think it's because years of usage showed that most common uses of `map` simply iterated over the result. Building a list when you don't need it is inefficient so the devs decided to make `map` lazy. There's a lot to be gained here for performance and not a lot to be lost (If you need a list, just ask for one ... `list(map(...))`). – mgilson Oct 13 '16 at 08:09
  • 3
    Ok, I find it interesting that rather than keeping the Functor pattern and offering a lazy version of List, they somehow made it a decision to force a lazy evaluation of a list whenever it is mapped. I would have preferred to have the right to make my own choice, aka, Generator -> map -> Generator or List -> map -> List (up to me to decide) – NoIdeaHowToFixThis Oct 13 '16 at 08:09
  • 5
    @NoIdeaHowToFixThis, actually is up to you, if you need the whole list, just transform it to a list, easy as hell – Netwave Oct 13 '16 at 08:10
  • @NoIdeaHowToFixThis: you have that choice, you can use either generator or list expressions, or use `list(map(...))`. – RemcoGerlich Oct 13 '16 at 08:10
  • 1
    Well, yes, of course, I can convert the iterator back to a list but that pollutes my code a lot (personal opinion). `list(map(..))` instead of `map(..)` and here is my pain. @RemcoGerlich: well, I do have picked a toy example but there are instances where I feel using map is more convenient that a list comprehension (personal opinion) – NoIdeaHowToFixThis Oct 13 '16 at 08:15
  • Anyway, I think we should not make SO the place for a debate. I think you all have helped me understand the design choices and the strategy. Thanks! – NoIdeaHowToFixThis Oct 13 '16 at 08:17
  • @Chris_Rands: In Python 2 you have generator expression: `(f(x) for x in xs)` – abukaj Oct 13 '16 at 10:22
  • 1
    The whole "laziness" of `map()` is highly debatable, since it neither is subscriptable nor yields same results when iterated twice (try: `m = map(str, [1, 2,3]); print(list(m)); print(list(m))`). – abukaj Oct 13 '16 at 10:27
  • @Chris_Rands I am not claiming there is no list comprehension ;). I forgot to mention `itertools.imap`, which may better serve your purpose. – abukaj Oct 13 '16 at 10:36
  • 1
    @abukaj Well, python is not Haskell. Python doesn't have referential transparecy so you should **not** expect that *any* expression evaluted twice produces the same result. – Bakuriu Oct 13 '16 at 12:41
  • 1
    @NoIdeaHowToFixThis So you would have required to introduce a whole new lazy data type instead? This still doesn't provide what python3 map does: a lazy data structure still consumes memory when used. python3's map can be iterated over in constant space. This matters if you have something like: `for k in map(func, itertools.count()): if predicate(k): break`. Using a lazy data structure this will consume more and more memory until the OOM kills the process. Python3 map instead uses O(1) memory during that loop. – Bakuriu Oct 13 '16 at 12:46
  • 1
    @Bakuriu - well, if this is the concern, then there should be a Stream or something. Still learning python. I was wondering about why map does what it does. In other programming languages the behavior (memory footprint, lazyness, cost of operations, etc) is part of the design of the data structures, which are then functors. That is: map just maps the data structure: the return type is what you feed in. – NoIdeaHowToFixThis Oct 13 '16 at 13:11
  • @Bakuriu I am not expecting it. I claim that the output of `map()` in Python 3 is not a lazily evaluated output of `map()` of Python 2. My point is it returns an iterator, not a lazily-evaluated list. Addressing the second comment, in Python 2 you can use `itertools.imap()` for memory-saving purposes on iteration. – abukaj Oct 13 '16 at 13:12
  • @NoIdeaHowToFixThis That's because in those languages `map` is a method of the class/interface. In python `map` is just a function that accepts one (or more) iterables and produces an iterable. `map` will not work on a tree data structure but only on strictly sequential data. do not think of python's `map` as the `Functor` operation. – Bakuriu Oct 13 '16 at 13:16
  • 1
    @abukaj I never stated that the result is lazy version of a list. The result is produced in a lazy way, e.g. on demand, but the result itself is an iterator object and acts as an iterator. The term lazy simply means "computed when it is needed" not that the result should be a real list. – Bakuriu Oct 13 '16 at 13:17
  • @Bakuriu Then we agree:). My comment was a response to claims, that the change was just introducing laziness to `map()`. I use `map()` (2.7) to obtain a "stable" image of a sequence rather than for one-time iteration purposes, so for me the change was much more than making the built-in lazy. – abukaj Oct 13 '16 at 14:35
  • 1
    @NoIdeaHowToFixThis: "the return type is what you feed in" - a design like that wouldn't be able to accept arbitrary iterables. `map(str, xrange(5))` couldn't possibly return an `xrange`, and `map(int, some_file)` couldn't possibly return a `file`. – user2357112 Oct 13 '16 at 16:42
  • 1
    I just ran into this issue, after searching without much luck for the confusing error message `'map' object does not support item assignment`. So here it is for the search engines.... – nealmcb Apr 09 '17 at 21:39
  • If it were going to return a List object then they would've called it List. Why would you _not_ expect something called "Map" to return a "Map" object? – TylerH Jan 30 '23 at 15:08

4 Answers4

41

I think the reason why map still exists at all when generator expressions also exist, is that it can take multiple iterator arguments that are all looped over and passed into the function:

>>> list(map(min, [1,2,3,4], [0,10,0,10]))
[0,2,0,4]

That's slightly easier than using zip:

>>> list(min(x, y) for x, y in zip([1,2,3,4], [0,10,0,10]))

Otherwise, it simply doesn't add anything over generator expressions.

Kinrany
  • 99
  • 1
  • 9
RemcoGerlich
  • 30,470
  • 6
  • 61
  • 79
  • 1
    I think that if we add the desire to stress that list comprehensions are more pythonic and the language designers wanted to stress that, this is the most on-spot answer, I think. @vishes_shell somehow does not focus enough on language design. – NoIdeaHowToFixThis Oct 13 '16 at 08:24
  • 2
    Produces different results in Python 2 and 3 *if the two lists are not of equal length*. Try `c = list(map(max, [1,2,3,4], [0,10,0,10, 99]))` in python 2 and in python 3. – cdarke Oct 13 '16 at 08:38
  • 1
    Here is a reference for the original plan to remove map altogether from python3: http://www.artima.com/weblogs/viewpost.jsp?thread=98196 – Bernhard Oct 13 '16 at 10:11
  • Hmm how odd when I wrap map in list, I get a list of 1 element lists. – awiebe Aug 29 '17 at 11:18
  • 1
    this doesn't really address the *why*. Map returns an iterator so that it can be consumed *lazily* – juanpa.arrivillaga Nov 24 '22 at 21:07
27

Because it returns an iterator, it omit storing the full size list in the memory. So that you can easily iterate over it in the future not making any pain to memory. Possibly you even don't need a full list, but the part of it, until your condition is reached.

You can find this docs useful, iterators are awesome.

An object representing a stream of data. Repeated calls to the iterator’s __next__() method (or passing it to the built-in function next()) return successive items in the stream. When no more data are available a StopIteration exception is raised instead. At this point, the iterator object is exhausted and any further calls to its __next__() method just raise StopIteration again. Iterators are required to have an __iter__() method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as a list) produces a fresh new iterator each time you pass it to the iter() function or use it in a for loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container.

vishes_shell
  • 22,409
  • 6
  • 71
  • 81
14

Guido answers this question here: "since creating a list would just be wasteful".

He also says that the correct transformation is to use a regular for loop.

Converting map() from 2 to 3 might not just be a simple case of sticking a list( ) around it. Guido also says:

If the input sequences are not of equal length, map() will stop at the termination of the shortest of the sequences. For full compatibility with map() from Python 2.x, also wrap the sequences in itertools.zip_longest(), e.g.

map(func, *sequences)

becomes

list(map(func, itertools.zip_longest(*sequences)))
mkrieger1
  • 19,194
  • 5
  • 54
  • 65
cdarke
  • 42,728
  • 8
  • 80
  • 84
  • 3
    Guido comment is on _`map()` invoked for the side effects of the function_, not on its use as a functor. – abukaj Oct 13 '16 at 10:16
  • 4
    The transformation with `zip_longest` is wrong. you have to use `itertools.starmap` for it to be equivalent: `list(starmap(func, zip_longest(*sequences)))`. That's because `zip_longest` produces tuples, so the `func` would receive a single `n`-uple argument instead of `n` distinct arguments as is the case when calling `map(func, *sequences)`. – Bakuriu Oct 13 '16 at 12:44
12

In Python 3 many functions (not just map but zip, range and others) return an iterator rather than the full list. You might want an iterator (e.g. to avoid holding the whole list in memory) or you might want a list (e.g. to be able to index).

However, I think the key reason for the change in Python 3 is that while it is trivial to convert an iterator to a list using list(some_iterator) the reverse equivalent iter(some_list) does not achieve the desired outcome because the full list has already been built and held in memory.

For example, in Python 3 list(range(n)) works just fine as there is little cost to building the range object and then converting it to a list. However, in Python 2 iter(range(n)) does not save any memory because the full list is constructed by range() before the iterator is built.

Therefore, in Python 2, separate functions are required to create an iterator rather than a list, such as imap for map (although they're not quite equivalent), xrange for range, izip for zip. By contrast Python 3 just requires a single function as a list() call creates the full list if required.

Community
  • 1
  • 1
Chris_Rands
  • 38,994
  • 14
  • 83
  • 119
  • AFAIK in Python 2.7 functions from `itertools` return iterators too. Also, I would not see iterators as lazy lists, since lists can be iterated multiple times and accessed randomly. – abukaj Oct 13 '16 at 12:44
  • @abukaj ok thanks, I've edited my answer to try to be clearer – Chris_Rands Oct 13 '16 at 13:13
  • @IgorRivin what do you mean? Python 3 `map` objects do have a `next()` method. Python 3 `range` range objects are not strictly iterators I know – Chris_Rands Oct 11 '17 at 07:48
  • @Chris_Rands in my Anaconda distribution python 3.6.2, doing `foo = map(lambda x: x, [1, 2, 3])` returns a map object `foo`. doing `foo.next()` comes back with an error: `'map' object has no attribute 'next'` – Igor Rivin Oct 11 '17 at 14:55
  • @IgorRivin You're not using Python 3 syntax, try `next(foo)` or `foo.__next__()` – Chris_Rands Oct 11 '17 at 15:53
  • @Chris_Rands Thanks! I am not sure how foo.__next__() is an improvement over foo.next(), but whatever works... – Igor Rivin Oct 11 '17 at 17:46
  • 1
    @IgorRivin: Methods beginning and ending with `__` are reserved to Python; without that reservation, you have the problem distinguishing things for which `next` is just a method (they're not really iterators) and things that are iterators. In practice, you should skip the methods and just use the `next()` function (e.g. `next(foo)`), which works properly on every Python version from 2.6 on. It's the same way you use `len(foo)` even though `foo.__len__()` would work just fine; the dunder methods are generally intended *not* to be called directly, but implicitly as part of some other operation. – ShadowRanger Sep 17 '18 at 22:54