0

I am learning about self-defined iterator, and confused with the example from below. My learning understanding with __iter__, __next__ so far is:

__iter__ generates a iterator, and __next__ goes through this iterator object for its contained values one by one (to me, this means __next__ needs an iterator object to work with. However, in the following code.

The CirleIterator() class does not have __iter__(self), to me this means this class does not create a iterator, then what the __next__() is doing here by itself? There is no iterator object for __next__() to use here.

The Cirle() class has the __iter__ step which return the object from the CircleIterator() class. My understanding from reading is __iter__ needs to return an iterator, but we just said the CircleIterator() class did not generate a iterator. So why do we even put it here?

I also tried to run the CircleIterator() class alone. And tried out things like CircleIterator('abc', 3).__next__(). This returns me nothing.

class CircleIterator():
    def __init__(self, data, max_times):
        self.data = data
        self.max_times = max_times
        self.index = 0

    def __next__(self):
        if self.index >= self.max_times:
            raise StopIteration
        value = self.data[self.index % len(self.data)]
        self.index += 1
        return value


class Circle():
    def __init__(self, data, max_times):
        self.data = data
        self.max_times = max_times

    def __iter__(self):
        return CircleIterator(self.data, self.max_times)

c = Circle('abc', 5)
print(list(c))
Ted Klein Bergman
  • 9,146
  • 4
  • 29
  • 50
MeiNan Zhu
  • 1,021
  • 1
  • 9
  • 18
  • What is considered an iterator is that an object with ```__next__()``` method implemented. The reason why ```Circle``` returns new instance object of ```CircleIterator``` is to support multiple active iterators. – Henry Tjhia Dec 23 '20 at 17:42
  • @HenryTjhia so it's not __iter__() defines a iterator but __next__()? – MeiNan Zhu Dec 23 '20 at 17:51
  • 5
    `CirlceIterator` is broken, according to the protocol spec it **must** implement `__iter__` and that `__iter__` **must** return `self`. – juanpa.arrivillaga Dec 23 '20 at 18:22
  • @ juanpa.arrivillaga the code functions, you may simply copy and paste run, you will see the result. – MeiNan Zhu Dec 23 '20 at 18:25
  • 4
    @MeiNanZhu yes, it functions, and yet, it does not follow the specification, so it should be considered broken. Try doing: `iter(iter(Cirecl("abc", 2)))`. – juanpa.arrivillaga Dec 23 '20 at 18:31
  • 1
    See https://docs.python.org/3/library/stdtypes.html#iterator-types. – chepner Dec 23 '20 at 18:35
  • I have removed my answer because it contradicts the specification, but I feel this is a grey area. It is *technically* perfectly fine for an iterator not to be an iterable. A quick test revealed no cases where an iterator is *actually* required to be an iterable. It would be nice if an answer could cite cases in which an iterator actually has to be an iterable. – MisterMiyagi Dec 23 '20 at 18:39
  • @MisterMiyagi if iterators dont support `__iter__` then they cannot be used in `for` statements, for example. – juanpa.arrivillaga Dec 23 '20 at 18:42
  • 1
    @MisterMiyagi Just because the compiler doesn't (or isn't able to) check for an appropriate `__iter__` doesn't mean you can simply omit it. The documentation clearly states that `__iter__` must be defined and return the object itself. – chepner Dec 23 '20 at 18:43
  • @juanpa.arrivillaga The ``for`` statement is defined to work on iterables, not iterators. – MisterMiyagi Dec 23 '20 at 18:46
  • @MisterMiyagi *all iterators are supposed to be iterable*. That is what the specification is telling you. – juanpa.arrivillaga Dec 23 '20 at 18:49
  • @juanpa.arrivillaga Can we please for a moment assume that I have read the spec and am not just being contrarian for lack of understanding it, but practical considerations? There are *a lot* of things that are technical against the spec, or even completely undefined, yet perfectly accepted use. Citing a practical case where an iterator, not iterable, is required and *must* be an iterable would be useful to clear up that grey area. – MisterMiyagi Dec 23 '20 at 18:52
  • @MisterMiyagi how about all the library functions which assume that `iter(iterator) is iterator`? The whole point of itertools is to allow you to work with iterators, and yet, if you don't fulfill that part of the spec, many of those functions wont work. – juanpa.arrivillaga Dec 23 '20 at 18:55

1 Answers1

1

Iterators must implement __iter__, and that __iter__ must return self. This is part of the specification, the reasoning being:

This is required to allow both containers and iterators to be used with the for and in statements.

Note, all sorts of library code rests on this assumption. You wouldn't be able to use most of itertools, for example, with your iterator.

In [1]: class BrokenIterator:
   ...:     def __next__(self):
   ...:         return 1
   ...:

In [2]: it = BrokenIterator()

In [3]: import itertools

In [4]: list(itertools.islice(it, 2))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-2f395782184e> in <module>
----> 1 list(itertools.islice(it, 2))

TypeError: 'BrokenIterator' object is not iterable

Versus a non-broken implementation:

In [5]: class Iterator:
   ...:     def __next__(self):
   ...:         return 1
   ...:     def __iter__(self):
   ...:         return self
   ...:

In [6]: it = Iterator()

In [7]: list(itertools.islice(it, 2))
Out[7]: [1, 1]

Of course, there's not much stopping you from using this implementation. Python very often doesn't strictly enforce it's documented protocols, for better or for worse.

juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
  • While this is true, I would like to say that in op's case, the two classes are complementary to each other. To support multiple active iterators, the ```Circle```'s ```__iter__``` needs to define a new stateful object for the iterator, instead of returning self for each iterator request. ```Circle```'s instance is the object you are asking iterator for not ```CircleIterator```. The latter defined just to hold independent state data from ```Circle```. – Henry Tjhia Dec 23 '20 at 18:57
  • 1
    @HenryTjhia `Circle` isn't supposed to return `self` from `__iter__`, `CircleIterator` *should*, and that wouldn't prevent `Circle` from supporting two active iterators at all – juanpa.arrivillaga Dec 23 '20 at 18:59