13

In Python 3, it is standard procedure to make a class an iterable and iterator at the same time by defining both the __iter__ and __next__ methods. But I have problems to wrap my head around this. Take this example which creates an iterator that produces only even numbers:

class EvenNumbers:
    
    def __init__(self, max_):
        self.max_ = max_

    def __iter__(self):
        self.n = 0
        return self

    def __next__(self):
        if self.n <= self.max_:  # edit: self.max --> self.max_
            result = 2 * self.n
            self.n += 1
            return result

        raise StopIteration

instance = EvenNumbers(4)

for entry in instance:
    print(entry)

To my knowledge (correct me if I'm wrong), when I create the loop, an iterator is created by calling something like itr = iter(instance) which internally calls the __iter__ method. This is expected to return an iterator object (which the instance is due to defining __next__ and therefore I can just return self). To get an element from it, next(itr) is called until the exception is raised.

My question here is now: if and how can __iter__ and __next__ be separated, so that the content of the latter function is defined somewhere else? And when could this be useful? I know that I have to change __iter__ so that it returns an iterator.

Btw the idea to do this comes from this site (LINK), which does not state how to implement this.

Martí
  • 571
  • 6
  • 17
DocDriven
  • 3,726
  • 6
  • 24
  • 53
  • 2
    Even when you seperate them, the one that implements `__next__` also _has_ to implement `__iter__` (returning itself). – L3viathan Aug 28 '18 at 10:59

2 Answers2

24

It sounds like you're confusing iterators and iterables. Iterables have an __iter__ method which returns an iterator. Iterators have a __next__ method which returns either their next value or raise a StopIteration. Now in python, it is stated that iterators are also iterables (but not visa versa) and that iter(iterator) is iterator so an iterator, itr, should return only itself from it's __iter__ method.

Iterators are required to have an __iter__() method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted

In code:

class MyIter:
   def __iter__(self):
       return self

   def __next__(self):
       # actual iterator logic

If you want to make a custom iterator class, the easiest way is to inherit from collections.abc.Iterator which you can see defines __iter__ as above (it is also a subclass of collections.abc.Iterable). Then all you need is

class MyIter(collections.abc.Iterator):
    def __next__(self):
        ...

There is of course a much easier way to make an iterator, and thats with a generator function

def fib():
    a = 1
    b = 1
    yield a
    yield b
    while True:
        b, a = a + b, b
        yield b

list(itertools.takewhile(lambda x: x < 100, fib()))
# --> [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]

Just for reference, this is (simplified) code for an abstract iterator and iterable

from abc import ABC, abstractmethod

class Iterable(ABC):
    @abstractmethod
    def __iter__(self):
        'Returns an instance of Iterator'
        pass

class Iterator(Iterable, ABC):
    @abstractmethod
    def __next__(self):
        'Return the next item from the iterator. When exhausted, raise StopIteration'
        pass

    # overrides Iterable.__iter__
    def __iter__(self):
        return self

    
FHTMitchell
  • 11,793
  • 2
  • 35
  • 47
  • Thank you, but I need some clarification: if I define `__iter__` in a class, I tell the interpreter that it is an iterable. When I return self in this method, then I return the instance and not an iterator object, right? But according to the docs this should be an iterator object and this confuses me. – DocDriven Aug 28 '18 at 11:29
  • 3
    If you define `__iter__` then the object is **iterable**. If you define `__next__` the object is an **iterator**. On **iterator** objects, you should set `__iter__` to return the object itself, which as I said, is an **iterator**. **iterators** are **iterables** that when iterated over (e.g. for loop), return themselves. – FHTMitchell Aug 28 '18 at 11:53
0

I think I have grasped the concept now, even if I do not fully understand the passage from the documentation by @FHTMitchell. I came across an example on how to separate the two methods and wanted to document this.

What I found is a very basic tutorial that clearly distinguishes between the iterable and the iterator (which is the cause of my confusion).

Basically, you define your iterable first as a separate class:

class EvenNumbers:

    def __init__(self, max_):
        self.max = max_

    def __iter__(self):
        self.n = 0
        return EvenNumbersIterator(self)

The __iter__ method only requires an object that has a __next__ method defined. Therefore, you can do this:

class EvenNumbersIterator:

    def __init__(self, source):
        self.source = source       

    def __next__(self):
        if self.source.n <= self.source.max:
            result = 2 * self.source.n
            self.source.n += 1
            return result
        else:
            raise StopIteration

This separates the iterator part from the iterable class. It now makes sense that if I define __next__ within the iterable class, I have to return the reference to the instance itself as it basically does 2 jobs at once.

DocDriven
  • 3,726
  • 6
  • 24
  • 53
  • 2
    Your iterator isn't a valid iterator (or at least breaks conventions and could lead to confusing errors) since iterators must also be iterable as discussed in my answer. You either need to inherit from `collections.abc.Iterator` or define `def __iter__(self): return self`. – FHTMitchell Aug 29 '18 at 09:46