You asked,
What is the difference between __getitem__
and __iter__
?
What is __getitem__()
?
Suppose that arr
is a list. Maybe arr = ["A", "B", "C"]
__getitem__
is a function which is executed when you write:
elem = arr[0]
elem = arr[1]
elem = arr[2]
- etc...
Suppose that we have lyst = [0, 111, 222, 333]
Each of the columns in the table below are equivalent in behavior:
+---------------------+------------------------------+--------------+
| `elem = lyst[0]` | `elem = lyst.__getitem__(0)` | `elem = 0` |
| `elem = lyst[1]` | `elem = lyst.__getitem__(1)` | `elem = 111` |
| `elem = lyst[2]` | `elem = lyst.__getitem__(2)` | `elem = 222` |
| `elem = lyst[3]` | `elem = lyst.__getitem__(3)` | `elem = 333` |
+---------------------+------------------------------+--------------+
You can define your own __getitem__
method in any of the classes you write. An example a custom-made class with a __getitem__
method is shown below:
from functools import *
from copy import *
class BaseArray:
pass
class Array(BaseArray):
def __init__(self:BaseArray, obj:object):
if hasattr(obj, `__iter__`):
it = iter(obj)
lyst = list(it)
lyst = self._init_helper(obj)
self._lyst = lyst
self._lvi = len(lyst) - 1
# `lvi` == `last valid index`
@singledispatchmethod
def _init_helper(self:BaseArray, obj:object):
raise NotImplementedError()
@_init_helper.register
def _init_helper(self:BaseArray, arr:BaseArray):
self._lyst = deepcopy(arr._lyst)
@_init_helper.register
def _init_helper(self:BaseArray, lyst:list):
self._lyst = deepcopy(lyst)
def __getitem__(self:BaseArray, index:int):
if index > self._lvi:
# `lvi` == `last valid index`
raise IndexError("index is too large")
return self._lyst
def __setitem__(self:BaseArray, index:int):
if index > self._lvi:
# `lvi`== `last valid index`
raise IndexError("index is too large")
self._lyst[index] = index
def __iter__(self:BaseArray):
raise NotImplementedError()
You might or might not care about C++, but in C++ __getitem__
is known as operator[]
.
Many different languages have something like python's __getitem__
method. If you become comfortable with the inner-workings __getitem__
, it will help you write code in other programming languages as well.
# This code is written in C++, not python
int& Array::operator[](int index)
{
\\ In C++ `this` is like the `self` parameter in python
if (index >= this->size) {
throw std::invalid_argument( "index is too large" );
\\ The `throw` keyword from C++
\\ is known as `raise` in python
exit(0);
}
return this->ptr[index];
}
What is __iter__()
?
Like __getitem__
, __iter__()
is a class method.
__iter__()
is usually used in for
loops.
Suppose that cookie_jar
is a list
like the following:
["oatmeal 1", "chocolate chip 1", "oatmeal 2"]
The following two pieces of code are syntactically different, but are semantically equivalent:
+------------------------------+-----------------------------------------+
| | it = iter(cookie_jar) |
| for cookie in cookie_jar: | while True: |
| print(cookie) | try: |
| | cookie = next(it) |
| | except (IndexError, StopIteration): |
| | break |
| | print(cookie) |
+------------------------------+-----------------------------------------+
Also, both of the loops shown above do the same thing as the following:
cookie_jar = ["oatmeal 1", "chocolate chip 1", "oatmeal 2"]
it = cookie_jar.__iter__()
while True:
try:
cookie = it.__next__()
except (StopIteration, IndexError):
break
print(cookie)
Many python container classes, such as list
, tuple
, etc... accept any iterable as input.
That is, you can pass anything as input to __init__
provided that that thing has an __iter__
method.
tuppy_the_tuple = ("A", "B", "C")
lizzy_the_list = ["A", "B", "C"]
steve_the_string = "ABC"
chests = [tuppy_the_tuple, lizzy_the_list, steve_the_string]
for chest in chests:
larry_the_new_list = list(chest)
print(larry_the_new_list)
# treasure chests are an example of a "container"
# __________
# /\____;;___\
# | / /
# `. ())oo() .
# |\( ()*^^()^\
# | |---------|
# \ | )) |
# \|_________|
An example of how to define your own __iter__
method is shown below:
class BaseCookieJar:
pass
class CookieJar(BaseCookieJar):
class CookieIterator:
def __init__(self, cookie_jar:BaseCookieJar):
self._cookie_jar = cookie_jar
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self_index >= len(self._cookie_jar):
raise StopIteration()
cookie = self._cookie_jar[self._index]
self._index += 1
return cookie
# End of Iterator class
# Resume `CookieJar` class
def __init__(*ignore_me, **ignore_me_too):
self._lyst = [0, 1, 2, 3, 4]
def __iter__(self):
return type(self).CookieIterator(self)
def __len__(self):
return len(self._lyst)
What happens if you define your own __getitem__()
method but do not define a __iter__()
method?
Suppose that you write a class with a __getitem__()
method, but no __iter__()
method.
Your class will inherit a default __iter__()
method. We can emulate the default implementation of __iter__
with the following analog:
IteratorClass = type(
iter(
type(
"TempClass",
(object,),
{
"__getitem__": lambda *args: None,
}
)()
)
)
class ClassChest:
"""
Something like this happens when you have
___getitem__()
but you do not have...
___iter__()
"""
# class ChestIterator(IteratorClass):
# We are not allowed to subclass `iterator`
# TypeError: type 'iterator' is not an acceptable base type
class ChestIterator:
def __init__(self, chest):
self._chest = chest
self._idx = 0
def __next__(self):
idx = self._idx
try:
gold_coin = self._chest[idx]
except IndexError:
raise IndexError from None
gold_coin = self._chest.__getitem__(idx)
self._idx = 1 + self._idx
return gold_coin
def __iter__(self):
return self
# End of Iterator
# Resume Class Chest class
def __iter__(self):
return type(self).ChestIterator(self)
def __getitem__(self, idx: int):
if idx > 4:
raise IndexError
return idx
instance_chest = ClassChest()
for shiny_object in instance_chest:
print("mi treazure is == ", shiny_object)
while True:
iterator = iter(instance_chest)
iterator = instance_chest.__iter__()
try:
shiny_object = next(iterator)
shiny_object = iterator.__next__()
except (StopIteration, IndexError):
break
print("mi treazure is == ", shiny_object)
the console output is:
mi treazure is == 0
mi treazure is == 1
mi treazure is == 2
mi treazure is == 3
mi treazure is == 4
Some notes about your code
You wrote,
class foo:
def __getitem__(self, *args):
print(*args)
bar = foo()
for i in bar:
print(i)
In your code, None
is printed because you failed to specify the return value of __getitem__
Both passages of code in the following table are equivalent:
+-------------------------------+-------------------------------+
| no `return` statement | `return None` |
+-------------------------------+-------------------------------+
| def __getitem__(self, *args): | def __getitem__(self, *args): |
| print(*args) | print(*args) |
| | return None |
+-------------------------------+-------------------------------+
Let us modify your code a bit:
class Klass:
def __getitem__(self, idx:int):
print("Inside __getitem__ `idx` is ", idx)
return idx
# WARNING: iter(self).__next__() can cause
# infinite loops if we do not do one of the following:
# * raise `StopIteration`
# * raise `IndexError`
obj = Klass()
for item in obj:
print("Inside of the `for` loop `item` is:", item)
We will have an infinite loop:
[...]
Inside of the `for` loop `item` is: 41875
Inside __getitem__ `idx` is 41876
Inside of the `for` loop `item` is: 41876
Inside __getitem__ `idx` is 41877
Inside of the `for` loop `item` is: 41877
Inside __getitem__ `idx` is 41878
Inside of the `for` loop `item` is: 41878
We can stop the looping by raising a StopIteration
exception.
class Klass:
def __getitem__(self, idx:int):
if idx > 3:
raise StopIteration()
print("Inside __getitem__ `idx` is ", idx)
return idx
obj = Klass()
for item in obj:
print("Inside of the `for` loop `item` is:", item)