79

This is more of a conceptual question. I recently saw a piece of code in Python (it worked in 2.7, and it might also have been run in 2.5 as well) in which a for loop used the same name for both the list that was being iterated over and the item in the list, which strikes me as both bad practice and something that should not work at all.

For example:

x = [1,2,3,4,5]
for x in x:
    print x
print x

Yields:

1
2
3
4
5
5

Now, it makes sense to me that the last value printed would be the last value assigned to x from the loop, but I fail to understand why you'd be able to use the same variable name for both your parts of the for loop and have it function as intended. Are they in different scopes? What's going on under the hood that allows something like this to work?

nico
  • 50,859
  • 17
  • 87
  • 112
Gustav
  • 688
  • 5
  • 12
  • 1
    As an interesting thought experiment: define a function printAndReturn that takes an argument, prints it, and returns is. Then in `for i in printAndReturn [1,2,3,4,5] …`, how many times should `[1,2,3,4,5]` be printed? – Joshua Taylor Jul 11 '14 at 17:13
  • 1
    A note on scope, since no one else directly mentioned it: Python has function-level scoping, but nothing like C's block-level scoping. So the inside and the outside of the `for` loop have the same scope. – Izkata Jul 11 '14 at 22:59
  • I corrected the title of the question, as is was a bit misleading. Just because something is bad practice it does not mean it does not work. It may just be that it is more prone to error, or difficult to read/maintain, etc. – nico Jul 12 '14 at 10:13
  • Thank you. I completely agree that it was a bad title, I just didn't know what to name it initially. – Gustav Jul 15 '14 at 01:02
  • this works in php also `for ($x as $x)` but is ugly code IMO – chiliNUT Mar 30 '16 at 04:58

6 Answers6

68

What does dis tell us:

Python 3.4.1 (default, May 19 2014, 13:10:29)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from dis import dis
>>> dis("""x = [1,2,3,4,5]
... for x in x:
...     print(x)
... print(x)""")

  1           0 LOAD_CONST               0 (1)
              3 LOAD_CONST               1 (2)
              6 LOAD_CONST               2 (3)
              9 LOAD_CONST               3 (4)
             12 LOAD_CONST               4 (5)
             15 BUILD_LIST               5
             18 STORE_NAME               0 (x)

  2          21 SETUP_LOOP              24 (to 48)
             24 LOAD_NAME                0 (x)
             27 GET_ITER
        >>   28 FOR_ITER                16 (to 47)
             31 STORE_NAME               0 (x)

  3          34 LOAD_NAME                1 (print)
             37 LOAD_NAME                0 (x)
             40 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             43 POP_TOP
             44 JUMP_ABSOLUTE           28
        >>   47 POP_BLOCK

  4     >>   48 LOAD_NAME                1 (print)
             51 LOAD_NAME                0 (x)
             54 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             57 POP_TOP
             58 LOAD_CONST               5 (None)
             61 RETURN_VALUE

The key bits are sections 2 and 3 - we load the value out of x (24 LOAD_NAME 0 (x)) and then we get its iterator (27 GET_ITER) and start iterating over it (28 FOR_ITER). Python never goes back to load the iterator again.

Aside: It wouldn't make any sense to do so, since it already has the iterator, and as Abhijit points out in his answer, Section 7.3 of Python's specification actually requires this behavior).

When the name x gets overwritten to point at each value inside of the list formerly known as x Python doesn't have any problems finding the iterator because it never needs to look at the name x again to finish the iteration protocol.

Community
  • 1
  • 1
Sean Vieira
  • 155,703
  • 32
  • 311
  • 293
  • 8
    "Python never goes back to load the iterator again (it wouldn't make any sense to do so, since it already has the iterator)." This describes the behavior that you observe in the disassembly, but it doesn't say whether that *has* to be the case or not; [Abhijit's answer](http://stackoverflow.com/a/24690950/1281433) cites the manual where this is actually specified. – Joshua Taylor Jul 11 '14 at 17:16
43

Using your example code as the core reference

x = [1,2,3,4,5]
for x in x:
    print x
print x

I would like you to refer the section 7.3. The for statement in the manual

Excerpt 1

The expression list is evaluated once; it should yield an iterable object. An iterator is created for the result of the expression_list.

What it means is that your variable x, which is a symbolic name of an object list : [1,2,3,4,5] is evaluated to an iterable object. Even if the variable, the symbolic reference changes its allegiance, as the expression-list is not evaluated again, there is no impact to the iterable object that has already been evaluated and generated.

Note

  • Everything in Python is an Object, has an Identifier, attributes and methods.
  • Variables are Symbolic name, a reference to one and only one object at any given instance.
  • Variables at run-time can change its allegiance i.e. can refer to some other object.

Excerpt 2

The suite is then executed once for each item provided by the iterator, in the order of ascending indices.

Here the suite refers to the iterator and not to the expression-list. So, for each iteration, the iterator is executed to yield the next item instead of referring to the original expression-list.

Abhijit
  • 62,056
  • 18
  • 131
  • 204
5

It is necessary for it to work this way, if you think about it. The expression for the sequence of a for loop could be anything:

binaryfile = open("file", "rb")
for byte in binaryfile.read(5):
    ...

We can't query the sequence on each pass through the loop, or here we'd end up reading from the next batch of 5 bytes the second time. Naturally Python must in some way store the result of the expression privately before the loop begins.


Are they in different scopes?

No. To confirm this you could keep a reference to the original scope dictionary (locals()) and notice that you are in fact using the same variables inside the loop:

x = [1,2,3,4,5]
loc = locals()
for x in x:
    print locals() is loc  # True
    print loc["x"]  # 1
    break

What's going on under the hood that allows something like this to work?

Sean Vieira showed exactly what is going on under the hood, but to describe it in more readable python code, your for loop is essentially equivalent to this while loop:

it = iter(x)
while True:
    try:
        x = it.next()
    except StopIteration:
        break
    print x

This is different from the traditional indexing approach to iteration you would see in older versions of Java, for example:

for (int index = 0; index < x.length; index++) {
    x = x[index];
    ...
 }

This approach would fail when the item variable and the sequence variable are the same, because the sequence x would no longer be available to look up the next index after the first time x was reassigned to the first item.

With the former approach, however, the first line (it = iter(x)) requests an iterator object which is what is actually responsible for providing the next item from then on. The sequence that x originally pointed to no longer needs to be accessed directly.

Community
  • 1
  • 1
nmclean
  • 7,564
  • 2
  • 28
  • 37
4

It's the difference between a variable (x) and the object it points to (the list). When the for loop starts, Python grabs an internal reference to the object pointed to by x. It uses the object and not what x happens to reference at any given time.

If you reassign x, the for loop doesn't change. If x points to a mutable object (e.g., a list) and you change that object (e.g., delete an element) results can be unpredictable.

tdelaney
  • 73,364
  • 6
  • 83
  • 116
3

Basically, the for loop takes in the list x, and then, storing that as a temporary variable, reassigns a x to each value in that temporary variable. Thus, x is now the last value in the list.

>>> x = [1, 2, 3]
>>> [x for x in x]
[1, 2, 3]
>>> x
3
>>> 

Just like in this:

>>> def foo(bar):
...     return bar
... 
>>> x = [1, 2, 3]
>>> for x in foo(x):
...     print x
... 
1
2
3
>>> 

In this example, x is stored in foo() as bar, so although x is being reassigned, it still exist(ed) in foo() so that we could use it to trigger our for loop.

ZenOfPython
  • 891
  • 6
  • 15
  • Actually, in the last example, I don't think `x` is being reassigned. A local variable `bar` is created in `foo` and assigned the value of `x`. `foo` then returns that value in the form of an object that is used in the `for` condition. Thus, the variable `x` never was reassigned in the second example. I agree with the first one though. – Tonio Jul 11 '14 at 04:20
  • @Tonio `x` is still the iteration variable though and thus takes a new value for each loop. After the loop, `x` is equal to `3` in both cases. – Peter Gibson Jul 11 '14 at 04:23
  • @PeterGibson You are absolutely right, it slipped past my attention. – Tonio Jul 11 '14 at 04:25
  • If it were a "a new variable" inside the loop, then how come after the loop `x` holds `3` and `not `[1,2,3]`? – Joshua Taylor Jul 11 '14 at 17:17
  • @JoshuaTaylor In python the loop index variable is lexically scoped to the block in which the for loop occurred. – HennyH Jul 13 '14 at 00:31
  • @JoshuaTaylor each time the `for` loop finishes, `x` is assigned to the **next** value in the list, not the entire list again. As you can see in the output of the `for` loop, at the end, `print x` yields `3`. – ZenOfPython Jul 13 '14 at 00:39
  • @HennyH and ZenOfPython I understand the scoping of the variable in the loop. My point is that this answer says that the elements of the list are stored in "a new variable". If the `for` loop introduced a new variable, then the value of `x` after the loop would still be the list. The elements of the list are assigned to the *same* variable whose value was previously the list. There's no new variable; there's just one variable. It's value is initially [1,2,3], then 1, then, 2, then 3. There's no *new* variable. – Joshua Taylor Jul 13 '14 at 02:00
  • @JoshuaTaylor think of the scope like a dictionary, at first the key `x` is associated with the list `[1,2,3]`, then when the for loop is executed, assignments like `x = next(_x)` (where `_x` is a new reference created by the loop to the *same* list `x` refereed to prior to the first iteration of the loop). Now the dictionary is `x = 1`, and this process happens again and again till `_x` is exhausted. The `x` identifier is re-used as the loop index. – HennyH Jul 13 '14 at 23:37
  • @HennyH Yes, I understand how the scoping here works. My point is that the *answer* is mistaken where it says that "the for loop takes in the list x, and then, … **assigns a new variable x** to each value". As you point out, there's only *one* variable. The for loop *doesn't* introduce or create any new variable. – Joshua Taylor Jul 14 '14 at 00:07
  • @JoshuaTaylor Ahhh yes, I would agree with that sentance. – HennyH Jul 14 '14 at 01:07
  • this does not seem to be true with python 3.7.10 where the list comprehension does not reassign `x`: ``` [x for x in x] Out[4]: [1, 2, 3] print(x) [1, 2, 3] ``` – ClementWalter Sep 08 '21 at 12:23
1

x no longer refers to the original x list, and so there's no confusion. Basically, python remembers it's iterating over the original x list, but as soon as you start assigning the iteration value (0,1,2, etc) to the name x, it no longer refers to the original x list. The name gets reassigned to the iteration value.

In [1]: x = range(5)

In [2]: x
Out[2]: [0, 1, 2, 3, 4]

In [3]: id(x)
Out[3]: 4371091680

In [4]: for x in x:
   ...:     print id(x), x
   ...:     
140470424504688 0
140470424504664 1
140470424504640 2
140470424504616 3
140470424504592 4

In [5]: id(x)
Out[5]: 140470424504592
Noah
  • 21,451
  • 8
  • 63
  • 71
  • 2
    It doesn't so much make a copy of the range list (as changes to the list would still produce undefined behavior in the iteration). `x` just stops referring to the range list and is instead assigned the new iteration values. The range list still exists intact. If you look at the value of `x` after the loop, it will be `4` – Peter Gibson Jul 11 '14 at 04:19
  • "x no longer refers to the original x" `x` never referred to `x`; `x` referred to a sequence. Then it referred to `1`, then to `2`, etc. – Joshua Taylor Jul 11 '14 at 17:19