9

What's the purpose of keeping track of blocks in Python bytecode?

The documentation here mentions:

... Per frame, there is a stack of blocks, denoting nested loops, try statements, and such.

But they don't actually seem necessary to actually perform loops. For instance, playing around with the REPL I see:

>>> def foo():
...   while True:
...     print('hi')
... 
>>> for inst in list(dis.get_instructions(foo)): print(inst)
... 
Instruction(opname='SETUP_LOOP', opcode=120, arg=12, argval=14, argrepr='to 14', offset=0, starts_line=2, is_jump_target=False)
Instruction(opname='LOAD_GLOBAL', opcode=116, arg=0, argval='print', argrepr='print', offset=2, starts_line=3, is_jump_target=True)
Instruction(opname='LOAD_CONST', opcode=100, arg=1, argval='hi', argrepr="'hi'", offset=4, starts_line=None, is_jump_target=False)
Instruction(opname='CALL_FUNCTION', opcode=131, arg=1, argval=1, argrepr='', offset=6, starts_line=None, is_jump_target=False)
Instruction(opname='POP_TOP', opcode=1, arg=None, argval=None, argrepr='', offset=8, starts_line=None, is_jump_target=False)
Instruction(opname='JUMP_ABSOLUTE', opcode=113, arg=2, argval=2, argrepr='', offset=10, starts_line=None, is_jump_target=False)
Instruction(opname='POP_BLOCK', opcode=87, arg=None, argval=None, argrepr='', offset=12, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_CONST', opcode=100, arg=0, argval=None, argrepr='None', offset=14, starts_line=None, is_jump_target=True)
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=16, starts_line=None, is_jump_target=False)

The JUMP_ABSOLUTE instruction listed jumps to the LOAD_GLOBAL instruction listed. From just looking at the instructions, it seems like SETUP_LOOP and POP_BLOCK opcodes could be no-ops.

From what I understand, in Python there are no block scoped variables, so that doesn't like it would be the reason either.

math4tots
  • 8,540
  • 14
  • 58
  • 95

2 Answers2

6

CPython uses a stack machine model, where temporary values are pushed onto a value stack and popped by instructions that use them. When a loop ends, depending on how it ends, it may have left values on the value stack that are no longer needed.

A frame's block stack keeps track of the value stack level at the start of loops and a few other constructs, so the value stack can be restored to the state that code after the loop/other construct needs the stack to be in. POP_BLOCK is one of the constructs that restores the stack to a pre-block-entry state.

The information in the block stack is very important for exception-handling constructs, since the value stack could be in all sorts of weird states when an exception occurs. It's not as necessary for loops, and I believe a patch going into CPython 3.8 will eliminate block stack entries for loops, instead having the compiler determine the necessary handling statically.

user2357112
  • 260,549
  • 28
  • 431
  • 505
1

The SETUP_LOOP and POP_BLOCK bytecodes in your example function are useless because the loop runs forever, but if you had a break statement inside the loop, the infrastructure they set up in the frame would be used. The interpreter would put a BREAK_LOOP bytecode where the break statement occurred, and it would use the block information to find the nearest loop to break out of.

Note that this part of the bytecode is apparently going to change in Python 3.8, so you may not want to invest too much effort into understanding how it currently works. You can read issue 17611 on the Python bug tracker to see how the topic was discussed before being implemented.

Blckknght
  • 100,903
  • 11
  • 120
  • 169