I am trying to deepen my understanding of the CPython interpreter by tracing through the different bytecodes/opcodes as they go through the interpreter loop in ceval.c
for a simple Python program. I am using bytecode
and opcode
to mean the same thing.
My python program is this:
#filename: test.py
x = 2
y = 3
if x < y:
z = x
elif True:
z = y
else:
z = 100
I am using Python 2.7.8 and have built it with debug flags like so:
wget https://www.python.org/ftp/python/2.7.8/Python-2.7.8.tgz # download
tar xvf Python-2.7.8.tgz # extract
cd Python2.7.8
./configure --with-pydebug # build with debug flag
make -j # parallel make
I'm interested in tracing through the for(;;)
loop surrouding the switch
statement for the different opcode
's in the interpreter loop ceval.c
, starting at line 964.
I added these lines right after the start of that for
loop to check if the interpreter is running my file and if so to print out the opcode
.
964 for (;;) {
965 if (strcmp(filename, "../test.py") == 0) {
966 printf("%d\n", opcode);
967 }
And the output I get is (comments added manually to show opcode
DEFINE
's from opcode.h
):
$ ./python.exe ../test.py | cat -n
1 0 // STOP_CODE
2 90 // HAVE_ARGUMENT
3 90 // HAVE_ARGUMENT
4 101 // LOAD_NAME
5 101 // LOAD_NAME
6 101 // LOAD_NAME
7 90 // STORE_NAME
I expected 12 different opcodes instead of 7, because when I get the bytecode disassembly of the same file, there are 12 bytecode commands.
$ ./python.exe -m dis ../pytests/test.py | sed "/^$/d" | cat -n
1 1 0 LOAD_CONST 0 (2)
2 3 STORE_NAME 0 (x)
3 2 6 LOAD_CONST 1 (3)
4 9 STORE_NAME 1 (y)
5 4 12 LOAD_NAME 0 (x)
6 15 LOAD_NAME 1 (y)
7 18 COMPARE_OP 0 (<)
8 21 POP_JUMP_IF_FALSE 33
9 5 24 LOAD_NAME 0 (x)
10 27 STORE_NAME 2 (z)
11 30 JUMP_FORWARD 21 (to 54)
12 6 >> 33 LOAD_NAME
My mental model of how the CPython interpeter works and/or my method of logging the different opcodes is incorrect, or both. Can you explain why I'm seeing different opcodes from the output of ceval.c
file and from using the python -m dis
package?