0

I am trying to deepen my understanding of the CPython interpreter by tracing through the different bytecodes/opcodes as they go through the interpreter loop in ceval.c for a simple Python program. I am using bytecode and opcode to mean the same thing.

My python program is this:

#filename: test.py

x = 2
y = 3

if x < y:
    z = x
elif True:
    z = y
else:
    z = 100

I am using Python 2.7.8 and have built it with debug flags like so:

wget https://www.python.org/ftp/python/2.7.8/Python-2.7.8.tgz   # download
tar xvf Python-2.7.8.tgz                                        # extract
cd Python2.7.8 
./configure --with-pydebug                                      # build with debug flag
make -j                                                         # parallel make

I'm interested in tracing through the for(;;) loop surrouding the switch statement for the different opcode's in the interpreter loop ceval.c, starting at line 964.

I added these lines right after the start of that for loop to check if the interpreter is running my file and if so to print out the opcode.

964     for (;;) {
965         if (strcmp(filename, "../test.py") == 0) {
966             printf("%d\n", opcode);
967         }

And the output I get is (comments added manually to show opcode DEFINE's from opcode.h):

 $ ./python.exe ../test.py | cat -n                                              
 1  0    //  STOP_CODE
 2  90   //  HAVE_ARGUMENT
 3  90   //  HAVE_ARGUMENT
 4  101  //  LOAD_NAME
 5  101  //  LOAD_NAME
 6  101  //  LOAD_NAME
 7  90   //  STORE_NAME

I expected 12 different opcodes instead of 7, because when I get the bytecode disassembly of the same file, there are 12 bytecode commands.

$ ./python.exe -m dis ../pytests/test.py | sed "/^$/d" | cat -n                      
1    1           0 LOAD_CONST               0 (2)
2                3 STORE_NAME               0 (x)
3    2           6 LOAD_CONST               1 (3)
4                9 STORE_NAME               1 (y)
5    4          12 LOAD_NAME                0 (x)
6               15 LOAD_NAME                1 (y)
7               18 COMPARE_OP               0 (<)
8               21 POP_JUMP_IF_FALSE       33
9    5          24 LOAD_NAME                0 (x)
10              27 STORE_NAME               2 (z)
11              30 JUMP_FORWARD            21 (to 54)
12   6     >>   33 LOAD_NAME               

My mental model of how the CPython interpeter works and/or my method of logging the different opcodes is incorrect, or both. Can you explain why I'm seeing different opcodes from the output of ceval.c file and from using the python -m dis package?

gariepy
  • 3,576
  • 6
  • 21
  • 34
Idr
  • 6,000
  • 6
  • 34
  • 49
  • I think you've wrongly tagged this as "Cython" (a tool that translates a Python-like language to C with a view to faster numerical calculations) when you mean "CPython" (the "standard" Python interpreter). As it stands this question doesn't make too much sense... – DavidW Nov 11 '15 at 07:42
  • But ignoring the fact the bytecodes you see are different I think you expect to get different numbers of bytecodes: the loop in ceval will only show what gets executed (not the parts of "it/else" that get skipped) while dis will give you everything. – DavidW Nov 11 '15 at 07:53
  • 1
    Possible duplicate of [What is the best way to sample/profile a PyObjC application?](http://stackoverflow.com/questions/157662/what-is-the-best-way-to-sample-profile-a-pyobjc-application) – Paul Sweatte Aug 31 '16 at 19:26

1 Answers1

0

Your tracing display is faulty because, in the normal course of things you don't execute either STOP_CODE (value 0) which would halt execution. Also, HAVE_ARGUMENT is not an opcode. For Python 2.7 the opcode is STORE_NAME.

As to the discrepancy in values, that is to be expected in any code that isn't straight-line (basic block) code. And yours isn't straight-line code. There is a COMPARE < followed by a POP_JUMP_IF_FALSE jump.

rocky
  • 7,226
  • 3
  • 33
  • 74