I noticed the following odd behaviour when timing enumerate
with the default start
parameter specified:
In [23]: %timeit enumerate([1, 2, 3, 4])
The slowest run took 7.18 times longer than the fastest. This could mean that an intermediate result is being cached
1000000 loops, best of 3: 511 ns per loop
In [24]: %timeit enumerate([1, 2, 3, 4], start=0)
The slowest run took 12.45 times longer than the fastest. This could mean that an intermediate result is being cached
1000000 loops, best of 3: 1.22 µs per loop
So, approximately a 2x slowdown for the case where start
is specified.
The byte code issued for each case doesn't really indicate anything that would contribute to the significant difference in speed. Case in point, after examining the different calls with dis.dis
the additional commands issued are:
18 LOAD_CONST 5 ('start')
21 LOAD_CONST 6 (0)
These, along with the CALL_FUNCTION
having 1 keyword, are the only differences.
I tried tracing through the calls made in CPython
s ceval
with gdb
and both seem to use do_call
in call_function
and not some other optimization I could detect.
Now, I understand enumerate
just creates an enumerate iterator, so we're dealing with object creation here (right?). I looked in Objects/enumobject.c
trying to spot any differences if start
was specified. The only thing that (I believe) differs is when start != NULL
in which the following happens:
if (start != NULL) {
start = PyNumber_Index(start);
if (start == NULL) {
Py_DECREF(en);
return NULL;
}
assert(PyInt_Check(start) || PyLong_Check(start));
en->en_index = PyInt_AsSsize_t(start);
if (en->en_index == -1 && PyErr_Occurred()) {
PyErr_Clear();
en->en_index = PY_SSIZE_T_MAX;
en->en_longindex = start;
} else {
en->en_longindex = NULL;
Py_DECREF(start);
}
Which doesn't look like something which would introduce a 2x slowdown. (I think, not sure.)
The previous code segments have been executed on Python 3.5
, similar results are present in 2.x
too, though.
This is where I'm stuck and can't figure out where to look. This might just be a case of overhead from additional calls in the second case accumulating, but again, I'm not really sure. Does anybody know what might be the reason behind this?