I've been interested in overheads, so I wrote a minimal C extension exporting two functions nop
and starnop
that do more or less nothing. They just pass through their input (the two relevant functions are right at the top the rest is just tedious boiler plate code):
amanmodule.c:
#include <Python.h>
static PyObject* aman_nop(PyObject *self, PyObject *args)
{
PyObject *obj;
if (!PyArg_UnpackTuple(args, "arg", 1, 1, &obj))
return NULL;
Py_INCREF(obj);
return obj;
}
static PyObject* aman_starnop(PyObject *self, PyObject *args)
{
Py_INCREF(args);
return args;
}
static PyMethodDef AmanMethods[] = {
{"nop", (PyCFunction)aman_nop, METH_VARARGS,
PyDoc_STR("nop(arg) -> arg\n\nReturn arg unchanged.")},
{"starnop", (PyCFunction)aman_starnop, METH_VARARGS,
PyDoc_STR("starnop(*args) -> args\n\nReturn tuple of args unchanged")},
{NULL, NULL}
};
static struct PyModuleDef amanmodule = {
PyModuleDef_HEAD_INIT,
"aman",
"aman - a module about nothing.\n\n"
"Provides functions 'nop' and 'starnop' which do nothing:\n"
"nop(arg) -> arg; starnop(*args) -> args\n",
-1,
AmanMethods
};
PyMODINIT_FUNC
PyInit_aman(void)
{
return PyModule_Create(&amanmodule);
}
setup.py:
from setuptools import setup, extension
setup(name='aman', version='1.0',
ext_modules=[extension.Extension('aman', ['amanmodule.c'])],
author='n.n.',
description="""aman - a module about nothing
Provides functions 'nop' and 'starnop' which do nothing:
nop(arg) -> arg; starnop(*args) -> args
""",
license='public domain',
keywords='nop pass-through identity')
Next, I time them against pure Python implementations and a couple of builtins that also do next to nothing:
import numpy as np
from aman import nop, starnop
from timeit import timeit
def mnsd(x): return '{:8.6f} \u00b1 {:8.6f} \u00b5s'.format(np.mean(x), np.std(x))
def pnp(x): x
globals={}
for globals['nop'] in (int, bool, (0).__add__, hash, starnop, nop, pnp, lambda x: x):
print('{:60s}'.format(repr(globals['nop'])),
mnsd([timeit('nop(1)', globals=globals) for i in range(10)]),
' ',
mnsd([timeit('nop(True)',globals=globals) for i in range(10)]))
First Question I'm not doing something retarded methodology-wise?
Results for 10 blocks of 1,000,000 calls each:
<class 'int'> 0.099754 ± 0.003917 µs 0.103933 ± 0.000585 µs
<class 'bool'> 0.097711 ± 0.000661 µs 0.094412 ± 0.000612 µs
<method-wrapper '__add__' of int object at 0x8c7000> 0.065146 ± 0.000728 µs 0.064976 ± 0.000605 µs
<built-in function hash> 0.039546 ± 0.000671 µs 0.039566 ± 0.000452 µs
<built-in function starnop> 0.056490 ± 0.000873 µs 0.056234 ± 0.000181 µs
<built-in function nop> 0.060094 ± 0.000799 µs 0.059959 ± 0.000170 µs
<function pnp at 0x7fa31c0512f0> 0.090452 ± 0.001077 µs 0.098479 ± 0.003314 µs
<function <lambda> at 0x7fa31c051378> 0.086387 ± 0.000817 µs 0.086536 ± 0.000714 µs
Now my actual question: even though my nops are written in C and do nothing (starnop
doesn't even parse its arguments) the builtin hash
function is consistently faster. I know that ints are their own hash values in Python, so hash
also is a nop here but it isn't nopper than my nops, so why the speed difference?
Update: Completely forgot: I'm on a pretty standard x86_64 machine, linux gcc4.8.5. The extension I install using python3 setup.py install --user
.