0

I would like to use my CPU's builtin instructions from within Numba compiled functions, but am having trouble figuring out how to reference them. For example, the popcnt instruction from the SSE4 instruction set, I can confirm I have it using llvmlite.binding.get_host_cpu_features(), but have no way of calling the functions itself.

I need to be able to call these functions (instructions) from within other nopython compiled functions.

Ideally this would be done as closely to Python as possible, but in this case speed is more important that readability.

Sam Ragusa
  • 422
  • 3
  • 13

2 Answers2

1

You can use Cython to call SSE intrinsics, but you cannot use Numba to do it. Code doing what you want via Cython is here: https://gist.github.com/aldro61/f604a3fa79b3dec5436a and here: https://gist.github.com/craffel/e470421958cad33df550

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • While I do appreciate this, I do believe it is possible to do it from within Numba. I say that mainly because of these docs: http://llvmlite.pydata.org/en/latest/user-guide/binding/modules.html#llvmlite.binding.ModuleRef – Sam Ragusa May 04 '18 at 08:39
  • @SamRagusa: Well it's hard to prove a negative isn't it? If you strongly believe it is possible, I would encourage you to figure out how and post it here as an answer. – John Zwinck May 06 '18 at 06:22
  • 1
    While this may be what bugs me the most, it's far from my current bottleneck. At some point I'll try and circle back around to it, and of course if I end up finding a better (or just interesting) solution I'll update the post accordingly! – Sam Ragusa May 12 '18 at 10:00
1

You can make a small assembly language DLL and call it through ctypes that in my experience have no overhead whatsoever when used from Numba nopython code. Or alternatively you can use instruction codes directly like in this blog post on jit in Python Piston JavaScript assembler might be used to obtain machine codes for a small asm routine. Numba allows making small functions in LLVM ir as described in this thread Of course llvmlite might be used too.

as691454
  • 11
  • 2