I would like to use my CPU's builtin instructions from within Numba compiled functions, but am having trouble figuring out how to reference them. For example, the popcnt instruction from the SSE4 instruction set, I can confirm I have it using
llvmlite.binding.get_host_cpu_features()
, but have no way of calling the functions itself.
I need to be able to call these functions (instructions) from within other nopython compiled functions.
Ideally this would be done as closely to Python as possible, but in this case speed is more important that readability.