As ridiculous as it may sound, the fastest solution using builtins may be to build a string and pass it to int
, much as the fastest way to count 1-bits in an int
is bin(n).count('1')
. And it's dead simple, too:
def unbitify_byte(src):
s = ''.join(map(str, src))
n = int(s, 2)
return n.to_bytes(len(src)//8, 'big')
Equivalent (but marginally more complex) code using gmpy2
instead of native Python int
is a bit faster.
And you can extend it to 2-bit values pretty easily:
def unhalfnybblify_byte(src):
s = ''.join(map(str, src))
n = int(s, 4)
return n.to_bytes(len(src)//4, 'big')
If you want something more flexible, but possibly slower, here's a simple solution using ctypes
.
If you know C, you can probably see a struct of 8 single-bit bit-fields would come in handy here. And you can write the equivalent struct type in Python like this:
class Bits(ctypes.Structure):
_fields_ = [(f'bit{8-i}', ctypes.c_uint, 1) for i in range(8)]
And you can construct one of them from 8 ints that are all 0 or 1:
bits = Bits(*src[:8])
And you can convert that to a single int by using an ugly cast or a simple union:
class UBits(ctypes.Union):
_fields_ = [('bits', Bits), ('i', ctypes.c_uint8)]
i = UBits(Bits(*src[:8])).i
So now it's just a matter of chunking src
into groups of 8 in big-endian order:
chunks = (src[i:i+8][::-1] for i in range(0, len(src), 8))
dst = bytearray(UBits(Bits(*chunk)).i for chunk in chunks)
And it should be pretty obvious how to extend this to four 2-bit fields, or two 4-bit fields, or even two 3-bit fields and a 2-bit field, per byte.
However, despite looking like low-level C code, it's probably slower. Still, it might be worth testing to see if it's fast enough for your uses.
A custom C extension can probably do better. And there are a number of bit-array-type modules on PyPI to try out. But if you want to go down that road, numpy
is the obvious answer. You can't get any simpler than this:
np.packbits(src)
(A bytearray
works just fine as an "array-like".)
It's also hard to beat for speed.
For comparison, here's some measurements:
- 60ns/byte + 0.3µs:
np.packbits
on an array
instead of a bytearray
- 60ns/byte + 1.9µs:
np.packbits
- 440ns/byte + 3.2µs:
for
and bit-twiddling in PyPy instead of CPython
- 570µs/byte + 3.8µs:
int(…, 2).to_bytes(…)
in PyPy instead of CPython
- 610ns/byte + 9.1µs:
bitarray
- 800ns/byte + 2.9µs:
gmpy.mpz(…)…
- 1.0µs/byte + 2.8µs:
int(…, 2).to_bytes(…)
- 2.9µs/byte + 0.2µs:
(UBits(Bits(*chunk)) …)
- 16.µs/byte + 0.9µs:
for
and bit-twiddling