I have some code and within this code I need to calculate a checksum (before transmission & on received data). Doing a simple timing check, a relatively large period of time was spent calculating and checking this. With 210,000 packets to deal with it sort of makes sense.
Reading through a few sites (and SO popped up a few times) numpy bitwise came up faster than native ( Fastest bitwise xor between two multibyte binary data variables )
Equally I have looked into lru_cache since I can take credit for the majority of the time very similar requests are made.
I have tried this and got some odd results.
!/usr/bin/env python
#-*- coding: utf-8 -*-
from functools import lru_cache
from numpy import bitwise_and, invert, bitwise_xor
import numpy as np
from timeit import Timer
import random
def checksum1(data):
'''Checksum is the bytes XOR'ed together, then bit inverted - truncated to a chr'''
return (~(data[0] ^ data[1] ^ data[2]) & 0xff).to_bytes(1,byteorder='big',signed=False)
@lru_cache(maxsize=128)
def checksum2(data):
'''Checksum is the bytes XOR'ed together, then bit inverted - truncated to a chr'''
return (~(data[0] ^ data[1] ^ data[2]) & 0xff).to_bytes(1,byteorder='big',signed=False)
def checksum3(data):
'''Checksum is the bytes XOR'ed together, then bit inverted - truncated to a chr'''
return bitwise_and(invert(bitwise_xor(bitwise_xor(data[0],data[1]),data[2])),255).astype(np.uint8).tobytes()
@lru_cache(maxsize=128)
def checksum4(data):
'''Checksum is the bytes XOR'ed together, then bit inverted - truncated to a chr'''
return bitwise_and(invert(bitwise_xor(bitwise_xor(data[0],data[1]),data[2])),255).astype(np.uint8).tobytes()
if __name__ == "__main__":
#T = Timer('test()',"from __main__ import test")
T = Timer('checksum1((random.randint(0,127),0,0))',"import random;from __main__ import checksum1")
print(T.timeit())
T = Timer('checksum2((random.randint(0,127),0,0))',"import random;from __main__ import checksum2")
print(T.timeit())
T = Timer('checksum3((random.randint(0,127),0,0))',"import random;from __main__ import checksum3")
print(T.timeit())
T = Timer('checksum4((random.randint(0,127),0,0))',"import random;from __main__ import checksum4")
print(T.timeit())
py test.py
4.10519769108277
6.260751157558025
10.463237500651697
6.182100842095494
This is implying that the numpy method is slow & accessing lru cache is providing a larger overhead than it would gain.
Am I doing something wrong or is this correct?