4

I have some code and within this code I need to calculate a checksum (before transmission & on received data). Doing a simple timing check, a relatively large period of time was spent calculating and checking this. With 210,000 packets to deal with it sort of makes sense.

Reading through a few sites (and SO popped up a few times) numpy bitwise came up faster than native ( Fastest bitwise xor between two multibyte binary data variables )

Equally I have looked into lru_cache since I can take credit for the majority of the time very similar requests are made.

I have tried this and got some odd results.

!/usr/bin/env python
#-*- coding: utf-8 -*-

from functools import lru_cache
from numpy import bitwise_and, invert, bitwise_xor
import numpy as np
from timeit import Timer
import random

def checksum1(data):
    '''Checksum is the bytes XOR'ed together, then bit inverted - truncated to a chr'''
    return (~(data[0] ^ data[1] ^ data[2]) & 0xff).to_bytes(1,byteorder='big',signed=False)


@lru_cache(maxsize=128)
def checksum2(data):
    '''Checksum is the bytes XOR'ed together, then bit inverted - truncated to a chr'''
    return (~(data[0] ^ data[1] ^ data[2]) & 0xff).to_bytes(1,byteorder='big',signed=False)


def checksum3(data):
    '''Checksum is the bytes XOR'ed together, then bit inverted - truncated to a chr'''
    return  bitwise_and(invert(bitwise_xor(bitwise_xor(data[0],data[1]),data[2])),255).astype(np.uint8).tobytes()


@lru_cache(maxsize=128)
def checksum4(data):
    '''Checksum is the bytes XOR'ed together, then bit inverted - truncated to a chr'''
    return  bitwise_and(invert(bitwise_xor(bitwise_xor(data[0],data[1]),data[2])),255).astype(np.uint8).tobytes()


if __name__ == "__main__":
    #T = Timer('test()',"from __main__ import test")
    T = Timer('checksum1((random.randint(0,127),0,0))',"import random;from __main__ import checksum1")
    print(T.timeit())
    T = Timer('checksum2((random.randint(0,127),0,0))',"import random;from __main__ import checksum2")
    print(T.timeit())
    T = Timer('checksum3((random.randint(0,127),0,0))',"import random;from __main__ import checksum3")
    print(T.timeit())
    T = Timer('checksum4((random.randint(0,127),0,0))',"import random;from __main__ import checksum4")
    print(T.timeit())

py test.py

4.10519769108277

6.260751157558025

10.463237500651697

6.182100842095494

This is implying that the numpy method is slow & accessing lru cache is providing a larger overhead than it would gain.

Am I doing something wrong or is this correct?

Naib
  • 999
  • 7
  • 20
  • In general, I recommend you to generate just one random integer, to be sure, this doesn't affect your result. – nox Feb 01 '16 at 10:41
  • 4
    Numpy beats Python performance when you can let it do operations with a single function call that would normally require loops. You only pass single-valued arguments to numpy, so it likely looses performance because they need to be converted to arrays first. – MB-F Feb 01 '16 at 10:43
  • Thanks. I used a rnd so that the lru_cache would do some work. Setting it as a fixed yielded faster times (no rnd call) but similar difference. Raw py wins (1.69), np+lru (3.98), py+lru (4.06), raw np (8.40). Looks like what I have was already quite good. Thanks – Naib Feb 01 '16 at 10:45
  • @kazemakase is correct - to properly leverage the efficiency of numpy functions you need to apply them to whole arrays rather than individual elements. If `data` was a numpy array then there will be no noticeable performance difference between `np.bitwise_and(data, b)` and `data & b`, since `&` and `np.bitwise_and` will both call the `.__and__` method of the array (the same goes for `^`, `~` etc). – ali_m Feb 01 '16 at 20:24

0 Answers0