Count changing bits in numpy array

Question

I'm doing my first steps with Python3, so I'm not sure how to solve the following task. I'd like to count how often each bit in a numpy array changes over the time, my array looks like this:

first column: timestamp; second column: ID; third to last column: byte8,...,byte2, byte1, byte0 (8 bit per byte)

[[0.009469 144 '00001001' ... '10011000' '00000000' '00000000']
 [0.01947 144 '00001000' ... '10011000' '00000000' '00000001']
 [0.029468 144 '00001001' ... '10011000' '00000000' '00000011']
 ...
 [0.015825 1428 '11000000' ... '01101101' '00000000' '00000001']
 [0.115823 1428 '11000000' ... '01101100' '00000000' '00000000']
 [0.063492 1680 '01000000' ... '00000000' '00000000' '00000000']]

The task is to count the bit changes for every ID over the time. The result should look like this (timestamp could be ignored):

one row for every ID containing:

first column: ID; second to column #65 (number of changes bit64, number of changes bit63, ... number of changes bit1, number of changes bit0)

So in this short example, there should a result array with 3 rows (ID144, ID1428 and ID1680) and 65 columns.

Do you know how to achieve this?

wish I had time to provide a full answer but make sure you check out `difflib.ndiff` of Python's standard library and read the table into a Pandas data frame to use `groupby` to loop over groups based on ID. that way you can isolate each group as a different dataframe, and pass consequent bits in tuples to `ndiff` to get which bit has changed at every step — D_Serg, Dec 09 '18 at 21:42
If you have the bit-strings as numbers, you can use COR and popcount to compute the Hamming distance between any two. — Davis Herring, Dec 09 '18 at 21:54

score 0 · Accepted Answer · answered Dec 10 '18 at 09:00

The first step is definitely removing the "timestamp" and the "ID" columns, and make sure it is not of type string. I don't think you can have a numpy array that looks like your example (except for compound dtype, which makes things complicated). For "ID", you should seperate different "ID" to different array, e.g.:

a = yourArray[yourArray[1]==144]
b = yourArray[yourArray[1]==1428]
c = yourArray[yourArray[1]==1680]

I'm going to make some random data here since I don't have your data:

a = np.random.randint(0, 256, (16, 8), 'B')

a should look like:

array([[ 46,  74,  78,  41,  46, 173, 188, 157],
       [164, 199, 135, 162, 101, 203,  86, 236],
       [145,  32,  40, 165,  47, 211, 187,   7],
       [ 90,  89,  98,  61, 248, 249, 210, 245],
       [169, 116,  43,   6,  74, 171, 103,  62],
       [168, 214,  13, 173,  71, 195,  69,   8],
       [ 33,   1,  38, 115,   1, 111, 251,  90],
       [233, 232, 247, 118, 111,  83, 180, 163],
       [130,  86, 253, 177, 218, 125, 173, 137],
       [227,   7, 241, 181,  86, 109,  21,  59],
       [ 24, 204,  53,  46, 172, 161, 248, 217],
       [132, 122,  37, 184, 165,  59,  10,  40],
       [ 85, 228,   6, 114, 155, 225, 128,  42],
       [229,   7,  61,  76,  31, 221, 102, 188],
       [127,  51, 185,  70,  17, 138, 179,  57],
       [120, 118, 115, 131, 188,  53,  80, 208]], dtype=uint8)

After that, you can simply:

abs(np.diff(np.unpackbits(a, 1).view('b'), axis=0)).sum(0)

to get the number of changes in row direction corresponding to each bit:

array([ 7,  9,  7,  7,  9, 12, 10,  6,  7,  8,  8,  7,  7,  6,  7,  9,  8,
        7, 11,  9,  8,  7,  5,  7,  7,  9,  6,  9,  8,  7,  9,  7,  6, 10,
        8, 12,  5,  5,  5,  9,  7,  9,  8, 12,  9,  8,  5,  5,  5,  8, 10,
       10,  7,  6,  7,  8,  7,  8,  5,  5, 11,  7,  6,  8])

This is a shape (64,) array corresponding to ID=144. To make the result (3, 64), concat three results like:

np.array((aResult, bResult, cResult))

Thank you very much for your response, that helps a lot ;) – D. Scharf Dec 10 '18 at 21:18 — D. Scharf, Dec 10 '18 at 21:18

Count changing bits in numpy array

1 Answers1