1

I'm doing my first steps with Python3, so I'm not sure how to solve the following task. I'd like to count how often each bit in a numpy array changes over the time, my array looks like this:

first column: timestamp; second column: ID; third to last column: byte8,...,byte2, byte1, byte0 (8 bit per byte)

[[0.009469 144 '00001001' ... '10011000' '00000000' '00000000']
 [0.01947 144 '00001000' ... '10011000' '00000000' '00000001']
 [0.029468 144 '00001001' ... '10011000' '00000000' '00000011']
 ...
 [0.015825 1428 '11000000' ... '01101101' '00000000' '00000001']
 [0.115823 1428 '11000000' ... '01101100' '00000000' '00000000']
 [0.063492 1680 '01000000' ... '00000000' '00000000' '00000000']]

The task is to count the bit changes for every ID over the time. The result should look like this (timestamp could be ignored):

one row for every ID containing:

first column: ID; second to column #65 (number of changes bit64, number of changes bit63, ... number of changes bit1, number of changes bit0)

So in this short example, there should a result array with 3 rows (ID144, ID1428 and ID1680) and 65 columns.

Do you know how to achieve this?

D. Scharf
  • 13
  • 4
  • wish I had time to provide a full answer but make sure you check out `difflib.ndiff` of Python's standard library and read the table into a Pandas data frame to use `groupby` to loop over groups based on ID. that way you can isolate each group as a different dataframe, and pass consequent bits in tuples to `ndiff` to get which bit has changed at every step – D_Serg Dec 09 '18 at 21:42
  • If you have the bit-strings as numbers, you can use COR and popcount to compute the Hamming distance between any two. – Davis Herring Dec 09 '18 at 21:54

1 Answers1

0

The first step is definitely removing the "timestamp" and the "ID" columns, and make sure it is not of type string. I don't think you can have a numpy array that looks like your example (except for compound dtype, which makes things complicated). For "ID", you should seperate different "ID" to different array, e.g.:

a = yourArray[yourArray[1]==144]
b = yourArray[yourArray[1]==1428]
c = yourArray[yourArray[1]==1680]

I'm going to make some random data here since I don't have your data:

a = np.random.randint(0, 256, (16, 8), 'B')

a should look like:

array([[ 46,  74,  78,  41,  46, 173, 188, 157],
       [164, 199, 135, 162, 101, 203,  86, 236],
       [145,  32,  40, 165,  47, 211, 187,   7],
       [ 90,  89,  98,  61, 248, 249, 210, 245],
       [169, 116,  43,   6,  74, 171, 103,  62],
       [168, 214,  13, 173,  71, 195,  69,   8],
       [ 33,   1,  38, 115,   1, 111, 251,  90],
       [233, 232, 247, 118, 111,  83, 180, 163],
       [130,  86, 253, 177, 218, 125, 173, 137],
       [227,   7, 241, 181,  86, 109,  21,  59],
       [ 24, 204,  53,  46, 172, 161, 248, 217],
       [132, 122,  37, 184, 165,  59,  10,  40],
       [ 85, 228,   6, 114, 155, 225, 128,  42],
       [229,   7,  61,  76,  31, 221, 102, 188],
       [127,  51, 185,  70,  17, 138, 179,  57],
       [120, 118, 115, 131, 188,  53,  80, 208]], dtype=uint8)

After that, you can simply:

abs(np.diff(np.unpackbits(a, 1).view('b'), axis=0)).sum(0)

to get the number of changes in row direction corresponding to each bit:

array([ 7,  9,  7,  7,  9, 12, 10,  6,  7,  8,  8,  7,  7,  6,  7,  9,  8,
        7, 11,  9,  8,  7,  5,  7,  7,  9,  6,  9,  8,  7,  9,  7,  6, 10,
        8, 12,  5,  5,  5,  9,  7,  9,  8, 12,  9,  8,  5,  5,  5,  8, 10,
       10,  7,  6,  7,  8,  7,  8,  5,  5, 11,  7,  6,  8])

This is a shape (64,) array corresponding to ID=144. To make the result (3, 64), concat three results like:

np.array((aResult, bResult, cResult))
ZisIsNotZis
  • 1,570
  • 1
  • 13
  • 30