0

I have some c++ code that manipulates binary data for a proprietary storage system. As part of it, I have this code from c++ (variable names changed for clarity):

void get_data(unsigned char* data) {
    data[0] = (~data[0] ^ 0x55);

Where data was read straight from a binary file.

This logic was handled by not changing anything and python's use of bitwise operators mean that I can just straight up copy the code.

data = bytearray(fd.read(2048))
data[0] = (~data[0] ^ 0x55)

And the exception is ValueError: byte must be in range(0, 256). This is because the first byte value 34 gets the twos complement of -35.

I'm not sure of the proper way to handle this. In c++, a char has ONLY 0-255. Python does not have that, so do I somehow handle it like an unsigned int or is there a more appropriate way to handle it?

One idea was to take every byte and:

byte_value = byte_value & 0xff

Another person suggested I use the ctypes.c_ubyte type. Then I remembered two things I'd experimented with in the past. struct to pull the numbers out and store them properly or the bitstring module to read and manipulate the data that way.

Is there a proper way to handle this, either for being 'pythonic' or for having better performance?

UtahJarhead
  • 2,091
  • 1
  • 14
  • 21
  • Related reading: [How do I do a bitwise Not operation in Python?](https://stackoverflow.com/q/31151107/953482). The "subtract from 0b11111111" suggestion in the top answer should give you values in 0-255 as desired. – Kevin Dec 03 '18 at 16:46
  • Ohhhh... so `~` is not the same as the `not` operator. I did not know that! – UtahJarhead Dec 03 '18 at 16:50
  • In that case, is it better to use `0b11111111 - `? Do I lose or gain a performance hit if I just do a numeric subtraction of `255 - `? – UtahJarhead Dec 03 '18 at 16:53
  • `0b11111111 - ` and `255 - ` produce identical bytecode, so they should have the same performance. Personally, I'm curious whether `255 - ` is faster than ` ^ 255`... – Kevin Dec 03 '18 at 17:01

0 Answers0