0

I've seen a lot of posts about this subject but I haven't found the solution I'm looking for (despite lot of attempts ...). In a nutshell, when migrating one of my libraries from Python2 to 3, whose main data model is based on hex strings like this '\xaa\xbb\xcc' (string length=3) I've encountered the (by now) known issue with the usage of binascii.a2b_hex('aabbcc') function, which gives '\xaa\xbb\xcc' in Python2 and b'\xaa\xbb\xcc' in Python3. Being the whole library based on this data model, including external libraries using it, it will take a lot of time to review the code line by line to migrate it to the bytes data model. In conclusion, I'm looking for a Python3 function doing the translation, i.e. b'\xaa\xbb\xcc' -> '\xaa\xbb\xcc' (string length=3) or 'aabbcc' -> '\xaa\xbb\xcc' (string length=3) many thanks in advance!

tried all hex()/binascii.hexlify()/format

Robby
  • 1
  • 1

3 Answers3

0

You can use the latin-1 codec which maps bytes to Unicode characters one for one, with the same values.

x = b'\xaa\xbb\xcc'
x = x.decode('latin-1')
Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • this looks good: x='ª»Ì' it remains to provide the hex representation (with len=3). – Robby Aug 17 '23 at 08:48
  • @Robby that's a different question. You want to convert your 3-character string to a 12-character string as you print it. I'm sure there's an easy way but I don't know what it is. – Mark Ransom Aug 17 '23 at 12:18
  • indeed the .decode()/.encode() with the 'latin-1' coding does the magic for the bytes <->string (same length) conversion, and then for a pretty printing/input better to avoid binascii.a2b_hex()/b2a_hex() all in all and use bytes.fromhex()/hex() , i.e. the b'\xaa\xbb\xcc' <-> 'aabbcc' conversion. thank you! – Robby Aug 17 '23 at 16:37
  • @Robby I didn't think `hex()` would work for you since it didn't include the `\x` parts, I'm glad you found something satisfactory. – Mark Ransom Aug 17 '23 at 16:40
0

Here's a function that should achieve what you're looking for:

def bytes_to_escaped_hex(input_data):
    if isinstance(input_data, bytes):
        hex_string = ''.join([f'\\x{byte:02x}' for byte in input_data])
        return hex_string
    elif isinstance(input_data, str):
        # Remove '0x' prefix if present
        hex_string = input_data.replace('0x', '')

        if len(hex_string) % 2 != 0:
            hex_string = '0' + hex_string

        bytes_object = bytes.fromhex(hex_string)
        escaped_hex_string = ''.join([f'\\x{byte:02x}' for byte in bytes_object])
        return escaped_hex_string
    else:
        raise ValueError("Input must be bytes or hex string without '0x' prefix")

# Examples
bytes_data = b'\xaa\xbb\xcc'
hex_string = 'aabbcc'

escaped_hex1 = bytes_to_escaped_hex(bytes_data)
escaped_hex2 = bytes_to_escaped_hex(hex_string)

print(escaped_hex1)  # Output: '\xaa\xbb\xcc'
print(escaped_hex2)  # Output: '\xaa\xbb\xcc'
Shekhar
  • 1
  • 1
  • What do `len(escaped_hex1)` and `len(escaped_hex2)` give you? If I read the question correctly, the desired result is 3. – Mark Ransom Aug 16 '23 at 18:20
  • Thank you Shekthar, but the resulting string hasn't the right format: len(escaped_hex1) 12 len(escaped_hex2) 12 while int his example I look for a 3 chars string – Robby Aug 17 '23 at 08:30
0

Here's an alternative solution that uses the re module to achieve the same result:

import re

def bytes_to_escaped_hex(input_data):
    if isinstance(input_data, bytes):
        hex_string = ''.join([f'\\x{byte:02x}' for byte in input_data])
        return hex_string
    elif isinstance(input_data, str):
        hex_string = re.sub(r'(..)', r'\\x\1', input_data)
        return hex_string
    else:
        raise ValueError("Input must be bytes or hex string without '0x' prefix")

# Examples
bytes_data = b'\xaa\xbb\xcc'
hex_string = 'aabbcc'

escaped_hex1 = bytes_to_escaped_hex(bytes_data)
escaped_hex2 = bytes_to_escaped_hex(hex_string)

print(escaped_hex1)  # Output: '\xaa\xbb\xcc'
print(escaped_hex2)  # Output: '\xaa\xbb\xcc'
Shekhar
  • 1
  • 1