Given a byte string, for instanceB = b"\x81\xc9\x00\x07I ABCD_\xe2\x86\x97_"
I want to be able to convert this to the valid printable UTF-8 string that is as UTF-8 as possible: S = "\\x81\\xc9\\x00\\x07I ABCD_↗_"
. Note that the first group of hex bytes are not valid UTF-8 characters, but the last 3 do define a valid UTF-8 character (the arrow). It seems like this should be part of codecs but I cannot figure out how to make this happen.
for instance
>>> codecs.decode(codecs.escape_encode(B, 'utf-8')[0], 'utf-8')
'\\x81\\xc9\\x00\\x07I\\x19ABCD_\\xe2\\x86\\x97_'
escapes a valid UTF-8 character along with the invalid characters.