Need a help in adding escape sequence to all elements in a list which later use for unicode

Question

>>> n
['de', 'db', 'aa', 'dC', 'be', 'Ad', 'Da', 'a7', 'Cb', 'Cc', 'Ed', 'D7', 'CA', 'Da', 'db', 'aa', 'bD', 'db', '7d', 'Ad', 'c4', 'DA', 'Ba', 'bD', 'cc', 'DC', 'da', 'dd', '2d', 'CD', 'bA', 'dA', 'EC', 'Cb', 'dC', 'aC', 'Dd', 'ec', 'CD', 'Ae', 'aC', 'dE', 'BE', 'CE', 'db', 'AC', 'EC', 'cb', 'DE']

I have a list like above . I want to add escape character '\x' before each element. I can do appending as '\x' but later I want to join elements to create a unicode character and if I am doing with '\x', its not working

Please suggest

Show an example of the desired output, and explain what 'its not working' means by including the error message. — DYZ, May 24 '17 at 05:42

score 2 · Accepted Answer · answered May 24 '17 at 05:34

You can and should use binascii.unhexlify.

from binascii import unhexlify

n = ['de', 'db', 'aa', 'dC', 'be', 'Ad', 'Da', 'a7', 'Cb', 'Cc', 'Ed', 'D7',
     'CA', 'Da', 'db', 'aa', 'bD', 'db', '7d', 'Ad', 'c4', 'DA', 'Ba', 'bD',
     'cc', 'DC', 'da', 'dd', '2d', 'CD', 'bA', 'dA', 'EC', 'Cb', 'dC', 'aC',
     'Dd', 'ec', 'CD', 'Ae', 'aC', 'dE', 'BE', 'CE', 'db', 'AC', 'EC', 'cb', 'DE']

print(repr(unhexlify(''.join(n))))

Example usage of binascii.unhexlify: unhexlify('abcdef') # '\xab\xcd\xef'

In your case, hex digits are stored in a list, so you should concatenate those first using str.join. And simply pass it to binascii.unhexlify.

score 2 · Answer 2 · answered May 24 '17 at 05:37

You can't do this easily in the string domain. Fortunately, it's simple just using chr and int:

>>> n = ['de', 'db', 'aa', 'dC', 'be', 'Ad', 'Da', 'a7', 'Cb', 'Cc', 'Ed', 'D7', 'CA', 'Da', 'db', 'aa', 'bD', 'db', '7d', 'Ad', 'c4', 'DA', 'Ba', 'bD', 'cc', 'DC', 'da', 'dd', '2d', 'CD', 'bA', 'dA', 'EC', 'Cb', 'dC', 'aC', 'Dd', 'ec', 'CD', 'Ae', 'aC', 'dE', 'BE', 'CE', 'db', 'AC', 'EC', 'cb', 'DE']
>>> [chr(int(k, 16)) for k in n]
['Þ', 'Û', 'ª', 'Ü', '¾', '\xad', 'Ú', '§', 'Ë', 'Ì', 'í', '×', 'Ê', 'Ú', 'Û', 'ª', '½', 'Û', '}', '\xad', 'Ä', 'Ú', 'º', '½', 'Ì', 'Ü', 'Ú', 'Ý', '-', 'Í', 'º', 'Ú', 'ì', 'Ë', 'Ü', '¬', 'Ý', 'ì', 'Í', '®', '¬', 'Þ', '¾', 'Î', 'Û', '¬', 'ì', 'Ë', 'Þ']

Arya McCarthy · Answer 3 · 2017-05-24T05:46:27.987

Providing an alternative to hallazzang's answer: We can use literal_eval to safely turn the string representation of the unicode into the unicode character. '\\xde', for instance, evaluates as 'Þ'.

>>>from ast import literal_eval
>>>
>>>[ast.literal_eval("'\\x{}'".format(k)) for k in n]
['Þ', 'Û', 'ª', 'Ü', '¾', '\xad', 'Ú', '§', 'Ë', 'Ì', 'í', '×', 'Ê', 'Ú', 'Û', 'ª', '½', 'Û', '}', '\xad', 'Ä', 'Ú', 'º', '½', 'Ì', 'Ü', 'Ú', 'Ý', '-', 'Í', 'º', 'Ú', 'ì', 'Ë', 'Ü', '¬', 'Ý', 'ì', 'Í', '®', '¬', 'Þ', '¾', 'Î', 'Û', '¬', 'ì', 'Ë', 'Þ']

I point out that literal_eval is a safe alternative to eval. eval allows arbitrary code to be run, whereas literal_eval only creates literals—things like ints, strings, lists, and dicts.

Need a help in adding escape sequence to all elements in a list which later use for unicode

3 Answers3