Well, there is:
>>> b'\\u00C0'.decode('unicode-escape')
'À'
However, the unicode-escape
codec is aimed at a particular format of string encoding, the Python string literal. It may produce unexpected results when faced with other escape sequences that are special in Python, such as \xC0
, \n
, \\
or \U000000C0
and it may not recognise other escape sequences from other string literal formats. It may also handle characters outside the Basic Multilingual Plane incorrectly (eg JSON would encode U+10000 to surrogates\uD800\uDC00
).
So unless your input data really is a Python string literal shorn of its quote delimiters, this isn't the right thing to do and it'll likely produce unwanted results for some of these edge cases. There are lots of formats that use \u
to signal Unicode characters; you should try to find out what format it is exactly, and use a decoder for that scheme. For example if the file is JSON, the right thing to do would be to use a JSON parser instead of trying to deal with \u
/\n
/\\
/etc yourself.