I'm not entirely sure from the question what you want, so I'll cover both cases I can see.
Case 1: You just want to output the arabic string from your code, using the unicode literal syntax. In this case, you should prefix your string literal with a u and you'll be right as rain:
s = u"\u063a\u064a\u0646\u064a\u0627"
print(s)
This would probably do the same as
print u'%s' % s
except shorter. In this case, formatting an otherwise empty string into your formed string doesn't make any sense, because it's not changing anything - in other words, u'%s' % s == s
.
Case 2: You have an escaped string from some other source that you want to evaluate as a Unicode string. This is kind of what it looks like you're trying to do with print u'%s' %
. This can be done with
import ast
s = r"\u063a\u064a\u0646\u064a\u0627"
print ast.literal_eval("u'{}'".format(s))
Note that unlike eval
this is safe, as literal_eval
doesn't allow anything like a function call. Also see that s here is an r-prefixed string, so the backslashes aren't escaping anything but are literally backslash characters.
Both pieces of code correctly output
غينيا
Some elaboration on print u'%s' % s
for case 1. This behaves differently, because if the string has already been escaped, it won't be evaluated like a Unicode literal in the formatting. This is because Python only actually builds Unicode out of unicode literal-like expressions (such as s) when they are at first evaluated. If it has been escaped, this is kind of out of reach by using normal string operations, so you have to use literal_eval
to evaluate it again in order to properly print the string. When you run
print u'%s' % s
the output is
\u063a\u064a\u0646\u064a\u0627
Note that this isn't a representation of a Unicode object but literally an ascii string with some backslashes and characters.