-1

This link lists out some python specific encodings.

One of the encoding is "unicode_escape".

I am just trying to figure out, is this special encoding really needed?

>>> l = r'C:\Users\userx\toot'
>>> l
'C:\\Users\\userx\\toot'
>>> l.encode('unicode_escape').decode()
'C:\\\\Users\\\\userx\\\\toot'

If you could see above, 'l' which is a unicode object, has already taken care of escaping the backslashes. Converting it to "unicode_escape" encoding adds one more set of escaped backslashes which doesn't make any sense to me.

Questions:

  1. Is "unicode_escape" encoding really needed?
  2. why did "unicode_escape" added one more sets of backslashes above?
khelwood
  • 55,782
  • 14
  • 81
  • 108
  • 1
    If you `print(l)` you'll see that in the actual content of the string, the backslashes are not escaped. The `repr` version of a string escapes backslashes in order to show it unambiguously to you, the developer. – khelwood Nov 18 '18 at 15:07
  • 1
    Why do you think `unicode_escape()` *is* or *would be* needed for your use case? Just because something exists doesn't mean it's meaningful or relevant to you. – Charles Duffy Nov 18 '18 at 15:15
  • @CharlesDuffy i am writing a python script which takes a windows style path as an argument. As such the script is working without any issues. i just wanted to make sure are there any cases where in i have to convert the arg input using "unicode_escape"? – karthiksatyanarayana Nov 18 '18 at 15:18
  • The purpose of `unicode_escape()` is to generate content that can be substituted directly into a plain-ASCII Python source file. You aren't doing that, so you don't need it. – Charles Duffy Nov 18 '18 at 15:19
  • @CharlesDuffy Could you give me an example where one has to "generate content that can be substituted directly into a plain-ASCII Python source file"? – karthiksatyanarayana Nov 18 '18 at 15:27
  • Hmm. I could potentially see needing to do that if you were writing a Python decompiler (though the existing ones are just as likely to implement that functionality directly themselves as to use what the library provides). I've also seen code-generation used for template engine generation, though more recent/modern examples build AST trees, not textual source. – Charles Duffy Nov 18 '18 at 15:40
  • Frankly, though, "how can X be used?" is a completely open-ended question, and we don't allow those here. Questions need to be specific, and based on an actual problem that you face. – Charles Duffy Nov 18 '18 at 15:42
  • @CharlesDuffy Got it! Thanks for the example though. – karthiksatyanarayana Nov 18 '18 at 15:50

1 Answers1

1

Quoting the document you linked:

Encoding suitable as the contents of a Unicode literal in ASCII-encoded Python source code, except that quotes are not escaped. Decodes from Latin-1 source code. Beware that Python source code actually uses UTF-8 by default.

Thus, print(l.encode('unicode_escape').decode()) does something almost exactly equivalent to print(repr(l)), except that it doesn't add quotes on the outside and escape quotes on the inside of the string.

When you leave off the print(), the REPL does a default repr(), so you get backslashes escaped twice -- exactly the same thing as happens when you run >>> repr(l).

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441