42

I need to escape a & (ampersand) character in a string. The problem is whenever I string = string.replace ('&', '\&') the result is '\\&'. An extra backslash is added to escape the original backslash. How do I remove this extra backslash?

wjandrea
  • 28,235
  • 9
  • 60
  • 81
Dr. Johnson
  • 579
  • 2
  • 6
  • 8
  • Knowing zilch about python: string = string.replace('&', '&') ... maybe the replace method will escape that ampersand for you... heh – iandisme Feb 01 '10 at 19:43
  • 8
    If you still use SO, please mark a solution! – erikbstack Aug 07 '12 at 08:51
  • 1
    @Veedrac: How is this 4 year old question marked as a duplicate of a question asked 6 days ago? – User Jun 12 '14 at 15:46
  • @User Because the dupe has an accepted answer which is arguably more descriptive and explains the problem better than this one – Bojangles Jun 12 '14 at 16:08
  • It was decided on [chat] that the new question should be made canonical. Thus everything *else* is officially a dupe of it. Any disagreements should go to [Meta] for discussion. – Veedrac Jun 12 '14 at 16:09
  • I agree that the other question has a better title and a better answer, and in principle I prefer to favor the better question/answer; just went against how I understood stackoverflow to work. I know I have asked "duplicates" that were better expressed and had better answers than the originals. – User Jun 12 '14 at 17:17
  • This is not duplicate at all – Trect Nov 29 '19 at 13:24
  • Nothing here actually answers the question adding backslashes without escaping. In C# I can @myfile and it is a literal whether a variable or fixed string. Python seems to have no such operator. r does not work with myfilename - which becomes rmyfilename. – Ken Apr 03 '20 at 04:02

6 Answers6

71

The result '\\&' is only displayed - actually the string is \&:

>>> str = '&'
>>> new_str = str.replace('&', '\&')
>>> new_str
'\\&'
>>> print new_str
\&

Try it in a shell.

Emil Ivanov
  • 37,300
  • 12
  • 75
  • 90
29

The extra backslash is not actually added; it's just added by the repr() function to indicate that it's a literal backslash. The Python interpreter uses the repr() function (which calls __repr__() on the object) when the result of an expression needs to be printed:

>>> '\\'
'\\'
>>> print '\\'
\
>>> print '\\'.__repr__()
'\\'
Thomas
  • 174,939
  • 50
  • 355
  • 478
24

Python treats \ in literal string in a special way.
This is so you can type '\n' to mean newline or '\t' to mean tab
Since '\&' doesn't mean anything special to Python, instead of causing an error, the Python lexical analyser implicitly adds the extra \ for you.

Really it is better to use \\& or r'\&' instead of '\&'

The r here means raw string and means that \ isn't treated specially unless it is right before the quote character at the start of the string.

In the interactive console, Python uses repr to display the result, so that is why you see the double '\'. If you print your string or use len(string) you will see that it is really only the 2 characters

Some examples

>>> 'Here\'s a backslash: \\'
"Here's a backslash: \\"
>>> print 'Here\'s a backslash: \\'
Here's a backslash: \
>>> 'Here\'s a backslash: \\. Here\'s a double quote: ".'
'Here\'s a backslash: \\. Here\'s a double quote: ".'
>>> print 'Here\'s a backslash: \\. Here\'s a double quote: ".'
Here's a backslash: \. Here's a double quote ".

To Clarify the point Peter makes in his comment see this link

Unlike Standard C, all unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the string. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.) It is also important to note that the escape sequences marked as “(Unicode only)” in the table above fall into the category of unrecognized escapes for non-Unicode string literals.

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • Part of this isn't correct. Python does *not* "implicitly add the extra `\` for you". It does, however, double the backslash when you display the repr() output of the string, as at prompt, purely for presentation purposes. len("\&") is only 2, proving there is no implicit munging of your data (thank heavens!). – Peter Hansen Feb 01 '10 at 22:10
  • @PeterHansen I think the OP was pointing out that blackslashes should normally be escaped in non-raw strings so it would normally be written doubled. – Michael Mior Oct 28 '15 at 19:45
9
>>> '\\&' == '\&'
True
>>> len('\\&')
2
>>> print('\\&')
\&

Or in other words: '\\&' only contains one backslash. It's just escaped in the python shell's output for clarity.

sepp2k
  • 363,768
  • 54
  • 674
  • 675
6

printing a list can also cause this problem (im new in python, so it confused me a bit too):

>>>myList = ['\\']
>>>print myList
['\\']
>>>print ''.join(myList)
\ 

similarly:

>>>myList = ['\&']
>>>print myList
['\\&']
>>>print ''.join(myList)
\&
Kicsi
  • 1,173
  • 10
  • 22
5

There is no extra backslash, it's just formatted that way in the interactive environment. Try:

print string

Then you can see that there really is no extra backslash.

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452