Remove \n from python string

Question

I have scraped a webpage using beautiful soup. I'm trying to get rid of a '\n' character which isnt eliminated despite whatever I try.

My effort so far:

wr=str(loc[i-1]).strip()
wr=wr.replace(r"\[|'u|\\n","")
print(wr)

Output:

    [u'\nWong; Voon Hon (Singapore, SG
Kandasamy; Ravi (Singapore, SG
Narasimalu; Srikanth (Singapore, SG
Larsen; Gerner (Hinnerup, DK
Abeyasekera; Tusitha (Aarhus N, DK

How do I eliminate the [u'\n? What am I doing wrong?

The full code is here.

You have a single quote before `\n` and after `u` in the list — thefourtheye, Sep 04 '16 at 17:04
I tried, that didnt work. Please see the updated code link in the question. — FlyingAura, Sep 04 '16 at 17:30

mpurg · Accepted Answer · 2016-09-04T17:36:40.360

1

You need to escape the newline character (double "\"):

rep=["[","u'","\\n"]
for r in rep:
    wr=wr.replace(r,"")

This is the same as @cricket_007's answer, however, the second part from his answer does not work for me. To my knowledge, str.replace() does not support these kind of regular expression lookups.

edited Sep 04 '16 at 17:36

answered Sep 04 '16 at 17:30

mpurg

201
1
6

That works! Thank you :) So we add extra \ because \n is a special character, right? – FlyingAura Sep 04 '16 at 17:40
Correct. Also, as @cricket_007 pointed out, you could also use a "raw string" representation: r"\n" – mpurg Sep 04 '16 at 17:47
You make a good point. I was thinking of `replace` of the `re` module – OneCricketeer Sep 04 '16 at 19:26

OneCricketeer · Answer 2 · 2016-09-04T19:20:24.120

0

You need to escape the backslash or use a raw string. Otherwise, it's a newline character, not a literal \n

Also, I don't think beautifulsoup is outputting unicode strings. You see the string representation in python as u'blah'

And you shouldn't need a list of elements to remove. The expression can be

r"\[|'u|\n"

edited Sep 04 '16 at 19:20

answered Sep 04 '16 at 17:11

OneCricketeer

179,855
19
132
245

How do I do that? – FlyingAura Sep 04 '16 at 17:12
Two backslashes `\\n` – OneCricketeer Sep 04 '16 at 17:14
By your advice, I did this: wr=wr.replace(r"\[|'u|\\n","") The result is still the same. – FlyingAura Sep 04 '16 at 17:27
You don't need the two slashes with the r in front of the string. – OneCricketeer Sep 04 '16 at 19:19

Remove \n from python string

2 Answers2