-4
 def __init__(self, text):
     self.text = text.strip("\r").strip("\n").strip("\r").strip(" ")
     print("TEXT:"+text+";SELFTEXT:"+self.text)

text is inputed as "http://thehill.com/homenews/senate/376515-jeff-flake-there-will-be-a-republican-challenger-to-trump-in-2020\r\n" and the self.text is still the same (does not remove the \r and the \n).

when this code is put on the Shell for Python it works as desired. Any ideas?

edit: when the print statement is changed to print("REPR:"+repr(text)+"\nTEXT:"+text+";SELFTEISXT:"+self.text)

the output for a similar string is:

REPR:'http://thehill.com/homenews/senate/376548-lindsey-graham-war-with-north-korea-would-be-worth-it-in-the-long-run\r\n' TEXT:http://thehill.com/homenews/senate/376548-lindsey-graham-war-with-north-korea-would-be-worth-it-in-the-long-run\r\n;SELFTEISXT:http://thehill.com/homenews/senate/376548-lindsey-graham-war-with-north-korea-would-be-worth-it-in-the-long-run\r\n

Dev Bali
  • 3
  • 2
  • Are you sure there aren't some other non-printable symbols after `text` that prevent `str.strip` from working? Can you print out `repr(text)` and see what the actual value is? – Blender Mar 04 '18 at 03:01
  • 3
    [mcve] please. (i.e., your code should be compilable without our having to guess and add your boilerplate code) – user202729 Mar 04 '18 at 03:05
  • Since the `print` adds a newline, how do you know that the newline wasn't removed? Did you try …`SELFTEXT:["+self.text+"]"` so that you have the `[]` pair around the text? It's a good way of identifying problems. – Jonathan Leffler Mar 04 '18 at 03:46
  • If you are trying to get text from html, take a look at Beautiful Soup and similar libraries - htmlparser, html2text, newspaper3k. Maybe also newsml. For newspapers feeds beware of them cut and pasting from word docs. – lxx Mar 04 '18 at 03:54
  • @Ixx I am getting the text from html but the type of the variable is string.@Blender I added an edit of the output when I print repr(text) which seems to be the same. – Dev Bali Mar 04 '18 at 04:43
  • @user202729 what extra code would you like? – Dev Bali Mar 04 '18 at 04:44
  • @Blender THANK YOU SO MUCH! The repr(text) showed that it was actually "\n" in the file and not just a line break. So when I replaced the "\r" and the "\n" with "\\r" and "\\n" it worked!! – Dev Bali Mar 04 '18 at 04:51
  • @DevBali Please read the [mcve] page. You need to provide some code that (1) is compilable, and (2) have the problem you're facing. Your code, if run as is, ([Try it online!](https://tio.run/##K6gsycjPM/7/XyElNU0hPj4zL7MkPl6jODUnTUehJLWiRNOKSwEEQCJ6IAEFW7C4XnFJUWaBhlJMkZImnJ2HxEYSV1DShBhSUJSZV6KhFOIaEWKlpA0yRlvJOtjVxw0qArdE8/9/AA)) doesn't have the problem you're describing. – user202729 Mar 04 '18 at 05:18

2 Answers2

1

Try this:

text.replace("\r\n","")
whackamadoodle3000
  • 6,684
  • 4
  • 27
  • 44
0

Try using raw string. I have aadded "r" in replace(r"\r\n", "")

EX:

text =  """http://thehill.com/homenews/senate/376515-jeff-flake-there-will-be-a-republican-challenger-to-trump-in-2020\r\n"""
print(text)
print(text.replace(r"\r\n", "").strip())

Output:

http://thehill.com/homenews/senate/376515-jeff-flake-there-will-be-a-republican-challenger-to-trump-in-2020\r\n
http://thehill.com/homenews/senate/376515-jeff-flake-there-will-be-a-republican-challenger-to-trump-in-2020
Rakesh
  • 81,458
  • 17
  • 76
  • 113