0

I used urllib.parse.unquote and html.unescape to preprocess my string, but unexpectedly there are '\xa0' and '\u200e' characters in my string, is there a python function to do the last two replacements for me in case there are more such characters in my string?

# res = res.replace("%20", "") 
res = urllib.parse.unquote(res)
# res = res.replace(' ', ' ')
res = html.unescape(res)
res = res.replace('\xa0', '')
res = res.replace('\u200e', '')
max yue
  • 375
  • 2
  • 4
  • 10
  • 1
    What is wrong with the solution you have? `replace()` replaces all occurrences. – Flomp Aug 24 '17 at 12:18
  • Maybe look through the answers [here](https://stackoverflow.com/questions/2365411/python-convert-unicode-to-ascii-without-errors)? – glibdud Aug 24 '17 at 12:30

0 Answers0