I have a bunch of strings which are sentences that look something like this:
Having two illnesses at the same time is known as \xe2\x80\x9ccomorbidity\xe2\x80\x9d and it can make treating each disorder more difficult.
I encoded the original string with .encode()
then compressed with python's bz2
library.
I then decompressed with bz2.decompress()
and used .decode()
to get it back.
Any ideas how I can conveniently remove these bytestrings from the text or avoid characters like quotes not getting decoded properly?
Thanks!