3

I am reading the contents of a file and one of the characters is an unknown character. I copied the part of the file that had this unknown character to my text editor and created the following script. I uploaded an image of the script because I'm not able to paste this character on SO, it shows up empty. The unknown character is '<0x01>' in the image. What is this character? What is the proper way to remove this character and other characters like them?

x = "<0x01> hello"

print x.decode('utf-8', 'ignore')
print x.replace("<0x01>", "")

enter image description here

You can see the image in the description now

Python 2.7.6

Ubuntu 14.04

SPYBUG96
  • 1,089
  • 5
  • 20
  • 38
mnm
  • 39
  • 1
  • defined encoding when reading file? – dejanmarich Nov 07 '18 at 16:23
  • 3
    Could you maybe post your code instead of images? – zipa Nov 07 '18 at 16:31
  • https://pastebin.com/GWNMf1fZ – mnm Nov 07 '18 at 16:35
  • 1
    @mnm Welcome to SO! However, please put the code in the question itself. Links don't last forever, and it's easier to answer the question if you don't need to follow a link. – Will Vousden Nov 07 '18 at 17:03
  • That's a Control-A character that somehow got inserted into your file. The normal way you'd write that character in a Python string would be `"\x01"`. The `<0x01>` appears to be your editor's way of showing unprintable characters, that's not something that you can type in yourself. – jasonharper Nov 07 '18 at 18:57

0 Answers0