1

When I'm opening a .txt file with special characters such as ö and ä, they look like this in the .txt file � and like this when I open loop through the lines ¿½. How can I read them with the real special characters? I need to compare strings and if i compare ä == � it returns False.

Nicke7117
  • 187
  • 1
  • 10
  • If � is already in the file - that is you see that character when viewing the file in a text editor - then the characters have already been corrupted. – snakecharmerb Oct 22 '21 at 08:07

2 Answers2

0

python supports unicode, and in fact python3 uses utf-8 unicode encoding for strings by default. So you should be able to just open up the file and read the content -- special characters would be handled gracefully as they are just normal unicode characters.

For example:

with open('special', 'r') as inf:
  content = inf.read()
print(content[0])

$ cat special
ääää

$ python3 read.py
ä
Fred.W
  • 71
  • 3
  • It doesn't work for me. – Nicke7117 Oct 22 '21 at 01:10
  • you might want to check your file encoding first. use [notepad++](https://notepad-plus-plus.org/downloads/) or similar software, which will give ur file encoding at bottom right when you open the file. Best guess is ur text file is not encoded in utf-8 – Fred.W Oct 22 '21 at 01:16
  • My file is encoded in utf-8. – Nicke7117 Oct 22 '21 at 01:19
  • @AmineLa it could be to do with the console not supporting those characters, in which case your best guess is maybe to print to another file and check if it printed the same character which it should because it would literally transfer the same data; if you want to check against it, however, you may need to use a unicode escape sequence – Matiiss Oct 22 '21 at 09:44
0

You can try using the "encoding" argument. It worked with me.

with open("text.txt",'r',encoding='utf-8') as f