1

I am reading a large file in python line by line with readline(). After reaching close to 672,280 lines I get an error:

 UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfd in position 228:
 invalid start byte.

However, I have searched the file using grep for a byte 0xfd and it returned none. I also wrote c++ code to go through the file and look for a byte 0xfd and still got nothing. So I have no idea what is going on here. Is it an error because the file is too big?

I just don't see how a decoding error can happen for a byte not in a file.

Thanks

Neha Gangwar
  • 670
  • 9
  • 14
Paul McVay
  • 128
  • 6
  • Possible duplicate of [Writing utf-8 string inside my python files](http://stackoverflow.com/questions/4812186/writing-utf-8-string-inside-my-python-files) – l'L'l Jan 12 '16 at 09:51

1 Answers1

0

you can try out to open file with ISO encoding.

open('myfile.txt', encoding = "ISO-8859-1")
navyad
  • 3,752
  • 7
  • 47
  • 88