Python Unicode Decode Error for Byte not in file

Question

I am reading a large file in python line by line with readline(). After reaching close to 672,280 lines I get an error:

 UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfd in position 228:
 invalid start byte.

However, I have searched the file using grep for a byte 0xfd and it returned none. I also wrote c++ code to go through the file and look for a byte 0xfd and still got nothing. So I have no idea what is going on here. Is it an error because the file is too big?

I just don't see how a decoding error can happen for a byte not in a file.

Thanks

Possible duplicate of [Writing utf-8 string inside my python files](http://stackoverflow.com/questions/4812186/writing-utf-8-string-inside-my-python-files) — l'L'l, Jan 12 '16 at 09:51

score 0 · Accepted Answer · answered Jan 12 '16 at 09:51

0

you can try out to open file with ISO encoding.

open('myfile.txt', encoding = "ISO-8859-1")

answered Jan 12 '16 at 09:51

navyad

3,752
7
47
88

Python Unicode Decode Error for Byte not in file

1 Answers1