I am reading a large file in python line by line with readline()
. After reaching close to 672,280 lines I get an error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfd in position 228:
invalid start byte.
However, I have searched the file using grep
for a byte 0xfd
and it returned none. I also wrote c++
code to go through the file and look for a byte 0xfd
and still got nothing. So I have no idea what is going on here. Is it an error because the file is too big?
I just don't see how a decoding error can happen for a byte not in a file.
Thanks