How to replace characters that cannot be decoded using utf8 with whitespace?
# -*- coding: utf-8 -*-
print unicode('\x97', errors='ignore') # print out nothing
print unicode('ABC\x97abc', errors='ignore') # print out ABCabc
How can I print out ABC abc
instead of ABCabc
? Note, \x97
is just an example character. The characters that cannot be decoded are unknown inputs.
- If we use
errors='ignore'
, it will print out nothing. - If we use
errors='replace'
, it will replace that character with some special chars.