1

I have a Python script called a.py:

#!/usr/bin/python2.7
# -*- coding: utf-8 -*-
print u''

In both bash and tcsh:

$ a.py
Ô£øÔ£øÔ£øÔ£ø
$ echo `a.py`
Traceback (most recent call last):
  File "a.py", line 3, in <module>
    print u'Ô£øÔ£øÔ£øÔ£ø'
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)

The error is coming from Python, not the shell. How can running the script under backticks affect the script itself? Note that this is not a problem if I switch the interpreter to Python 3 at the beginning of the script.

kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
Mitchell Model
  • 1,060
  • 10
  • 16

1 Answers1

9

When Python does not detect that it is printing to a terminal, as is the case when in a subshell, sys.stdout.encoding is set to None. When you print a unicode, the ascii codec is used (at least in Python2). This will result in a UnicodeError if the unicode contains code points outside of 0-127.

A way to fix this is to set the PYTHONIOENCODING environment variable to an appropriate encoding. For example:

export PYTHONIOENCODING=utf-8; echo `a.py`

Credits for this go to unutbu!

Community
  • 1
  • 1
R. Q.
  • 904
  • 5
  • 12
  • Wow. I never would have figured this out. So I tried adding this to a.py: import os os.putenv('PYTHONIOENCODING', 'utf-8') but that doesn't work. Probably my lack of experience with bash (I mostly use tcsh), but I don't understand why your solution works but this doesn't. – Mitchell Model Apr 04 '17 at 18:18