4

I'm running ./sample.py --url http://blah.com without error, though if I run ./sample.py --url http://blah.com | wc -l or similar I receive an error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\u200f' in position 0: ordinal not in range(128)

How do I make a python script compatible with my terminal commands? I keep seeing reference to sys.stdin.isatty though its use case appears to be opposite.

mbb
  • 3,052
  • 1
  • 27
  • 28

2 Answers2

6

When Python detects that it is printing to a terminal, sys.stdout.encoding is set to the encoding of the terminal. When you print a unicode, the unicode is encoded to a str using the sys.stdout.encoding.

When Python does not detect that it is printing to a terminal, sys.stdout.encoding is set to None. When you print a unicode, the ascii codec is used (at least in Python2). This will result in a UnicodeError if the unicode contains code points outside of 0-127.

One way to fix this is to explicitly encode your unicode before printing. That perhaps is the proper way, but it can be laborious if you have a lot of print statements scattered around.

Another way to fix this is to set the PYTHONIOENCODING environment variable to an appropriate encoding. For example,

PYTHONIOENCODING=utf-8

Then this encoding will be used instead of ascii when printing output to a file.

See the PrintFails wiki page for more information.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • Thank you for that @unutbu. Thought #1 - follow [this article](http://blog.notdot.net/2010/07/Getting-unicode-right-in-Python) from now on. #2 - Where is the right place to put this line for now? – mbb Nov 20 '12 at 22:23
  • 1
    @mjb: How you should set the `PYTHONIOENCODING` environment variable depends on the machine's OS. It is done the same way as you set the `PYTHONPATH` environment variable. On Linux, you could put `export PYTHONIOENCODING=utf-8` in your `~/.profile` or `~/.bashrc` file. – unutbu Nov 20 '12 at 23:13
  • 1
    @mjb: for a single command in bash: `PYTHONIOENCODING=utf-8 ./sample.py ...`. btw, User-perceived characters and codepoints are different things, though it is a topic for another article – jfs Nov 20 '12 at 23:57
-1

Try:

(./sample.py --url http://blah.com) | wc -l

This spawns a subshell to run your python script then pipes the output from stdout to wc

sampson-chen
  • 45,805
  • 12
  • 84
  • 81