7

Assume that

n = u"Tübingen"
repr(n) # `T\xfcbingen` # Unicode
i = 1 # integer

The first of the following files throws

UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 82: ordinal not in range(128)

When I do n.encode('utf8') it works.

The second works flawless in both cases.

# Python File 1
#
#!/usr/bin/env python -B
# encoding: utf-8

print '{id}, {name}'.format(id=i, name=n)

# Python File 2
#
#!/usr/bin/env python -B
# encoding: utf-8

print '%i, %s'% (i, n)

Since in the documentation it is encouraged to use format() instead of the % format operator, I don't understand why format() seems more "handicaped". Does format() only work with utf8-strings?

Tom van der Woerdt
  • 29,532
  • 7
  • 72
  • 105
Aufwind
  • 25,310
  • 38
  • 109
  • 154
  • When you did `u'{id}, {name}'.format(id=i, name=n)` what did you observe? Note that the formatting string is a Unicode string `u'...'`. Please add that to your examples and comment on it. – S.Lott Dec 22 '11 at 11:40
  • Thank you S.Lott, this was it. I understand now where my fault was. `'{id}, {name}'` was a utf-8 string (defined by the *magic line* `# encoding: utf-8`) and `n` was in unicode. It is not possible to "concatenate" them. That is why `n.encode('utf8')` worked. Right? – Aufwind Dec 22 '11 at 11:44

1 Answers1

10

You're using string.format while you don't have a string but an unicode object.

print u'{id}, {name}'.format(id=i, name=n)

will work, since it uses unicode.format instead.

Tom van der Woerdt
  • 29,532
  • 7
  • 72
  • 105