I'm using Python version: 2.7.3.
In Python, we use the magic methods __str__
and __unicode__
to define the behavior of str
and unicode
on our custom classes:
>>> class A(object):
def __str__(self):
print 'Casting A to str'
return u'String'
def __unicode__(self):
print 'Casting A to unicode'
return 'Unicode'
>>> a = A()
>>> str(a)
Casting A to str
'String'
>>> unicode(a)
Casting A to unicode
u'Unicode'
The behavior suggests that the return value from __str__
and __unicode__
is coerced to either str
or unicode
depending on which magic method is run.
However, if we do this:
>>> class B(object):
def __str__(self):
print 'Casting B to str'
return A()
def __unicode__(self):
print 'Casting B to unicode'
return A()
>>> b = B()
>>> str(b)
Casting B to str
Traceback (most recent call last):
File "<pyshell#47>", line 1, in <module>
str(b)
TypeError: __str__ returned non-string (type A)
>>> unicode(b)
Casting B to unicode
Traceback (most recent call last):
File "<pyshell#48>", line 1, in <module>
unicode(b)
TypeError: coercing to Unicode: need string or buffer, A found
Calling str.mro()
and unicode.mro()
says that both are subclasses of basestring
. However, __unicode__
also allows returning of buffer
objects, which directly inherits from object
and doesn't inherit from basestring
.
So, my question is, what actually happens when str
and unicode
are called? What are the return value requirements on __str__
and __unicode__
for use in str
and unicode
?