2

I know that we have to use setattr method when we are outside of an object. However, I have troubles calling setattr with unicode key leading me to use __setattr__ directly.

class MyObject(object):
    def __init__(self):
        self.__dict__["properties"] = dict()
    def __setattr__(self, k, v):
        self.properties[k] = v
obj = MyObject()

And I get the following content of obj.properties:

  • setattr(obj, u"é", u"à"): raise UnicodeEncodeError
  • setattr(obj, "é", u"à"): {'\xc3\xa9': u'\xe0'}
  • obj.__setattr__(u"é", u"à"): {u'\xe9': u'\xe0'}

I don't understand why Python is behaving with these differences

ChrisGPT was on strike
  • 127,765
  • 105
  • 273
  • 257

2 Answers2

0

Python 2.7? Ascii identifiers only. That includes your code in 2) - ascii accent but not .1) - unicode accent.

Unicode identifiers in Python?

3) involves you setting an unicode key within a dictionary. Legal.

Note that __setattr__ is almost never meant to be used as you are doing. It's meant to set attributes on an object. Not intercept that and stuff them in a internal dict attribute. I'd Avoid properties too as a name, confusing with properties in the get/Set sense.

Generally you want to use setattr, not the double underscore variant. Unlike your opening sentence.

You typically also don't call double underscore methods, you define them and Python's underlying data protocol calls them on your behalf. Bit like JavaBeans get/set implicit calls (I think).

__setattr__ can be tricky. If you are not careful, it blocks "setting activities" in unexpected ways.

Here's a silly example,

class Foo(object):

    def __setattr__(self, attrname, value):
        """ let's uppercase variables starting with k"""

        if attrname.lower().startswith("k"):
            self.__dict__[attrname.upper()] = value

foo = Foo()

foo.kilometer = 1000
foo.meter = 1

print "foo.KILOMETER:%s" % getattr(foo, "KILOMETER", "unknown")
print "foo.meter:%s" % getattr(foo, "meter", "unknown")
print "foo.METER:%s" % getattr(foo, "METER", "unknown")

output:

foo.KILOMETER:1000
foo.meter:unknown
foo.METER:unknown

You needed to have an else after the if:

        else:
            self.__dict__[attrname] = value

output:

foo.KILOMETER:1000
foo.meter:1
foo.METER:unknown

Last, if you are just starting out and unicode is a big deal, I'd evaluate Python 2 vs 3 - 3 has much better, unified, unicode support. There are tons of reasons you might or might not need to use 2.7, rather than 3, but unicode "pushes towards" 3.

Community
  • 1
  • 1
JL Peyret
  • 10,917
  • 2
  • 54
  • 73
  • I finally solved it by calling `encode('utf-8')`before calling `setattr`. Otherwise, concerning the last point, I have the following requirement: be able to access object property "toto" via `obj.properties["toto"]` and also directly `obj.toto`. Thus, intercept `setattr` and `getattr` seems to be the only solution. – jbaptiste.trb Apr 21 '16 at 15:02
  • Francais? The `toto` vs `foobar` gives that away ;-) *If* you only need to access via obj.toto for *reads* then you can leave setattr alone and instead write a \__getattr\__ that returns obj.properties[attrname]. Overriding \__getattr\__ is common, \__setattr\__ is more special case and needs careful consideration. I'd have something like my silly example with the k variable names and test for leading `_` in attribute names to allow for normal internal variables. – JL Peyret Apr 21 '16 at 16:28
0

Python 2 doesn't allow unicode identifiers:

>>> é = 3
  File "<stdin>", line 1
    é = 3
    ^
SyntaxError: invalid syntax

Presumably it's so insistent on this point that you can't work around it as you're trying because setattr goes through some processing before calling __setattr__. You can show this by inserting a print at the very start of __setattr__: nothing gets printed, so the issue is not in your code.

Alex Hall
  • 34,833
  • 5
  • 57
  • 89