13

According to Python's documentation,

Data descriptors with __set__() and __get__() defined always override a redefinition in an instance dictionary.

I have no problem understanding this sentence, but can someone clarify for me why such a rule is in place? After all, if I want to override an attribute in an instance dictionary, I already need to do that explicitely (inst.__dict__["attr"] = val), as a naive inst.attr = val would call the descriptor's __set__ method, which would (usually) not override the attribute in the instance dictionary.

edit: just to make it clear, I understand what is happening, my question is about why such a rule was put in place.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
antony
  • 2,877
  • 4
  • 31
  • 43

1 Answers1

10

The override applies to descriptors that are part of the class __dict__.

Python will always look up type(instance).__dict__[attributename].__get__(instance, type(instance)), and will not use instance.__dict__ to search for a instance-override.

Here is an example using a contrived Descriptor class and a property (which is a descriptor with a __get__ and a __set__:

>>> class Descriptor(object):
...     def __init__(self, name):
...         self.name = name
...     def __get__(self, instance, cls):
...         print 'Getting %s, with instance %r, class %r' % (self.name, instance, cls)
... 
>>> class Foo(object):
...     _spam = 'eggs'
...     @property
...     def spam(self):
...         return self._spam
...     @spam.setter
...     def spam(self, val):
...         self._spam = val
... 
>>> Foo().spam
'eggs'
>>> foo = Foo()
>>> foo.__dict__['spam'] = Descriptor('Override')
>>> foo.spam
'eggs'

As you can see, even though I add a spam entry in the instance __dict__, it is completely ignored and the Foo.spam property is used still. Python is ignoring the instance __dict__ because the spam property defines both __get__ and a __set__.

If you use a descriptor that doesn't define a __set__ the override works (but it's __get__ is not called:

>>> class Foo(object):
...     desc = Descriptor('Class-stored descriptor')
... 
>>> Foo.desc
Getting Class-stored descriptor, with instance None, class <class '__main__.Foo'>
>>> Foo().desc
Getting Class-stored descriptor, with instance <__main__.Foo object at 0x1018df510>, class <class '__main__.Foo'>
>>> foo = Foo()
>>> foo.__dict__['desc'] = Descriptor('Instance-stored descriptor')
>>> foo.desc
<__main__.Descriptor object at 0x1018df1d0>
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • can u explain with an example? – sureshvv Oct 22 '12 at 08:19
  • 2
    Here's the implementation for CPython 2.7.3: [get](http://hg.python.org/cpython/file/70274d53c1dd/Objects/object.c#l1382) and [set](http://hg.python.org/cpython/file/70274d53c1dd/Objects/object.c#l1494). The order is data descriptor, instance attribute, non-data descriptor, and finally (get only) a class attribute. – Eryk Sun Oct 22 '12 at 11:42
  • 1
    Yes, I already understood that part. My question was more about the *rationale* for such a rule; i.e., I will never "accidentally" override a class' data descriptor in an instance (i.e. to do so I need to access __dict__) so why forbid me to do so if I really want to? – antony Oct 23 '12 at 21:55
  • 3
    @antony: this is a concious decision, for a longer explanation see [PEP 252](http://www.python.org/dev/peps/pep-0252/); but the short of it is so that you cannot override static attributes like `__class__` in the instance `__dict__`. – Martijn Pieters Oct 24 '12 at 11:34
  • @antony: or, put differently, if a descriptor has `__set__` defined, it is interested in intercepting assignment (including denying assignment, making it a read-only attribute). To this end it is given precedence; Python cannot detect if the descriptor is allowing writing the attribute (`__set__` will not raise an exception) and thus the instance `__dict__` value cannot override it, as that would allow you to circumvent such a descriptor. – Martijn Pieters Oct 24 '12 at 11:50
  • 3
    Still not convinced (again, my point is that Python's approach is usually one of "consenting adults" and that fiddling with `__dict__` should get you, well, what you want -- if I wanted to dynamically update an attribute while respecting the descriptor protocol I can use `setattr`) but I guess I should not argue on language design here. – antony Oct 24 '12 at 17:51
  • @antony: I tend to agree that the 'nanny' rationales (read-only attributes, preventing nonsense like shadowing a ctypes Structrue field, etc) are weak. However, it would be inefficient to always have to find the instance dict and lookup the attribute, just for the rare desire to override a data descriptor. There's nothing stopping you from checking the instance manually in the descriptor methods. – Eryk Sun Oct 27 '12 at 18:03
  • 2
    On the contrary, the implemented approach looks less efficient to me: you must now check the type dict for a data descriptor, then the instance dict for a non-descriptor attribute, and then the type dict again for a non-data descriptor. – antony Oct 27 '12 at 21:07
  • @antony: It only calls `_PyType_Lookup(tp, name)` once, which caches recently accessed attributes, including `NULL` if the name wasn't found in the bases. – Eryk Sun Oct 28 '12 at 00:00
  • 1
    But without this rule you wouldn't need to call this even once if the attribute is found in the instance dict. (note: I don't really think performance is a good argument here, it's not as if this was the slowest part of python :)) – antony Oct 28 '12 at 00:11