41

I encountered a strange bug in python where using the __new__ method of a class as a factory would lead to the __init__ method of the instantiated class to be called twice.

The idea was originally to use the __new__ method of the mother class to return a specific instance of one of her children depending on the parameters that are passed, without having to declare a factory function outside of the class.

I know that using a factory function would be the best design-pattern to use here, but changing the design pattern at this point of the project would be costly. My question hence is: is there a way to avoid the double call to __init__ and get only a single call to __init__ in such a schema ?

class Shape(object):
    def __new__(cls, desc):
        if cls is Shape:
            if desc == 'big':   return Rectangle(desc)
            if desc == 'small': return Triangle(desc)
        else:
            return super(Shape, cls).__new__(cls, desc)

    def __init__(self, desc):
        print "init called"
        self.desc = desc

class Triangle(Shape):
    @property
    def number_of_edges(self): return 3

class Rectangle(Shape):
    @property
    def number_of_edges(self): return 4

instance = Shape('small')
print instance.number_of_edges

>>> init called
>>> init called
>>> 3

Any help greatly appreciated.

xApple
  • 6,150
  • 9
  • 48
  • 49

3 Answers3

68

When you construct an object Python calls its __new__ method to create the object then calls __init__ on the object that is returned. When you create the object from inside __new__ by calling Triangle() that will result in further calls to __new__ and __init__.

What you should do is:

class Shape(object):
    def __new__(cls, desc):
        if cls is Shape:
            if desc == 'big':   return super(Shape, cls).__new__(Rectangle)
            if desc == 'small': return super(Shape, cls).__new__(Triangle)
        else:
            return super(Shape, cls).__new__(cls, desc)

which will create a Rectangle or Triangle without triggering a call to __init__ and then __init__ is called only once.

Edit to answer @Adrian's question about how super works:

super(Shape,cls) searches cls.__mro__ to find Shape and then searches down the remainder of the sequence to find the attribute.

Triangle.__mro__ is (Triangle, Shape, object) and Rectangle.__mro__ is (Rectangle, Shape, object) while Shape.__mro__ is just (Shape, object). For any of those cases when you call super(Shape, cls) it ignores everything in the mro squence up to and including Shape so the only thing left is the single element tuple (object,) and that is used to find the desired attribute.

This would get more complicated if you had a diamond inheritance:

class A(object): pass
class B(A): pass
class C(A): pass
class D(B,C): pass

now a method in B might use super(B, cls) and if it were a B instance would search (A, object) but if you had a D instance the same call in B would search (C, A, object) because the D.__mro__ is (B, C, A, object).

So in this particular case you could define a new mixin class that modifies the construction behaviour of the shapes and you could have specialised triangles and rectangles inheriting from the existing ones but constructed differently.

Duncan
  • 92,073
  • 11
  • 122
  • 156
  • 2
    Wouldn't it be better to use `return Rectangle.__new__(Rectangle)` because this would guarantee that `__new__` of `Rectangle` gets called if it's defined? – Georg Schölly May 12 '11 at 18:38
  • 2
    @Georg, If you do that you are going to have to be pretty careful to avoid infinite recursion. Any class specific initialisation should be in `__init__` so I think it is pretty safe here to assume that `__new__`'s only job is to create an object of the correct type. – Duncan May 13 '11 at 09:22
  • The code in your answer fixes the problem but doesn't explain why `Shape.__init__()` was called twice in the OP's code, even when `Shape.__new__()` returns a `Triangle(desc)`. This seems contrary to what the [docs](http://docs.python.org/2/reference/datamodel.html?highlight=__new__#basic-customization) say: "If `__new__()` does not return an instance of `cls`, then the new instance’s `__init__()` method will _not_ be invoked." – martineau Jan 28 '13 at 00:33
  • 1
    Never mind, I'm guessing in the OP's code it's `isinstance(, Shape)` being `True` for a `Triangle` object that's allowing the `Shape.__init__()` method to be called on it. – martineau Jan 28 '13 at 01:01
  • @martineau: Precisely, `__init__` is called if the returned object is an instance of the specified class _or a subclass_. (It's my frank opinion that this is a Bad Thing, but alas, it's the way it is.) – javawizard Jun 10 '13 at 20:06
  • @GeorgSchölly `super(Shape, cls)` - in general, how does `super` use the first and second arguments to return the thing it returns? [I know `cls` must be a subclass of `Shape`](https://docs.python.org/3/library/functions.html#super) but how does the thing that `super` returns relate to `Shape` and `cls`? – Bob Jul 30 '16 at 19:38
  • @Adrian: I don't know all the details of super by heart, but you can have a look [at the documentation](https://docs.python.org/3/library/functions.html#super) or [the code](https://hg.python.org/cpython/file/tip/Objects/typeobject.c#l7129). It must be something like this: `super(Class, instance).method(args)` is the same as `Class.method(instance, args)` where `instance` is usually called `self` inside the method. – Georg Schölly Jul 31 '16 at 07:46
  • @Adrian, I added a description of how `super` works. – Duncan Aug 02 '16 at 09:05
  • 2
    What would be the equivalent in python 3? – geekscrap Aug 03 '20 at 14:33
14

After posting my question, I continued searching for a solution an found a way to solve the problem that looks like a bit of a hack. It is inferior to Duncan's solution, but I thought it could be interesting to mention none the less. The Shapeclass becomes:

class ShapeFactory(type):
    def __call__(cls, desc):
        if cls is Shape:
            if desc == 'big':   return Rectangle(desc)
            if desc == 'small': return Triangle(desc)
        return type.__call__(cls, desc)

class Shape(object):
    __metaclass__ = ShapeFactory 
    def __init__(self, desc):
        print "init called"
        self.desc = desc
xApple
  • 6,150
  • 9
  • 48
  • 49
  • 4
    Why do you say this is inferior to Duncan's solution. This seems much clearer to what's going on, imo less hacky. Also meta. – Andy Hayden Sep 22 '13 at 01:14
  • 3
    I thought it was less obvious because it adds a fourth class to the program and involves the `__metaclass__` black magic that not everyone is familiar with. – xApple Sep 25 '13 at 08:22
  • Although most folks probably don't care, one advantage of using inheritance instead of a metaclass is that the syntax for specifying metaclasses is different in Python 3 that it is in Python 2—on the other hand, inheritance is written similarly in both of them. That fact could make it a better fit when attempting to write code that will work unchanged in both versions of Python. – martineau Jul 05 '18 at 04:05
  • IMHO having metaclass is cleaner because here, one is trying to implemented a top down inheritence (creating child from parents) instead of the normal inheritance order down-top (creating parents from child). Metaclass seems to be the right way to do it :) Also, the whole purpose of `call`and `new` is to deal with such cases... – Marine Galantin Apr 03 '21 at 23:02
-1

I can't actually reproduce this behavior in either of the Python interpreters I have installed, so this is something of a guess. However...

__init__ is being called twice because you are initializing two objects: the original Shape object, and then one of its subclasess. If you change your __init__ so it also prints the class of the object being initialized, you will see this.

print type(self), "init called"

This is harmless because the original Shape will be discarded, since you are not returning a reference to it in your __new__().

Since calling a function is syntactically identical to instantiating a class, you can change this to a function without changing anything else, and I recommend that you do exactly that. I don't understand your reluctance.

kindall
  • 178,883
  • 35
  • 278
  • 309
  • 2
    This is not correct - the first `__init__` call happens *inside* the outer `__new__` call (when `Triangle()` and `Rectangle()` are called), but then, because an instance of `Shape` is returned by `__new__`, the original `Shape()` call invokes `__init__` *again* on that already initialised object. Note that if the object returned by `__new__()` is *not* an instance of `Shape()` then `__init__` won't be called (which can foul up attempts to observe this behaviour if the class hierarchy isn't right). – ncoghlan May 11 '11 at 03:26
  • Indeed, both are syntactically identical. My reluctance stems from the fact that if I define a function called "Shape", I must rename my class to something like "_Shape". This will cause some variable renaming, of course, but mostly it will have complicated consequences on other things like the documentation that is generated by sphinx-autodoc. – xApple May 11 '11 at 08:02
  • I suppose you do want to expose documentation on those classes so you can't just declare them *in* the function. You might try just reassigning the instance's `__class__` attribute in `__init__()` and not messing with `__new__()` at all. – kindall May 11 '11 at 13:55
  • ... one of the pitfalls of this, by the way, is that you will have to manually re-bind the instance methods from the correct class to the instance. – kindall May 11 '11 at 16:57
  • Another approach might be to use a metaclass to override the `__call__()` method on the class, so that the `()` syntax does not necessarily instantiate the class. – kindall May 11 '11 at 17:49
  • ... in fact, the metaclass approach does work, but the solution you accepted is much simpler. – kindall May 11 '11 at 18:07