22

Code:

from weakref import WeakSet
se = WeakSet()
se.add(1)

Output:

TypeError: cannot create weak reference to 'int' object

Doc:

Several built-in types such as list and dict do not directly support weak references but can add support through subclassing:

...

Other built-in types such as tuple and int do not support weak references even when subclassed (This is an implementation detail and may be different across various Python implementations.).

This isn't expressive enough to explain:

  • Why some built-in types don't support weak references?

  • What are exactly those types that support weak references?


To add some thoughts:

For the above example you can wrap the int within a user-defined wrapper class, and that wrapper class supports weak references (Those who are familiar with Java will recall int and Integer):

from weakref import WeakSet
se = WeakSet()

class Integer:
    def __init__(self, n=0):
        self.n = n

i = 1
I = Integer(1)

se.add(i)   # fail
se.add(I)   # ok

I'm not sure why Python doesn't provide auto-wrapping for commonly used built-in types (int, str, etc.) but instead simply say they don't support weak references. It might be due to performance issues, but not being able to weakref these built-in types greatly reduced its usage.

Grokify
  • 15,092
  • 6
  • 60
  • 81
Cyker
  • 9,946
  • 8
  • 65
  • 93
  • Related: [What exactly is \_\_weakref\_\_ in Python?](//stackoverflow.com/q/36787603) (arguably the same question but backwards) – Aran-Fey Aug 24 '18 at 20:44
  • Do you know anything about the C API, and have you followed the link (immediately after the docs you quoted) to [Weak Reference Support](https://docs.python.org/3/extending/newtypes.html#weakref-support)? If so, part 1 of the answer is pretty simple, although part 2 is still not so much. If not, it's a lot more complicated, because it would require explaining most of what those docs explain. – abarnert Aug 24 '18 at 20:46
  • @Aran-Fey *What is `__weakref__`* is not *What is weakref*? That question mentions nothing about built-in types. – Cyker Aug 24 '18 at 20:53
  • @abarnert I followed the link but it doesn't tell you why built-in types don't support weakrefs. The only conclusion you can draw from that section might be *built-in types are too simple to support weakref`, but this is not expressive enough. – Cyker Aug 24 '18 at 20:55
  • @Cyker Thats why I said that answering part 1 would be pretty simple if you’d read that, rather than saying that reading it would give you the answer. (It tells you what builtin types would have to do if they wanted to be weakref-able, but you still need the explanation of why they don’t do that. And, for part 2, the only answer is slogging through each builtin type, either in source or in the REPL, to check which ones do.) – abarnert Aug 24 '18 at 20:58
  • Java's `int`/`Integer` distinction is completely different. Python has no concept of primitives; an int is already an object. In Java, you can't even put an int in a list. The wrappers are near-mandatory. Also, your code is prone to Integer + Integer or Integer + int TypeErrors, which are essentially impossible in the Java type system. – user2357112 Aug 24 '18 at 21:23
  • @user2357112 Well I understand in Python an `int` is an `object`. The example is posted at its simplest and only for providing a possible way of supporting weakref on built-in types. But given an `int` is an `object`, think about this: Inheriting from `object` gives you a class supporting weak references but inheriting from `int` doesn't. This is somewhat weird. – Cyker Aug 24 '18 at 21:31
  • Related: https://stackoverflow.com/questions/7711246/why-weakref-doesnt-support-built-in-types-in-python/7712764 – ws_e_c421 Feb 13 '20 at 17:19

2 Answers2

13

First: this is all CPython-specific. Weakrefs work differently on different Python implementations.

Most built-in types don't support weak references because Python's weak reference mechanism adds some overhead to every object that supports weak references, and the Python dev team decided they didn't want most built-in types to pay that overhead. The simplest way this overhead manifests is that any object with weak reference support needs space for an extra pointer for weakref management, and most built-in objects don't reserve space for that pointer.

Attempting to compile a complete list of all types with weak reference support is about as fruitful as trying to compile a complete list of all humans with red hair. If you want to determine whether a type has weak reference support, you can check its __weakrefoffset__, which is nonzero for types with weak reference support:

>>> int.__weakrefoffset__
0
>>> type.__weakrefoffset__
368
>>> tuple.__weakrefoffset__
0
>>> class Foo(object):
...     pass
... 
>>> class Bar(tuple):
...     pass
... 
>>> Foo.__weakrefoffset__
24
>>> Bar.__weakrefoffset__
0

A type's __weakrefoffset__ is the offset in bytes from the start of an instance to the weakref pointer, or 0 if instances have no weakref pointer. It corresponds to the type struct's tp_weaklistoffset at C level. As of the time of this writing, __weakrefoffset__ is completely undocumented, but tp_weaklistoffset is documented, because people implementing extension types in C need to know about it.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • 2
    There’s also the historical fact that most of the builtin types predate weakrefs, so adding weakref support would have been a performance regression for tons of existing Python code. But otherwise I think this covers everything. – abarnert Aug 24 '18 at 21:01
  • Thank you for providing `__weakrefoffset__`. From the code you posted, can we safely say all user-defined classes directly inheriting from `object` (possibly recursively, but doesn't involve another built-in class in the chain) would support weak references? – Cyker Aug 24 '18 at 21:20
  • 1
    @Cyker: No, because `__slots__` complicates things. – user2357112 Aug 24 '18 at 21:21
  • @user2357112 Could you please give an example about that? I think most developer won't bother with `__slots__` when defining a custom class. If so, and they define the subclass by `class Foo(object): ...`, then `Foo` seems to always support weak references in CPython. Not sure about this on different Python implementations. – Cyker Aug 24 '18 at 21:36
  • 1
    @Cyker Any developer who needs to optimize memory usage will definitely think about using `__slots__` on their types, and any developer who doesn't need to optimize memory usage probably isn't looking for weakrefs, so I think user's point is pretty relevant. And meanwhile, [here's a trivial example](https://repl.it/repls/ToughShowyArtificialintelligence). – abarnert Aug 24 '18 at 22:40
  • Link to what's the "overhead" https://mail.python.org/pipermail/python-list/2005-March/346298.html – Bhupesh Varshney Aug 29 '22 at 18:01
3

There are two things that aren’t covered by user’s excellent answer.


First, weakref was added to Python in version 2.1.

For everything added after 2.1 (and that includes object and type), the default was to add weakref support unless there was a good reason not to.

But for everything that already existed, especially pretty small ones like int, adding another 4 bytes (most Python implementations were 32-bit at the time, so let’s just call a pointer 4 bytes) could cause a noticeable performance regression for all of the Python code out there that had been written for 1.6/2.0 or earlier. So, there was a higher bar to pass for adding weakref support to those types.


Second, Python allows the implementation to merge values of builtin types that it can prove are immutable, and for a few of those builtin types, CPython takes advantage of that. For example (the details vary across versions, so take this only as an example):

  • Integers from -5 to 255, the empty string, single-character printable ASCII strings, the empty bytes, single-byte bytes, and the empty tuple get singleton instances created at startup, and most attempts to construct a new value equal to one of these singletons instead get a reference to the singleton.
  • Many strings are cached in a string intern table, and many attempts to construct a string with the same value as an interned string instead get a reference to the existing one.
  • Within a single compilation unit, the compiler will merge two separate constants that are equal ints, strings, tuples of ints and strings, etc. into two references to the same constant.

So, weakrefs to these types wouldn’t be as useful as you’d initially think. Many values just aren’t ever going to go away, because they’re references to singletons or module constants or interned strings. Even those that aren’t immortal, you probably have more references to them than you expected.

Sure, there are some cases where weakrefs would be useful anyway. If I calculate a billion large integers, most of those won’t be immortal, or shared. But it means they’re useful less often for these types, which has to be a factor when weighing the tradeoffs of making every int 4 bytes larger so you can save memory by safely releasing them in some relatively uncommon cases.

abarnert
  • 354,177
  • 51
  • 601
  • 671