Why does CPython return a new pointer to the True and False singletons and increment its reference count?

Question

A Python Boolean is described as follows in the documentation:

Booleans in Python are implemented as a subclass of integers. There are only two booleans, Py_False and Py_True.

The Py_False and Py_True are, as I understand it, singletons corresponding to False and True respectively.

Indeed, the following returns True in my Jupyter notebook:

a = True
b = True
a is b

False works the same way.

The PyBool_FromLong method (code here) creates a new Boolean object from a long. However, it creates an entirely new pointer to it and increments the reference count before returning it:

PyObject *PyBool_FromLong(long ok)
{
    PyObject *result;

    if (ok)
        result = Py_True;
    else
        result = Py_False;
    return Py_NewRef(result);
}

The Py_True and Py_False are defined as follows:

/* Py_False and Py_True are the only two bools in existence.
Don't forget to apply Py_INCREF() when returning either!!! */

/* Don't use these directly */
PyAPI_DATA(PyLongObject) _Py_FalseStruct;
PyAPI_DATA(PyLongObject) _Py_TrueStruct;

/* Use these macros */
#define Py_False _PyObject_CAST(&_Py_FalseStruct)
#define Py_True _PyObject_CAST(&_Py_TrueStruct)

The comments above are quite insistent that you increment the reference count when returning either, and that's exactly what the method I showed above does. I'm somewhat confused as to why this is necessary, though, since (as I understand it) these are just singletons that will never be garbage collected.

I was able to find this Q&A about whether incrementing the ref count is always necessary, but I'm still confused about why it's needed in the first place, given that the True and False objects are singletons that would never be garbage collected.

I'm not sure if I'm missing something obvious, but can someone explain why it's necessary to increment the reference count when returning a reference to Py_False or Py_True? Or is this to prevent the object from ever being garbage collected?

[This](https://stackoverflow.com/a/41509916/3000206) answer appears to address why you need to take reference counts into consideration. It needs to quack like a duck. — Carcigenicate, Dec 04 '22 at 21:07
@Carcigenicate Good point. That being said, what happens if the reference count reaches 0? Is that even possible? I assume that the objects should never be garbage collected. — EJoshuaS - Stand with Ukraine, Dec 04 '22 at 21:10

score 0 · Accepted Answer · answered Jan 13 '23 at 16:51

Since the previous answer was deleted, I will summarize here. It turns out that this is to prevent the objects from being garbage collected.

Since these objects are singletons, by definition should never be garbage collected. If we can prevent the reference count from reaching zero, the interpreter will never garbage collect it. This approach has the benefit of preventing the interpreter from needing to explicitly check whether an object is a singleton before garbage collecting it, thus avoiding some potentially ugly special-case reasoning.

I assume (but haven't been able to prove yet) that the reference count will already be 1 prior to users explicitly referring to them. We can then just use the "ordinary" memory management mechanism on them without having to worry about them ever being garbage collected.

Why does CPython return a new pointer to the True and False singletons and increment its reference count?

1 Answers1

Linked