1

I was doing multi-processing and multi-threading in python where I learn about GIL (Global Interpreter Lock), which allows only one thread in a state of execution at any point in time in result it can manage reference counter for each object.

Just for curiosity I thought lets check the reference count for different data types so I just ran this Bellow code

import sys

integer_1 = 9
float_1 = 3.14
string_1 = "Some String"
boolean_1 = True
LIST_1 = [1, 3]
TUPLE_1 = (1, 3)
SET_1 = {1, 3}
DICTIONARY_1 = {1: "1", 3: "3"}

print("integer    : ", sys.getrefcount(integer_1))
print("float      : ", sys.getrefcount(float_1))
print("string     : ", sys.getrefcount(string_1))
print("boolean    : ", sys.getrefcount(boolean_1))
print("LIST       : ", sys.getrefcount(LIST_1))
print("TUPLE      : ", sys.getrefcount(TUPLE_1))
print("SET        : ", sys.getrefcount(SET_1))
print("DICTIONARY : ", sys.getrefcount(DICTIONARY_1))

And I get the shocking output

integer    :  16
float      :  4
string     :  4
boolean    :  106
LIST       :  2
TUPLE      :  4
SET        :  2
DICTIONARY :  2

For LIST, SET and DICTIONARY the reference count is 2 which is completely understandable that python pass the object by reference so one reference is mine and other one is in getrefcount function when it receives.

But why other data types have different reference count?

Usman Ghani Mughal
  • 613
  • 2
  • 7
  • 14

2 Answers2

2

For mutable types you would see 2 and you know why.

Reference of immutable types can be shared without causing problems. On the other hand your script isn't the only module that is loaded and executed by the Python interpreter. They are other internal stuffs that have used these objects(in startup for instance).

Specially small integers are most used integers in other places like indices of the sequence types and so many other places. (In fact Python caches integers between -5 up to 256 in an array and treats them as singletons. Whenever you create an integer in this range you will get a reference back.)

Same thing for boolean True and False.

And one thing about multithreading, Threads are only switched between different bytecode instructions, so the reference counting system is thread safe this way.

S.B
  • 13,077
  • 10
  • 22
  • 49
  • 1
    To maybe clarify this a bit: Some values (like byte-sized numbers and booleans) are not created when you assign them somewhere. Instead Python holds those static and just hands you a reference (hence you just +1 it). You can check that behaviour if you use `id(your_small_int_or_boolean)`. It's the same for every variable holding the same value. – Ric Hard Apr 22 '22 at 08:02
  • @RicHard You're right. I just added to the answer. – S.B Apr 22 '22 at 08:08
0

You could use gc.get_referrers() to see what the referring objects are to help with your shock:

I elided True, because it's understandably used in a bunch of places.

The refs N counts occasionally differ from the count gc.get_referrers() since getrefcount() only looks at the refcount for an object, and that can change within a CPython function's (such as getrefcount() itself) execution.

for obj in (integer_1, float_1, string_1, LIST_1, TUPLE_1, SET_1, DICTIONARY_1):
    print(repr(obj), "of type", type(obj), "refs", sys.getrefcount(obj))
    for ref in gc.get_referrers(obj):
        print("  ", type(ref), str(ref)[:70])

On my interpreter this outputs something like this (# lines are my commentary).

9 of type <class 'int'> refs 26
# module internals for `__main__`
   <class 'list'> [b'import', b'gc', 'gc', b'', 'gc', b'import', b'sys', 'sys', b'', 'sy
   <class 'tuple'> (0, None, 9, 3.14, 'Some String', True, 1, 3, (1, 3), '1', '3', 'of ty
# module internals for some other module
   <class 'tuple'> (9, 3.14, 'Some String', [1, 3], (1, 3), {1, 3}, {1: '1', 3: '3'})
   <class 'tuple'> (None, ('to',), <code object print_ at 0x10906f680, file "/usr/local/C
# a tuple from some other module
   <class 'tuple'> (LITERAL, 9)
# module globals dicts for various modules
   <class 'dict'> {'__name__': '_locale', '__doc__': 'Support for POSIX locales.', '__pa
   <class 'dict'> {'__name__': 'tokenize', '__doc__': 'Tokenization help for Python prog
   <class 'dict'> {'__name__': 'token', '__doc__': 'Token constants.', '__package__': ''
   <class 'dict'> {'__new__': <built-in method __new__ of type object at 0x108d7f748>, '
   <class 'dict'> {'__new__': <built-in method __new__ of type object at 0x108d81938>, '
   <class 'dict'> {'__name__': '_signal', '__doc__': 'This module provides mechanisms to
   <class 'dict'> {'__name__': '__main__', '__doc__': None, '__package__': None, '__load
   <class 'dict'> {'__name__': 'stat', '__doc__': 'Constants/functions for interpreting 
   <class 'dict'> {'__name__': '_stat', '__doc__': 'S_IFMT_: file type bits\nS_IFDIR: di


3.14 of type <class 'float'> refs 6
# module internals and globals dict for `__main__`
   <class 'list'> [b'import', b'gc', 'gc', b'', 'gc', b'import', b'sys', 'sys', b'', 'sy
   <class 'tuple'> (0, None, 9, 3.14, 'Some String', True, 1, 3, (1, 3), '1', '3', 'of ty
   <class 'tuple'> (9, 3.14, 'Some String', [1, 3], (1, 3), {1, 3}, {1: '1', 3: '3'})
   <class 'dict'> {'__name__': '__main__', '__doc__': None, '__package__': None, '__load

'Some String' of type <class 'str'> refs 6
# module internals and globals dict for `__main__`
   <class 'list'> [b'import', b'gc', 'gc', b'', 'gc', b'import', b'sys', 'sys', b'', 'sy
   <class 'tuple'> (0, None, 9, 3.14, 'Some String', True, 1, 3, (1, 3), '1', '3', 'of ty
   <class 'tuple'> (9, 3.14, 'Some String', [1, 3], (1, 3), {1, 3}, {1: '1', 3: '3'})
   <class 'dict'> {'__name__': '__main__', '__doc__': None, '__package__': None, '__load

[1, 3] of type <class 'list'> refs 4
# module internals and globals dict for `__main__`
   <class 'tuple'> (9, 3.14, 'Some String', [1, 3], (1, 3), {1, 3}, {1: '1', 3: '3'})
   <class 'dict'> {'__name__': '__main__', '__doc__': None, '__package__': None, '__load

(1, 3) of type <class 'tuple'> refs 6
# module internals and globals dict for `__main__`
   <class 'list'> [b'import', b'gc', 'gc', b'', 'gc', b'import', b'sys', 'sys', b'', 'sy
   <class 'tuple'> (0, None, 9, 3.14, 'Some String', True, 1, 3, (1, 3), '1', '3', 'of ty
   <class 'tuple'> (9, 3.14, 'Some String', [1, 3], (1, 3), {1, 3}, {1: '1', 3: '3'})
   <class 'dict'> {'__name__': '__main__', '__doc__': None, '__package__': None, '__load

{1, 3} of type <class 'set'> refs 4
# module internals and globals dict for `__main__`
   <class 'tuple'> (9, 3.14, 'Some String', [1, 3], (1, 3), {1, 3}, {1: '1', 3: '3'})
   <class 'dict'> {'__name__': '__main__', '__doc__': None, '__package__': None, '__load

{1: '1', 3: '3'} of type <class 'dict'> refs 4
# module internals and globals dict for `__main__`
   <class 'tuple'> (9, 3.14, 'Some String', [1, 3], (1, 3), {1, 3}, {1: '1', 3: '3'})
   <class 'dict'> {'__name__': '__main__', '__doc__': None, '__package__': None, '__load

This also ties into the CPython small integer cache; certain small integers (e.g. that 9) are always the same object for performance reasons, while arbitrary integers are likely allocated separately:

923852 of type <class 'int'> refs 6
   <class 'list'> [b'import', b'gc', 'gc', b'', 'gc', b'import', b'sys', 'sys', b'', 'sy
   <class 'tuple'> (0, None, 9, 923852, 3.14, 'Some String', True, 1, 3, (1, 3), '1', '3'
   <class 'tuple'> (923852,)
   <class 'dict'> {'__name__': '__main__', '__doc__': None, '__package__': None, '__load
AKX
  • 152,115
  • 15
  • 115
  • 172