I recently learned about the "sentinel object" pattern in python. I was taken by it, and started to use it wherever I could. However, after using it someplace where it isn't needed, a coworker asked me about it. Now, I can't see the use of it, given that "x in dict" exists. Here is a (truncated) canonical example, from the functools LRU cache library:
def _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo):
# Constants shared by all lru cache instances:
sentinel = object() # unique object used to signal cache misses
make_key = _make_key # build a key from the function arguments
PREV, NEXT, KEY, RESULT = 0, 1, 2, 3 # names for the link fields
cache = {}
hits = misses = 0
full = False
cache_get = cache.get # bound method to lookup a key or return None
cache_len = cache.__len__ # get cache size without calling len()
lock = RLock() # because linkedlist updates aren't threadsafe
root = [] # root of the circular doubly linked list
root[:] = [root, root, None, None] # initialize by pointing to self
if maxsize == 0:
def wrapper(*args, **kwds):
# No caching -- just a statistics update after a successful call
nonlocal misses
result = user_function(*args, **kwds)
misses += 1
return result
elif maxsize is None:
def wrapper(*args, **kwds):
# Simple caching without ordering or size limit
nonlocal hits, misses
key = make_key(args, kwds, typed)
result = cache_get(key, sentinel)
if result is not sentinel:
hits += 1
return result
result = user_function(*args, **kwds)
cache[key] = result
misses += 1
return result
Now, just focusing on the part where the pattern is used:
result = cache_get(key, sentinel)
if result is not sentinel:
hits += 1
return result
result = user_function(*args, **kwds)
cache[key] = result
misses += 1
return result
As far as I can tell, this can be rewritten the following way:
if key not in cache:
result = user_function(*args, **kwds)
cache[key] = result
misses += 1
else:
result = cache_get(key)
hits += 1
return result
I wondered: What is the benefit of this sentinel method? I thought it might be efficiency. The python wiki says that "x in s" is O(n) average case, while get item is O(1) average case. But does this actually make a practical time difference?
I ran some quick tests on my laptop, and the runtimes are close, both in scenarios where most keys are hits and where most keys are misses.
In reply to @martineau, I don't think there is any additional functionality we get from this pattern, as demonstrated by this interactive session:
>>> d={1:None}
>>> if 1 in d:
... print('one is there')
...
one is there
>>> if 2 in d:
... print('two is not')
...
>>> d={1:None,None:3}
>>> if None in d:
... print('we can find a none key as well')
...
we can find a none key as well
So, the question remains: What is the point of this pattern?