7

I have an object with a method/property multiplier. This method is called many times in my program, so I've decided to use lru_cache() on it to improve the execution speed. As expected, it is much faster:

The following code shows the problem:

from functools import lru_cache

class MyClass(object):
    def __init__(self):
        self.current_contract = 201706
        self.futures = {201706: {'multiplier': 1000},
                        201712: {'multiplier': 25}}

    @property
    @lru_cache()
    def multiplier(self):
        return self.futures[self.current_contract]['multiplier']

CF = MyClass()
assert CF.multiplier == 1000

CF.current_contract = 201712
assert CF.multiplier == 25

The 2nd assert fails, because the cached value is 1000 as lru_cache() is unaware that the underlying attribute current_contract was changed.

Is there a way to clear the cache when self.current_contract is updated?

Thanks!

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
agiap
  • 503
  • 1
  • 6
  • 16

2 Answers2

6

Yes quite simply: make current_contract a read/write property and clear the cache in the property's setter:

from functools import lru_cache

class MyClass(object):
    def __init__(self):
        self.futures = {201706: {'multiplier': 1000},
                        201712: {'multiplier': 25}}
        self.current_contract = 201706

    @property
    def current_contract(self):
        return self._current_contract

    @current_contract.setter
    def current_contract(self, value):
        self._current_contract = value
        type(self).multiplier.fget.cache_clear()

    @property
    @lru_cache()
    def multiplier(self):
        return self.futures[self.current_contract]['multiplier']

NB : I assume your real use case involves costly computations instead of a mere dict lookup - else lru_cache might be a bit overkill ;)

bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118
  • Actually it's really a mere dict lookup, but it's called hundreds of thousands times in my program and using lru_cache made a big difference. I'll test it again with the new code. Many thanks for your help! It solved the problem. – agiap Jul 24 '17 at 14:38
  • 2
    If you have such a need for optimizations you may want to use `self._current_contract` instead of `self.current_contract` in `multiplier` (to avoid the property call / method call / attribute resolution overhead), and possibly just make multiplier a plain attribute that gets set in `current_contract` setter (note that I haven't done any benchmarking so you may want to `timeit` first to find out which solution is indeed the fastest) – bruno desthuilliers Jul 24 '17 at 14:49
  • That's what I did, will test it soon. Thanks for the advice. – agiap Jul 25 '17 at 14:59
  • 1
    @brunodesthuilliers can you please give me some color on why you have to use `type(self).multiplier.fget.cache_clear()` instead of `self.multiplier.fget.cache_clear()` ? thx – Steven G Jul 25 '17 at 15:39
  • Because else you'd trigger the property mechanism. You can read the official doc about descriptors (the general mechanism that supports computed attributes) to get the details. – bruno desthuilliers Jul 25 '17 at 20:54
  • Hi everyone! Is there a way to clear the `lru_cache` for only one instance of the class? I mean suppose the changes on attributes of one object require a cache flush, but another object doesn't requires a cache flush. Is there any way to do that? – garciparedes Oct 19 '19 at 09:59
4

Short Answer

Don't clear the cache when self.current_contract is updated. That is working against the cache and throws away information.

Instead, just add methods for __eq__ and __hash__. That will teach the cache (or any other mapping) which attributes are important for influencing the result.

Worked out example

Here we add __eq__ and __hash__ to your code. That tells the cache (or any other mapping) that current_contract is the relevant independent variable:

from functools import lru_cache

class MyClass(object):
    def __init__(self):
        self.current_contract = 201706
        self.futures = {201706: {'multiplier': 1000},
                        201712: {'multiplier': 25}}

    def __hash__(self):
        return hash(self.current_contract)

    def __eq__(self, other):
        return self.current_contract == other.current_contract

    @property
    @lru_cache()
    def multiplier(self):
        return self.futures[self.current_contract]['multiplier']

An immediate advantage is that as you switch between contract numbers, previous results are kept in the cache. Try switching between 201706 and 201712 a hundred times and you will get 98 cache hits and 2 cache misses:

cf = MyClass()
for i in range(50):
    cf.current_contract = 201712
    assert cf.multiplier == 25
    cf.current_contract = 201706 
    assert cf.multiplier == 1000
print(vars(MyClass)['multiplier'].fget.cache_info())

This prints:

CacheInfo(hits=98, misses=2, maxsize=128, currsize=2)
Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485