11

I use Python's lru_cache on a function which returns a mutable object, like so:

import functools

@functools.lru_cache()
def f():
    x = [0, 1, 2]  # Stand-in for some long computation
    return x

If I call this function, mutate the result and call it again, I do not obtain a "fresh", unmutated object:

a = f()
a.append(3)
b = f()
print(a)  # [0, 1, 2, 3]
print(b)  # [0, 1, 2, 3]

I get why this happens, but it's not what I want. A fix would be to leave the caller in charge of using list.copy:

a = f().copy()
a.append(3)
b = f().copy()
print(a)  # [0, 1, 2, 3]
print(b)  # [0, 1, 2]

However I would like to fix this inside f. A pretty solution would be something like

@functools.lru_cache(copy=True)
def f():
    ...

though no copy argument is actually taken by functools.lru_cache.

Any suggestion as to how to best implement this behavior?

Edit

Based on the answer from holdenweb, this is my final implementation. It behaves exactly like the builtin functools.lru_cache by default, and extends it with the copying behavior when copy=True is supplied.

import functools
from copy import deepcopy

def lru_cache(maxsize=128, typed=False, copy=False):
    if not copy:
        return functools.lru_cache(maxsize, typed)
    def decorator(f):
        cached_func = functools.lru_cache(maxsize, typed)(f)
        @functools.wraps(f)
        def wrapper(*args, **kwargs):
            return deepcopy(cached_func(*args, **kwargs))
        return wrapper
    return decorator

# Tests below

@lru_cache()
def f():
    x = [0, 1, 2]  # Stand-in for some long computation
    return x

a = f()
a.append(3)
b = f()
print(a)  # [0, 1, 2, 3]
print(b)  # [0, 1, 2, 3]

@lru_cache(copy=True)
def f():
    x = [0, 1, 2]  # Stand-in for some long computation
    return x

a = f()
a.append(3)
b = f()
print(a)  # [0, 1, 2, 3]
print(b)  # [0, 1, 2]
jmd_dk
  • 12,125
  • 9
  • 63
  • 94
  • I don't think this is an appropriate use of `functools.lru_cache` because your function doesn't have any arguments (which are used to look-up previous results in the cache). To get what you have to work, make your function return a copy (or `deepcopy`) of `x`—which will likely defeat the purpose of using the decorator in this scenario. – martineau Feb 27 '19 at 16:18
  • 1
    @martineau The function in my actual use case will have arguments. This does not matter for the question. – jmd_dk Feb 27 '19 at 16:19
  • Well, it does in so far as it would have made providing a working answer a little easier, but I hope I've finally arrived at something you can use - if you must despite the documented warnings. – holdenweb Feb 28 '19 at 13:13
  • I use `lru_cache` as a walk-around for `functools.cache` (memoize, need py>=3.9). It's good, except that python has no constness (not like c++, it's all value semantic not object semantic), so that cached object can be modified by user. That's troublesome. So create a deepcopy out of cached object is a really good idea here. – dvorak4tzx Mar 16 '23 at 12:36

1 Answers1

7

Since the lru_cache decorator has unsuitable behaviour for you, the best you can do is to build your own decorator that returns a copy of what it gets from lru_cache. This will mean that the first call with a particular set of arguments will create two copies of the object, since now the cache will only be holding prototype objects.

This question is made more difficult because lru_cache can take arguments (mazsize and typed), so a call to lru_cache returns a decorator. Remembering that a decorator takes a function as its argument and (usually) returns a function, you will have to replace lru_cache with a function that takes two arguments and returns a function that takes a function as an argument and returns a (wrapped) function which is not an easy structure to wrap your head around.

You would then write your functions using the copying_lru_cache decorator instead of the standard one, which is now applied "manually" inside the updated decorator.

Depending on how heavy the mutations are, you might get away without using deepcopy, but you don't give enough information to determine that.

So your code would then read

from functools import lru_cache
from copy import deepcopy

def copying_lru_cache(maxsize=10, typed=False):
    def decorator(f):
        cached_func = lru_cache(maxsize=maxsize, typed=typed)(f)
        def wrapper(*args, **kwargs):
            return deepcopy(cached_func(*args, **kwargs))
        return wrapper
    return decorator

@copying_lru_cache()
def f(arg):
    print(f"Called with {arg}")
    x = [0, 1, arg]  # Stand-in for some long computation
    return x

print(f(1), f(2), f(3), f(1))

This prints

Called with 1
Called with 2
Called with 3
[0, 1, 1] [0, 1, 2] [0, 1, 3] [0, 1, 1]

so the cacheing behaviour your require appears to be present. Note also tht the documentation for lru_cache specifically warns that

In general, the LRU cache should only be used when you want to reuse previously computed values. Accordingly, it doesn’t make sense to cache functions with side-effects, functions that need to create distinct mutable objects on each call, or impure functions such as time() or random().

holdenweb
  • 33,305
  • 7
  • 57
  • 77
  • That looks better. However it seems like doing a copy on every call to the wrapped function is going to cancel-out to some degree the likely reason for using a LRU cache in the first place. – martineau Feb 27 '19 at 16:12
  • 1
    @martineau Well, each computation might take a long time, even though the returned object is small. – jmd_dk Feb 27 '19 at 16:13
  • @holdenweb Can you provide a small but complete example using your `copying_lru_cache` decorator? – jmd_dk Feb 27 '19 at 16:16
  • @jmd_dk: It's a decorator—use it like any other. – martineau Feb 27 '19 at 16:25
  • @martineau If I replace `@functools.lru_cache()` in my question with either of `@copying_lru_cache` or `@copying_lru_cache()` (after doing `from functools import lru_cache`) I get two separate errors... – jmd_dk Feb 27 '19 at 16:33
  • @jmd_dk: Putting `@copying_lru_cache` right before the function the `def f():` should work (unless there's something I'm missing). I'll let you and the author work things out. – martineau Feb 27 '19 at 16:37
  • @jmd_dk: Just occurrent ot me that it may not work on your example code because `f()` doesn't have any arguments... – martineau Feb 27 '19 at 16:46
  • This whole answer was pretty ill-thought out, and if I can't correct it (it's nearly bedtime) I should delete it. – holdenweb Feb 27 '19 at 22:54