18

I'm using Python 3's builtin functools.lru_cache decorator to memoize some expensive functions. I would like to memoize as many calls as possible without using too much memory, since caching too many values causes thrashing.

Is there a preferred technique or library for accomplishing this in Python?

For example, this question lead me to a Go library for system memory aware LRU caching. Something similar for Python would be ideal.


Note: I can't just estimate the memory used per value and set maxsize accordingly, since several processes will be calling the decorated function in parallel; a solution would need to actually dynamically check how much memory is free.

Community
  • 1
  • 1
Will
  • 4,241
  • 4
  • 39
  • 48
  • 1
    If you can't find something out there that does this already, you can try leveraging psutil (http://code.google.com/p/psutil/) to roll your own. – dano May 05 '14 at 16:49
  • Yes, that's what I'm looking into right now—in fact, do you happen to know how to find the source for Python 3's `lru_cache` implementation? The easiest way would be to simply check the memory usage within the decorator. It would add some overhead for sure, but in this application I don't think it would be significant. – Will May 05 '14 at 16:52
  • 1
    @Will for the source see [functools.py:370](http://hg.python.org/cpython/file/5d0783376c88/Lib/functools.py#l370) and a couple lines above for the cache key functions. – Lukas Graf May 05 '14 at 17:02
  • Or locally: Launch an interactive Python interpreter, `import functools` and just enter the module name `functools`. Works for any locating the source of nearly any Python module (except C extensions of course). – Lukas Graf May 05 '14 at 17:05
  • 1
    @Will BTW, `functools.lru_cache` was written by Raymond Hettinger. He posted several different (LRU) caching / memoization [recipes](http://code.activestate.com/recipes/users/178123/tags/cache/), maybe you can find something useful or at least inspirational in those ;-) – Lukas Graf May 05 '14 at 17:17
  • Thanks guys, seems to be working well. Posted the code in my answer. Let me know if it can be improved at all! – Will May 05 '14 at 21:07
  • I asked in https://github.com/tkem/cachetools/issues/152 We will see :) – Mišo Dec 06 '19 at 12:22

1 Answers1

21

I ended up modifying the built-in lru_cache to use psutil.

The modified decorator takes an additional optional argument use_memory_up_to. If set, the cache will be considered full if there are fewer than use_memory_up_to bytes of memory available (according to psutil.virtual_memory().available). For example:

from .lru_cache import lru_cache

GB = 1024**3

@lru_cache(use_memory_up_to=(1 * GB))
def expensive_func(args):
    ...

Note: setting use_memory_up_to will cause maxsize to have no effect.

Here's the code: lru_cache.py

Will
  • 4,241
  • 4
  • 39
  • 48