2

I'm implementing a small "persistent" version of functools.lru_cache, where "persistent" means that the cache is not destroyed between subsequent runs. To this end, I thought I could simply replace the dictionary that functools.lru_cache uses by a shelve object, but I'm running into shelves requirement for keys to be strings. I tried to fix it by using str(hash(...).to_int(...)), but hash(...) is not identical for identical objects between different runs of the cpython interpreter.

Is there any class like shelve that allows for any hashable key, rather than just strings, while being accessible like a dictionary transparently?

Some details: my cache might be in the order of 100MB. It is somewhat frequently read and infrequently written.

gerrit
  • 24,025
  • 17
  • 97
  • 170
  • one question pops in my head, why not stringify your key `str()`? – zeapo Apr 30 '14 at 23:17
  • 2
    @zeapo As in `str(bytes(hash(...).to_int(...)))`? I thought of that after posting this question. It hurts me on the inside, though ;-) – gerrit Apr 30 '14 at 23:18
  • @zeapo Just realised that `hash(...)` is not consistent between different cpython runs. – gerrit May 01 '14 at 17:00
  • @zeapo But, I could just run `str(...)` on my key directly as long as its string-representation is reasonably short. That probably has some caveats but it will probably do for my use case. – gerrit May 01 '14 at 17:09

2 Answers2

1

The concrete solution really depends on your needs, but it is likely you'll find that using a database is the best solution for your case. You can also use pickle instead of shelve, but that has its downsides of course.

  • is the cache big?

then you need a solution which avoids reading/writing the entire cache on each access (e.g. a pickle-based solution), e.g. a DB.

  • is it accessed frequently?

then you probably also want an in-memory caching to avoid the frequent slow external accesses.

Quick googling found this: persistentdict.

Also see this question.

shx2
  • 61,779
  • 13
  • 130
  • 153
0

Could subclassing Shelf work the way you want?

from shelve import Shelf

class SubShelf(Shelf):
    def __init__(self):
        super().__init__()

    def __setitem__(self, key, val):
        h = str(hash(key))
        super().__setitem__(h, val)

    def __getitem__(self, key):
        h = str(hash(key))
        return super().__getitem__(h)
slushy
  • 3,277
  • 1
  • 18
  • 24
  • That doesn't work, because `hash(key)` is not identical for identical keys between subsequent runs of the Python interpreter. – gerrit May 01 '14 at 17:02