1

Is there an existing (Python) implementation of a hash-like data structure that exists partially on disk? Or, can persist specific keys to some secondary storage, based on some criteria (like last-accessed-time)?

ex: "data at key K has not been accessed in M milliseconds; serialize it to persistent storage (disk?), and delete it".

I was referred to this, but I'm not sure I can digest it.

EDIT:

I've recd two excellent answers (sqlite; gdb)m; in order to determine a winner, I'll have to wait until I've tested both. Thank you!!

Community
  • 1
  • 1
mikewaters
  • 3,668
  • 3
  • 28
  • 22

2 Answers2

2

Go for SQLITE. A big problem that you will face down the road is concurrency/file corruption, etc and SQLite makes these very easy to avoid as it provides transactional integrity. Just define a single table with schema (primary key string key, string value). SQLite is insanely fast, especially if you wrap bunches of writes into a transaction.

GDBM IMHO also has licence problems depending on what you want to do, whereas SQLite is public domain.

1

Sounds like you are looking for gdbm:

The gdbm module provides an interface to the GNU DBM library. gdbm objects behave like mappings (dictionaries), except that keys and values are always strings.

That's basically a dictionary on disk. You might have to do a bit of serialization depending on your use.

mu is too short
  • 426,620
  • 70
  • 833
  • 800