8

For my python application I am thinking of using shelve, part of the standard library. There will be hundreds of processes, each writing something to the same shelve object. The writing will always be to add a new key,value pair to the shelve. The keys are unique, so no two processes will update the same entry.

What could go wrong in such a scenario?

Anas Elghafari
  • 1,062
  • 1
  • 10
  • 20
  • 5
    If you're worried about it, use a Queue to which all threads write, and then have one thread read from the queue and write to the shelf. And you might want to clarify whether you are using threads or sub processes. – mhawke Aug 16 '14 at 09:03
  • @mhawke: I am actually using processes (parallel jobs) not threads. But good catch, the original version of the question used both words. I edited to fix that. Thanks. – Anas Elghafari Aug 16 '14 at 09:21

2 Answers2

6

The shelve documentation is explicit about this.

The shelve module does not support concurrent read/write access to shelved objects. (Multiple simultaneous read accesses are safe.) When a program has a shelf open for writing, no other program should have it open for reading or writing. Unix file locking can be used to solve this, but this differs across Unix versions and requires knowledge about the database implementation used.

So, without process synchronisation, I wouldn't do it.

How are the processes started? If they are created by a master process then you can look at the multiprocessing module. Use a Queue to which the child processes write back their results, and have the master remove items from the queue and write them to the shelf. Example of this sort of this is at https://stackoverflow.com/a/24501437/21945.

If you have no process hierarchy then you'll need to use locking to control read and write access to the shelf file. If you are using Linux or similar you might use posix_ipc named semaphore.

The other obvious option is to use a database server - Postgresql or similar.

Community
  • 1
  • 1
mhawke
  • 84,695
  • 9
  • 117
  • 138
  • "Multiple simultaneous read accesses are safe". Could you please add information about how safe it is to have multiple read accesses during a write (of a different key)? – lucidbrot Aug 05 '19 at 12:21
0

In your case you'd probably have better luck using a more robust kvp store, such as redis. It's pretty easy to setup a local redis service or a remote redis service (such as on AWS's ElastiCache service)

Sean Azlin
  • 886
  • 7
  • 21