0

After some time looking at the mongodb documentation and the pymongo API, I am still no clearer on what route I take as the way forward (more confused now that when I started) . My problem concerns locks ... not so much that I have tested and found there to be major concurrency problems, but that I don't want to run into them after the fact.

I have a tkinter script with several functions, all of them need access to the same document collection, and most of them access the same single document within that collection.

client = MongoClient()

def 1 ():
    glob_client  = client['ALPHA']['A-Z']
    #do work:
    """Also call subprocesses that use the same database document (glob_client) in another script.
    There can be 3 -10 instances of this subprocess running, listening to various http streams in a while loop, 
    collecting data that can come in at 100's of times per second."""

def2 ():
    glob_client  = client['ALPHA']['A-Z']
... 
def32 ():
    glob_client  = client['ALPHA']['A-Z']

And the called subprocess(in separate scripts), multiple instances possible:

client = MongoClient()
glob_client  = client['ALPHA']['A-Z']

while True:
    #do work with glob_client; updates, push, pull, reads,   

So, would it be enough in this case to just use client.close() in every function?

def 1 ():
    glob_client  = client['ALPHA']['A-Z']
    #do work
    client.close()

Similarly in the while loops:

while True:
    #do work with glob_client; updates, push, pull, reads, 
    Client.close()

Would that suffice, or should I be looking to shard in this case? Or should I just go back to SQL!

Mongodb 3.0.6 32-bit, pymongo 3.03, python 2.7.

ajsp
  • 2,512
  • 22
  • 34
  • 1
    Your own quote *"not so much that I have tested and found there to be major concurrency problems"* should lead you to a conclusion that is evident in this famous quote. *"Premature optimization is the root of all evil ( or at least most of it ) in computer programming"*. Which is suffice to say exactly what you are doing here. By all means "test, test and then test again", where you find problems, then deal with them in the appropriate manner, or ask peers like us for solutions to those problems. Under no circumstance however should you close a database connection until done with completely. – Blakes Seven Sep 19 '15 at 10:22
  • And that means when your application is completed running. Database connections are expensive operations. Keep them open until the app is completely done. – Blakes Seven Sep 19 '15 at 10:23
  • @Blakes Seven Duly noted, good quote, very apt in this case too, the time I have wasted trying to plan this out now runs into several days. Just trying to be careful, when I really should be more of a buccaneer. Cheers. – ajsp Sep 19 '15 at 11:54
  • 1
    client.close would probably be a bad idea to call within a while true loop. It should only be called at the end of all execution, including that of a daemon's execution. Making a network connection is the hardest and longest part of your program most likely, you want to hold the connections for as long as possible. – Sammaye Sep 19 '15 at 14:52
  • @Sammaye Can I call the same connection in several different scripts? Say I had `client = MongoClient()` stored in `main_conn.py`, and then called it from a different script,` test_con.py` with `from main_conn import client` for example. Would this be the same connection, regardless of how ever many scripts connected to it? (with a view to limiting concurrency) – ajsp Sep 19 '15 at 15:33
  • Hmm, I am not sure if the MongoClient is a singleton, initial research tells me no so I would say no to your question – Sammaye Sep 19 '15 at 18:57
  • Any new on this? finaly how did you solve it? I'm in the same situation – Iván Rodríguez Torres Sep 20 '16 at 17:06
  • @Ivan Rodriguez See answer below, as far as I can remember it did the trick for what I needed at the time. – ajsp Sep 20 '16 at 23:53

1 Answers1

0

As far as avoiding a lock in this case I put the client in a separate script, foo.py:

import pymongo

CLIENT = pymongo.MongoClient(maxPoolSize=None,w=1)
COLLEC = CLIENT ['ABC']['XYZ']

And then imported the collection anywhere I needed it throughout various scripts:

from foo import COLLEC 
ajsp
  • 2,512
  • 22
  • 34