0

I've got a working program that calls an API for each item of a list (say, a book) to get back metadata about the book. It stores the book : metadata in dict for use. This causes the user to wait during the metadata gathering, so avoid excess calls I am persisting the dict to CSV and loading it before making aforementioned API calls to ensure I only get responses when necessary.

However, when I introduce context managers to read the persisted dict, and then do the call-if-not-there logic into a function ("gatherfiles()"), it is no longer accessible to a third function.

I can see the dict is returned by gatherfiles() when I call in in a main function, but when I make the third function call (to "pickabook()") I get a keyerror and I see an empty dictionary.

I've put a redacted version of the code below. My guess is that somehow the context manager has changed the scoping (so it's treating one shimdict as global and one as local), but that doesn't seem right given what I can read online. So any thoughts here that aren't ugly?


shimdict = {}

def pickabook(book=None):

        print(shimdict, "<-this is {}. why?!?")
        picked = shimdict.pop(book)

def gatherfiles(directory):

    with open('test.csv', 'rb') as f:
      reader = csv.reader(f,)
      shimdict = dict((rows[0],rows[1]) for rows in reader)

    with open('test.csv', 'a+b') as f:  
      w = csv.writer(f)

      ff = os.listdir(directory)
      for f in ff:
          if f.rsplit('.', 1)[1].lower() in [....]:
                filename = os.path.join(directory, f)

                if filename in shimdict.keys():
                    print("already here")

                else:

                    print("make the api call, then write the value to dict & then csv")

                    shimdict[filename] = (returnedvalue)

                    w.writerow([filename, (returnedvalue)])

    return shimdict

def main():

    shimdict = gatherfiles(directory)
    print(shimdict, "<-dictionary works")


    while 1:
        print(shimdict, "<-dictionary works")
        current = pickabook(bookname)

---- edit below ---- I don't think I posed my question explicitly enough. I am able to access the dict "shimdict" in "pickabook()" if the context managers are removed i.e. I use this code:

def gatherfiles(directory):

    ff = os.listdir(directory)
    for f in ff:
        if f.rsplit('.', 1)[1].lower() in [....]:
            filename = os.path.join(directory, f)

            shimdict[filename] = (returnedvalue)

return shimdict        

So i completely understand that I can use global or pass the local dict to the function to fix this, but I want to know why adding the context manager changes the behavior.

Blckknght
  • 100,903
  • 11
  • 120
  • 169
SQLesion
  • 165
  • 2
  • 9
  • 1
    The difference is that you put it into a function, not that you used a context manager. In the function, use `global shimdict` to get write access to `shimdict`. – Waleed Khan May 14 '14 at 03:05

1 Answers1

0

As Waleed Khan commented, the issue is that the shimdict variables in the main and gatherfiles functions are not the same as the shimdict global variable. The latter is initialized as en empty dictionary at the top of the module, and it stays empty. It is what pickabook tries to pop from. The others are local variables in their functions (which happen to refer to the same object in this case, though that's not necessarily the case for local variables with the same names in other functions). Python will always use local variables by default when you assign to a new name unless you use a global or nonlocal statement to tell it to do otherwise.

In this specific situation, you could make your functions work correctly in one of two ways. You could put global shimdict at the top of either main or gatherfiles (if the latter, you could skip returning the dictionary, since it's going to be accessed by its global name later anyway).

A better solution, however, would probably be to get rid of the global variable completely and have main pass its local shimdict to pickabook. Just change the pickabook function declaration to:

def pickabook(shimdict, book=None):

and the calls to it from main to:

current = pickabook(shmdict, bookname)

Edit to answer the question edit:

The context manager remains inconsequential. The real difference between the working code you show in your edit and the non-working code in the original part of the question is that in the latter you're doing an assignment to the name shimdict:

shimdict = dict((rows[0],rows[1]) for rows in reader)

while in the former you only assign to individual items of the dictionary:

shimdict[filename] = (returnedvalue)

The former creates the new local dictionary inside the function (unless you use a global statement). The latter never does, it always accesses the global version of shimdict.

So, I suppose an alternative solution would be to rewrite your context manager using code to do individual assignments of each item into the dictionary, rather than creating the whole thing with a generator expression:

with open('test.csv', 'rb') as f:
    reader = csv.reader(f,)
    for row in reader:
        shimdict[row[0]] = row[1]

I would still suggest avoiding global variables, as that often leads to complicated code with bugs that are hard to fix.

Blckknght
  • 100,903
  • 11
  • 120
  • 169
  • Hi - thanks for your answer. I didn't ask my question explicitly enough - can you comment on the edit? – SQLesion May 14 '14 at 15:52