0

I'm trying to create a simple joblib function, which will evaluate the expression and pickle the result, while checking for the existence of the pickle file. But when I put this function in some other file and import the function after adding the path of the file to sys.path. I get errors.

from pathlib import Path
import joblib as jl    
def saveobj(filename, expression_obj,ignore_file = False):
    fname = Path(filename)
    if fname.exists() and not ignore_file:
        obj = jl.load(filename)
    else:
        obj = eval(expression_obj)
        jl.dump(obj,fname,compress = True)        
    return obj

Sample call:

rf_clf = saveobj(file, "rnd_cv.fit(X_train, np.ravel(y_train))", ignore_file=True)

Error:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-11-02c2cae43c5d> in <module>
      1 file = Path("rf.pickle")
----> 2 rf_clf = saveobj(file, "rnd_cv.fit(X_train, np.ravel(y_train))", ignore_file=True)

~/Dropbox/myfnlib/util_funs.py in saveobj(filename, expression_obj, ignore_file)
     37         obj = jl.load(filename)
     38     else:
---> 39         obj = eval(expression_obj)
     40         jl.dump(obj,fname,compress = True)
     41     return obj

~/Dropbox/myfnlib/util_funs.py in <module>

NameError: name 'rnd_cv' is not defined

I guess, python needs to evaluate the function locally, but since the objects don't exist in that scope, it is throwing this error. Is there a better way of doing this. I need to do this repeatedly, that's why a function. Thanks a lot for your help.

agarg
  • 318
  • 3
  • 11

2 Answers2

1

You can check the documentation of eval:

Help on built-in function eval in module builtins:

eval(source, globals=None, locals=None, /)

Evaluate the given source in the context of globals and locals.

The source may be a string representing a Python expression
or a code object as returned by compile().
The globals must be a dictionary and locals can be any mapping,
defaulting to the current globals and locals.
If only globals is given, locals defaults to it.

It has arguments for global and local variables. So, in your case, you can:

from pathlib import Path
import joblib as jl    
def saveobj(filename, expression_obj,global,local,ignore_file = False):
    fname = Path(filename)
    if fname.exists() and not ignore_file:
        obj = jl.load(filename)
    else:
        obj = eval(expression_obj, global, local)
        jl.dump(obj,fname,compress = True)        
    return obj

The code can be changed to:

rf_clf = saveobj(file, "rnd_cv.fit(X_train, np.ravel(y_train))", globals(), locals(), ignore_file=True)
Community
  • 1
  • 1
youkaichao
  • 1,938
  • 1
  • 14
  • 26
0

I was about to post answer my own question, when I saw @youkaichao answer. Thanks a lot. One more way to skin the cat: (although limited to keyword arguments)

def saveobj(filename,func, ignore_file = False, **kwargs):
    fname = Path(filename)
    if fname.exists() and not ignore_file:
        obj = jl.load(filename)
    else:
        obj = func(**kwargs)
        jl.dump(obj,fname,compress = True)        
    return obj

Changed Call:

file = Path("rf.pickle")
rf_clf = saveobj(file, rnd_cv.fit, ignore_file=False, X=X_train, y= np.ravel(y_train))

Although, I would still love to know, which one is better.

agarg
  • 318
  • 3
  • 11
  • I think mine is better. Yours have to pass any additional parameters according to``func``. For mine, the call to ``saveobj `` doesn't change even if ``func`` changes. – youkaichao Jan 03 '20 at 02:49
  • @youkaichao Thanks for your inputs, though I was looking for a more subtle answer. More specifically, how globals and locals will be propagated and what kind of memory and computational complexity they add compared to a simple **kwargs call. BTW, I have accepted your answer as well – agarg Jan 03 '20 at 03:04
  • Nope, my answer is unaccepted :( You can only accept one answer. – youkaichao Jan 03 '20 at 03:06
  • All global variables and local variables are maintained by the python interpreter no matter you call ``globals()`` or not, so ``globals()`` and ``locals()`` requires negligible cost (just passing a reference). – youkaichao Jan 03 '20 at 03:09