0

I trained a model in a cluster, downloaded it (pkl format) and tried to load locally. I know that sklearn's version of joblib was used to save a model mymodel.pkl (but I don't know which exactly version...).

from sklearn.externals import joblib

print(joblib.__version__)

model = joblib.load("mymodel.pkl")

I use the version 0.13.0 of sklearn's joblib locally.

This is the error that I got:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-100-d0a3c42e5c53> in <module>
      3 print(joblib.__version__)
      4 
----> 5 model = joblib.load("mymodel.pkl")

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\externals\joblib\numpy_pickle.py in load(filename, mmap_mode)
    596                     return load_compatibility(fobj)
    597 
--> 598                 obj = _unpickle(fobj, filename, mmap_mode)
    599 
    600     return obj

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\externals\joblib\numpy_pickle.py in _unpickle(fobj, filename, mmap_mode)
    524     obj = None
    525     try:
--> 526         obj = unpickler.load()
    527         if unpickler.compat_mode:
    528             warnings.warn("The file '%s' has been generated with a "

~\AppData\Local\Continuum\anaconda3\lib\pickle.py in load(self)
   1083                     raise EOFError
   1084                 assert isinstance(key, bytes_types)
-> 1085                 dispatch[key[0]](self)
   1086         except _Stop as stopinst:
   1087             return stopinst.value

KeyError: 239

Update:

Also I tried, but got an error AttributeError: 'str' object has no attribute 'readable':

with io.BufferedReader("mymodel.pkl") as pickle_file:
    model = pickle.load(pickle_file)
Fluxy
  • 2,838
  • 6
  • 34
  • 63
  • can you try unpickling it using pickle.load() – teoML Jan 04 '20 at 20:41
  • @t_e_o: thanks. in this case I get an error: `TypeError: file must have 'read' and 'readline' attributes` – Fluxy Jan 04 '20 at 20:44
  • can you try this : https://stackoverflow.com/questions/18261898/cannot-load-pickled-object – teoML Jan 04 '20 at 20:50
  • @t_e_o: Could you please clarify what is `'out/cache/' +hashed_url` in the accepted answer? My model is stored locally. I don't have any URL. – Fluxy Jan 04 '20 at 20:58
  • @t_e_o: Also, I tried `with open(filename, "rb") as f: model = pickle.loads(f)`. It gave me the error `TypeError: a bytes-like object is required, not '_io.BufferedReader'`. – Fluxy Jan 04 '20 at 21:00
  • @t_e_o: if you want, I can share the model... – Fluxy Jan 04 '20 at 21:02
  • @t_e_o Just happens that I am having the same issue. `KeyError: 239` (using` joblib.load`, but when using pickle.load it gives `_pickle.UnpicklingError: invalid load key, '\xef'.` – Xavier Geerinck Jan 04 '20 at 21:03
  • @XavierGeerinck: The same for me. I also tried io.BufferedReader. Nothing helps... – Fluxy Jan 04 '20 at 21:05
  • did you try using pickle.load() without the with statement ? – teoML Jan 04 '20 at 21:08
  • @t_e_o: yes, I get `TypeError: file must have 'read' and 'readline' attributes` – Fluxy Jan 04 '20 at 21:09
  • with open('file.pkl','rb') as f: data = pickle.load(f) can you try it out? – teoML Jan 04 '20 at 21:14
  • Notice that you have load() and loads() – teoML Jan 04 '20 at 21:16
  • @t_e_o: I tried both `load` and `loads`. Also I tried `with open('file.pkl','rb') as f: data = pickle.load(f)`, and got: `EOFError: Ran out of input` – Fluxy Jan 04 '20 at 21:17
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/205361/discussion-between-t-e-o-and-fluxy). – teoML Jan 04 '20 at 21:21

1 Answers1

5

You tried to dump it with joblib.dump('pipeline','mymodel.pkl'). This only dumped the string 'pipeline'! Not your actual pipeline object.

Dump it correctly with:

joblib.dump(pipeline,'mymodel.pkl')

...then read back with:

model = joblib.load('mymodel.pkl')
smci
  • 32,567
  • 20
  • 113
  • 146
Satish
  • 51
  • 1
  • 3