5

Actually there is a lot of question about persistence,but i have tried a lot using pickle or joblib.dumps . but when i use it to save my random forest i got this:

ValueError: ("Buffer dtype mismatch, expected 'SIZE_t' but got 'long'", <type 'sklearn.tree._tree.ClassificationCriterion'>, (1, array([10])))

Can any one tell me why?

some code for review

forest = RandomForestClassifier()
forest.fit(data[:n_samples], target[:n_samples ])
import cPickle
with open('rf.pkl', 'wb') as f:
    cPickle.dump(forest, f)
with open('rf.pkl', 'rb') as f:
    forest = cPickle.load(f)

or

from sklearn.externals import joblib
joblib.dump(forest,'rf.pkl') 

from sklearn.externals import joblib
forest = joblib.load('rf.pkl')
mrbean
  • 171
  • 2
  • 15

2 Answers2

7

It is caused by using different 32/64 bit version of python to save/load, as Scikits-Learn RandomForrest trained on 64bit python wont open on 32bit python suggests.

Community
  • 1
  • 1
xgdgsc
  • 1,367
  • 13
  • 38
1

Try to import the joblib package directly:

import joblib

# ...

# save
joblib.dump(rf, "some_path")

# load 
rf2 = joblib.load("some_path")

I've put the full working example with the code and comments here.

pplonski
  • 5,023
  • 1
  • 30
  • 34