5

I am using CatBoostRegressor in Python version of the Catboost library.

According to documentation, it's possible to use overfitting detector, which I am doing, like this:

model = CatBoostRegressor(iterations=iters, learning_rate=0.03, depth=depth, verbose=True, od_pval=1, od_type='IncToDec', od_wait=20)
model.fit(train_pool, eval_set=validation_pool)

# this code didn't executed
model.save_model(model_name)

However, after the overfitting occurs, I've got my Python script interrupted, prematurely stopped, pick any phrase you want, and save model part didn't get executed, which leads to a lot of waisted time and no results in the end. I didn't get any stacktrace.

Is there any possibility to handle it in CatBoost and save hours of fitting work?

Mysterion
  • 9,050
  • 3
  • 30
  • 52
  • Could you give more info about why and how your python script got killed? – mcsim Dec 03 '17 at 12:49
  • 1
    i expect that this is what overfitting detector has been doing. Not sure, that I fully grasp what you expect me to answer – Mysterion Dec 03 '17 at 12:51
  • Stack trace, for example. What does it mean "killed"? – mcsim Dec 03 '17 at 12:52
  • what do you mean by `I've got my Python script killed` I would expect that an error was raised. ? – 00__00__00 Dec 03 '17 at 12:54
  • 1
    okay, may be I should rephrase it. I didn't get any stack trace, my script just got interrupted, ended prematurely, pick any word you want. If I would have an error, of course I would paste it here. I'm pretty much sure, it's something with the library (CatBoost), that I'm using – Mysterion Dec 03 '17 at 12:55
  • what data size are we talking about and is it a data specific problem so if you try on another dataset does it work – 00__00__00 Dec 03 '17 at 13:05
  • @ErroriSalvo data isn't that big, but it's rather computation that are expensive – Mysterion Dec 03 '17 at 13:30

3 Answers3

7

Use this code. It will save the model no matter what happens in the try block.

try:
    model.fit(X, y)
finally:
    model.save_model()
Myles Hollowed
  • 546
  • 4
  • 16
0

Well i don't know how catboost work but i would like to share a different way to save/store your trained data maybe it could help

import pickle
model = CatBoostRegressor(iterations=iters, learning_rate=0.03, depth=depth, verbose=True, od_pval=1, od_type='IncToDec', od_wait=20)
model.fit(train_pool, eval_set=validation_pool)
#----To store model----------
filename = 'final_model' # name to store model
pickle.dump(model, open(filename, 'wb')) # pickling
#-----To load model------------
loaded_model = pickle.load(open(filename, 'rb'))
outlier
  • 331
  • 3
  • 10
  • 1
    how the pickle will help me? – Mysterion Dec 06 '17 at 08:56
  • how many no of iterations you are using? and if no of iterations are more than you mentioned in training parameters then it could be because of over fitting...... Pickle is just another way to store model, long time back when i was performing multi class classification i used save_model and it didn't save my classifier so i used pickle and it worked – outlier Dec 08 '17 at 06:09
-3

You can do it with pickle, just train your module and dump it using pickle.

 pickle.dump(regr, open("models/svrrbf.sav",'wb'))

Further you can use that module to test your inputs. Hope it helps

m00am
  • 5,910
  • 11
  • 53
  • 69
Rohit P
  • 39
  • 4