0

With pycaret, it is possible to call the compare_models() function and get the model that best fits our data. This will look something like this,

# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# compare models
best = compare_models()

pycaret also comes with a tune_model() function which allows us to tune the hyperparameters of a given model. This will look as follows,

# load dataset
from pycaret.datasets import get_data 
boston = get_data('boston') 

# init setup
from pycaret.regression import * 
reg1 = setup(data = boston, target = 'medv')

# train model
dt = create_model('dt')

# tune model
tuned_dt = tune_model(dt)

What I want to know is, should we call the tune_model() function on the best model we get from compare_models()? Or are the hyperparameters of this model already tuned?

In essence, I want to know if I should do the following to get the best model possible,

# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# compare models
best = compare_models()

# tune model
tuned_dt = tune_model(dt)

I could not find this explicitly mentioned in the documentation.

Minura Punchihewa
  • 1,498
  • 1
  • 12
  • 35

1 Answers1

1

compare_model does not perform hyperparameter tuning. It simply calls create_model for each model type. Calling tune_model on all models would be very costly.

If you want to do tuning and model selection from multiple models, one approach would be to return the top N models from compare_models and then tune them individually, e.g.

# Select top N models (defaults only, without tuning)
N = 5
best_N = compare_model(n_select = N)

# Then tune these N best models
tuned_models = [tune_model(model) for model in best_N]
Nikhil Gupta
  • 1,436
  • 12
  • 15