When I use Pycaret predict_model(dt,data=data_unseen) to analyze only one row appears

Question

I use Python pycaret module to analyze big set of data. I did setup, compare_model, create_model correctly, but when I try to use model I created to predict the unseen_date I splite from beginning, there is only one row come, there is supposee 100k row need predict. I do skip the tune part cause it is take too long but I dont think thats the reason

TSLASAMPLE = TSLA.sample(frac=0.8)
data_unseen  = TSLA.drop(TSLASAMPLE.index)
TSLASAMPLE.reset_index(drop=True, inplace=True)
data_unseen .reset_index(drop=True, inplace=True)
TSLAinput = setup(data = TSLASAMPLE, target= 'prtPrice', use_gpu=True,html=False,silent=True)
dt = create_model('dt')
prediction = predict_model(dt,data=data_unseen)

output:

Model   MAE MSE RMSE    R2  RMSLE   MAPE
0   Decision Tree Regressor 0.1842  1.8393  1.3562  0.9996  0.0303  0.0082

score 1 · Answer 1 · answered May 13 '22 at 11:25

1

This is expected. The results (1 row) that you see are the metrics on the unseen data. The actual predictions are in your prediction variable.

answered May 13 '22 at 11:25

Nikhil Gupta

1,436
12
15

score 0 · Answer 2 · answered May 15 '23 at 03:28

This is because "create_model" returns the list of trained models, where the 1st element is the best model based on the accuracy results.

If you want to make a prediction on unseen data for each model you should loop through each instance of compare_modesl(which is the list of the models):

You can try something like this:

model_list = compare_models()

predictions = []
for model in model_list:
    model_prediction = predict_model(model, data=data_unseen)
    predictions.append(model_prediction)

the 'predictions' list stored the results for each model. The order is the same as in the 'compare_models()'

When I use Pycaret predict_model(dt,data=data_unseen) to analyze only one row appears

2 Answers2