0

I have a Random Forest model, and model saved in .pkl file. I have loaded the .pkl model but now I have to input the test data and predict the accuracy. how to input file to .pkl model?

import pickle

def read_from_pickle(RF):
    with open(RF, 'rb') as file:
        try:
            while True:
                yield pickle.load(file)
        except EOFError:
            pass

this is the code i have used to load the model Next..how to input?

James Z
  • 12,209
  • 10
  • 24
  • 44
  • Does this answer your question? [Save python random forest model to file](https://stackoverflow.com/questions/20662023/save-python-random-forest-model-to-file) – Stereo Jul 19 '22 at 11:57
  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. – Community Jul 19 '22 at 20:28

1 Answers1

0

this solution is with random Forrest regressor my model was dynamic price prediction

import pandas as pd import numpy as np from sklearn import pipeline, preprocessing,metrics,model_selection,ensemble,linear_model from sklearn_pandas import DataFrameMapper from sklearn.metrics import mean_squared_error

// firstly we loaded this library and then we loaded the dataset and all the cleaning stuff we did after that

data.to_csv("Pune_hpp.csv",index=False)

mapper = DataFrameMapper([ (['area_type','size','new_total_sqft','bath','balcony',], preprocessing.StandardScaler()), # (['area_type','size'],preprocessing.OneHotEncoder())

                    ],df_out=True)

// hear we created two pipeline for it bcz we have compared two algorithm with mse and rsme method and loaded the this below algo

pipeline_obj_LR=pipeline.Pipeline([ ('mapper',mapper), ("model",linear_model.LinearRegression()) ])

pipeline_obj=pipeline.Pipeline([ ('mapper',mapper), ("model",ensemble.RandomForestRegressor()) ])

X=['area_type','size','new_total_sqft','bath','balcony'] // X with INPUT

Y=['price'] // Y as OUTPUT

// hear the comparison process start

pipeline_obj_LR.fit(data[X],data[Y]) // this logistic regression

pipeline_obj.fit(data[X],data[Y]) // random forest

pipeline_obj.predict(data[X]) // some predict we have done

predict=pipeline_obj_LR.predict(data[X])

//BELLOW is the actual way to compare and which algo is best fited

predict=pipeline_obj_LR.predict(data[X])

Root Mean Squared Error on train and test data

print('MSE using linear_regression: ', mean_squared_error(data[Y], predict)) print('RMSE using linear_regression: ', mean_squared_error(data[Y], predict)**(0.5))

// above is for the lr

predict=pipeline_obj.predict(data[X])

Root Mean Squared Error on train and test data

print('MSE using randomforestregression: ', mean_squared_error(data[Y], predict)) print('RMSE using randomforestregression: ', mean_squared_error(data[Y], predict)**(0.5))

// above it is for RFR and in my I have done with the random forest reason to do with the joblib was I had the huge dataset and it easy to implement and it's line of code also very less and you have seen I have not use the pipeline_obj_LR this how we have inputed the value in pkl file

import joblib

joblib.dump(pipeline_obj,'dynamic_price_pred.pkl')

modelReload=joblib.load('dynamic_price_pred.pkl')