0

I am trying to implement a nested regression model separately which I got as an output from TPOT. The output of TPOT is:

RandomForestRegressor(XGBRegressor(XGBRegressor(**args1), **args2), **args3)

My code is so far:

from xgboost import XGBRegressor
from sklearn.ensemble import RandomForestRegressor

xgb1 = XGBRegressor(**args1)
xgb2 = XGBRegressor(**args2)
rf = RandomForestRegressor(**args3)

I am not sure how I can combine them properly in the order of TPOT's answer.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Debayan Paul
  • 95
  • 2
  • 9

1 Answers1

0

TPOT Classifier and Regressor make available a scikit-learn Pipeline object that already does that for you.

If you look at the TPOT API both TPOTClassifier and TPOTRegressor expose an attribute fitted_pipeline_ which will hold the best scikit-learn Pipeline TPOT could find. An example of a scikit-learn Pipeline:

PolynomialFeatures(degree=2, include_bias=False, interaction_only=False),
    XGBRegressor(learning_rate=0.1, max_depth=4, min_child_weight=14, n_estimators=100, n_jobs=1, objective="reg:squarederror", subsample=1.0, verbosity=0)

You can either dump it for later load, so you don't have to retrain your model, or you can simply export the best pipeline using TPOT Classifier and Regressor built-in function to export your optimized Pipeline as Python Code so you can re-fit your model:

tpot.export('tpot_digits_pipeline.py')

If for some reason you only have that output posted in the question, you can recreate the scikit-learn Pipeline like this:

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from xgboost import XGBRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.pipeline import make_pipeline

tpot_data = pd.read_csv('PATH/TO/DATA/FILE', sep='COLUMN_SEPARATOR', dtype=np.float64)
features = tpot_data.drop('target', axis=1)
training_features, testing_features, training_target, testing_target = \
            train_test_split(features, tpot_data['target'], random_state=42)

exported_pipeline = make_pipeline(
  RandomForestRegressor(XGBRegressor(XGBRegressor(<replace with actual arg list>), <replace with actual arg list>), <replace with actual arg list>)
)

exported_pipeline.fit(training_features, training_target)
Alexandre Juma
  • 3,128
  • 1
  • 20
  • 46