0

I'm just getting started with auto-sklearn. I have implemented the below - it runs fine.

I'm not clear how I know which model & parameters it chose though?

Also, if it manages some of the preprocessing steps (e.g. imputing nulls & encoding), how do I then deploy the 'pipeline' including those steps?

Thanks a lot

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.metrics import accuracy_score
import autosklearn.classification

train_features, test_features, train_labels, test_labels = train_test_split(features, output, test_size = 0.2, random_state = 42)
model = autosklearn.classification.AutoSklearnClassifier(n_jobs=-1)
model.fit(train_features, train_labels)
predictions = model.predict(test_features)
kikee1222
  • 1,866
  • 2
  • 23
  • 46

1 Answers1

0

From the docs, there are model.cv_results_ for comprehensive stats, and show_models() to print the final (ensemble) models. Presumably you will just binarize the entire autosklearn object and deploy it, so that preprocessing still happens internally. (In a quick example, the ensemble consists of many SimpleClassificationPipeline objects, which are wrappers around sklearn Pipelines.)

Ben Reiniger
  • 10,517
  • 3
  • 16
  • 29