1

I'm trying to do parameter optimisation with HyperOptSearch and ray.tune. The code works with hyperopt (without tune) but I wanted it to be faster and therefore use tune. Unfortunately I could not find many examples, so I am not sure about the code. I use a pipeline with XGboost but do not just want to optimise the parameters in XGboost but also another parameter in the pipeline that is for the encoding. Is this possible to do with tune? My code is below.

from hyperopt import hp
from ray import tune
from ray.tune.suggest.hyperopt import HyperOptSearch

def train_model(space, reporter):
    #target encoding
    columns_te=no_of_classes[no_of_classes.counts>space['enc_threshold']].feature.values.tolist()
    #one hot encoding
    columns_ohe=categorical.columns[~categorical.columns.isin(cols_te)].tolist()

    #model
    pipe1 = SKPipeline([('ohe',
                          OneHotEncoder(cols=columns_ohe, return_df=True,handle_unknown='ignore', use_cat_names=True)), 
                          ('te',
                          TargetEncoder(cols= columns_te, min_samples_leaf=space['min_samples_leaf']))])

    pipe2 = IMBPipeline([
         ('sampling',RandomUnderSampler()),
         ('clf', xgb.XGBClassifier(**space, n_jobs = -1))
        ])

    model = SKPipeline([('pipe1', pipe1), ('pipe2', pipe2)])

    optimizer = SGD()
    dataset = xx
    accuracy = model.fit(dataset.drop(['yy']), dataset.yy)
    reporter(mean_accuracy=roc_auc)

if __name__ == '__main__':
    ray.init()

    space = {'eta':hp.uniform('eta',0.001,0.1),
            'max_depth':scope.int(hp.quniform('max_depth', 1,5,1)),
            'min_child_weight': hp.uniform('min_child_weight', 0.1, 1.5),
            'n_estimators': scope.int(hp.quniform('n_estimators',20,200,10)),
            'subsample': hp.uniform('subsample',0.5,0.9),
            'colsample_bytree': hp.uniform('colsample_bytree',0.5,0.9),
            'gamma': hp.uniform('gamma',0,5),
            'min_samples_leaf':scope.int(hp.quniform('min_samples_leaf',10,200,20)),
            'nrounds':scope.int(hp.quniform('nrounds',100,1500,50))
             }



    algo = HyperOptSearch(space, max_concurrent=5,  metric='roc_auc', mode="max")
    tune.run(train_model, num_samples=10, search_alg=algo)
corianne1234
  • 634
  • 9
  • 23
  • That looks fine. Feel free to contribute this as an example once you get it working! – richliaw Dec 10 '19 at 02:15
  • I get n error I have no idea how to solve. redis.exceptions.ConnectionError: Error 104 while writing to socket. Connection reset by peer. – corianne1234 Dec 10 '19 at 08:55
  • Seems like it has to do with the size of the dataset. I was trying to follow this example, but I do not understand how they iterate over the dataset. Is data a list/dataframe? Also, if I just use a sample of my dataset it doesn't work. `def train_func(config, reporter): # add a reporter arg model = ( ... ) optimizer = SGD(model.parameters(), momentum=config["momentum"]) dataset = ( ... ) for idx, (data, target) in enumerate(dataset): accuracy = model.fit(data, target) reporter(mean_accuracy=accuracy) # report metrics` – corianne1234 Dec 10 '19 at 09:37
  • We've recently pushed an XGBoost example onto the documentation: https://docs.ray.io/en/master/tune/tutorials/tune-xgboost.html. Further, if you're running into ConnectionErrors with Redis, you can use this workaround: https://github.com/ray-project/ray/issues/2931#issuecomment-653859257 – richliaw Jul 12 '20 at 04:17

0 Answers0