Gaussian Process Regression: tune hyperparameters based on validation set

Question

In the standard scikit-learn implementation of Gaussian-Process Regression (GPR), the hyper-parameters (of the kernel) are chosen based on the training set.

Is there an easy to use implementation of GPR (in python), where the hyperparemeters (of the kernel) are chosen based on a separate validation set? Or cross-validation would also be a nice alternative to find suitable hyperparameters (that are optimized to perform well on mutliple train-val splits). (I would prefer a solution that builds on the scikit-learn GPR.)

_{In detail: a set of hyperparameters theta should be found, that performs well in the following metric:
Calculate the posterior GP based on the training data (given the prior GP with hyperparameters theta). Then evaluate the negative log likelihood of the validation data with respect to the posterior.
This negative log likelihood should be minimal for theta.}

_{In other words I want to find theta such "P[ valData | trainData, theta ]" is maximal. A non-exact approximation that might be sufficient would be to find theta such that sum_i log(P[ valData_i | trainData, theta ] is maximal, where P[ valData_i | trainData, theta ] is the Gaussian marginal posterior density of a validation data-point valData_i given the training-data set given the prior GP with hyperparameters theta.Edit: Since P[ valData | trainData, theta ] has been implemented recently (see my answer), the easier to implement approximation of P[ valData | trainData, theta ] is not needed.}

score 0 · Answer 1 · answered Jan 19 '22 at 14:05

0

I would do it this way: first I would fit an sklearn GPR with default kernel on my validation set; then I would fit another GPR on my training set with the same hyperparemeters, but providing as kernel the kernel instance of the preavious GPR:

X_val = np.random.random((100, 5))
y_val = np.random.random((100,))

X_train = np.random.random((1000, 5))
y_train = np.random.random((1000,))

gpr_val = GaussianProcessRegressor().fit(X_val, y_val)
gpr_train = GaussianProcessRegressor(kernel=gpr_val.kernel_).fit(X_train, y_train)

answered Jan 19 '22 at 14:05

Salvatore Daniele Bianco

2,496
1
8
22

In your suggestion, the hyper-parameters are only chosen based on the validation set. In the details of my question, I have explained how I would like to obtain the hyperparameters: I want hyperparameters theta that perform well on the validation set when the posterior was calculated based on the training set, i.e. I want to find theta such that P[val-data|train-data,theta] is maximal – Jakob Jan 19 '22 at 14:56

Jakob · Accepted Answer · 2022-07-23T15:51:27.023

Two days ago a paper has been presented at ICML that implements my suggestion of splitting the training data into a hyperparameter training set D_<m and a hyperparameter validation set D_>=m and selecting hyperparameter theta which optimize max p(D_>=m|D_<m, theta): https://proceedings.mlr.press/v162/lotfi22a.html. This paper won an ICML outstanding paper award. They discuss the advantages compared to standars maximization of marginal liklihood and provide some code: https://github.com/Sanaelotfi/Bayesian_model_comparison

I hope that somone implements this (often superior) option for hyperparameter tuning into standard GPR implementation such as the one in scikit-learn.

Gaussian Process Regression: tune hyperparameters based on validation set

2 Answers2