i'm studying gaussian process regression, and i'm trying to use the built-in functions from scikit-learn, and also trying to impement a custom function for doing so.
This is the code when using scikit-learn:
import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor as gpr
from sklearn.gaussian_process.kernels import RBF,WhiteKernel,ConstantKernel as C
from scipy.optimize import minimize
import scipy.stats as s
X = np.linspace(0,10,10).reshape(-1,1) # Input Values
Y = 2*X + np.sin(X) # Function
v = 1
kernel = v*RBF() + WhiteKernel() #Defining kernel
gp = gpr(kernel=kernel,n_restarts_optimizer=50).fit(X,Y) #fitting the process to get optimized
hyperparameter
gp.kernel_ #Hyperparameters optimized by the GPR function in scikit-learn
Out[]: 14.1**2 * RBF(length_scale=3.7) + WhiteKernel(noise_level=1e-05) #result
And this is the code i wrote manually:
def marglike(par,X,Y): #defining log-marginal-likelihood
# print(par)
l,var,sigma_n = par
n = len(X)
dist_X = (X - X.T)**2
# print(dist_X)
k = var*np.exp(-(1/(2*(l**2)))*dist_X)
inverse = np.linalg.inv(k + (sigma_n**2)*np.eye(len(k)))
ml = (1/2)*np.dot(np.dot(Y.T,inverse),Y) + (1/2)*np.log(np.linalg.det(k +
(sigma_n**2)*np.eye(len(k)))) + (n/2)*np.log(2*np.pi)
return ml
b= [0.0005,100]
bnd = [b,b,b] #bounds used for "minimize" function
start = np.array([1.1,1.6,0.05]) #initial hyperparameters values
re = minimize(marglike,start,args=(X,Y),method="L-BFGS-B",options = {'disp':True},bounds=bnd) #the
method used is the same as the one used by scikit-learn
re.x #Hyperparameter results
Out[]: array([3.55266484e+00, 9.99986210e+01, 5.00000000e-04])
As you can see, the hyperparameter i got from the 2 methods are different, but yet i used the same data(X,Y) and same minimization method.
Could somebody help me to understand why and maybe how to get same results ?!