I've also asked this on the GPflow GitHub
I found the initial guesses for hyperparameters by using m.likelihood.variance.assign(0.01)
and m.kernel.lengthscales.assign(0.3)
affects significantly to the final optimized hyperparameters. I was wondering if there is a method to get a good initial guess? For example, estimating using the dataset.