-1

I was informed that there is not enough memory when creating the following Gaussian process model, and I would like to know if there is a feature in GPflow that allows loading data in batches instead of reading all the data at once.

Try this code

data = (X, Y) # size approximate to 1e6
gpflow.models.VGP(
    data,
    kernel=gpflow.kernels.SquaredExponential(),
    likelihood=gpflow.likelihoods.Bernoulli(),
),

encounter OOM

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345

1 Answers1

0

If you use the SVGP model instead of VGP, you can train the model on data loaded in batches ("mini-batch"). This is demonstrated in this notebook: https://gpflow.github.io/GPflow/2.7.1/notebooks/advanced/gps_for_big_data.html

If you're just past the edge of "not enough memory" there might be some other ways of computing part by part (though I can't give any advice on how you would do that), but with a VGP model for N data points in the end you will still need to allocate O(N^2) memory.

STJ
  • 1,478
  • 7
  • 25