1

When there are categorical variables in the formula, then patsy needs the full original dataset to rebuild the category levels and encoding.

After data is transformed to a design matrix, is there a way to retrieve patsy's levels and encoding for that data? I would like to avoid keeping the full dataset around just so that patsy can rebuild the category levels and encoding.

The context is that I'm transforming training data to a design matrix with patsy during model training, and then would like to know the level/encoding to get a model prediction without having to keep the original training data around.

morfys
  • 2,195
  • 3
  • 28
  • 35

0 Answers0