retrieve patsy's levels and encoding of categorical variables when transforming data to a design matrix

Asked Jun 16 '21 at 19:30

Active Jun 16 '21 at 19:30

Viewed 95 times

When there are categorical variables in the formula, then patsy needs the full original dataset to rebuild the category levels and encoding.

After data is transformed to a design matrix, is there a way to retrieve patsy's levels and encoding for that data? I would like to avoid keeping the full dataset around just so that patsy can rebuild the category levels and encoding.

The context is that I'm transforming training data to a design matrix with patsy during model training, and then would like to know the level/encoding to get a model prediction without having to keep the original training data around.

asked Jun 16 '21 at 19:30

morfys

2,195
3
28
35

retrieve patsy's levels and encoding of categorical variables when transforming data to a design matrix

0 Answers0