Preprocessing data in TensorFlow

Question

I have a simply sequential model written in Python using TensorFlow library. As an input I have categorical and numerical columns and in output I'm getting float number.

I would like deploy my model in Windows Application (.NET) and I am wondering how to deal with data encoders (eg. label encoder, normalization encoder).

I seem to have at least two options:

save the encoders somehow - how?
add preprocessing layer in tf (I am personally for this option), but how? I am looking for a solution analogous to FeatureUnion/ColumnTransform from sklearn. Is it possible to use a preprocessing layer with the option of setting an encoder for each column separately? How?

score 0 · Answer 1 · answered May 06 '23 at 20:44

Use sklearn preprocessing independently of the TensorFlow model in your training script. Afterward, save both your sklearn preprocessing steps and the TensorFlow model as ONNX. Then, either feed the output of the preprocessing step as the input to the model in your .NET application or use the ONNX helper to stitch both models together in advance.

P.S.

If you need a concrete example how to combine several onnx models into one file, you could refer to this file in my Falcon-ML library where I have exactly the same use case.

Preprocessing data in TensorFlow

1 Answers1