How can dvc pipeline recognize when to use encoding pipeline while new data added for the modeling?

Asked May 31 '22 at 17:24

Active May 31 '22 at 17:24

Viewed 33 times

I have created separate pipelines for feature encoding and feature scaling in DVC. Now, when I will input new data from my flask API, how these DVC pipelines will automatically run and encode and scale data for modelling?

asked May 31 '22 at 17:24

DRP

1

Much more context and some detail is needed to try to answer this. The short A is use `dvc repro`. Please see https://dvc.org/doc/command-reference/repro for now. – Jorge Orpinel Pérez May 31 '22 at 19:36
Do you mean incremental learning? Data came from flask API and stored in some local path? And want the training progress automatically started? – karajan1001 Jun 01 '22 at 07:56
For example, when I have a dataset with ten features. Five of them are categorical features. In the feature engineering part, I have handled missing data and encoded the categorical features and scaled features. I've created pipelines in DVC called load_data->fill_data->encode_data->scaled_data->split_data->train_data. Now I've created flask API, and that API is taking input from the user. How will the DVC pipeline run on that new user-inputted data? – DRP Jun 02 '22 at 04:06

How can dvc pipeline recognize when to use encoding pipeline while new data added for the modeling?

0 Answers0