I have 6 text features (say f1,f2,..,f6) available for the data on which I have trained a model. But when this model is deployed and a new data point comes, for which I have to make prediction using this model, it has only 2 features (f1, and f2). So, there is the problem of feature mismatch. How can I tackle this problem? I have a few thoughts, but that are not very efficient.
- Use only two features for training (f1 and f2), and discard other features (f3,..,f6). But this leads to a loss of information and my test set accuracy decreases.
- Learn some relation between (f3,..,f6) with (f1 and f2). So that even though, (f3,..,f6) is not there in the new data point, the information can be extracted from f1, and f2 only.