I want to train xgb classifier model to be able to make class predictions given specific variables entries. Nothing difficult by now. But I do want to train the model such that it observes the evolution of variables (here appears the sequential/ time series part) that makes the class change.
Example:
We have n different IDs, and for each ID we have some variables (features),let's say p, which are evolving in time for, let's say m time stamps. Initially the flag (y) is 0 and at m it becomes 1 (this is what I want the model to observe; evolution of variables that make the label change).
Questions:
- If I structure the data as follows:
X = [[c1 v111 v121 ... v1p1], [c1 v112 v122 ... v1p2], [c1 v113 v123 ... v1p3], ... [c1 v11(m-1) v12(m-1) ... v1p(m-1)], [c1 v11m v12m ... v1pm], [c2 v211 v221 ... v2p1],
[c2 v212 v222 ... v2p2], ... ]
y = [ 0 0 0 ... 0 1 0 0 ... ]
Will the model learn this sequential part only by considering the ID variable as input data (X)?
- There are any other models suitable for prediction problems like this? (xgb can perform parallel computation therefore it is suitable for large data sets, which is also my case)
Thank you and sorry if the presentation is not clear enough :)
Best, Vic
I've used the structure presented above but I'm not sure if the model observes the evolution I want :)