I have a stream of data (e.g. 3D position) generating by a system which it looks like:
(pos1, time1) (Pos2, time2) (pos3, time3) ...
I want to use a machine learning technique to estimate the likelihood (or detect) of a particular event from given stream of data. What I have done:
- I've tagged my data at every frame by YES if the event occurred at that frame, otherwise it is set to NO.
(pos1, time1, NO) (Pos2, time2, Yes) (pos3, time3, NO) ...(posK, timeK, Yes)...
- set a window length like L to train model by giving L consecutive frames and the corresponding tag is set by the tag of the last element on that window:
(pos1, Pos2, pos3, NO) (pos2, Pos3, pos4, NO) (pos3, Pos4, pos5, NO) ... (posK-2, PosK-1, posK, YES) ...
- Finally, I trained my model by this set of that.
- For Testing, I concatenate L consecutive frames and ask the model to find the corresponding tag for this set of data (e.g. YES or NO).
I realize that occurrence of "NO" is a lot more frequent that "YES". Simply because the system is mostly on idle state and I have no event. So it affects on the training.
Could you give me some hints: 1) what type of machine learning model is the best fit for this problem. 2) At the moment I am classifying the output either "YES" or "NO" but I would like to have the probability of occurrence of the event at anytime. What kind of model is do you suggest?
Thanks