4

When we need to do gesture recognition, we should train several HMMs for each gesture. then when we need to classify the gesture we compute the sequence probability from each HMM and take the one with the highest likelihood

But what to do when we need to classify multiple gestures in a sequence and we don't know how to segment the multiple gestures to take the same approach with single gesture

So how can we do this sequence classification? is HMM appropriate? are there other ways?

Thanks

1 Answers1

3

NLP generally does this with real-time interpretation. Set a match threshold; when a sequence of motions resolves to a unique gesture and meets the threshold, you interpret that as a gesture.

This is simple in description. In practice, there is a lot of feedback, especially if some gestures are subsets of others, or if the matches are not quite as crisp as we'd like.

If you want to use HMM, can you seed it after some training with markers for terminal states?

Prune
  • 76,765
  • 14
  • 60
  • 81
  • Thanks a lot. That is helpful. I got the threshold part but what do you mean by "when a sequence of motions resolves to a unique gesture"?. Yes I can mark the terminal States. Am trying to implement this for Sign language recognition, so the gestures are words and I am marking the start and end of each word as separate states. Also if you have a tutorial or a paper explaining this more, I will be more than grateful. Thanks – user5692502 Dec 19 '15 at 01:01
  • 1
    Some words are not atomic motions, such as "morning" or "together". Some are compounds of simpler concepts, such as "teacher". Your first parsing need is to delineate words in a relatively fluid sequence of motions. I'm sorry; I have no paper or tutorial on this. I have post-master's work in cognitive sciences, including computational linguistics, and I have an interest in sign language. Many of your problems are related to processing an audio stream into speech, and resolving that into words. – Prune Dec 19 '15 at 01:18