0

My educational project is about "sign language recognition using kinect camera" .

I want to compare the hand motion trajectories using DTW as the distance measure , and then do a NN-DTW classification .

Hand trajectory is constructed from the hand joint position in consecutive frames in 3D coordinate system. x,y,z coordinate for hand joint , in every frame , is obtained from using kinect camera .

Which option is more appropriate for measuring the distance of these trajectories? DTWi or DTWd ?

At_Ta.stu
  • 31
  • 5

1 Answers1

0

Short answer: DTWd (for your specific use-case)

You might want to have a look at this paper:

Shokoohi-Yekta, M., Wang, J., & Keogh, E. (2015). On the Non-Trivial Generalization of Dynamic Time Warping to the Multi-Dimensional Case. Proceedings of the 2015 SIAM International Conference on Data Mining, 289–297. https://doi.org/10.1137/1.9781611974010.33

According to this paper, one of the two (i.e., DTWi or DTWd) always exhibits better performance. However, the decision will depend on the data you are using. In general terms, the authors say that "results suggest if the data dimensions are dependently warped, use DTWd to classify the data. If the data dimensions are independently warped, DTWi will give you more accurate results for classifying the data"

If you do not want to make that decision before hand, you can continuously adapt the selection to the one that suits best as explained in their paper.

jdacoello
  • 11
  • 2