0

I was building a system that processes poses from videos using Python, and then, a Javascript (react) application that estimates the user pose on webcam in real time, and compares it with the Python processed poses.

The thing is that I started encountering very different results on the coordinates... I made a test running the same video on both applications, and it gives a very discrepant result. I've tried to seek for some patter to transform the data (sometimes the X axis in python seems to be the Y axis in javascript, and vice-versa), but testing more than one scenario, I just couldn't get a reliable pattern to transform and match the data.

I'm using the same version of mediapipe in both applications. I know that python and javascript mediapipe implementation can be slightly different... but it is that different or am I missing something?

Thank you!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Dhiogo Corrêa
  • 431
  • 1
  • 4
  • 10

1 Answers1

0

For those who someday struggle with the same thing, here is what I found as a solution:

the mediapipe blazepose estimatePose method, is analog to the pose_world_landmarks from python's library, and not the pose_landmarks method! The difference is described here: https://github.com/google/mediapipe/blob/master/docs/solutions/pose.md

Dhiogo Corrêa
  • 431
  • 1
  • 4
  • 10