Mediapipe `max_num_hands` not working properly

Question

I am trying to extract hand pose using Mediapipe. Even though max_num_hands is set to be 2, sometimes results.multi_hand_landmarks still returns an array of length 3, which indicates 3 hands. I checked this by print(len(results.multi_hand_landmarks)).

The function I use to extract pose (the commented code is for displaying):

import cv2
import mediapipe as mp

mp_hands = mp.solutions.hands
hands = mp_hands.Hands(static_image_mode=False,
                       max_num_hands=2,
                       min_detection_confidence=0.5,
                       min_tracking_confidence=0.5,
                       )

def extract_pose(frame, hands):
    frame = cv2.flip(frame, 1)
    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    results = hands.process(frame_rgb)
    if results.multi_hand_landmarks:
        print(len(results.multi_hand_landmarks))
        # for hand_landmarks in results.multi_hand_landmarks:
        #     for landmark in hand_landmarks.landmark:
        #         x = int(landmark.x * frame.shape[1])
        #         y = int(landmark.y * frame.shape[0])
        #
        #         cv2.circle(frame, (x, y), 5, (0, 255, 0), -1)
    # cv2.imshow("Hand Pose", frame)

And the code to read images:

for idx in range(960, 964):
    path = f'data/frame/s1_Jeremy/s1_Jeremy_2/s1_Jeremy_2_{idx:04d}.jpg'
    print(path)
    img = cv2.imread(path, cv2.IMREAD_COLOR)

    extract_pose(img, hands)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

For some reason, this only happens if I read frames continuously (as in the code to read images above). The extra pose detected is very close to the existing hand pose.

How do I force mediapipe to extract only 2 poses at maximum, and of different hands, so that it does not extract pose from the same hand twice?

Mediapipe `max_num_hands` not working properly

0 Answers0