1

I want to add a action_reconition calculator node to the pose_landmark detector (pose_landmark_gpu.pbtxt). Does anyone know if there is already a calculator implementation suited for that purpose?
i.e.

  • Input: pose landmarks
  • Inference via tflite model
  • Output: probability values for the respective action classes

I've seen that the original pose landmark detector uses tensors_to_landmarks_calculator.cc. I would need a similar file but for different input & output types. Any idea if there is a "template" cc file that I could adapt to my use case?

Just for better understanding, here is my edited pbtxt of the pose_landmark detector with an additional node for action classification:

# GPU buffer. (GpuBuffer)
input_stream: "input_video"

output_stream: "output_video"  # Output image with rendered results. (GpuBuffer)
output_stream: "pose_landmarks"  # Pose landmarks. (NormalizedLandmarkList)
output_stream: "action_detection"  # Action Probabilities

node {
  calculator: "FlowLimiterCalculator"
  input_stream: "input_video"
  input_stream: "FINISHED:output_video"
  input_stream_info: {
    tag_index: "FINISHED"
    back_edge: true
  }
  output_stream: "throttled_input_video"
}

# Subgraph that detects poses and corresponding landmarks.
node {
  calculator: "PoseLandmarkGpu"
  input_stream: "IMAGE:throttled_input_video"
  output_stream: "LANDMARKS:pose_landmarks"
  output_stream: "DETECTION:pose_detection"
  output_stream: "ROI_FROM_LANDMARKS:roi_from_landmarks"
}

# Subgraph that renders pose-landmark annotation onto the input image.
node {
  calculator: "PoseRendererGpu"
  input_stream: "IMAGE:throttled_input_video"
  input_stream: "LANDMARKS:pose_landmarks"
  input_stream: "ROI:roi_from_landmarks"
  input_stream: "DETECTION:pose_detection"
  output_stream: "IMAGE:output_video"
}

# Subgraph that detects actions from poses
node {
  calculator: "ActionDetectorGPU"
  input_stream: "LANDMARKS:pose_landmarks"
  output_stream: "ACTION:action_detection"
}

Update
There is a open source project called SigNN, that does the same thing as I'm intending just for hand pose classification (into american sign language letters). I'm going to plow through that...

mcExchange
  • 6,154
  • 12
  • 57
  • 103

1 Answers1

0

Here is a more general formulation of a similar problem. There is a solution using MediaPipeUnityPlugin (but the same graph would also work in pure mediapipe, though there is no released driver code at the time of writing this)

mcExchange
  • 6,154
  • 12
  • 57
  • 103