-1

Currently, I am implementing multi-class semantic segmentation pipeline for my custom dataset. At the end of training/evaluation, I will get the corresponding masks of my classes/objects. From the output masks, I need to find the pose of those masks(of one particular class).

What I have in my mind for proceeding further is by either using keypoints detection(similar to human pose estimation) or by panoptic segmentation. My question is, am I going in the right way? If so,

  1. which would be a better option to proceed?
  2. Can you link an implementation you came across fro the similar problem I mentioned

Thanks in advance!!

1 Answers1

0

If you are able to label your custom dataset with keypoints this is the way to go, especially for non-rigid objects. There are also methods for few(zero)-shot keypoint detection (https://github.com/AlanLuSun/Few-shot-keypoint-detection).

In your question it is not clear if your dataset consist of objects with a small degree of freedom. In that case you can just align them in coordinates.

Sergei Chicherin
  • 2,031
  • 1
  • 18
  • 24
  • Thanks!! For further clarification, My custom dataset has a fixed pipe along with some background(which I considered as multiple classes and I don't want to estimate the pose of it). For this fixed pipe I need to estimate the pose. What I thought is adding a key point called 'pipe_centre' and using it to estimate its pose. Does that makes sense? – Dr.strange Jul 20 '23 at 11:12
  • Yes, if you have a fixed symmetric pipe and knowing it length , by having two points, pipe_center and pip_end (or just 2 ends) you can definitely determine the position and orientation in space. For more complicated samples the idea is the same. – Sergei Chicherin Jul 21 '23 at 09:55