I am trying to detect objects in my images, mainly shoes. I am familiar that MediaPipe has the option to detect 3D objects (it is only one mode which can take different models like: 'Shoe', 'Chair', 'Cup', 'Camera') but after running the algorithm, it only returns a bounding box around an object and I don't see any other outputs which can help me to get mask:
Is there any way to get a segmentation mask of the detected object? Something like that: