Template Matching through python API on Linux desktop

Question

I'm following the tutorial on using your own template images to do object 3D pose tracking, but I'm trying to get it working on Ubuntu 20.04 with a live webcam stream.

I was able to successfully make my index .pb file with extracted KNIFT features from my custom images.

It seems the next thing to do is load the provided template matching graph (in mediapipe/graphs/template_matching/template_matching_desktop.pbtxt) (replacing the index_proto_filename of the BoxDetectorCalculator with my own index file), and run it on a video input stream to track my custom object.

I was hoping that would be easiest to do in python, but am running into dependency problems.

(I installed mediapipe python with pip3 install mediapipe)

First, I couldn't find how to directly load a .pbtxt file as a graph in the mediapipe python API, but that's ok. I just load the text it contains and use that.

template_matching_graph_filepath=os.path.abspath("~/mediapipe/mediapipe/graphs/template_matching/template_matching_desktop.pbtxt")
graph = mp.CalculatorGraph(graph_config=open(template_matching_graph_filepath).read())

But I get missing calculator targets.

No registered object with name: OpenCvVideoDecoderCalculator; Unable to find Calculator "OpenCvVideoDecoderCalculator"

or

[libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/text_format.cc:309] Error parsing text-format mediapipe.CalculatorGraphConfig: 54:70: Could not find type "type.googleapis.com/mediapipe.TfLiteInferenceCalculatorOptions" stored in google.protobuf.Any.

It seems similar to this troubleshooting case but, since I'm not trying to compile an application, I'm not sure how to link in the missing calculators. How to I make the mediapipe python API aware of these graphs?

UPDATE: I made decent progress by adding the graphs that the template_matching depends on to the cc_library deps of the mediapipe/python/BUILD file

cc_library(
name = "builtin_calculators",
deps = [
    "//mediapipe/calculators/image:feature_detector_calculator",
    "//mediapipe/calculators/image:image_properties_calculator",
    "//mediapipe/calculators/video:opencv_video_decoder_calculator",
    "//mediapipe/calculators/video:opencv_video_encoder_calculator",
    "//mediapipe/calculators/video:box_detector_calculator",
    "//mediapipe/calculators/tflite:tflite_inference_calculator",
    "//mediapipe/calculators/tflite:tflite_tensors_to_floats_calculator",
    "//mediapipe/calculators/util:timed_box_list_id_to_label_calculator",
    "//mediapipe/calculators/util:timed_box_list_to_render_data_calculator",
    "//mediapipe/calculators/util:landmarks_to_render_data_calculator",
    "//mediapipe/calculators/util:annotation_overlay_calculator",
...

I also modified solution_base.py so it knows about BoxDetector's options.

from mediapipe.calculators.video import box_detector_calculator_pb2
...
CALCULATOR_TO_OPTIONS = {
'BoxDetectorCalculator':
    box_detector_calculator_pb2
    .BoxDetectorCalculatorOptions,

Then I rebuilt and installed mediapipe python from source with:

~/mediapipe$ python3 setup.py install --link-opencv

Then I was able to make my own class derived from SolutionBase

from mediapipe.python.solution_base import SolutionBase
class ObjectTracker(SolutionBase):
    """Process a video stream and output a video with edges of templates highlighted."""
    
    def __init__(self,
                 object_knift_index_file_path):
        super().__init__(binary_graph_path=object_pose_estimation_binary_file_path,
                         calculator_params={"BoxDetector.index_proto_filename": object_knift_index_file_path},
                        )
    def process(self, image: np.ndarray) -> NamedTuple:
        return super().process(input_data={'input_video':image})
ot = ObjectTracker(object_knift_index_file_path="/path/to/my/object_knift_index.pb")

Finally, I process a video frame from a cv2.VideoCapture

cv_video = cv2.VideoCapture(0)
result, frame = cv_video.read()
input_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
res = ot.process(image=input_frame)

So close! But I run into this error which I just don't know what to do with.

/usr/local/lib/python3.8/dist-packages/mediapipe/python/solution_base.py in process(self, input_data)
    326         if data.shape[2] != RGB_CHANNELS:
    327           raise ValueError('Input image must contain three channel rgb data.')
--> 328         self._graph.add_packet_to_input_stream(
    329             stream=stream_name,
    330             packet=self._make_packet(input_stream_type,

RuntimeError: Graph has errors: 
Calculator::Open() for node "BoxDetector" failed: ; Error while reading file: /usr/local/lib/python3.8/dist-packages/

Looks like CalculatorNode::OpenNode() is trying to open the python API install path as a file. Maybe it has to do with the default_context. I have no idea where to go from here. :(

Did you manage to progress with this? – LemurPwned Aug 25 '22 at 11:32 — LemurPwned, Aug 25 '22 at 11:32

Template Matching through python API on Linux desktop

0 Answers0