Is there any way to use two different weight file of Yolov5 for a video?

Question

I have two trained models of YoloV5 for pen and pen cap detection (pen.pt, cap.pt).
I want to use both this model for a video. For that I run a command

! python detect.py --weights cap.pt pen.pt --img 640 --conf 0.50 --source VID_20220727_185703.mp4

It runs properly and detects pens and its cap separately but it shows labels as cap only for both.

Is there any way to solve it? without retraining the whole dataset.

score 1 · Answer 1 · answered Aug 06 '22 at 12:56

The process of using two models for inferencing one single data is carried out under model ensembling in YoloV5.

Model Ensembling Tutorial clearly defines:

Ensemble modeling is a process where multiple diverse models are created to predict an outcome, either by using many different modeling algorithms or using different training data sets. The ensemble model then aggregates the prediction of each base model and results in once final prediction for the unseen data. The motivation for using ensemble models is to reduce the generalization error of the prediction.

So, model ensembling can improve mAP and Recall during testing and inference but the two models should be trained for the same classes.

Same is clarified in issue#1188

So, a workaround here may be, using the output video from one inferencing as an input to the inferencing for the second model.

Is there any way to use two different weight file of Yolov5 for a video?

1 Answers1