-1

I'm planning to do it using YOLO for a CNN supervised regression task. Given an image, predict the number of times it will be viewed. I'm inclined on using YOLO as it is an object detector. Highly viewed photos mostly contain objects(face, animals, text, etc) that are classes that are in the COCO dataset where YOLO was originally trained.

I already tried using pretrained CNN models(VGGNet, MobileNet, etc.) with frozen weights but the results are not good. The option to fine tune the pretrained models are impossible since I don't have the computational resources to train using 100K+ images for x epochs just to create a good model for my problem.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Rayleigh
  • 20
  • 5

1 Answers1

1

YOLOuses Darknet as a CNN backbone/feature extractor. Therefore, you may want to try a pre-trained Darknet as a feature extractor and replace the classifier with your regressor. Standard YOLO uses Darknet-53 while Tiny YOLO uses Darknet-19.

Buoy Rina
  • 498
  • 1
  • 4
  • 7
  • Here is the link to Darknet site - https://pjreddie.com/darknet/imagenet/ – Buoy Rina Sep 12 '21 at 06:11
  • I spent some time exploring the Darknet architecture. I'm stuck on creating the data file(a parameter needed for `./darknet classifier train ...`) which is expected to contain the `classes` and `labels`. Something I don't have because I'm planning to do regression. Do you have any advice that can help me? Thank you very much. – Rayleigh Sep 14 '21 at 08:02