0

I want to use the Yolo Network v3 for let's say detect 5 custom object classes, for which I already have data. So I'm going to use my train data of these classes to retrain the yolo network with pre-trained weights.

Now Imagine the case:

After some time I want to add another class to my model. Now I need to change the architecture of my model, therefore I would need to retrain my model with all 5 + 1 classes, right?

To avoid this situation I had the idea to set a maximum number of classes at the beginning, let's say 20. So I build a Yolo-architecture with 20 classes and train it with the first 5 classes for which data is available. If data of a new class is available, I will use Stochastic Gradient Descent for Online-Learning to train the model to detect the new class.

Here are my questions:

  • Does the model correctly learn the 5 classes at the beginning, without having data from the other 15 classes?
  • Is it possible by Stochastic Gradient Descent to learn new classes bit by bit?
  • Is there any other convenient way to handle my problem?

Thanks for any advice!

f_3464gh
  • 162
  • 3
  • 11
  • I blv. your approach is fine - but it actually shouldnt make much difference whether you a. start with 5 classes and add an output neuron and retrain for every new class, or b. start with 20 classes and retrain for every new class . It would be interesting to look at what difference if any crops up between the two. As for SGD - yes that's the standard way to train these things, you can also start from the already-trained 5-class NN and transfer-learn with the new data. Possibly here is a spot where starting with 20 classes would serve well, if yolo can't xfer learn with final layer changes. – jeremy_rutman Dec 04 '19 at 15:03
  • Thanks for your answer. "but it actually shouldnt make much difference whether you a. start with 5 classes and add an output neuron and retrain for every new class, or b. start with 20 classes and retrain for every new class ". --> Does it really work for a simple NN to just add a neuron and only retrain for that new class? I'm eveb more unsure whether that works for Yolo-network. – f_3464gh Dec 04 '19 at 15:53
  • ok i understand now you only want to retrain with new data for the new class. In this case your method (b) is almost surely easier. It might be possible to freeze weights between the penultimate layer and the first 5 neurons and only train the weights between penultimate and new neuron, but I dont think most ML toolkits are setup to allow for this; caffe for instance allows retraining whole layers but not parts thereof. You may need to avoid the case where weights simply shift to always make neuron 6 win during the new train, by adding occasional cases of classes 1-5. – jeremy_rutman Dec 04 '19 at 17:12

0 Answers0