Been going through the recently released tensorflow/models/../object_detection models, particularly faster r-cnn.
The paper mentions 4-step alternating training, where you would
- train the RPN, then freeze RPN layers,
- train RCNN, then freeze RCNN layers,
- train RPN, then freeze RPN layers
- train RCNN.
From what I gather, at stage 2 = RCNN, RPN is indeed frozen with:
if self._is_training:
proposal_boxes = tf.stop_gradient(proposal_boxes)
So train RPN + freeze RPN Layers, followed by RCNN training is covered, but where are the other 3 steps performed?
Am I missing something?