Transfer Learning: Labeling Differences

Question

I was just watching a video on transfer learning (training a model on a larger similar dataset if your dataset is small). I am confused about how the different dataset labels do not interfere with transfer learning.

I understand that transfer learning is typically used if there is only a small amount of data (let's call this Dataset A) for your target task (say blurry cat photos) but a large dataset that has similar data (let's call this Dataset B, the set of professionally taken and not blurry wolf photos) and whose lower-level feature could be used in learning Dataset A (the intuition being that the same edge and curve detection/other lower level features that helps in detecting wolves from Dataset B could also help in detecting cats from Dataset A). From what I understand, you would first train the neural network on Dataset B, then set the weights of the last layers to random, and keeping all other parameters constant, retrain on Dataset A. But given that the label scheme for Dataset B would be for wolves, and the labels for Dataset A on cats, wouldn't the difference in labeling cause a problem?

I'm voting to close this question as off-topic because it is primarily about machine learning rather than programming. You may consider visiting [Cross Validated](//stats.stackexchange.com) or [Data Science SE](//datascience.stackexchange.com) instead. — E_net4, Jan 13 '19 at 19:29

score 0 · Answer 1 · answered Jan 13 '19 at 19:36

Your understanding is right to some point. You do not necessarily just set the "weights of the last layers to random" before retraining. But it is more like a cutting of the last layer and replacing it with another newly designed layer.

Meaning that you could have different types and a different number of outputs for your network, allowing different numbers and types of labels during retraining.

Note: Tensorflow calls this "simply train a new classification layer on top". More on this and a nice tutorial to grasp the ideas are to be found here.

Transfer Learning: Labeling Differences

1 Answers1