8

Is there a way to do transfer learning with a decision tree or a random forest model? Concretely, I was wondering if there is a good and easy way of doing so in Python with a model trained with Scikit-learn.

All I can think of is training a random forest on the original dataset, and when new data arrive, train new trees and add these to your model. However, I wonder if this is a good approach and if there are any other better methods.

IGB
  • 117
  • 1
  • 8

2 Answers2

5

Possible but not practical.

The aim of transfer learning is to give initial weights to the deep learning (DL) models and speed up the learning process. You can find that given one same DL model, when applied to similar applications such as computer vision, all the generated DL models have relative range of values although not entirely significant but better than randomization of weights or even sparse.

Machine learning (ML) models have shallow architectures and you can simply use randomization of weights to train-test the model.

If you insist to do transfer learning, you can use the weights of the previous model you are referred to but make sure you have same input-output data and configure your model accordingly. You will notice that you cannot find transfer learning for ML anywhere and easier, because it is not practical. Better learn from scratch.

Eddy Piedad
  • 336
  • 1
  • 8
  • in addition to speeding up training time, transfer learning can also be used when your training data is relatively small or when you want to 'update' your current model of a new subset of data e.g. when new input data suddenly changes for some reason. I am interested if there is more information on this! – Phillip Maire Feb 28 '22 at 20:39
0

One method to use if you want you are using transfer learning to fit a small target dataset and are not concerned about training time. you could use all or part of the original data then add all of the target data in as well. in addition you could augment the target data to ensure the model is 'more fit' to it. I have done this with light GBM and it works well, so in theory, it should also work fine on a random forest.

Also, I found this thesis that explores the topic in more detail Transfer Learning Using Decision Forests by Noam Segev.

In this work we propose three transfer algorithms based on random forests. Two of our algorithms refine a classifier learned on the source set using the available target set, while the last uses both sets directly during tree induction.

Phillip Maire
  • 323
  • 2
  • 10