0

I'm training a transformer model with OpenNMT-py on MIDI music files, but results are poor because I only have access to a small dataset pertaining to the style I want to study. To help the model learn something useful, I would like to use a much larger dataset of other styles of music for a pre-training and then fine-tune the results using the small dataset.

I was thinking of freezing the encoder side of the transformer after the pre-training and letting the decoder part free to do the fine-tuning. How would one do this with OpenNMT-py?

Gianluca Micchi
  • 1,584
  • 15
  • 32

1 Answers1

1

Please be more specific about your questions and show some code which will help you to get a productive response from the SO community.

If I were in your place and wanted to freeze a neural network component, I would simply do:

for name, param in self.encoder.named_parameters():
    param.requires_grad = False

Here I assume you have a NN module like as follows.

class Net(nn.Module):
    def __init__(self, params):
        super(Net, self).__init__()

        self.encoder = TransformerEncoder(num_layers,
                                        d_model, 
                                        heads, 
                                        d_ff, 
                                        dropout, 
                                        embeddings,
                                        max_relative_positions)

    def foward(self):
        # write your code
Wasi Ahmad
  • 35,739
  • 32
  • 114
  • 161
  • I'm using OpenNMT as a black box, following the documentation on http://opennmt.net/OpenNMT-py/. Unfortunately I don't know pytorch so I still didn't dig into the source code of the library. But with your answer might help me understand, thank you. – Gianluca Micchi May 04 '19 at 13:22