-2

I have used the Transformer model to train the time series dataset, but there is always a gap between training and validation in my loss curve. I have tried using different learning rates, batch sizes, dropout, heads, dim_feedforward, and layers, but they don't work. Can anyone give me some ideas on reducing the gap between them?

enter image description here

I also tried to ask the question on the Pytorch forum but didn't get any reply. How to design a decoder for time series regression in Transformer?

Jiangtao Liu
  • 41
  • 1
  • 4

1 Answers1

1

Since you are overfitting your model here 1.Try using more data. 2.Try to add dropOut layers 3. Try using lasso or Ridge

DholuBholu
  • 180
  • 8