0

I am completely new to transformers. I built a transformer-based model that has the encoder and positional embedding parts only. I stacked 12 of them. To classify around 1 million samples of Time series data. the model is very very slow ( around half an hour for each epoch). My GPU: RTX 3080 on a laptop. Is it normal for Transformers to learn slowly? Is there any way to improve the performance? Is there an easy way to tune the hyperparameter with a highly skewed and very noisy dataset?

I tried diffrent learning rates to speed up the process, 0.001 is given me not bead results but a very slow processs. I implemented followe the tensoerflow implementation.

Malak_MAM
  • 1
  • 1

0 Answers0