I have trained text sum for 5 days with the parameters recommended in the project page. I use a training set with more than 3 million article-summary pairs.
At first running_average_loss
decrease slowly from around 9 to around 4, but after that, running_average_loss
value changes in a wide range, it can be as high as more than 5, but sometimes can be as low as 1. And I test the model with some article in the training set, but the output is far from the referenced summary, I'm confused. Can someone share their experience?
I'm confused with following questions
running_average_loss
is less then 10 every time I run , is it normal?- Is it over fitting since
running_average_los
s varies in a wide range and has no sign to converge? - How long will it take to train a model good enough or when to stop training? Is there a sign to indicate to stop training?