I am using ktrain package to classify text. My experiment is shown as:
lr_find and lr_plot are functions in ktrain. They can be used to highlight the best learning rate, which is shown as the red dot in the plot.
I do not understand how to understand this plot:
- How to transfer log scale to the normal linear one?
- Why the best scale is the red dot?