In Keras official documentation for ReduceLROnPlateau class (https://keras.io/api/callbacks/reduce_lr_on_plateau/) they mention that
"Models often benefit from reducing the learning rate"
Why is that so? It's counter-intuitive for me at least, since from what I know- a higher learning rate allows taking further steps from my current position.
Thanks!