For a given function, gradient descent may end up in a local minimum, which is not the global one. Is there any way to combine simulated annealing with gradient descent to find a better local minimum? Finding the global minimum is ideal, but it may not be possible, that's why I want to find a "better" local minimum.
Asked
Active
Viewed 235 times
1
-
1As far as I know, SA will give you a less accurate minimum location, but has higher chance of finding global one. So I guess you could use SA first, and then, when loss is getting low enough stop and continue with gradient descent. This is just a guess though, so feel free to experiment and tell me how it went ! – Antoni Silvestrovič Jan 13 '20 at 01:07