Is there an accepted "current industry standard best" of stochastic optimization? (Simulated annealing, Particle swarm optimization, etc)

Question

Sorting algorithms are well understood enough that Java Collections uses some flavor of MergeSort or Timsort. (Even though it is possible to hand-craft collections that "fight" the algorithm and perform poorly, those choices are "often enough ideal" for most real world sorting situations)

Statistical ML algorithms kinda/sorta have winners as well, e.g. "You won't go wrong first trying Logistic Regression, Random Forests, and SVM."

Q: Is there a similar "best of breed" choice between the various global optimum approximation functions? For example, it seems that particle swarm optimization (PSO) is several simulated annealing processes running in parallel and sharing information...

In a nutshell, no. Sorting is an extremely well-defined problem with easy success metrics. Optimization problems aren't and don't. If there's a standard practice, it's to try a range of algorithms and pick the one that works "best" by your definition on your kind of problem. — Gene, Sep 21 '18 at 21:15
I think the search for a single best black box optimization routine was pursued less enthusiastically after https://en.wikipedia.org/wiki/No_free_lunch_in_search_and_optimization — mcdowella, Sep 22 '18 at 04:54
Not looking for a free lunch! Looking for a consistently high-value lunch that provides a healthy option that wouldn't be bad to eat for long periods of time. Something like "Algo X costs more but in general is a good first choice." — Benjamin H, Sep 22 '18 at 19:06
It depends on the situation. Do you have gradient information (or higher order, e.g. hessians, fisher info), even as stochastic estimates? Is the problem very high dimensional? Are there many local minima? Is the "fitness" function veryexpensive to evaluate? In the latter case I'd use a Bayesian optimization approach. If there is derivative info, I'd use as much as possible, but if the dimension is too high, just the gradient. Outside of this, hill climbing is ok for smoothish landscapes. — user3658307, Oct 06 '18 at 20:24
Otherwise we're in the GA, PSO, Cross-entropy opt, etc... domain. *Personally*, I tend to feel these are very similar (and can be tuned to be very similar to each other). Another tradeoff might be how much tuning is needed vs computation time. Basically it turns out that even these are essentially different methods of robust derivative estimation... you still have to tune the tradeoff of more robust vs better gradient though. Check out [this paper btw](https://arxiv.org/abs/1712.06568). — user3658307, Oct 06 '18 at 20:27

Is there an accepted "current industry standard best" of stochastic optimization? (Simulated annealing, Particle swarm optimization, etc)

0 Answers0