2

I have a problem that can be formulated as a convex optimization problem with a linear objective function and linear equality and inequality constraints, but with an enormous number of parameters. I can solve this problem in reasonable time on a single machine but a couple hundred thousand parameters, but not with the couple million I need.

I see that Aaron Staple/Databricks have implemented some portions of the Matlab TFOCS library in Spark, but the only examples I see solve unconstrained convex optimization or linear programs with canonical constraints (Ax = b, x >= 0, for scalar matrix A, scalar vector b, and x the vector of parameters to optimize over). But I need to solve a linear program with arbitrary linear equality and linear inequality constraints.

Anyone know if there are capabilities in Spark TFOCS that I'm missing that can solve my problem? Other ways to tackle this problem with available Spark tools?

goodepic
  • 196
  • 5
  • Usually for really large LP:s you can resort to some decomposition principle, e.g., Dantzig-Wolfe decomposition with column generation. This, on the other hand, usually touches decomposing the problem itself (mathematically) outside of solver/modelling software, and thereafter using software to solve the "equivalent" problem to find optimal solutions that are likewise (by transformation) optimal in the original problem. Possibly (unless some knows some neat tricks specific to this solver of yours) this could be one approach to keep in mind. – dfrib Jan 14 '16 at 01:18
  • Decomposition for LP is nowadays somewhat out of favor. To solve very large problems I would suggest to try out a commercial solver like Cplex, Gurobi or Mosek. Make sure to try out their interior point algorithms. Also if you have access to multi-core machines then you may want to try out the parallel versions. My guess would be that distributed LP solvers (as far as they exist) will not be competitive. – Erwin Kalvelagen Jan 14 '16 at 03:00
  • I agree with @Erwin Kalvelagen, the problem with Benders' and Dantzig-Wolfe decomposition is that they are designed for specially structured problems (meaning constraint matrix) and sometimes do worse than regular Simplex on unstructured problems. From my experience, Cplex is extremely fast in terms of solving LPs, plus if you have RAM issues you can make it use hard drive instead. I tested Cplex on two-stage stochastic programs (more than 5 millions variables and constraints in the extensive form) and Cplex was done in half an hour. You might be eligible for IBM academic program to get Cplex. – serge_k Jan 14 '16 at 07:47
  • Note: it's under development and will soon be released in spark packages. – Ehsan M. Kermani Apr 19 '16 at 15:16

0 Answers0