1

I am trying to use CVXPY to solve a linear program of the following form:

10 people respond to a survey asking for their household, age, gender, and generation. From these responses, I have written up many constraints and statistics such as "average age of all 10 individuals = 40", or "number of single-parent households = 0". Each field (a given person's age, race, etc.) is represented as a CVXPY Variable. The goal is to use these constraints to regenerate the original survey responses (pretend that an outsider sees the constraints published without seeing the survey responses and wants to determine what the original survey responses for each individual were).

I can encode my constraints as CVXPY constraints, but then I have no objective function to maximize, as all I have are many constraints. Is there a way to encode my objective function to return the number of constraints that are met by a given assignment of variables, so that the objective function is maximized when all constraints are met? I can't tell if there is a way to do this from the CVXPY documentation. Alternatively, is there another open-source optimizer that is better suited to solving this program? I have already solved it using a SAT solver, and want to do it with a nonlinear optimizer now.

Example data is in the following format: ID, Household #, Age, Sex, Race, Generation:

ID is irrelevant and is only used to help me keep track of line numbers in other code.

1 1 80 1 1 2
2 1 40 0 0 1
3 1 70 1 0 2
4 1 30 1 1 1
5 1 90 0 0 2
6 2 10 0 1 0
7 2 10 0 1 0
8 2 10 1 0 0
9 2 40 1 0 1
10 2 20 0 1 1

Thank you, Christian

  • Is there any reason you want to do this in CVX? Further, can you provide some example data? – Jonas Adler Jul 19 '17 at 13:52
  • 1
    Counting feasible constraints results in a model with binary variables. – Erwin Kalvelagen Jul 19 '17 at 16:10
  • A small follow-up on Erwin's comment: you can easily modify your constraints in indicator-form: e.g. x>=5 <-> x-5>=0 => x-4>=b0 where b0 is a binary-variable (one for each constraint) and build an objective as Max(sum(b)). But: this is now an mixed-integer program (hard to solve) and the available solvers within cvxpy are bad (if you can't use the commercial ones like Gurobi and Cplex which are supported; default=ECOS_BB), except for CBC which needs a non-trivial install-step. – sascha Jul 19 '17 at 18:11
  • Furthermore this task is more suited for stochastic-optimization, where you could pose some assumption as: response x is normal-distributed with unknown mean + variance). But this is even harder (although even simple assumptions should make the result much better)! – sascha Jul 19 '17 at 18:14
  • 1
    There seems to be some confusion here about indicator constraints in the comments. They have the form `b0=1 => x>=5`. Traditionally this is written as `x >= 5 - M*(1-b0)`. – Erwin Kalvelagen Jul 20 '17 at 11:26
  • @ErwinKalvelagen Of course you are right. I don't know how i came up with that hack, but in my defence: the example does not need a bigM constant in this case. – sascha Jul 20 '17 at 12:40
  • @JonasAdler The only reason I chose CVXPY is because it has the capability to support different solvers depending on what type of problem you have. If you know a better solver for this specific problem please let me know. Example data is as follows, where the columns in order are ID#(unimportant), Household #, Age, Sex, Race, Generation. Race is 0=White and 1=Black, Sex is 0=Male, Generation is 0=child, 1=parent, 2=grandparent. 1 1 80 1 1 2 2 1 40 0 0 1 3 1 70 1 0 2 4 1 30 1 1 1 5 1 90 0 0 2 6 2 10 0 1 0 7 2 10 0 1 0 8 2 10 1 0 0 9 2 40 1 0 1 10 2 20 0 1 1 – cmartindale Jul 20 '17 at 13:21
  • Can you edit the post to include the data? – Jonas Adler Jul 20 '17 at 13:29
  • @sascha Thanks for the advice on the indicators - that was what I needed. I have one more question: If I need to encode the constraint "There are 3 children in the data", I did that in a SAT solver by defining if(Generation_person_1 == child) return 1, else return 0, and then summing those for all 10 people and constraining that sum to be 3. Is there a way to do the equivalent in an optimizer? I can't find a way to do the equivalent of an if statement like in the "method" above. Thank you! – cmartindale Jul 20 '17 at 13:30
  • @JonasAdler Added - sorry about the bad formatting in the comment, I'm new to stackoverflow. – cmartindale Jul 20 '17 at 13:31

0 Answers0