8

23 pupils from level A, 24 from level B and 30 from level C need to be assigned into three classes. The classes need to be almost exactly of the same size. Different levels can be mixed into a single class, however it is better if it can be avoided. In any case, there should be either 0 pupils from one level in a class, or more than 6.

Can you help me solve this combinatorial optimization problem? Below is a sample input and output. Bonus points if you can show me how to solve the general problem!

Input:

pupils = { "A" : 23, "B" : 24, "C": 30 }

Example output (not very good!)

Class #1 : {'A': 9,  'B': 6, 'C': 10},
Class #2 : {'A': 10, 'B': 9, 'C': 7},
Class #3 : {'A': 11, 'B': 9, 'C': 6}

Edit: Here is my very hackish, completely undocumented, semi-bruteforce code. It's ugly, but it works! I'd love to learn how I could write a more elegant solution.

static_rtti
  • 53,760
  • 47
  • 136
  • 192
  • You say the example output is “not very good”. What's wrong with it? Why would any other 25-26-26 split be any better? – James Waldby - jwpat7 Jun 09 '13 at 18:57
  • @jwpat7 : as explained in the description, it's not very good because there are three levels in all three classes, which is harder for the teacher and best avoided. The constraint that classes should be the same size is stronger than this one, though. – static_rtti Jun 09 '13 at 19:07
  • How are you solving it now? As in, are you using an optimization solver? If so, what are your constraints and objectives? – user327301 Jun 09 '13 at 19:40
  • @raoulcousins : not knowing better, I've started writing a semi-bruteforce algorithm (I try to eliminate bad options as soon as possible). That may well work out, but I'd love to know if there are any better or more elegant options. – static_rtti Jun 09 '13 at 19:48
  • Ok, I'll write you up a quick optimization model to generalize it. What programming language are you using? How did you choose the bounds <= 0 or >= 6? If you had N levels and p_i pupils in each level i in N, what would the bounds <= 0 or >= 6 generalize to? – user327301 Jun 09 '13 at 20:18
  • @raoulcousins: thanks! I use python most of the time, but anything available on linux is fine. 0 is rather obvious, as you can't have negative pupils. 6 is arbitrary, it just means that you don't want to have to few pupils of a level in a classroom because you'll spend a lot of time on them for no good reason. But it should be a parameter, and not set in stone. – static_rtti Jun 09 '13 at 20:24
  • Just to give you an idea, here's my current, very hackish, completely undocumented solution: http://pastie.org/8027658 – static_rtti Jun 09 '13 at 20:58

1 Answers1

18

The fundamental difficulty here is that you sort of have a multiobjective optimization problem. You have three things I think you're interested in that you can either consider objectives or "soft constraints":

  1. Getting similar class sizes
  2. Minimizing number of levels per class
  3. Having enough students from a level in a class if there are any students in a class.

Note that I've written an optimization model for this in AMPL. Since you're using Python, there are similar optimization modeling languages for python like PuLP and pyomo that you could use. The model should not be too difficult to translate.

Here is an integer programming model and a data file that emphasizes objective number 1 while keeping the problem (integer) linear. With this objective, the optimization problem finds the same solution you gave in your example. Hopefully you can build on this and add other constraints and/or objective terms and get better solutions.

The objective is to minimize the largest class size. The variable of interest is y[i,j]. y[i,j] for i in LEVEL, j in CLASS is the number of students from level i assigned to class j. It assumes that you have input for the minimum number of students from each level in each class if any are assigned to that level.

The objective function may not be what you want, but it is a way of trying to equalize the class sizes which is linear. I also don't promise that this is the most efficient way of solving the problem. There may be a better custom algorithm for the problem, but all I had to do was express the constraints and objective and not write an algorithm. It's probably good enough for your use.

Using the solver Gurobi on neos-server.org (you could use lpsolve or another open source optimization solver), I got the solution

y :=
1 1   14
1 2    9
1 3    0
2 1    6
2 2    0
2 3   18
3 1    6
3 2   16
3 3    8
;

Model:

set LEVEL ordered;
set CLASS;

param maxClassSize {CLASS};
param minLevelNumberInClass {LEVEL, CLASS};
param numInLevel {LEVEL};

var z >= 0;
var y{LEVEL, CLASS} integer, >= 0;
var x{LEVEL, CLASS} binary;

#minimize maximum class size
minimize obj: 
  z;

subject to allStudentsAssigned {i in LEVEL}:
  sum {j in CLASS} y[i,j] = numInLevel[i];

#z is the largest of all classes sizes
subject to minMaxZ {j in CLASS}:
  z >= sum {i in LEVEL} y[i,j];

subject to maxClassSizeCon {j in CLASS}:
  sum {i in LEVEL} y[i,j] <= maxClassSize[j];

#xij = 1 if any students from level i are in class j
subject to defineX {i in LEVEL, j in CLASS}:
  y[i,j] <= min(numInLevel[i], maxClassSize[j]) * x[i,j];

#if any students from level i are assigned to class j, then there is a minimum
#if x[i,j] = 1,  y[i,j] >= minLevelNumberInClass[i,j]
subject to minLevel {i in LEVEL, j in CLASS}:
  minLevelNumberInClass[i,j] * x[i,j] <= y[i,j];

Data file for your example:

set LEVEL := 1 2 3;
set CLASS := 1 2 3;

param minLevelNumberInClass:  
  1 2 3 :=
1 6 6 6
2 6 6 6
3 6 6 6
;

param maxClassSize :=
1 77
2 77
3 77
;

param numInLevel :=
1 23
2 24
3 30
;
user327301
  • 2,641
  • 2
  • 26
  • 33
  • Thanks, that was a great introduction to the world of Linear Programming. Thanks a lot for taking the time to write this up. – static_rtti Jun 10 '13 at 20:06
  • 1
    No problem. I really threw you into the deep end if that was an introduction. If you're interested in writing good optimization models, Model Building in Mathematical Programming by H. Paul Williams is a great applied book. – user327301 Jun 11 '13 at 02:02
  • Quick follow-up question: is it possible to get more than one optimal answer? This problem has quite a few optimal answers, and it's interesting to see them all. – static_rtti Jun 11 '13 at 07:21
  • 1
    Yes, but that deserves its own post. This has some details: http://orinanobworld.blogspot.com/2012/05/k-best-solutions-in-amplcplex.html – user327301 Jun 11 '13 at 14:21