0

I have a normal assignment problem, where I want to match workers to jobs. But there are several kinds of jobs, each with a set amount of positions. So for example I would need 10,000 builders, 5,000 welders etc. Each worker has of course the same preference for each position of the same kind of job.

My current approach is to use the Hungarian Algorithm and to just extend the matrix columns to account for that. So for example it would have 10,000 builder columns, 5,000 welder etc. Of course with O(n3) and a matrix that big, getting results may take a while.

Is there any variation of the Hungarian algorithm, or a different one, which uses the fact, that there can be multiple connections to one job node? Or should rather look into Monte Carlo or genetic search tree algorithms?

edit:

Formal description as Sascha proposed:

Set W for workers, J for jobs, weight function formula for the preference, function formula for the amount of jobs available

So the function I want to minimize would be: formula

where formula

Constraints would be:

formula

and

formula

As asked by Yay295, it would be ok if it ran for a day or two on a normal consumer machine. There are 50k workers right now with 10 kinds of jobs and 50k jobs total. So the matrix is 50k x 50k (extended) in the case of the Hungarian algorithm I'm using right now, or 50k x 10 for LP with the additional constraint formula, while formula and preference values in the matrix would go from 0-100.

Yay295
  • 1,628
  • 3
  • 17
  • 29
user2368505
  • 416
  • 3
  • 16
  • Formalize your problem to get more help. This problem should be easily solved by linear-programming by i don't want to code it unless there is this informal description. (For the classic assignment-problem, LP-solvers are often even faster than the dedicated hungarian algorithm.) – sascha Jun 04 '16 at 11:44
  • Added formailzation, I just hope it's correct. – user2368505 Jun 04 '16 at 13:15
  • This is now looking more like a transportation problem. You may want to consider using a decent LP solver, as this allows both assignment and transportation problems (and other variations). – Erwin Kalvelagen Jun 04 '16 at 15:31
  • How fast does it have to be, and what's the largest value you would have in the matrix? – Yay295 Jun 05 '16 at 02:38
  • It would be ok if it ran for a day or two on a normal consumer machine. There are 50k workers right now with 10 kinds of jobs and 50k jobs total. So the matrix is 50k x 50k (extended) in the case of the hungarian algorithm I'm using right now, or 50k x 10 for LP with the additional constraint https://chart.googleapis.com/chart?cht=tx&chl=%5Csum_%7Bi%20%5Cin%20W%7Dx_%7Bij%7D%20%3D%20A(j)%20%5Ctext%7B%20for%20%7Dj%20%5Cin%20J while https://chart.googleapis.com/chart?cht=tx&chl=%5Csum_%7Bj%20%5Cin%20J%7DA(j)%20%3D%2050000 – user2368505 Jun 05 '16 at 10:26
  • Ah, and preference values in the matrix go from 0-100. – user2368505 Jun 05 '16 at 10:34

1 Answers1

1

This is actually called the Transportation Problem. The Transportation Problem is similar to the Assignment Problem in that they both have sources and destinations, but the Transportation Problem has two more values: each source has a supply, and each destination has a demand. The Assignment Problem is a simplification of the Transportation Problem in which the supply of each source and the demand of each destination is 1.

In your case, you have 50,000 sources (your workers) each with a supply of 1 (each worker can only work one job). You also have 10 destinations (the job types) each with some amount of demand (the number of openings for that type).

The Transportation Problem is traditionally solved with the Simplex Algorithm. I couldn't tell you how it works off the top of my head, but there is plenty of information available elsewhere online on how to do it. I would recommend these two videos: first, second.

 

Alternatively, the Transportation Problem can actually also be solved using the Hungarian Algorithm. The idea is to keep track of your supply and demand separately, and then use the Hungarian Algorithm (or any other algorithm for the Assignment Problem) to solve it as if the supply and demand were 1 (this can be incredibly fast when it's as lopsided as 50,000 sources to 10 destinations as in your case). Once you've solved it once, use the results to decrement the supply and demand of the assigned solution appropriately. Repeat until the sum of either supply or demand is zero.

However, none of this may be necessary. I wrote my own Assignment Problem solver in C++ a few years ago, and despite using 2.5GB of RAM, it can solve a 50,000 by 50,000 assignment problem in less than 5 seconds. The trick is to write your own. Before I wrote mine I had a look around at what was available online, and they were all pretty bad, often with obvious bugs. If you are going to write your own code for this though, it would be better to use the Simplex Algorithm as described in the videos I linked above. I don't know that one is faster than the other, but the Hungarian Algorithm wasn't made for the Transportation Problem.

 

ps: The same person who did the two lectures I linked above also did one on the Assignment Problem and the Hungarian Algorithm.

Yay295
  • 1,628
  • 3
  • 17
  • 29