How to add a PuLP constraint that involves two variables that are subject to be optimized

Question

I am a newbie with PuLP and trying to solve a potentially trivial problem.

I have some projects that required employees with a certain skillset per project, defined in a matrix project_skill_demand. The projects demand certain (wo)manpower to be maintained, eg. 1.5 (150%) of a FTE. Developers may work no more than .8 across all projects they are assigned to, and the total project demand must be fulfilled, defined in project_utilization_demand.

I hard coded 5 developers assuming that not all are assigned to a project if possible.

Here are my two problems: Distributing employees (here developers) across projects looked straight forward, however I fail to define a minimization function that maximizes a developers utilization while at the same time minimizing the total number of assigned workers (which should be a consequence of the former).

What I additionally can't figure out is how to have the solver minimize the required skill sets for a given developer working on a project. I want to optimize for the smallest skill set possible while fulfilling all demands of a project they are assigned to.

Here is what I have so far:

import pulp as plp


Projects = ["P1", "P2", "P3"]
Devs = ["D1", "D2", "D3", "D4", "D5", "D6"]
Skills = ["S1", "S2", "S3", "S4"]


# Define demand for each project in pct FTE
project_utilization_demand = {"P1": 2.0, "P2": 0.5, "P3": 1.5,}

# Define sets of required skills for each project, unused so far
_project_skill_demand = [  # Skills
    # S1 S2 S3 S4
    [1, 1, 1, 1],  # P1
    [1, 1, 0, 0],  # P2    Projects
    [0, 1, 0, 1],  # P3
]

project_skill_demand = plp.makeDict([Projects, Skills], _project_skill_demand)

DevSkills = [(d, s) for d in Devs for s in Skills]
Assignments = [(p, d) for p in Projects for d in Devs]

# These variables I'd like to have optimized 
# Skill requirements for devs should be as little as possible
dev_skills = plp.LpVariable.dicts("DevSkills", (Devs, Skills), 0, 1, cat=plp.LpInteger)
# dev_skills = plp.LpVariable.dicts("DevSkills", (Devs, Skills), cat=plp.LpBinary) # Is there any difference using a binary variable here?

# Devs should be assigned to as few projects as possible
# Can I use information from `project_utilization_demand` as the upper limit here, currently None?
assignments = plp.LpVariable.dicts("pct_assignment", (Projects, Devs), 0, None, plp.LpContinuous)

prob = plp.LpProblem("LinearProgramming", plp.LpMinimize)

# minimizing assignments has no effect since the constraints set them to > sum(project_demand), which is 4
# and there is no reason to go beyond that
# 
prob += (
    plp.lpSum([assignments[p][d] for p, d in Assignments]) # plp.lpSum(assignements) is just the same
    + plp.lpSum(dev_skills) # Without a constraint, this will remain all 0
)

# Constraints
# Devs may only be utilized to max .8 (80%)
for d in Devs:
    prob += (
        plp.lpSum([assignments[p][d] for p in Projects]) <= .8,
        f"Sum of projects for Dev {d}",
        
    )   

# total assignments of devs to projects must fulfill demand
for p in Projects:
    prob += (
        plp.lpSum([assignments[p][d] for d in Devs]) >= project_utilization_demand[p],
        f"Sum of Devs in P {p}",
    )

    # Skills of devs _eventually assigned_ to a project must satisfy the project's requirements
    #
    # HOW?
    # 


prob.solve()
status = plp.LpStatus[prob.status]

for v in prob.variables():
    print(f"{v.name} = {v.varValue}")

print(f"Total Costs = {prob.objective.value()}")

As I read your code it appears that the worker’s skills are a variable as well? Is that intent—seems a bit odd. I would suspect that the workers skill sets would be a known parameter, this would be much more straightforward…? — AirSquid, Apr 27 '22 at 16:20
They are indeed variables, the skills required per project are hard coded though. The initial goal was to optimize/minimize the skill sets that are required to optimally staff the projects, if that makes sense. — TomTom101, Apr 28 '22 at 19:44
it is also the case that each worker assigned to the project needs to have *all* of the skills (or a superset) in order to be assigned to a project? Or could a worker with only `S1` contribute to some of the demand for `P2`, for example? — AirSquid, May 01 '22 at 23:57
I'd like to experiment with that, I don't know yet. I'd be happy to at all get to this stage, though ;) — TomTom101, May 04 '22 at 08:00

score 1 · Answer 1 · answered May 04 '22 at 16:06

Here's a cut at this....

A couple notes:

With 4 skills, you have 15 possible combinations of skills, yet in your problem, only 4 of them are applicable, so I collapsed the problem down by generating the "useful skillsets" from the requirement and used that to regulate the assignments
We are assigning skillsets to tasks here, so I used that as a variable, and then used a second variable to roll up the totality of the assignments into an integer "person" to hire with that skillset.
As mentioned you are tinkering with a multi-objective problem here (skills & dev count) so I just chose to "cost" the skillsets as a means of making an objective function. I costed them by count of skills, but anything is possible with a LUT or such with cost breakdowns.
Note in this model, there is no sense of which individual is assigned to which project. If that becomes needed in additional constraints, we'd have to add that as an index as well

Code

from typing import Union
import pulp as plp
from itertools import chain, combinations

# from: https://docs.python.org/3/library/itertools.html#itertools-recipes
def powerset(iterable):
    "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

projects = ["P1", "P2", "P3"]
#Devs = ["D1", "D2", "D3", "D4", "D5", "D6"]
skills = ["S1", "S2", "S3", "S4"]

# Define demand for each project in pct FTE
project_utilization_demand = {"P1": 2.0, "P2": 0.5, "P3": 1.5,}

proj_skill_reqts = {
    'P1': {'S1', 'S2', 'S3', 'S4'},
    'P2': {'S1', 'S2',       'S4'},
    'P3': {      'S2',       'S4'}
}

all_skillsets = {frozenset(t) for t in powerset(skills)}  # a set of all 16 combinations of 4 skills
desired_skillsets = proj_skill_reqts.values()
# cull the list to only the ones usable on the projects...
usable_skillsets = {t for t in all_skillsets if any(t>=s for s in desired_skillsets)}

# some cost structure for the skills.  Will use just a count of the # of skills
# and in OBJ, will weight with this
wt = 1/3  # (this could be anything...)
skillset_costs = {ss:len(ss)*wt for ss in usable_skillsets}

# now, we need a way to make subets of all of the usable skills indexed by project they
# are usable on.
project_usable_ss = {p:{ss for ss in usable_skillsets if ss >= proj_skill_reqts[p]} for p in projects}

# print(project_usable_ss)

# set up the pulp problem

# VARIABLES
# assign some proportion skillset ss time to project p
assignments = [(ss, p) for p in projects for ss in usable_skillsets]
assign = plp.LpVariable.dicts("assign", assignments, lowBound=0, cat=plp.LpContinuous)
# hire/train this number of employees with skillset ss
hire = plp.LpVariable.dicts("hire", usable_skillsets, lowBound=0, cat=plp.LpInteger)

# Define problem
prob = plp.LpProblem("assign_employees", plp.LpMinimize)

# OBJ:  minimize the number of employees needed, and minimize cost of skillsets
prob += plp.lpSum(hire[ss]*skillset_costs[ss] for ss in usable_skillsets)

# CONSTRAINTS
# hire to cover the assignment demand
max_ute = 0.8
for ss in usable_skillsets:
    prob += hire[ss] >= (1/max_ute)*plp.lpSum(assign[ss, p] for p in projects)

# cover the project demands with assignments
# note we are using the project-specific subsets as the source of
# qualified assignments
for p in projects:
    prob += plp.lpSum(assign[ss, p] for ss in project_usable_ss[p]) >= project_utilization_demand[p]


#print(prob)

prob.solve()
status = plp.LpStatus[prob.status]
assert(status=='Optimal')  # fail here & prevent buffoonery if not optimal

def set_print(x: Union[set, frozenset]) -> str:
    x = sorted(x)
    if not x: 
        return None
    res = '{'
    res += f'{x[0]}'
    for t in x[1:]:
        res += ', '
        res += str( t)
    res += '}'
    return res

for ss in usable_skillsets:
    print(f'hire {hire[ss].varValue} people with skills: {set_print(ss)}')
    
print()
for p in projects:
    for ss in project_usable_ss[p]:
        if assign[ss, p].varValue:
            print(f'assign {assign[ss, p].varValue}FTE with skillset {set_print(ss)} to project {p}')

print()
print(f"Total Costs = {prob.objective.value():0.2f}")

Yields:

Result - Optimal solution found

Objective value:                5.66666667
Enumerated nodes:               0
Total iterations:               0
Time (CPU seconds):             0.00
Time (Wallclock seconds):       0.00

Option for printingOptions changed from normal to all
Total time (CPU seconds):       0.00   (Wallclock seconds):       0.00

hire 0.0 people with skills: {S2, S3, S4}
hire 1.0 people with skills: {S1, S2, S4}
hire 1.0 people with skills: {S2, S4}
hire 3.0 people with skills: {S1, S2, S3, S4}

assign 2.0FTE with skillset {S1, S2, S3, S4} to project P1
assign 0.5FTE with skillset {S1, S2, S4} to project P2
assign 0.3FTE with skillset {S1, S2, S4} to project P3
assign 0.8FTE with skillset {S2, S4} to project P3
assign 0.4FTE with skillset {S1, S2, S3, S4} to project P3

Total Costs = 5.67
[Finished in 166ms]

Thanks for giving this a whirl! When I played around it does give some suboptimal advise ;) Change P1 demand to 2.5 and our LP-HR department lets us hire four S1-4 skilled personell only to assign 2.5 of them to P1, 0.5 to P2 and the third one gets lucky and is not assigned at all ;) — TomTom101, May 06 '22 at 18:55
Hire 4 people with skills: {S1, S2, S3, S4} Hire 2 people with skills: {S2, S4} Assign 2.5 with skillset {S1, S2, S3, S4} to project P1 Assign 0.5 with skillset {S1, S2, S3, S4} to project P2 Assign 1.6 with skillset {S2, S4} to project P3 — TomTom101, May 06 '22 at 18:55
These are optimal, given the constraints you stated. You stated 80% max UTE rate, so in the first case, even though we are "assigning" the full 2.5 + 0.5 = 3.0 UTE, we need to hire 4 people to cover this as 0.8 * 4 = 3.2 usable FTE — AirSquid, May 06 '22 at 19:21

score 0 · Answer 2 · answered Apr 26 '22 at 08:30

0

Re 1. This is a case of multi-objective optimization. This is a fairly large field. You may want to do some reading on this subject.

answered Apr 26 '22 at 08:30

Erwin Kalvelagen

15,677
2
14
39

So much for "potentially trivial" ;) I would reckon this everyday problem has been solved a million times already. Thanks however! – TomTom101 Apr 26 '22 at 08:42

How to add a PuLP constraint that involves two variables that are subject to be optimized

2 Answers2

Code

Yields: