## Monday, January 23, 2017

### Employee Scheduling II: column generation

In (1) and (2) an Employee Scheduling model is considered where all possible shifts are enumerated before we solve the actual model. Although such models models tend to solve quickly (especially considering their size), generating all possible shifts may not be an option for larger problems. One possible way to attack this problem is using a traditional column generation approach. This method will generate attractive candidate solutions and add them to the problem.

Our original problem was:

 \begin{align}\min\>&\sum_s c_s x_s\\ &\sum_{s|E(e,s)} x_s \le 1 \>\forall e\\ &\sum_{s|S(s,t)} x_s \ge R_t \>\forall t\\ &x_s \in\{0,1\}\end{align}

where

 $x_s = \begin{cases}1 & \text{if shift s is selected}\\ 0 & \text{otherwise} \end{cases}$

Instead of generating all possible shifts in advance, we start with a small initial set $$s \in S$$. We want this initial set to form a feasible solution for the problem. To make this easier we rewrite the problem as:

 \begin{align}\min\> & \sum_s c_s x_s + \sum_t c_t s_t\\ &\sum_{s|E(e,s)} x_s \le 1\>\forall e\\ &\sum_{s|S(s,t)} x_s + s_t \ge R_t\>\forall t \\ &x_s \in\{0,1\}\\ & s_t \ge 0\end{align}

Here $$s_t$$ is a slack with a penalty $$c_t$$. We also can consider this as extra, flexible, hourly personnel to hire at a hourly rate of $$c_t$$ to cover staffing shortages. Sometimes we call this scheme: making the model “elastic”.

The column generation algorithm iterates between two different problems. First we have a master problem that is essentially a relaxed version of the elastic model we just discussed. We replace the condition $$x_s \in \{0,1\}$$ by $$x_s \in [0,1]$$. Using this we now have an LP (and thus we can generate duals). Second, there is a sub problem that calculates a candidate column (or candidate shift).

The sub problem uses a particular objective: it minimizes the reduced cost of the new LP column $$A_j$$ wrt to the master problem. I.e. the objective is basically

 $\min \sigma_j = c_j – \pi^T A_j$

where $$\pi$$ are the LP duals of the master problem.

This scheme is called column generation as we increase the size of the number of variables $$x_s$$ during the algorithm:

Note that the columns $$s_t$$ stay fixed.

We stop iterating as soon as we can no longer find a new column with $$\sigma_j < 0$$, indicating a “profitable” column. We now have collection of shifts and an LP solution. I.e. we have fractional variable values. To make this a usable schedule we need to solve one more MIP: introduce again the condition $$x_s \in\{0,1\}$$. The complete algorithm can look like:

1. Generate an initial (small) set $$S$$ of feasible shifts
2. Solve the relaxed master problem (LP)
3. Solve the sub problem
4. If $$\sigma_j < 0$$ add shift to the set $$S$$ and go to step 2
5. Solve master problem as MIP

#### Initial Set

For the problem in (1) we generate 100 random feasible shifts. The first 10 shifts are as follows:

Note that we have here two shifts for employee White. This is no problem: the master problem will select the most suitable one to include in the schedule. All these shifts are feasible: they obey the minimum and maximum hours and will not use hours an employee is not available. If the shifts are not sufficient to cover all staffing requirements, no problem: the LP model will start hiring the expensive hourly “emergency” workers. Indeed for these 100 random shifts we need some extra hourly help:

 ----    214 VARIABLE short.L  staffing shortage in time t t5  1.000,    t6  1.000,    t7  1.000,    t9  2.000,    t13 1.000,    t15 1.000,    t16 1.000,    t17 1.000t18 1.000,    t19 1.000,    t23 1.000

#### Master Problem

The master problem expressed in GAMS looks like:

 set   s0 'shifts' /s1*s1000/   s(s0) 'dynamic subset will grow during algorithm'   esmap(e,s0) 'mapping between employee and shift'   shifts(s0,t) 'data structire to hold shifts';parameters   a(t,s0) 'matrix growing in dimension s'   c(s0)   'cost vector growing in dimension s';scalar   shortcost 'cost to hire extra hourly staff' /200/;binary variable x(s0) 'main variable: shift is selected';positive variable short(t) 'staffing shortage in time t';variable cost;equations  masterObj        'calculate total cost'  oneShift(e)      'each employee can do at most one shift'  coverPeriod(t)   'cover each period';masterObj..   cost =e= sum(s, c(s)*x(s)) + sum(t,shortcost*short(t));oneShift(e)..   sum(esmap(e,s), x(s)) =l= 1;coverPeriod(t)..   sum(shifts(s,t), x(s)) + short(t) =g= requirements(t);model Master /masterObj,oneShift,coverPeriod/;

#### Sub problem

The sub problem is somewhat more difficult. We need to find a feasible shift and minimize the reduced cost $$\sigma_j$$.

 set sh 'feasible shift hours' /sh2*sh8/;parameter shlen(sh);shlen(sh) = ord(sh)+1;binary variables   w(t) 'work at time t'   start(t) 'start of shift'   emp(e) 'select shift from employee e'   shft(sh) 'select feasible shift length'   se(sh,e) 'select (sh,e) combination';variables   shiftlen  'length of shift'   wagecost   redcost   'reduced cost';equations   startShift(t)    'start of shift: w(t) goes from 0 to 1'   oneShiftStart    'exactly one start of shift'   calcLen          'calculate length of shift'   oneEmployee      'pick one employee'   minLength        'observe minimum shift length'   maxLength        'observe maximum shift length'   feasibleShift(t) 'make sure shift is feasible'   calcShift1       'select shift length'   calcShift2       'id'   secomb(sh,e)     'shift/employee combination'   calcwage         'calculate cost of shift'   subObj           'objective';startShift(t)..  start(t) =g= w(t) - w(t--1);oneShiftStart..  sum(t,start(t)) =e= 1;calcLen..        shiftLen =e= sum(t, w(t));oneEmployee..    sum(e, emp(e)) =e= 1;minLength..      shiftLen =g= sum(e, minlen(e)*emp(e));maxLength..      shiftLen =l= sum(e, maxlen(e)*emp(e));feasibleShift(t).. w(t) =l= sum(canwork(e,t),emp(e));calcShift1..     sum(sh,shft(sh)) =e= 1;calcShift2..     sum(sh,shlen(sh)*shft(sh)) =e= shiftlen;secomb(sh,e)..   se(sh,e) =g= shft(sh) + emp(e) - 1;calcwage..       wageCost =e= sum((sh,e),shlen(sh)*wage(e)*se(sh,e));subObj..  redcost =e= wageCost - sum(e,emp(e)*oneShift.m(e)) - sum(t, w(t)*coverPeriod.m(t));model sub /subObj,startShift,oneShiftStart,calcLen,oneEmployee,minLength,           maxLength,feasibleShift,calcShift1,calcShift2,secomb,calcWage/;

Some of the modeling issues with the sub problem are explained in more detail in (3).

#### Column Generation Algorithm

The CG algorithm itself can look like:

 sets  iter /iter1*iter100/;scalars   continue /1/;parameter   masterObjVal(iter);loop(iter$continue, solve master using rmip minimizing cost; masterObjVal(iter) = cost.L; solve sub using mip minimizing redcost; if (redcost.l < -0.001, esmap(e,nexts)$(emp.l(e)>0.5) = yes;      shifts(nexts,t)$(w.l(t)>0.5) = yes; c(nexts) = wagecost.L; s(nexts) = yes; nexts(s0) = nexts(s0-1); else continue = 0; ););solve master using mip minimizing cost; #### Results We need to do 36 cycles before we can conclude there are no more interesting columns. The objective of the relaxed master problem decreases to an LP objective representing a cost of$4670. Of course the LP objective is non-increasing: adding a column (i.e. adding a shift) can only help.

After these 36 iterations we have collected 135 shifts (remember we starting in the first iteration with our 100 randomly generated initial shifts). This compares quite favorably to the 1295 shifts we obtained by enumerating all possible shifts. When we reintroduce the integer restrictions $$x_s\in \{0,1\}$$ the objective value deteriorates to $4675. This is just a little bit worse than the global optimal objective of$4670. Indeed this method does not guarantee to find absolute best integer solution, however often it finds very good solutions.

The final schedule looks like:

We don’t need any expensive hourly “emergency” staff, i.e. $$s_t = 0$$. Again we have no slack in the system: we precisely meet the staffing requirements every hour:

#### References

1. Loren Shure, Generating an Optimal Employee Work Schedule Using Integer Linear Programming, January 6, 2016, http://blogs.mathworks.com/loren/2016/01/06/generating-an-optimal-employee-work-schedule-using-integer-linear-programming/ describes the problem with a Matlab implementation.
2. Employee Scheduling I: Matlab vs GAMS, http://yetanothermathprogrammingconsultant.blogspot.com/2017/01/employee-scheduling-i-matlab-vs-gams.html has a GAMS version of the model in (1).
3. Employee Scheduling III: generating all shifts using the Cplex solution pool, http://yetanothermathprogrammingconsultant.blogspot.com/2017/01/employee-scheduling-iii-generating-all.html
4. Erwin Kalvelagen, Column Generation with GAMS, http://www.amsterdamoptimization.com/pdf/colgen.pdf has some background on Column Generation in GAMS.
5. Employee Scheduling IV: direct optimization, http://yetanothermathprogrammingconsultant.blogspot.com/2017/03/employee-scheduling-iv-direct.html