Tuesday, February 21, 2017

Multiplication of binary with continuous variable between zero and one

A small tidbit. The standard linearization for the multiplication of two binary variables \(z=x \cdot y\) with \(x,y \in \{0,1\}\) is:
\[\boxed{\begin{align}&z\le x\\&z \le y\\& z\ge x+y-1\\&z \in \{0,1\}\end{align}}\]

What about if \(y\) is continuous between zero and one, \(y\in[0,1]\) (while \(x\) remains a binary variable)? In that case we can use the formulation in (1) where we consider a continuous variable \(y \in [y^{lo},y^{up}]\):

\[\boxed{\begin{align}&\min\{0,y^{lo}\} \le z \le \max\{0,y^{up}\} \\
&y^{lo} \cdot x \le z \le y^{up} \cdot x \\
&y − y^{up} \cdot (1 − x) \le z \le y − y^{lo} \cdot (1 − x)\end{align}}\]

When we plug in \(y^{lo}=0\) and \(y^{up}=1\), we end up with the same thing:

\[\boxed{\begin{align}&z\le x\\&z \le y\\& z\ge x+y-1\\&z \in [0,1]\end{align}}\]

We can easily verify the correctness by checking what happens when \(x=0\) or \(x=1\):

\[\begin{align}
    &x=0 & \implies & \begin{cases}
                              z\le 0\\
                              z \le y &\text{(NB)}\\
                              z \ge y-1 &\text{(NB)} \\
                              z \in [0,1]
                             \end{cases} & \implies  & z=0 \\
    &x=1 & \implies & \begin{cases}
                              z\le 1 & \text{(NB)}\\
                              z \le y \\
                              z \ge y \\
                              z \in [0,1]
                             \end{cases} & \implies  & z=y
\end{align}\]

where \(\text{NB}\) means non-binding by which I mean we can ignore this constraint.

References
  1. Multiplication of a continuous and a binary variable, http://yetanothermathprogrammingconsultant.blogspot.com/2008/05/multiplication-of-continuous-and-binary.html

Saturday, February 18, 2017

Solving Kakuro puzzles as a MIP

The Kakuro puzzle is another logical puzzle similar to Sudoku, KenKen and Takuzu. A small example from (1) is:

image 

The rules are as follows:

  1. Each (white) cell must be filled with a value \(1,..,9\).
  2. Each (white) cell belongs to a horizontal and/or a vertical block of contiguous cells.
  3. The values of the cells in each block add up to a given constant. These constants are to the left (or top) of the block. A notation a\b is used: a is the sum for the block below and b is the sum for the block to the right.
  4. The values in each block must be unique (i.e. no duplicates).
Model

The actual model is rather simple. Similar to (2) and (3) I define a binary decision variable:

\[x_{i,j,k} = \begin{cases}1 & \text{if cell $(i,j)$ has the value $k$} \\
     0& \text{otherwise}\end{cases}\]

where \(k=\{1,..,9\}\). Of course, we can select only one value in a cell, i.e.:

\[ \sum_k x_{i,j,k} = 1 \>\>\forall i,j|IJ(i,j)\]

where \(IJ(i,j)\) is the set of white cells we need to fill in. To make the model simpler we also define an integer variable \(v_{i,j}\) with the actual value:

\[v_{i,j} = \sum_k k\cdot x_{i,j,k}\]

The values in each block need to add up the given constant. So:

\[\sum_{i,j|B(b,i,j)} v_{i,j} = C_b \>\>\forall b\]

where the set \(B(b,i,j)\) is the mapping between block number and cells \((i,j)\). The uniqueness constraint is not very difficult at all:

\[\sum_{i,j|B(b,i,j)} x_{i,j,k} \le 1 \>\>\forall b,k\]

Note that (1) uses a different numbering scheme for the cells, but otherwise the models are very similar. My version of the model is smaller in terms of nonzero elements in the matrix, due to my use of intermediate variables \(v_{i,j}\).

Data entry

A compact way to enter the data for the example problem in GAMS is:

parameter
  block(b,i,j)
/
   
block1.r1.(c1*c2)   16  ,   block13.(r1*r3).c1  23
   
block2.r1.(c5*c7)   24  ,   block14.(r6*r7).c1  11
   
block3.r2.(c1*c2)   17  ,   block15.(r1*r4).c2  30
   
block4.r2.(c4*c7)   29  ,   block16.(r6*r7).c2  10
   
block5.r3.(c1*c5)   35  ,   block17.(r3*r7).c3  15
   
block6.r4.(c2*c3)    7  ,   block18.(r2*r3).c4  17
   
block7.r4.(c5*c6)    8  ,   block19.(r5*r6).c4   7
   
block8.r5.(c3*c7)   16  ,   block20.(r1*r5).c5  27
   
block9.r6.(c1*c4)   21  ,   block21.(r1*r2).c6  12
   
block10.r6.(c6*c7)   5  ,   block22.(r4*r7).c6  12
   
block11.r7.(c1*c3)   6  ,   block23.(r1*r2).c7  16
   
block12.r7.(c6*c7)   3  ,   block24.(r5*r7).c7   7
 
/

The sets \(IJ(i,j)\) and parameter \(C_b\) can be easily extracted from this.

Results

After adding a dummy objective, we can solve the model as a MIP (Mixed Integer Programming) Model. The output looks like:

----     75 VARIABLE v.L 

            c1          c2          c3          c4          c5          c6          c7

r1           9           7                                   8           7           9
r2           8           9                       8           9           5           7
r3           6           8           5           9           7
r4                       6           1                       2           6
r5                                   4           6           1           3           2
r6           8           9           3           1                       1           4
r7           3           1           2                                   2           1

It is noted that although I used the declaration \(x_{i,j}\), only the subset of cells that we actually need to fill in is used in the equations. GAMS will only actually generate the variables that appear in the equations, so all the “black” cells are not part of the model.

Cplex can solve the model in zero iterations and zero nodes: it solves the model completely in the presolve.

Interestingly in (1) somewhat disappointing performance results are reported. They use CBC and a solver called eplex (never heard of that one). I checked the above small instance with CBC and it can solve it in zero nodes, just like Cplex.

A large, difficult example

This example is taken from (4):

image

For this problem Cplex has to do a little bit of work. The log looks like:

---   3,665 rows  4,921 columns  19,189 non-zeroes
---   4,920 discrete-columns

. . .

Dual simplex solved model.

Root relaxation solution time = 0.44 sec. (102.53 ticks)

        Nodes                                         Cuts/
   Node  Left     Objective  IInf  Best Integer    Best Bound    ItCnt     Gap

      0     0        0.0000   325                      0.0000     1727        
      0     0        0.0000   188                    Cuts: 39     2562        
      0     0        0.0000   233                    Cuts: 49     3018        
      0     0        0.0000   227                    Cuts: 18     3403        
      0     0        0.0000   147                    Cuts: 64     4037        
*     0+    0                            0.0000        0.0000             0.00%
Found incumbent of value 0.000000 after 4.42 sec. (1071.33 ticks)
      0     0        cutoff              0.0000        0.0000     4423    0.00%
Elapsed time = 4.44 sec. (1072.36 ticks, tree = 0.01 MB, solutions = 1)

. . .

Proven optimal solution.

MIP Solution:            0.000000    (4423 iterations, 0 nodes)

To be complete, CBC takes a little more time:

Search completed - best objective 0, took 55117 iterations and 140 nodes (42.00 seconds)
Strong branching done 1748 times (104829 iterations), fathomed 9 nodes and fixed 122 variables
Maximum depth 18, 0 variables fixed on reduced cost

This model does not solve completely in the presolve. However the computation times look quite good to me, especially when compared to what is reported in (1).

The solution looks like:

image

Proving uniqueness

We can prove the uniqueness of this solution by adding the constraint:

\[\sum_{i,j,k|IJ(i,j)} x^*_{i,j,k} x_{i,j,k} \le |IJ| –1 \]

where \(x^*\) is the previously found optimal solution, and \(|IJ|\) is the cardinality of the set \(IJ(i,j)\) (i.e. the number of “white” cells). Note that \(\sum x^*_{i,j,k}=\sum x_{i,j,k} = |IJ|\). With this constraint we should either find a different solution (so the solution was not unique), or the model should be declared “infeasible”.

References
  1. Helmut Simonis, Kakuro as a Constraint Problem, Cork Constraint Computation Centre (4C), University College Cork, 2008.
  2. MIP Modeling: From Sukodu to Kenken via Logarithms, http://yetanothermathprogrammingconsultant.blogspot.com/2016/10/mip-modeling-from-sudoku-to-kenken.html 
  3. Solving Takuzu puzzles as a MIP using xor, http://yetanothermathprogrammingconsultant.blogspot.com/2017/01/solving-takuzu-puzzles-as-mip-using-xor.html
  4. http://pagesperso.g-scop.grenoble-inp.fr/~cambazah/page1/page7/assets/data_generated_hard.ecl has some larger data sets.
  5. http://www.gecode.org/doc-latest/MPG.pdf, Chapter 21 is about solving Kakuro using Gecode, a constraint programming solver.

A script to generate power pivot tables

In (1) I provided a GAMS/VBS script to generate an Excel spreadsheet containing a pivot table that used a CSV file as external data source. Essentially the pivot table provides a view of the data in the CSV file. Here I show a script that allows to store the data inside the Excel data model: no external CSV file is used after loading the data into the data model.

$ontext

  
Example code that creates a pivot table with large data loaded

  
into Excel's data model.

$offtext

sets c,cty,yr,attr;
parameter psd(c,cty,yr,attr) 'production, supply and distribution'
;

$gdxin psd_alldata.gdx
$loaddc c=dim1 cty=dim2 yr=dim3 attr=dim4 psd

parameter
TotalProduction(c,yr);
TotalProduction(c,yr) =
sum(cty,psd(c,cty,yr,'Production'
));


alias
(Commodities,Countries,Year,Attributes,*);
parameter
Cube(Commodities,Countries,Year,Attributes);

Cube(c,cty,yr,attr) = psd(c,cty,yr,attr);
Cube(c,
'World',yr,'Production'
) = TotalProduction(c,yr);

$set gdxfile         cube.gdx
$set csvfile         cube.csv

execute_unload "%gdxfile%"
,Cube;
execute "gdxdump %gdxfile% output=%csvfile% format=csv symb=Cube"
;


$set xlsfile         %system.fp%pivotcube.xlsx
$set scriptname      script.vbs

execute "cscript %scriptname% //Nologo"
;


$onecho > %scriptname%
Wscript.Echo "Creating pivot table..."

Wscript.Echo "This will take a minute."
Wscript.Echo ""
Set xl = CreateObject("Excel.Application")
xl.DisplayAlerts=False
Wscript.Echo "  Creating Workbook..."
Set wb = xl.Workbooks.Add

Wscript.Echo "  Creating Query..."
wb.Queries.Add "cube", _
 
"let Source = Csv.Document(File.Contents(""%system.fp%%csvfile%""),[Delimiter="",""])," & _
 
"  #""Headers"" = Table.PromoteHeaders(Source)," & _
 
"  #""Type"" = Table.TransformColumnTypes(#""Headers"",{{""Val"", Number.Type}}) in #""Type"""

Wscript.Echo "  Creating Connection (loading data)..."
Set con = wb.Connections.Add2("Query - cube", _
       
"Connection to the 'cube' query in the workbook.", _
       
"OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=cube" _
       
, """cube""", 6, True, False)

Wscript.Echo "  Creating PivotCache..."
Set pc = wb.PivotCaches.Create(2,con,6)

If wb.Sheets.count = 0 Then
  
Set sh = wb.Sheets.Add()
Else
  
Set sh = wb.Sheets(1)
End If

Wscript.Echo "  Creating PivotTable..."
Set pt = pc.CreatePivotTable(sh.Range("A1"))
pt.SmallGrid = False

Wscript.Echo "  Initial Layout..."
pt.CubeFields("[cube].[Commodities]").Orientation=3
pt.CubeFields("[cube].[Attributes]").Orientation=3
pt.CubeFields("[cube].[Year]").Orientation=2
pt.CubeFields("[cube].[Countries]").Orientation=1

xlSum = -4157
call pt.CubeFields.GetMeasure("[cube].[Val]",xlSum,"Sum of Val")
call pt.AddDataField(pt.CubeFields("[Measures].[Sum of Val]"), "Sum of Val")

pt.PivotFields("[cube].[Commodities].[Commodities]").ClearAllFilters
pt.PivotFields("[cube].[Commodities].[Commodities]").CurrentPageName = _
  
"[cube].[Commodities].&[Coffee, Green]"
pt.PivotFields("[cube].[Attributes].[Attributes]").ClearAllFilters
pt.PivotFields("[cube].[Attributes].[Attributes]").CurrentPageName = _
   
"[cube].[Attributes].&[Production]"

pt.TableStyle2 = "PivotStyleDark7"

Wscript.Echo "  Saving %xlsfile%..."
wb.SaveAs("%xlsfile%")
wb.Close
xl.Quit

$offecho

The steps are as follows:

  1. Load data into GAMS.
  2. Create a 4 dimensional cube parameter suited for display as a pivot table. The dimensions are commodity, country, year, attribute.
  3. Export the cube as GDX file.
  4. Convert the GDX file to a CSV file.
  5. Run a VBscript script that talks to Excel and performs the following steps:
    1. Load the data into the data model.
    2. Create a pivot table for this data.
    3. Create an initial layout that makes sense for a user of the data (opposed to an empty pivot table).

The result will look like:

image

We see three regions. On the left we see the actual pivot table:

image

The most right panel allows to change the layout of the pivot table:

image

Note that our 4 dimensional cube parameter has become a 5 column table: the values are a separate column.

The smaller panel in the middle shows the queries in data model:

image

Data model

The step to load the CSV data into the data model of Excel is actually written in what is called the M language:

  1. Source = Csv.Document(File.Contents("cube.csv"),[Delimiter=","])
  2. #"Headers" = Table.PromoteHeaders(Source)
  3. #"Type" = Table.TransformColumnTypes(#"Headers",{{"Val", Number.Type}}) in #"Type"

Interestingly we can follow these steps after the fact in the Excel Query Editor. After the first step Source we see the raw CSV data has been imported:

image

The second step promotes the headers:

image

and the last step changes the type of the value column to numeric:

image

File Sizes

The method in (1) requires both an .xlsx file and a .csv file to be present. For this example, we end up with:

02/18/2017  12:49 PM         6,397,317 pivotcube.xlsx
02/18/2017  12:48 PM        62,767,308 cube.csv

The method shown here produces a self-contained .xslx file. We have:

02/18/2017  12:52 PM         2,958,573 pivotcube.xlsx

I am not sure why this version is even smaller than the pivot table with external CSV data, but apparently the internal data model can be squeezed very efficiently. Some of it may have to do with the internal Vertipaq Compression Engine which allows for very efficient column storage of the data in the data model (3). 

References
  1. http://yetanothermathprogrammingconsultant.blogspot.com/2016/04/export-large-result-sets-to-pivot-table.html contains a script to generate a pivot table with a CSV file as an external data source.
  2. http://yetanothermathprogrammingconsultant.blogspot.com/2016/07/excel-pivot-table-and-large-data-sets.html shows how to load CSV data into the data model and create a pivot table by hand.
  3. Alberto Ferrari and Marco Russo, The VertiPaq Engine in DAX, https://www.microsoftpressstore.com/articles/article.aspx?p=2449192&seqNum=3 (chapter from The Definitive Guide to DAX, Business intelligence with Microsoft Excel, SQL Server Analysis Server, and Power BI, Microsoft Press, 2015).

Thursday, February 9, 2017

New book: Testing R Code

Testing R Code book cover

Table of contents

  1. Introduction
  2. Run-Time Testing with assertive
  3. Development-Time Testing with testthat
  4. Writing Easily Maintainable and Testable Code
  5. Integrating Testing into Packages
  6. Advanced Development-Time Testing
  7. Writing Your Own Assertions and Expectations

190 pages


Price: $0.26 per page (Amazon)

I liked it at first sight when leafing through the pages. Will try to use this in my next R project.

Wednesday, February 8, 2017

Failing to use optimal solution as initial point

In a somewhat complex non-linear programming model, I observed the following. I do here two solves in a row:

solve m2 minimizing zrf using dnlp;
solve
m2 minimizing zrf using dnlp;

We solve the model, and then in the second solve we use the optimal solution from the first solve as an initial point. This should make the second solve rather easy! But what I see is somewhat different.

The first solve has no problems. (I fed it a close to optimal initial point). The log looks like:

--- Generating DNLP model m2
--- testmodel6.gms(83981) 156 Mb  22 secs
--- testmodel6.gms(84048) 160 Mb
---   139,443 rows  139,450 columns  746,438 non-zeroes
---   8,504,043 nl-code  467,552 nl-non-zeroes
--- testmodel6.gms(84048) 116 Mb
--- Executing CONOPT: elapsed 0:01:44.102
CONOPT 3         24.7.3 r58181 Released Jul 11, 2016 WEI x86 64bit/MS Windows
 
 
    C O N O P T 3   version 3.17A
    Copyright (C)   ARKI Consulting and Development A/S
                    Bagsvaerdvej 246 A
                    DK-2880 Bagsvaerd, Denmark
 
 
   Iter Phase Ninf   Infeasibility   RGmax    NSB   Step InItr MX OK
      0   0        8.7137917110E-04 (Input point)
 
                   Pre-triangular equations:   9720
                   Post-triangular equations:  129722
 
      1   0        3.0000000000E-05 (After pre-processing)
      2   0        4.6875000000E-07 (After scaling)
      3   0     0  7.3256941462E-17               1.0E+00      F  T
 
** Feasible solution. Value of objective =   0.905233579345
 
   Iter Phase Ninf     Objective     RGmax    NSB   Step InItr MX OK
      4   3        9.0523357374E-01 1.9E-03     6 1.0E+00    3 T  T
      5   3        9.0523357341E-01 1.0E-03     3 1.0E+00    2 T  T
      6   3        9.0523357341E-01 1.3E-12     3
 
** Optimal solution. Reduced gradient less than tolerance.

We are hopeful the second solve should be even easier, but we see:

--- Generating DNLP model m2
--- testmodel6.gms(83981) 69 Mb
*** Error at line 83981: overflow in * operation (mulop)
--- testmodel6.gms(83981) 155 Mb 1 Error  22 secs
--- testmodel6.gms(84049) 116 Mb 1 Error
*** SOLVE aborted

How can an optimal solution be an invalid initial point?

My analysis is the following:

  • The equation we cannot generate is part of what Conopt calls “Post-triangular equations”. These equations are set aside by Conopt when solving the model, and are reintroduced just before reporting the solution because Conopt knows these equations will not change the solution. Typically “accounting rows” are such equations. These are constraints that could have been implemented as assignments in post-processing as they just calculate some additional information about the solution. In most cases these post-triangular equations do not require the evaluation of gradients.
  • When we generate the model again for the second time, GAMS will try to evaluate the gradients. It is here where we encounter the overflow. In this model things are way more complicated but think about the function \(f(x)=\sqrt{x}\) at \(x=0\). The function itself can be evaluated at \(x=0\), but the gradient can not. (Actually GAMS will patch the derivative of \(\sqrt{x}\) at \(x=0\) to fix this, but it is not smart enough to do this on my functions).
  • This also means that there is a architectural flaw here: we should do the presolve before evaluating gradients. Here GAMS will evaluate gradients, and subsequently CONOPT will presolve the model. This is exactly in the wrong order. Note that AMPL will do a presolve in the generation of the model opposed to GAMS which leaves it to the solver. I believe AMPL is doing the right thing here.

A work-around is:

solve m2 minimizing zrf using dnlp;
x.l(jd) = max(x.l(jd),1e-6);
solve
m2 minimizing zrf using dnlp;

Monday, January 30, 2017

Little trick: random positive definite matrix

Sometimes we want to test a model with some random data. When dealing with quadratic problems we may need to invent a positive definite matrix \(Q\). One simple way to do this is as follows:

image

I.e. in matrix terms: \(Q:=Q^TQ\) (i.e. the assignment above is not done one-by-one but rather in parallel). To be precise: \(Q\) will be positive semi-definite, i.e. \(x^TQx\ge 0\).

This simple device can help generating random data for convex quadratic models.

Thursday, January 26, 2017

MIP modeling: Selecting subsets

Consider the following problem, original from (1):

We have a population of items with values having a mean \(\mu\). Furthermore we have a number of subsets of this population. Each subset \(i\) has a known count \(n_i\) and mean \(\mu_i\). It can be assumed the subsets do not overlap: the pairwise intersection between two subsets is always empty. We want to select \(N\) subsets such that the total average of all selected items is as close to \(\mu\) as possible.

OK, let’s see. Let:

\[\delta_i = \begin{cases}1 & \text{if subset $i$ is selected}\\
                                   0 & \text{otherwise}\end{cases}\]

We could model the problem as:

\[\boxed{\begin{align} \min \> & \left| \mu – \frac{\sum_i n_i \mu_i \delta_i}{\sum_i n_i \delta_i} \right|\\
    & \sum_i \delta_i = N \\
    & \delta_i \in \{0,1\} \end{align}} \]

where \(|.|\) indicates `absolute value`.

The objective is somewhat complicated. The ratio is the average of the items in the selected subsets. We can see this as follows. The average is the sum of the values of the items in the selected subsets divided by the number if selected items. Of course, the sum of values in a subset \(i\) is \(n_i \mu_i\) which explains the numerator in the fraction.

This model is quite nonlinear, but there is a way to linearize this. The first thing we can do is introduce a variable \(y\) defined by

\[y=\frac{\sum_i n_i \mu_i \delta_i}{\sum_i n_i \delta_i}\]

and reformulate as:

\[ \sum_i n_i (y \delta_i ) = \sum_i n_i \mu_i \delta_i\]

This is nonlinear because of the multiplication \(y\cdot\delta_i\). However, a multiplication of a binary variable and a continuous variable can be linearized with a little bit of effort (2). Let’s introduce a new variable \(x_i = y \cdot \delta_i\), then we can form the linear inequalities:

\[\begin{align}
    &y^{LO} \cdot \delta_i \le x_i \le y^{UP} \cdot \delta_i\\
    &y-y^{UP}\cdot (1-\delta_i) \le x_i \le y-y^{LO}\cdot (1-\delta_i)
\end{align}\]

Here \(y^{LO},y^{UP}\) is the lower- and upper-bound on \(y\). We can use:

\[\begin{align} &y^{LO} := \min_i \mu_i\\
&y^{UP} := \max_i \mu_i \end{align}\]

With this we can rewrite the condition \(\sum_i n_i (y \delta_i) = \sum_i n_i \mu_i \delta_i\) as:

\[\sum_i n_i x_i = \sum_i n_i \mu_i \delta_i\]

The absolute value can be handled in a standard way, e.g.:

\[\begin{align} \min\>& z\\
                              &-z \le \mu – y \le z \end{align}\]

Now we have all the tools to write the complete linear model:

\[\boxed{\begin{align}
      \min\>& z\\
               &-z \le \mu – y \le z \\
                  &y^{LO} \cdot \delta_i \le x_i \le y^{UP} \cdot \delta_i\\
    &y-y^{UP}\cdot (1-\delta_i) \le x_i \le y-y^{LO}\cdot (1-\delta_i) \\
    &\sum_i n_i x_i = \sum_i n_i \mu_i \delta_i\\
    & \sum_i \delta_i = N \\
    & \delta_i \in \{0,1\} \\
    & z \ge 0 \\
    & y^{LO} \le y \le y^{UP} \\
    & x_i \>\text{free}
\end{align}}\]

What if the subsets can overlap? That means a data element \(a_k\) can be a member of multiple subsets. If we want it to be counted just once, we need to consider the data elements themselves. This implies the need to introduce a binary variable that indicates whether \(a_k\) is in any of the selected subsets. I believe the high-level model can look like:

\[\boxed{\begin{align}
    \min\> & \left| \mu – \frac{\sum_k a_k \alpha_k}{\sum_k \alpha_k} \right|\\
              & \alpha_k = \prod_i s_{i,k} \delta_i\\
              & \sum_i \delta_i = N \\
              & \delta_i, \alpha_k \in \{0,1\}
\end{align}}\]

Here we have data: \(s_{i,k}=1\) if value \(a_k\) is in subset \(i\). This can be linearized in a similar fashion. 

References
  1. http://stackoverflow.com/questions/41879502/find-subset-with-similar-mean-as-full-set
  2. Multiplication of a continuous and a binary variable, http://yetanothermathprogrammingconsultant.blogspot.com/2008/05/multiplication-of-continuous-and-binary.html

Tuesday, January 24, 2017

Employee Scheduling III: generating all shifts using the Cplex solution pool

In (1) and (2) all possible shifts for a small Employee Scheduling problem are calculated by programming. In (3) we generate new shifts one by one using MIP model as part of a Column Generation algorithm. Here I want to show a different approach to generate all possible feasible shifts with a MIP model using the Cplex solution pool. So this is a post about “modeling instead of programming”.

The model is very similar to the sub-problem in the Column Generation scheme:

binary variables
   w(t)
'work at time t'
   start(t)
'start of shift'
   emp(e)
'select shift from employee e'
;
positive variables
   shiftlen 
'length of shift'
;
variable dummy;

equations

   startShift(t)   
'start of shift: w(t) goes from 0 to 1'
   oneShiftStart   
'exactly one start of shift'
   calcLen         
'calculate length of shift'
   oneEmployee     
'pick one employee'
   minLength       
'observe minimum shift length'
   maxLength       
'observe maximum shift length'
   feasibleShift(t)
'make sure shift is feasible'
   dummyObj
;

startShift(t)..  start(t) =g= w(t) - w(t--1);
oneShiftStart.. 
sum(t,start(t)) =e= 1;
calcLen..        shiftLen =e=
sum
(t, w(t));
oneEmployee..   
sum
(e, emp(e)) =e= 1;
minLength..      shiftLen =g=
sum
(e, minlen(e)*emp(e));
maxLength..      shiftLen =l=
sum
(e, maxlen(e)*emp(e));
feasibleShift(t).. w(t) =l=
sum
(canwork(e,t),emp(e));
dummyObj..       dummy =e= 0;


option
mip=cplex;

$onecho > cplex.opt
SolnPoolAGap = 0.0

SolnPoolIntensity = 4
PopulateLim = 10000
SolnPoolPop = 2
solnpoolmerge solutions.gdx
$offecho

model solpool /all/;
solpool.optfile=1;
solve
solpool using mip minimizing dummy;

Actually the model is quite simpler as in the CG sub-problem we needed a rather complicated objective. Here we use a dummy objective as we just need to find a feasible shift.

This model finds a shift encoded by the binary variable \(w_t\) and a corresponding employee denoted by \(emp_e \in \{0,1\}\) (the selected employee will have a value of one).

The start of a shift is easily found: we need to look for a pattern \(w_t,w_{t+1} = 0, 1\). Note that \(w_1\) for \(t=1\) is special: it may be a continuation of a shift with \(w_{24}=1\). This is the reason why in the GAMS model we use “start(t) =g= w(t) - w(t--1)” with a circular lag operator --. To make sure shifts are contiguous we only need to require that \(\sum_t start_t \le 1\) (note that \(\sum_t start_t = 0\) can not happen, so we can also write \(\sum_t start_t = 1\)).

We check if a shift is only using allowed time slots in equation feasibleShift. Here we use the set canwork(e,t) indicating which time periods are allowed or forbidden for a given employee (see (2)). We require here that if canwork=0 then the corresponding \(w_t=0\).

The Cplex solution pool (4) allows us to find all feasible integer solutions for a MIP very fast. This technology is used here to find all feasible shifts and indeed we find:

image

We see we find the same 1295 shifts, although in a different order (the order is not important for our purposes).

Now we have found this complete collection of feasible shifts, we can solve our scheduling model:

\[\begin{align}\min\>&\sum_s c_s x_s\\
       &\sum_{s|E(e,s)} x_s \le 1 \>\forall e\\
       &\sum_{s|S(s,t)} x_s \ge R_t \> \forall t\\
        &x_s \in\{0,1\}\end{align}\]

With this approach we just use one MIP solve to find all feasible shifts and a second MIP solve to find the optimal schedule.

References

  1. Loren Shure, Generating an Optimal Employee Work Schedule Using Integer Linear Programming, January 6, 2016, http://blogs.mathworks.com/loren/2016/01/06/generating-an-optimal-employee-work-schedule-using-integer-linear-programming/ describes the problem with a Matlab implementation.
  2. Employee Scheduling I : Matlab vs GAMS, http://yetanothermathprogrammingconsultant.blogspot.com/2017/01/employee-scheduling-i-matlab-vs-gams.html has a GAMS version of the model in (1).
  3. Employee Scheduling II : Column Generation, http://yetanothermathprogrammingconsultant.blogspot.com/2017/01/employee-scheduling-ii-column-generation.html
  4. IBM, What is the solution pool?, http://www.ibm.com/support/knowledgecenter/SSSA5P_12.7.0/ilog.odms.cplex.help/CPLEX/UsrMan/topics/discr_optim/soln_pool/02_soln_pool_defn.html

Monday, January 23, 2017

Employee Scheduling II: column generation

In (1) and (2) an Employee Scheduling model is considered where all possible shifts are enumerated before we solve the actual model. Although such models models tend to solve quickly (especially considering their size), generating all possible shifts may not be an option for larger problems. One possible way to attack this problem is using a traditional column generation approach. This method will generate attractive candidate solutions and add them to the problem.

Our original problem was:

\[\begin{align}\min\>&\sum_s c_s x_s\\ &\sum_{s|E(e,s)} x_s \le 1 \>\forall e\\ &\sum_{s|S(s,t)} x_s \ge R_t \>\forall t\\ &x_s \in\{0,1\}\end{align}\]

where

\[x_s = \begin{cases}1 & \text{if shift $s$ is selected}\\  0 & \text{otherwise} \end{cases}\]

Instead of generating all possible shifts in advance, we start with a small initial set \(s \in S\). We want this initial set to form a feasible solution for the problem. To make this easier we rewrite the problem as:

\[\begin{align}
\min\> & \sum_s c_s x_s + \sum_t c_t s_t\\ &\sum_{s|E(e,s)} x_s \le 1\>\forall e\\ &\sum_{s|S(s,t)} x_s + s_t \ge R_t\>\forall t \\ &x_s \in\{0,1\}\\ & s_t \ge 0\end{align}\]

Here \(s_t\) is a slack with a penalty \(c_t\). We also can consider this as extra, flexible, hourly personnel to hire at a hourly rate of \(c_t\) to cover staffing shortages. Sometimes we call this scheme: making the model “elastic”.

The column generation algorithm iterates between two different problems. First we have a master problem that is essentially a relaxed version of the elastic model we just discussed. We replace the condition \(x_s \in \{0,1\}\) by \(x_s \in [0,1]\). Using this we now have an LP (and thus we can generate duals). Second, there is a sub problem that calculates a candidate column (or candidate shift). 

image  

The sub problem uses a particular objective: it minimizes the reduced cost of the new LP column \(A_j\) wrt to the master problem. I.e. the objective is basically

\[\min \sigma_j = c_j – \pi^T A_j\]

where \(\pi\) are the LP duals of the master problem.

This scheme is called column generation as we increase the size of the number of variables \(x_s\) during the algorithm:

image

Note that the columns \(s_t\) stay fixed.

We stop iterating as soon as we can no longer find a new column with \(\sigma_j < 0\), indicating a “profitable” column. We now have collection of shifts and an LP solution. I.e. we have fractional variable values. To make this a usable schedule we need to solve one more MIP: introduce again the condition \(x_s \in\{0,1\}\). The complete algorithm can look like:

  1. Generate an initial (small) set \(S\) of feasible shifts
  2. Solve the relaxed master problem (LP)
  3. Solve the sub problem
  4. If \(\sigma_j < 0\) add shift to the set \(S\) and go to step 2
  5. Solve master problem as MIP

Initial Set

For the problem in (1) we generate 100 random feasible shifts. The first 10 shifts are as follows:

image

Note that we have here two shifts for employee White. This is no problem: the master problem with select the most suitable one to include in the schedule. All these shits are feasible: they obey the minimum and maximum hours and will not use hours an employee is not available. If the shifts are not sufficient to cover all staffing requirements, no problem: the LP model will start hiring the expensive hourly “emergency” workers. Indeed for these 100 random shifts we need some extra hourly help:

----    214 VARIABLE short.L  staffing shortage in time t

t5  1.000,    t6  1.000,    t7  1.000,    t9  2.000,    t13 1.000,    t15 1.000,    t16 1.000,    t17 1.000
t18 1.000,    t19 1.000,    t23 1.000


Master Problem

The master problem expressed in GAMS looks like:

set
   s0
'shifts' /s1*s1000/
   s(s0)
'dynamic subset will grow during algorithm'
   esmap(e,s0)
'mapping between employee and shift'
   shifts(s0,t)
'data structire to hold shifts'
;

parameters
   a(t,s0)
'matrix growing in dimension s'
   c(s0)  
'cost vector growing in dimension s'
;

scalar
   shortcost
'cost to hire extra hourly staff' /200/
;

binary variable x(s0) 'main variable: shift is selected';
positive variable short(t) 'staffing shortage in time t'
;
variable
cost;

equations

  masterObj       
'calculate total cost'
  oneShift(e)     
'each employee can do at most one shift'
  coverPeriod(t)  
'cover each period'
;

masterObj..
   cost =e=
sum(s, c(s)*x(s)) + sum(t,shortcost*short(t));

oneShift(e)..
  
sum
(esmap(e,s), x(s)) =l= 1;

coverPeriod(t)..
  
sum
(shifts(s,t), x(s)) + short(t) =g= requirements(t);

model Master /masterObj,oneShift,coverPeriod/
;


Sub problem

The sub problem is somewhat more difficult. We need to find a feasible shift and minimize the reduced cost \(\sigma_j\).

set sh 'feasible shift hours' /sh2*sh8/;
parameter
shlen(sh);
shlen(sh) =
ord
(sh)+1;

binary variables

   w(t)
'work at time t'
   start(t)
'start of shift'
   emp(e)
'select shift from employee e'
   shft(sh)
'select feasible shift length'
   se(sh,e)
'select (sh,e) combination'
;
variables
   shiftlen 
'length of shift'
   wagecost
   redcost  
'reduced cost'
;

equations
   startShift(t)   
'start of shift: w(t) goes from 0 to 1'
   oneShiftStart   
'exactly one start of shift'
   calcLen         
'calculate length of shift'
   oneEmployee     
'pick one employee'
   minLength       
'observe minimum shift length'
   maxLength       
'observe maximum shift length'
   feasibleShift(t)
'make sure shift is feasible'
   calcShift1      
'select shift length'
   calcShift2      
'id'
   secomb(sh,e)    
'shift/employee combination'
   calcwage        
'calculate cost of shift'
   subObj          
'objective'
;

startShift(t)..  start(t) =g= w(t) - w(t--1);
oneShiftStart.. 
sum(t,start(t)) =e= 1;
calcLen..        shiftLen =e=
sum
(t, w(t));
oneEmployee..   
sum
(e, emp(e)) =e= 1;
minLength..      shiftLen =g=
sum
(e, minlen(e)*emp(e));
maxLength..      shiftLen =l=
sum
(e, maxlen(e)*emp(e));
feasibleShift(t).. w(t) =l=
sum
(canwork(e,t),emp(e));
calcShift1..    
sum
(sh,shft(sh)) =e= 1;
calcShift2..    
sum
(sh,shlen(sh)*shft(sh)) =e= shiftlen;
secomb(sh,e)..   se(sh,e) =g= shft(sh) + emp(e) - 1;
calcwage..       wageCost =e=
sum
((sh,e),shlen(sh)*wage(e)*se(sh,e));
subObj..
  redcost =e= wageCost -
sum(e,emp(e)*oneShift.m(e)) - sum
(t, w(t)*coverPeriod.m(t));
model sub /subObj,startShift,oneShiftStart,calcLen,oneEmployee,minLength,

          
maxLength,feasibleShift,calcShift1,calcShift2,secomb,calcWage/
;

 

Some of the modeling issues with the sub problem are explained in more detail in (3).

Column Generation Algorithm

The CG algorithm itself can look like:

sets
  iter
/iter1*iter100/
;

scalars
   continue
/1/
;
parameter
   masterObjVal(iter)
;

loop(iter$continue,

  
solve
master using rmip minimizing cost;
   masterObjVal(iter) = cost.L;

  
solve
sub using mip minimizing redcost;

  
if
(redcost.l < -0.001,
      esmap(e,nexts)$(emp.l(e)>0.5) =
yes
;
      shifts(nexts,t)$(w.l(t)>0.5) =
yes
;
      c(nexts) = wagecost.L;
      s(nexts) =
yes
;
      nexts(s0) = nexts(s0-1);
  
else

     continue = 0;
   );

);

solve
master using mip minimizing cost;


Results

We need to do 36 cycles before we can conclude there are no more interesting columns. The objective of the relaxed master problem decreases to an LP objective representing a cost of $4670. Of course the LP objective is non-increasing: adding a column (i.e. adding a shift) can only help.

image

After these 36 iterations we have collected 135 shifts (remember we starting in the first iteration with our 100 randomly generated initial shifts). This compares quite favorably to the 1295 shifts we obtained by enumerating all possible shifts. When we reintroduce the integer restrictions \(x_s\in \{0,1\}\) the objective value deteriorates to $4675. This is just a little bit worse than the global optimal objective of $4670. Indeed this method does not guarantee to find absolute best integer solution, however often it finds very good solutions.

The final schedule looks like:

image

We don’t need any expensive hourly “emergency” staff, i.e. \(s_t = 0\). Again we have no slack in the system: we precisely meet the staffing requirements every hour:

image

 

References

  1. Loren Shure, Generating an Optimal Employee Work Schedule Using Integer Linear Programming, January 6, 2016, http://blogs.mathworks.com/loren/2016/01/06/generating-an-optimal-employee-work-schedule-using-integer-linear-programming/ describes the problem with a Matlab implementation.
  2. Employee Scheduling I: Matlab vs GAMS, http://yetanothermathprogrammingconsultant.blogspot.com/2017/01/employee-scheduling-i-matlab-vs-gams.html has a GAMS version of the model in (1).
  3. Employee Scheduling III: generating all shifts using the Cplex solution pool, http://yetanothermathprogrammingconsultant.blogspot.com/2017/01/employee-scheduling-iii-generating-all.html
  4. Erwin Kalvelagen, Column Generation with GAMS, http://www.amsterdamoptimization.com/pdf/colgen.pdf has some background on Column Generation in GAMS.