Yet Another Math Programming Consultant: May 2020

Saturday, May 30, 2020

Pictures don't match

The Financial Times had an interesting picture of per capita excess deaths. In many respects, this is the best measure to compare how severe different countries are hit by the Coronavirus.

Here is the post [1]:

However, when we go to the corresponding FT page [2], we actually see a different graph:

I suspect the first picture is wrong w.r.t. Spain.

Indeed in a PS of the FT page [2] we see:

This article has been amended to take into account a one-off revision to Spanish data on Thursday. This meant the UK now has the second-highest death rate from coronavirus after Spain rather than the highest rate as originally reported. This article has been modified to replace a chart linking excess deaths to lockdown dates with one linking excess deaths per million.

Looking at the complete picture:

I am wondering about the difference between the first (total excess deaths per capita) with the last (percentage difference with average). I expected these to be more similar. Could it be that Peru has a low death rate (younger population)? Again the text has a note on this:

Peru has seen a large rise in deaths this year partly because it has had to battle other diseases, in addition to coronavirus, with its overstretched health system.

References

UK has Europe's highest excess death toll and second highest in the world, https://www.youtube.com/post/UgxDGaBKDom1wrc91fp4AaABCQ
UK suffers second-highest death rate from coronavirus, https://www.ft.com/content/6b4c784e-c259-4ca4-9a82-648ffde71bf0

Installing Pyomo on colab by google

Google has a nice online environment (https://colab.research.google.com/) to run Python code. So let's try Pyomo...

The installation instructions are simple:

The first step was easy:

However, the second step was unsuccessful:

This is of course a little bit disappointing. I am not aware of a workaround.

Monday, May 25, 2020

The Miracle Sudoku

Introduction

In [1] another strange Sudoku variant is presented.

The givens are just:

Obviously, more than just the standard Sudoku constraints are needed to have just one unique feasible solution. So, for this problem we have the following rules:

Standard Sudoku rules apply. So, for each row, column, and sub-block (a.k.a. nonet [3]), we have uniqueness constraints (each cell in a region has a unique number between 1 and 9).
Borrowing from chess: any two cells corresponding to a knight's move, must have different values.
Similarly, any two cells that form a king's move, must also be different.
Orthogonally adjacent cells must be non-consecutive.

Standard Sudoku setup

When we want to solve Sudokus, the easiest approach is to define the following binary decision variables [4]: \[x_{i,j,k} = \begin{cases} 1 & \text{if cell $(i,j)$ has value $k$} \\ 0 & \text{otherwise}\end{cases}\] Here $k \in \{1,\dots,9\}$. We have 27 areas we need to check for unique values: rows, columns and nonets. We can organize this as a set: \[u_{a,i,j}\>\text{exists if and only if area $a$ contains cell $(i,j)$}\] This is data. We also have these two given cells, i.e. fixed variables for these cells. The resulting MIP model can look like [4]:

Mixed-Integer Programming Model for standard Sudoku
\[\begin{align}\min\>& 0 && && \text{Dummy objective}\\ & \sum_k \color{darkred}x_{i,j,k} = 1 &&\forall i,j && \text{One $k$ in each cell}\\ & \sum_{i,j\|\color{darkblue}u_{a,i,j}} \color{darkred}x_{i,j,k} = 1 && \forall a,k && \text{Unique values in each area}\\ & \color{darkred}x_{i,j,k} = 1 && \text{where } k=\color{darkblue}{\mathit{Given}}_{i,j} &&\text{Fix given values}\\ &\color{darkred}x_{i,j,k} \in \{0,1\} \end{align}\]

Mixed-Integer Programming Model for standard Sudoku

\[\begin{align}\min\>& 0 && && \text{Dummy objective}\\ & \sum_k \color{darkred}x_{i,j,k} = 1 &&\forall i,j && \text{One $k$ in each cell}\\ & \sum_{i,j|\color{darkblue}u_{a,i,j}} \color{darkred}x_{i,j,k} = 1 && \forall a,k && \text{Unique values in each area}\\ & \color{darkred}x_{i,j,k} = 1 && \text{where } k=\color{darkblue}{\mathit{Given}}_{i,j} &&\text{Fix given values}\\ &\color{darkred}x_{i,j,k} \in \{0,1\} \end{align}\]

During post-processing, we can calculate the completed grid using the optimal values of $x$: \[v_{i,j} = \sum_k k \cdot x^*_{i,j,k}\] or we can include them as "accounting rows" to the model. Accounting rows are constraints that have the function to calculate some quantities for reporting purposes.

The implementation in GAMS uses a set u(a,i,j)that is populated as:

set
a 'areas' /r1*r9,c1*c9,s1*s9/
i(a) 'rows' /r1*r9/
j(a) 'columns' /c1*c9/
s(a) 'squares' /s1*s9/
u(a,i,j) 'areas with unique values'
;

* populate u(a,i,j): u(a,i,j)=YES if area a contains cell (i,j)
* see https://yetanothermathprogrammingconsultant.blogspot.com/2016/10/mip-modeling-from-sudoku-to-kenken.html
u(i,i,j) = yes;
u(j,i,j) = yes;
u(s,i,j)$(ord(s)=3*ceil(ord(i)/3)+ceil(ord(j)/3)-3) = yes;

The model equations look like:

binary variable x(i,j,k);
* v0(i,j) are the givens
x.fx(i,j,k)$(v0(i,j)=ord(k)) = 1;
variable z 'dummy objective';
integer variable v(i,j);
v.lo(i,j) = 1;
v.up(i,j) = 9;

equation
   dummy 'objective'
   unique(a,k) 'all-different'
   values(i,j) 'one value per cell'
   eqv(i,j) 'calculate values'
;

dummy.. z =e= 0;
unique(a,k).. sum(u(a,i,j), x(i,j,k)) =e= 1;
values(i,j).. sum(k, x(i,j,k)) =e= 1;
eqv(i,j).. v(i,j) =e= sum(k, ord(k)*x(i,j,k));

model sudoku /all/;
solve sudoku minimizing z using mip;

We can test this approach with the above data. We will actually find a feasible solution, but of course it will not be unique. I believe there are likely millions of different solutions (or more). Counting them is not that easy. I tried to give it a shot with a solution pool approach: after setting a limit if 1,000,000 solutions, the solver stopped after reaching this limit. So, there are more than 1 million solutions for this (partial) problem.

Knight's jumps

To model how knight's moves are covering cells, we can have a look at [5].

To place knights we set up a set $\mathit{jump}_{i,j,i',j'}$ indicating if we can jump from cell $(i,j)$ to cell $(i',j')$. If we can jump from $(i,j) \rightarrow (i',j')$ we can also jump from $(i',j') \rightarrow (i,j)$. We don't want to check both cases. To prevent this double check, we only need to look forward. So for each $(i,j)$ we need to consider just four cases:

	$j$	$j+1$	$j+2$
$i-2$		$x_{i-2,j+1}$
$i-1$			$x_{i-1,j+2}$
$i$	$x_{i,j}$
$i+1$			$x_{i+1,j+2}$
$i+2$		$x_{i+2,j+1}$

Note that near the border we may have fewer than four cases. In GAMS we can populate the set $\mathit{jump}$ in a straightforward manner:

alias (i,ii),(j,jj);
set jump(i,j,ii,jj);

jump(i,j,i-2,j+1) = yes;
Jump(i,j,i-1,j+2) = yes;
Jump(i,j,i+1,j+2) = yes;
Jump(i,j,i+2,j+1) = yes;

There are 224 elements in the set $\mathit{jump}$.

Now comes the more complicated part: how to make sure that the values of cell $(i,j)$ and $(i',j')$ are not the same. There are basically two approaches:

Compare the values $v_{i,j}$ and $v_{i',j'}$. The constraint becomes \[|v_{i,j}-v_{i',j'}| \ge 1 \>\> \forall i,j,i',j'|\mathit{jump}(i,j,i',j')\] This can be linearized using \[v_{i,j}-v_{i',j'} \ge 1 \textbf{ or } v_{i',j'}-v_{i,j} \ge 1\] or \[\begin{align}& v_{i,j}-v_{i',j'} \ge 1 - M\cdot \delta_{i,j,i',j'} \\ & v_{i',j'}-v_{i,j} \ge 1- M (1-\delta_{i,j,i',j'}) \\ &\delta_{i,j,i',j'} \in \{0,1\}\end{align}\] Here $M$ is a large enough constant ($M=10$ should suffice).
Directly work with $x_{i,j,k}$ and $x_{i',j',k}$. The constraint should be \[x_{i,j,k} x_{i',j',k} = 0 \>\> \forall k, \forall i,j,i',j'|\mathit{jump}(i,j,i',j') \] This non-linear constraint can be linearized as \[x_{i,j,k}+x_{i',j',k}\le 1\] This clearly is the easier route.

In GAMS this can look like:

equation Different(i,j,ii,jj,k) 'cells forming a knights move should be different';

Different(jump(i,j,ii,jj),k)..
x(i,j,k) + x(ii,jj,k) =l= 1;

Kings's moves

Here we want to implement a similar restriction as for the knight's move. As the constraints are the same, we just have to augment the set $\mathit{jump}$ a bit.

alias (i,ii),(j,jj);
set jump(i,j,ii,jj) 'knights or kings move';

* knight
jump(i,j,i-2,j+1) = yes;
jump(i,j,i-1,j+2) = yes;
jump(i,j,i+1,j+2) = yes;
jump(i,j,i+2,j+1) = yes;
* king
jump(i,j,i+1,j+1) = yes;
jump(i,j,i-1,j+1) = yes;

After this, the set $\mathit{jump}$ has grown from 224 to 352 elements.

Orthogonal neighbors can't be consecutive

First we need to build a set that models "orthogonal neighbors". Similar to our set $\mathit{jump}$, we want to prevent comparing pairs twice. So given a cell $(i,j)$, we only need to consider two neighbors.

	$j$	$j+1$
$i-1$	$x_{i-1,j}$
$i$	$x_{i,j}$	$x_{i,j+1}$
$i+1$

This is coded in GAMS as:

set nb(i,j,ii,jj) 'orthogonal neighbors';
nb(i,j,i-1,j) = yes;
nb(i,j,i,j+1) = yes;

This set has 144 elements.

The constraints are: \[\begin{align}&x_{i,j,k} + x_{i',j',k+1} \le 1\\&x_{i,j,k} + x_{i',j',k-1} \le 1\end{align}\>\> \forall i,j,i',j'|nb(i,j,i',j'), \forall k\]

The GAMS representation is:

equations
   NonConsecutive1(i,j,ii,jj,k) 'orthogonal neighbors constraint'
   NonConsecutive2(i,j,ii,jj,k) 'orthogonal neighbors constraint'
;

NonConsecutive1(nb(i,j,ii,jj),k+1)..
     x(i,j,k) + x(ii,jj,k+1) =l= 1;
NonConsecutive2(nb(i,j,ii,jj),k-1)..
     x(i,j,k) + x(ii,jj,k-1) =l= 1;

Detail: The last index of the equation definition has a $k+1$ or $k-1$. This tells GAMS not to generate all possile constraints for all $k$ but rather one less. This is not strictly needed: in GAMS addressing outside the domain results in a zero. But this trick will prevent singleton constraints $x_{i,j,k} + 0 \le 1$ to be generated.

Results

The complete model has 811 variables (808 of them binary or integer). The number of constraints is 5,878. The results look like:

----    111 VARIABLE v.L  

            c1          c2          c3          c4          c5          c6          c7          c8          c9

r1           4           8           3           7           2           6           1           5           9
r2           7           2           6           1           5           9           4           8           3
r3           1           5           9           4           8           3           7           2           6
r4           8           3           7           2           6           1           5           9           4
r5           2           6           1           5           9           4           8           3           7
r6           5           9           4           8           3           7           2           6           1
r7           3           7           2           6           1           5           9           4           8
r8           6           1           5           9           4           8           3           7           2
r9           9           4           8           3           7           2           6           1           5

The problem solves in the presolve phase, i.e. 0 iterations and 0 nodes.

Uniqueness of the solution

To prove the uniqueness of the solution, we add a cut that forbids the current solution $x^*$: \[\sum_{i,j,k} x^*_{i,j,k} x_{i,j,k} \le 9^2-1\] The resulting model is infeasible. This means that the solution is indeed unique.

Conclusion

This Sudoku variant combines standard Sudoku rules with additional constraints, some of them borrowed from chess. This makes the modeling at bit trickier. In the development of the model, I have placed much of the logic in sets. This has the advantage that the constraints are fairly simple. In general, sets are easier to debug than constraints: we can display and debug sets before we have a working model.

Both the mathematical model and the GAMS representation is interesting. The mathematical formulation shows we can model this without extra (discrete) variables. The GAMS implementation details assembling sets that simplify constraints.

Sudoku models typically solve extremely fast: they are solved by the presolver. This model is no exception.

References

The Miracle Sudoku, https://youtu.be/yKf9aUIxdb4
Puzzled man solving 'miracle' sudoku becomes YouTube sensation,https://www.theguardian.com/lifeandstyle/2020/may/22/cracking-the-cryptic-puzzled-uk-men-become-internet-sensation-with-sudoku-channel
Glossary of Sudoku, https://en.wikipedia.org/wiki/Glossary_of_Sudoku
MIP Modeling: from Sudoku to KenKen via Logarithms, https://yetanothermathprogrammingconsultant.blogspot.com/2016/10/mip-modeling-from-sudoku-to-kenken.html
Chess and solution pool, https://yetanothermathprogrammingconsultant.blogspot.com/2018/11/chess-and-solution-pool.html

Deep learning in practice

A wonderfully mystical menu

or maybe, as someone suggested, an expression of modern Yemenite poetry. Presumably the result of some automatic translation tool.

https://twitter.com/vladadraws/status/1264355635434418177

Thursday, May 21, 2020

A small time arbitrage model

In [1] the following question was posed:

We have monthly prices for a product and I have predictions for the next 12 months. The idea is to exploit this and buy when prices are low and sell when they are high. We have limits on how much we can buy and sell and there are also inventory capacity constraints. What plan is the most profitable?

This is an interesting little problem. We will use R to experiment with this.

Data

The data set is as follows:

# data
price = c(12, 11, 12, 13, 16, 17, 18, 17, 18, 16, 17, 13)
capacity = 25
max_units_buy = 4
max_units_sell = 8

Model

The main vehicle we will use is the well-known inventory balance equation:\[\mathit{inv}_t = \mathit{inv}_{t-1} + \mathit{buy}_t - \mathit{sell}_t\] I assume that the initial inventory is \[\mathit{inv}_0 = 0\]

The optimization model for this problem can look like:

Model 1
\begin{align}\max&\sum_t \color{darkblue}{\mathit{price}}_t \cdot(\color{darkred}{\mathit{sell}}_t-\color{darkred}{\mathit{buy}}_t)\\ &\color{darkred}{\mathit{inv}}_t = \color{darkred}{\mathit{inv}}_{t-1} + \color{darkred}{\mathit{buy}}_t - \color{darkred}{\mathit{sell}}_t\\ &\color{darkred}{\mathit{inv}}_0 = 0\\ &\color{darkred}{\mathit{inv}}_t \in \{0,1,\dots,\color{darkblue}{\mathit{capacity}}\}\\ &\color{darkred}{\mathit{buy}}_t \in \{0,1,\dots,\color{darkblue}{\mathit{maxbuy}}\}\\ &\color{darkred}{\mathit{sell}}_t \in \{0,1,\dots,\color{darkblue}{\mathit{maxsell}}\}\\ \end{align}

Model 1

I used here integer decision variables: the assumption is we cannot deal with fractional values. Obviously, we can easily relax this and work with continuous variables.

Implimentation

To try this out, I am using CVXR [2] with the Glpk solver [3].

CVXR is using a matrix oriented notation. It is interesting to see how this can be applied to our inventory balance equation. The main trick is that we use a Lag operator implemented as a matrix-vector multiplication: \[ L \cdot \mathit{inv}\]

The Lag operator. We know that multiplication by an identity matrix gives us the same vector: \[ v = I \cdot v \] If we shift the diagonal down by one, we get what we want. Let's do a little experiment:

> n <- 5
> v <- 1:n
> v
[1] 1 2 3 4 5
> # matrix-vector multiplication with identity matrix
> diag(n) %*% v
     [,1]
[1,]    1
[2,]    2
[3,]    3
[4,]    4
[5,]    5
> # lag operator
> L = cbind(rbind(0,diag(n-1)),0)
> L
     [,1] [,2] [,3] [,4] [,5]
[1,]    0    0    0    0    0
[2,]    1    0    0    0    0
[3,]    0    1    0    0    0
[4,]    0    0    1    0    0
[5,]    0    0    0    1    0
> # matrix-vector multiplication with Lag operator
> L %*% v
     [,1]
[1,]    0
[2,]    1
[3,]    2
[4,]    3
[5,]    4
>

We see that $L$ is indeed the identity matrix but shifted one position down. When we multiply $Lv$ we see that the result vector is also shifted one down. The newly inserted element at the first position is zero, which is exactly what we want.

Note: if we want a non-zero initial inventory we need to add a vector $\mathit{initinv}$ with the first element being the initial inventory and the other elements all zero. Then we can write the inventory balance equation in matrix notation as \[\mathit{inv} = \mathit{initinv} + L \cdot \mathit{inv} + \mathit{buy} - \mathit{sell}\] In the R code below, we assumed this was not needed.

With this, we are ready to solve our problem:

> library(CVXR)
> 
> # data
> price = c(12, 11, 12, 13, 16, 17, 18, 17, 18, 16, 17, 13)
> capacity = 25
> max_units_buy = 4
> max_units_sell = 8
> 
> # number of time periods
> NT <- length(price)
> 
> # Decision variables
> inv = Variable(NT,integer=T)
> buy = Variable(NT,integer=T)
> sell = Variable(NT,integer=T)
> 
> # Lag operator
> L = cbind(rbind(0,diag(NT-1)),0)
> 
> # optimization model
> problem <- Problem(Maximize(sum(price*(sell-buy))),
+                    list(inv == L %*% inv + buy - sell,
+                         inv >= 0, inv <= capacity,
+                         buy >= 0, buy <= max_units_buy,
+                         sell >= 0, sell <= max_units_sell))
> result <- solve(problem,verbose=T)
GLPK Simplex Optimizer, v4.47
84 rows, 36 columns, 119 non-zeros
*     0: obj =  0.000000000e+000  infeas = 0.000e+000 (12)
*    35: obj = -1.040000000e+002  infeas = 0.000e+000 (0)
OPTIMAL SOLUTION FOUND
GLPK Integer Optimizer, v4.47
84 rows, 36 columns, 119 non-zeros
36 integer variables, none of which are binary
Integer optimization begins...
+    35: mip =     not found yet >=              -inf        (1; 0)
+    35: >>>>> -1.040000000e+002 >= -1.040000000e+002   0.0% (1; 0)
+    35: mip = -1.040000000e+002 >=     tree is empty   0.0% (0; 1)
INTEGER OPTIMAL SOLUTION FOUND
> cat("status:",result$status)
status: optimal
> cat("objective:",result$value)
objective: 104
> # print results
> data.frame(price,
+            buy=result$getValue(buy),
+            sell=result$getValue(sell),
+            inv=result$getValue(inv))
   price buy sell inv
1     12   4    0   4
2     11   4    0   8
3     12   4    0  12
4     13   4    0  16
5     16   4    0  20
6     17   0    8  12
7     18   0    8   4
8     17   4    0   8
9     18   0    8   0
10    16   4    0   4
11    17   0    4   0
12    13   0    0   0
>

We see indeed we are buying when things are cheap and selling when prices are high. This gives us a net profit of 104. The inventory capacity is never binding here.

A complication

In a subsequent post, a non-trivial wrinkle was added to the problem [4].

price = c(12, 11, 12, 13, 16, 17, 18, 17, 18, 16, 17, 13)
capacity = 25
max_units_buy_30 = 4 # when inventory level is lower then 30% it is possible to buy 0 to 4 units
max_units_buy_65 = 3 # when inventory level is between 30% and 65% it is possible to buy 0 to 3 units
max_units_buy_100 = 2 # when inventory level is between 65% and 100% it is possible to buy 0 to 2 units
max_units_sell_30 = 4 # when inventory level is lower then 30% it is possible to sell 0 to 4 units
max_units_sell_70 = 6 # when inventory level is between 30% and 70% it is possible to sell 0 to 6 units
max_units_sell_100 = 8 # when inventory level is between 70% and 100% it is possible to sell 0 to 8 units

This requires a little bit of work.

Time

The first thing to consider is how time is handled in the model. In Model 1, we just used a time index $t$ and did not really say much about it. Here, we need to be a bit more precise.

The variables buy and sell are what we call flow variables. Buying and selling takes place during time period $t$. Inventory, however, is measured at a specific point in time. In our case, this is at the end of period $t$. This is called a stock variable.

Looking at the picture we should put limits on $\mathit{buy}_t$ and $\mathit{sell}_t$ based on the inventory level at the end of the previous period: $\mathit{inv}_{t-1}$.

Segmenting inventory levels

From the question, we see that we have different intervals:

For buying, we have: 0%-30%, 30%-65% and 65%-100%
For selling, we have: 0%-30%, 30%-70% and 70%-100%

I prefer to deal with one set of intervals, so we combine this into:

0%-30%, 30%-65%, 65%-70% and 70%-100%

The idea is to introduce a binary variable $\delta_{k,t}$ indicating in which segment $k$ we are. Obviously, we can only be in one segment, so: \[\begin{align} &\sum_k \delta_{k,t} = 1 && \forall t\\ & \delta_{k,t} \in \{0,1\}\end{align}\] So depending on which $\delta_{k,t}$ is turned on, we have different lower- and upperbounds on the inventory levels.

This means we can write:\[\sum_k \mathit{invlb}_k \cdot \delta_{k,t}\le \mathit{inv}_{t-1} \le \sum_k \mathit{invub}_k \cdot \delta_{k,t}\] Notice that we put the bounds on the inventory level at the end of the previous period.

Limiting buying and selling

Here we need to link the values for $\delta_{k,t}$ to bounds on buying and selling. This is very similar to what we did with respect to inventory: \[\begin{align}\mathit{buy}_t \le \sum_k \mathit{buyub}_k \cdot \delta_{k,t} && \forall t \\\mathit{sell}_t \le \sum_k \mathit{sellub}_k \cdot \delta_{k,t} && \forall t\end{align}\]

Model

We now have all the pieces to write down the complete model.

Model 2
\begin{align}\max&\sum_t \color{darkblue}{\mathit{price}}_t \cdot(\color{darkred}{\mathit{sell}}_t-\color{darkred}{\mathit{buy}}_t)\\ &\color{darkred}{\mathit{inv}}_t = \color{darkred}{\mathit{inv}}_{t-1} + \color{darkred}{\mathit{buy}}_t - \color{darkred}{\mathit{sell}}_t&& \forall t\\ &\color{darkred}{\mathit{inv}}_0 = 0\\ &\color{darkred}{\mathit{inv}}_{t-1} \ge \sum_k \color{darkblue}{\mathit{invlb}}_k \cdot \color{darkred}\delta_{k,t} && \forall t \\ &\color{darkred}{\mathit{inv}}_{t-1} \le \sum_k \color{darkblue}{\mathit{invub}}_k \cdot \color{darkred}\delta_{k,t} && \forall t \\ &\color{darkred}{\mathit{buy}}_t \le \sum_k \color{darkblue}{\mathit{buyub}}_k \cdot \color{darkred}\delta_{k,t} && \forall t \\ &\color{darkred}{\mathit{sell}}_t \le \sum_k \color{darkblue}{\mathit{sellub}}_k \cdot \color{darkred}\delta_{k,t} && \forall t \\ &\sum_k \color{darkred}\delta_{k,t} = 1 && \forall t\\ &\color{darkred}{\mathit{inv}}_t \in \{0,1,\dots,\color{darkblue}{\mathit{capacity}}\}\\ &\color{darkred}{\mathit{buy}}_t \in \{0,1,\dots\}\\ &\color{darkred}{\mathit{sell}}_t \in \{0,1,\dots\}\\ &\color{darkred}\delta_{k,t} \in \{0,1\} \end{align}

Model 2

\begin{align}\max&\sum_t \color{darkblue}{\mathit{price}}_t \cdot(\color{darkred}{\mathit{sell}}_t-\color{darkred}{\mathit{buy}}_t)\\ &\color{darkred}{\mathit{inv}}_t = \color{darkred}{\mathit{inv}}_{t-1} + \color{darkred}{\mathit{buy}}_t - \color{darkred}{\mathit{sell}}_t&& \forall t\\ &\color{darkred}{\mathit{inv}}_0 = 0\\ &\color{darkred}{\mathit{inv}}_{t-1} \ge \sum_k \color{darkblue}{\mathit{invlb}}_k \cdot \color{darkred}\delta_{k,t} && \forall t \\ &\color{darkred}{\mathit{inv}}_{t-1} \le \sum_k \color{darkblue}{\mathit{invub}}_k \cdot \color{darkred}\delta_{k,t} && \forall t \\ &\color{darkred}{\mathit{buy}}_t \le \sum_k \color{darkblue}{\mathit{buyub}}_k \cdot \color{darkred}\delta_{k,t} && \forall t \\ &\color{darkred}{\mathit{sell}}_t \le \sum_k \color{darkblue}{\mathit{sellub}}_k \cdot \color{darkred}\delta_{k,t} && \forall t \\ &\sum_k \color{darkred}\delta_{k,t} = 1 && \forall t\\ &\color{darkred}{\mathit{inv}}_t \in \{0,1,\dots,\color{darkblue}{\mathit{capacity}}\}\\ &\color{darkred}{\mathit{buy}}_t \in \{0,1,\dots\}\\ &\color{darkred}{\mathit{sell}}_t \in \{0,1,\dots\}\\ &\color{darkred}\delta_{k,t} \in \{0,1\} \end{align}

Implementation

The R implementation can look like:

> library(CVXR)
> 
> # data
> price = c(12, 11, 12, 13, 16, 17, 18, 17, 18, 16, 17, 13)
> capacity = 25
> max_units_buy = 4
> max_units_sell = 8
> 
> # capacity segments
> s <- c(0,0.3,0.65,0.7,1)
> 
> # corresponding lower and upper bounds
> invlb <- s[1:(length(s)-1)] * capacity
> invlb
[1]  0.00  7.50 16.25 17.50
> invub <- s[2:length(s)] * capacity
> invub
[1]  7.50 16.25 17.50 25.00
> 
> buyub <- c(4,3,2,2)
> sellub <- c(4,6,6,8)
> 
> # number of time periods
> NT <- length(price)
> NT
[1] 12
> 
> # number of capacity segments
> NS <- length(s)-1
> NS
[1] 4
> 
> # Decision variables
> inv = Variable(NT,integer=T)
> buy = Variable(NT,integer=T)
> sell = Variable(NT,integer=T)
> delta = Variable(NS,NT,boolean=T)
> 
> # Lag operator
> L = cbind(rbind(0,diag(NT-1)),0)
> 
> # optimization model
> problem <- Problem(Maximize(sum(price*(sell-buy))),
+                    list(inv == L %*% inv + buy - sell,
+                         sum_entries(delta,axis=2)==1, 
+                         L %*% inv >= t(delta) %*% invlb,
+                         L %*% inv <= t(delta) %*% invub,
+                         buy <= t(delta) %*% buyub,
+                         sell <= t(delta) %*% sellub,
+                         inv >= 0, inv <= capacity,
+                         buy >= 0, sell >= 0))
> result <- solve(problem,verbose=T)
GLPK Simplex Optimizer, v4.47
120 rows, 84 columns, 369 non-zeros
      0: obj =  0.000000000e+000  infeas = 1.200e+001 (24)
*    23: obj =  0.000000000e+000  infeas = 0.000e+000 (24)
*    85: obj = -9.875986758e+001  infeas = 0.000e+000 (2)
OPTIMAL SOLUTION FOUND
GLPK Integer Optimizer, v4.47
120 rows, 84 columns, 369 non-zeros
84 integer variables, 48 of which are binary
Integer optimization begins...
+    85: mip =     not found yet >=              -inf        (1; 0)
+   123: >>>>> -8.800000000e+001 >= -9.100000000e+001   3.4% (17; 0)
+   126: >>>>> -9.000000000e+001 >= -9.100000000e+001   1.1% (9; 11)
+   142: mip = -9.000000000e+001 >=     tree is empty   0.0% (0; 35)
INTEGER OPTIMAL SOLUTION FOUND
> cat("status:",result$status)
status: optimal
> cat("objective:",result$value)
objective: 90
> # print results
> data.frame(price,
+            buy=result$getValue(buy),
+            sell=result$getValue(sell),
+            inv=result$getValue(inv),
+            maxbuy=t(result$getValue(delta)) %*% buyub,
+            maxsell=t(result$getValue(delta)) %*% sellub)
   price buy sell inv maxbuy maxsell
1     12   3    0   3      4       4
2     11   4    0   7      4       4
3     12   4    0  11      4       4
4     13   3    0  14      3       6
5     16   3    0  17      3       6
6     17   1    0  18      2       6
7     18   0    8  10      2       8
8     17   0    6   4      3       6
9     18   0    4   0      4       4
10    16   4    0   4      4       4
11    17   0    4   0      4       4
12    13   0    0   0      4       4
> # display the optimal values for delta
> print(result$getValue(delta))
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,]    1    1    1    0    0    0    0    0    1     1     1     1
[2,]    0    0    0    1    1    0    0    1    0     0     0     0
[3,]    0    0    0    0    0    1    0    0    0     0     0     0
[4,]    0    0    0    0    0    0    1    0    0     0     0     0
>

The results are interesting. Compared to the first model, the profit has decreased from 140 to 90.

One question that comes to mind is: why only buy 3 in the first period? Well, if we buy 4 and 4 units in periods 1 and 2, we can no longer buy 4 units in period 3. These kinds of decisions are very difficult to make by hand: it is just too complex.

Conclusion

This is a nice little example, where some formal modeling really helps. It also shows both the strengths and weaknesses of a CVX-like modeling tool: it can make matrix- and vector-based models very easy to express. But when we stray away from this a bit and want to implement modeling concepts like lags, things become somewhat more complicated.

References

R optimisation buy sell, https://stackoverflow.com/questions/61873468/r-optimisation-buy-sell
CVXR, Convex Optimization in R, https://cvxr.rbind.io/
Package Rglpk, https://cran.r-project.org/web/packages/Rglpk/Rglpk.pdf
R optimisation max buy/sell dependent on stock level, https://stackoverflow.com/questions/61900256/r-optimisation-max-buy-sell-dependent-on-stock-level

Friday, May 8, 2020

A scheduling problem

Problem Statement

The problem we are trying to solve here is a simple scheduling model [1]. But as we shall see, it has some interesting performance issues.

There are $N$ tasks (jobs) and $M$ rooms. We need to assign jobs to rooms where they are executed. We assume we execute one job at the time in each room (but jobs in different rooms can execute concurrently). Jobs require some resources (for example water tap, gas piping). Rooms provide certain resources. We need to make sure tasks are assigned to rooms that contain the resources that are needed. Finally, all jobs have a due date.

Data

Let's generate some data.

----     33 SET use  resource usage

         resource1   resource2   resource3   resource4   resource5

task2                                              YES
task3                                                          YES
task5                                  YES
task7          YES
task9                      YES                     YES
task11                                                         YES
task12         YES                                 YES
task13         YES
task14         YES
task15                                 YES
task16                                 YES                     YES
task17                     YES
task20                     YES                     YES
task21         YES         YES
task23                     YES
task24                                             YES
task25         YES                                             YES
task26                                 YES
task28                                                         YES


----     33 SET avail  resource availability

        resource1   resource2   resource3   resource4   resource5

room1                     YES         YES         YES         YES
room2                                 YES         YES
room3                                             YES         YES
room4         YES         YES         YES                     YES
room5         YES                     YES         YES         YES


----     33 PARAMETER length  job length

task1  2.335,    task2  4.935,    task3  4.066,    task4  1.440,    task5  4.979,    task6  3.321,    task7  1.666
task8  3.573,    task9  2.377,    task10 4.649,    task11 4.600,    task12 1.065,    task13 2.475,    task14 3.658
task15 3.374,    task16 1.138,    task17 4.367,    task18 4.728,    task19 3.032,    task20 2.198,    task21 2.986
task22 1.180,    task23 4.095,    task24 3.132,    task25 3.987,    task26 3.880,    task27 3.526,    task28 1.460
task29 4.885,    task30 3.827


----     33 PARAMETER due  job due dates

task1   5.166,    task2   5.333,    task3   5.493,    task4   5.540,    task5   6.226,    task6   8.105
task7   8.271,    task8   8.556,    task9   8.677,    task10  8.922,    task11 10.184,    task12 11.711
task13 11.975,    task14 12.814,    task15 12.867,    task16 14.023,    task17 14.200,    task18 15.820
task19 15.877,    task20 16.156,    task21 16.438,    task22 16.885,    task23 17.033,    task24 17.813
task25 21.109,    task26 21.713,    task27 23.655,    task28 23.977,    task29 24.014,    task30 24.507

We have 30 tasks, 5 rooms, and 5 resources. Note that some tasks don't need special resources (e.g. task1). They can execute in any room. Some jobs require resources that allow only one room. For instance, task9 needs resources 2 and 4. Only room1 provides this combination.

In the model, we actually don't need to know about resource usage. The only thing we need to know is whether job $i$ can be assigned to room $j$. So I calculated a set Allowed:

----     37 SET allowed  task is allowed to be executed in room

             room1       room2       room3       room4       room5

task1          YES         YES         YES         YES         YES
task2          YES         YES         YES                     YES
task3          YES                     YES         YES         YES
task4          YES         YES         YES         YES         YES
task5          YES         YES                     YES         YES
task6          YES         YES         YES         YES         YES
task7                                              YES         YES
task8          YES         YES         YES         YES         YES
task9          YES
task10         YES         YES         YES         YES         YES
task11         YES                     YES         YES         YES
task12                                                         YES
task13                                             YES         YES
task14                                             YES         YES
task15         YES         YES                     YES         YES
task16         YES                                 YES         YES
task17         YES                                 YES
task18         YES         YES         YES         YES         YES
task19         YES         YES         YES         YES         YES
task20         YES
task21                                             YES
task22         YES         YES         YES         YES         YES
task23         YES                                 YES
task24         YES         YES         YES                     YES
task25                                             YES         YES
task26         YES         YES                     YES         YES
task27         YES         YES         YES         YES         YES
task28         YES                     YES         YES         YES
task29         YES         YES         YES         YES         YES
task30         YES         YES         YES         YES         YES

Model 1

My first approach is to use no-overlap constraints for jobs that are assigned to the same room.

Mixed Integer Programming Model 1
\[ \begin{align} \min\>&\color{darkred}{\mathit{Makespan}}\\ &\sum_{j\|\color{darkblue}{\mathit{Allowed}}(i,j)} \color{darkred}{\mathit{Assign}}_{i,j} = 1 && \forall i \\ &\color{darkred}{\mathit{Finish}}_{i} = \color{darkred}{\mathit{Start}}_{i} + \color{darkblue}{\mathit{Length}}_{i} && \forall i \\ &\color{darkred}{\mathit{Start}}_{i} \ge \color{darkred}{\mathit{Finish}}_{i'} - \color{darkblue}M \cdot\color{darkred}\delta_{i,i',j} - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j}) - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i',j}) && \forall i \lt i',j \| \color{darkblue}{\mathit{Allowed}}(i,j) \>\mathbf{and}\>\color{darkblue}{\mathit{Allowed}}(i',j)\\ &\color{darkred}{\mathit{Start}}_{i'} \ge \color{darkred}{\mathit{Finish}}_{i} - \color{darkblue}M (1-\color{darkred}\delta_{i,i',j}) - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j}) - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i',j}) &&\forall i \lt i',j \| \color{darkblue}{\mathit{Allowed}}(i,j) \>\mathbf{and}\>\color{darkblue}{\mathit{Allowed}}(i',j) \\ &\color{darkred}{\mathit{Finish}}_i \le \color{darkblue}{\mathit{DueDate}}_{i} && \forall i\\ &\color{darkred}{\mathit{Makespan}} \ge \color{darkred}{\mathit{Finish}}_{i} && \forall i \\ & \color{darkred}{\mathit{Assign}}_{i,j} \in \{0,1\} \\ & \color{darkred}{\mathit{Start}}_{i},\color{darkred}{\mathit{Finish}}_{i} \ge 0\\ & \color{darkred}\delta_{i,i',j} \in \{0,1\} \end{align} \]

Mixed Integer Programming Model 1

\[ \begin{align} \min\>&\color{darkred}{\mathit{Makespan}}\\ &\sum_{j|\color{darkblue}{\mathit{Allowed}}(i,j)} \color{darkred}{\mathit{Assign}}_{i,j} = 1 && \forall i \\ &\color{darkred}{\mathit{Finish}}_{i} = \color{darkred}{\mathit{Start}}_{i} + \color{darkblue}{\mathit{Length}}_{i} && \forall i \\ &\color{darkred}{\mathit{Start}}_{i} \ge \color{darkred}{\mathit{Finish}}_{i'} - \color{darkblue}M \cdot\color{darkred}\delta_{i,i',j} - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j}) - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i',j}) && \forall i \lt i',j | \color{darkblue}{\mathit{Allowed}}(i,j) \>\mathbf{and}\>\color{darkblue}{\mathit{Allowed}}(i',j)\\ &\color{darkred}{\mathit{Start}}_{i'} \ge \color{darkred}{\mathit{Finish}}_{i} - \color{darkblue}M (1-\color{darkred}\delta_{i,i',j}) - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j}) - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i',j}) &&\forall i \lt i',j | \color{darkblue}{\mathit{Allowed}}(i,j) \>\mathbf{and}\>\color{darkblue}{\mathit{Allowed}}(i',j) \\ &\color{darkred}{\mathit{Finish}}_i \le \color{darkblue}{\mathit{DueDate}}_{i} && \forall i\\ &\color{darkred}{\mathit{Makespan}} \ge \color{darkred}{\mathit{Finish}}_{i} && \forall i \\ & \color{darkred}{\mathit{Assign}}_{i,j} \in \{0,1\} \\ & \color{darkred}{\mathit{Start}}_{i},\color{darkred}{\mathit{Finish}}_{i} \ge 0\\ & \color{darkred}\delta_{i,i',j} \in \{0,1\} \end{align} \]

The no-overlap constraints say, if jobs $i$ and $i'$ execute in the same room $j$ then either $i$ has to execute before $i'$ or $i'$ has to execute before $i$. As usual, the big-M values are used to make constraints non-binding when they are not to be obeyed. For this problem, I simply used $M=100$.

This model does not perform very well at all. After an hour, I still saw a gap of 27% In addition, the number of constraints is large (given the small data set): 2,352.

We can improve on this formulation by observing that we don't need all variables $\delta_{i,i',j}$. Instead we can use $\delta_{i,i'}$. This improves the performance a bit, but it is still not very good. This version is called "Improved Model 1" in the results table further down.

Model 2

Let's try a different model. First, we make sure all jobs are ordered by the due date. This means that we can find the finish time for job $i$ placed in room $j$ by calculating the processing time of all previous jobs assigned to room $j$: \[\sum_{i'|i'\le i\>\mathbf{and} \mathit{Allowed}(i',j)} {\mathit{Length}}_{i'} \cdot {\mathit{Assign}}_{i',j} \] This means: we execute jobs assigned to a room back-to-back (no holes). Using this approach we can write:

Mixed Integer Programming Model 2
\[ \begin{align} \min\>&\color{darkred}{\mathit{Makespan}}\\ &\color{darkred}{\mathit{Finish}}_{i} \ge \sum_{i'\|i'\le i\> \mathbf{and}\>\color{darkblue}{\mathit{Allowed}}(i',j)} \color{darkblue}{\mathit{Length}}_{i'} \cdot \color{darkred}{\mathit{Assign}}_{i',j} - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j})&& \forall i,j\|\color{darkblue}{\mathit{Allowed}}(i,j)\\ &\color{darkred}{\mathit{Finish}}_i \le \color{darkblue}{\mathit{DueDate}}_{i} && \forall i\\ &\sum_{j\|\color{darkblue}{\mathit{Allowed}}(i,j)} \color{darkred}{\mathit{Assign}}_{i,j} = 1 && \forall i \\ &\color{darkred}{\mathit{Makespan}} \ge \color{darkred}{\mathit{Finish}}_{i} && \forall i \\ & \color{darkred}{\mathit{Assign}}_{i,j} \in \{0,1\} \\ & \color{darkred}{\mathit{Finish}}_{i} \ge 0\\ \end{align} \]

Mixed Integer Programming Model 2

\[ \begin{align} \min\>&\color{darkred}{\mathit{Makespan}}\\ &\color{darkred}{\mathit{Finish}}_{i} \ge \sum_{i'|i'\le i\> \mathbf{and}\>\color{darkblue}{\mathit{Allowed}}(i',j)} \color{darkblue}{\mathit{Length}}_{i'} \cdot \color{darkred}{\mathit{Assign}}_{i',j} - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j})&& \forall i,j|\color{darkblue}{\mathit{Allowed}}(i,j)\\ &\color{darkred}{\mathit{Finish}}_i \le \color{darkblue}{\mathit{DueDate}}_{i} && \forall i\\ &\sum_{j|\color{darkblue}{\mathit{Allowed}}(i,j)} \color{darkred}{\mathit{Assign}}_{i,j} = 1 && \forall i \\ &\color{darkred}{\mathit{Makespan}} \ge \color{darkred}{\mathit{Finish}}_{i} && \forall i \\ & \color{darkred}{\mathit{Assign}}_{i,j} \in \{0,1\} \\ & \color{darkred}{\mathit{Finish}}_{i} \ge 0\\ \end{align} \]

Again we rely heavily on the set Allowed.

Note that in the finish constraint we repeat large parts of the summation in subsequent constraints. For large models, we may want to look into this (e.g. by adding extra variables and constraints to reduce the number of nonzero elements). In this case, I just left the constraint as specified in the mathematical model.

This model 2 turns out to be much faster:

	Model 1	Improved Model 1	Model 2
Rows	2,352	2,352	286
Columns	1,283	585	137
Binary Columns	1,222	524	106
Time	>3,600	>3,600	24
Status	Time Limit	Time Limit	Optimal
Objective	19.5142	19.4359	19.4322
Gap	27%	22%	0%

The difference between models 1 and 2 is rather dramatic.

Solution

The solution values for the variables $\mathit{Finish}_i$ are not necessarily as small as possible. The objective does not push all job completion times down, only the ones involved with the total makespan (i.e. on the critical path). When reporting it makes sense just to take the optimal assignments from the model and then execute jobs as early as possible. This is what I did here. Jobs on a single line are ordered by the due date. The recalculated solution is:

----    129 PARAMETER results  

                   start      length      finish     duedate

room1.task1                    2.335       2.335       5.166
room1.task9        2.335       2.377       4.712       8.677
room1.task11       4.712       4.600       9.312      10.184
room1.task20       9.312       2.198      11.510      16.156
room1.task23      11.510       4.095      15.605      17.033
room1.task30      15.605       3.827      19.432      24.507
room2.task6                    3.321       3.321       8.105
room2.task10       3.321       4.649       7.971       8.922
room2.task15       7.971       3.374      11.344      12.867
room2.task24      11.344       3.132      14.476      17.813
room2.task29      14.476       4.885      19.361      24.014
room3.task2                    4.935       4.935       5.333
room3.task8        4.935       3.573       8.508       8.556
room3.task18       8.508       4.728      13.237      15.820
room3.task22      13.237       1.180      14.416      16.885
room3.task27      14.416       3.526      17.943      23.655
room3.task28      17.943       1.460      19.403      23.977
room4.task3                    4.066       4.066       5.493
room4.task4        4.066       1.440       5.506       5.540
room4.task13       5.506       2.475       7.981      11.975
room4.task17       7.981       4.367      12.348      14.200
room4.task21      12.348       2.986      15.335      16.438
room4.task25      15.335       3.987      19.322      21.109
room5.task5                    4.979       4.979       6.226
room5.task7        4.979       1.666       6.645       8.271
room5.task12       6.645       1.065       7.710      11.711
room5.task14       7.710       3.658      11.367      12.814
room5.task16      11.367       1.138      12.506      14.023
room5.task19      12.506       3.032      15.538      15.877
room5.task26      15.538       3.880      19.418      21.713

The highlighted completion time corresponds to the makespan.

Conclusion

This is a bit more interesting model than I initially thought.

The standard no-overlap constraints for a continuous-time model do not perform very well. In addition, they are a bit complicated due to several big-M terms (the constraints should only hold in certain specific cases: two jobs run in the same room).

By using the assumption that jobs in a single room are executed in order of the due date (not a bad assumption), we can create a much simpler MIP model that also performs much, much better.

When we have an infeasible solution (i.e. we cannot meet all due dates), we may want to deliver a good schedule that minimizes the damage. This is easy is we just add a slack variable to each due date constraint, and add the slack with a cost (penalty) to the objective. This essentially minimized the sum of the due date violations. There may be a reason to also look at minimizing the number of tardy jobs. It is noted that because we fixed the order of jobs in a single room, we may not get the best if we want to minimize the number of tardy jobs.

References

Allocating and scheduling tasks into rooms with conditions - optimization algorithm, https://stackoverflow.com/questions/61656492/allocating-and-scheduling-tasks-into-rooms-with-conditions-optimization-algori

	\(j\)	\(j+1\)	\(j+2\)
\(i-2\)		\(x_{i-2,j+1}\)
\(i-1\)			\(x_{i-1,j+2}\)
\(i\)	\(x_{i,j}\)
\(i+1\)			\(x_{i+1,j+2}\)
\(i+2\)		\(x_{i+2,j+1}\)

	\(j\)	\(j+1\)
\(i-1\)	\(x_{i-1,j}\)
\(i\)	\(x_{i,j}\)	\(x_{i,j+1}\)
\(i+1\)