Yet Another Math Programming Consultant: March 2020

Thursday, March 26, 2020

Special Sudoku (2)

In [1], a variant of Sudoku was discussed, where a side constraint is:

Orthogonally neigboring cells differ in value by at most 5

This max 5 difference is a convex constraint, so it can be expressed quite easily: \[-5 \le v_{i,j} - v_{i',j'} \le 5\] for neighboring cells $(i,j)$ and $(i',j')$.

A more difficult restriction is: neighboring cells must differ by at least 2. Here is such a problem, again designed by Aad van de Wetering:

Problem data

The rules are:

Standard Sudoku-rules apply. cells in each row, column and sub-block have unique values from 1 through 9.
In addition, neighboring (horizontal and vertical) cells have values that differ by 2 or more.
Also, values 1 and 9 are considered to have a difference of 1. So neighboring cells, cannot have values 1 and 9.

I will try to solve this problem in two different ways. The first is an extension of what we did in [1]. The second one is a bit different.

Approach I

The constraint: neighbors must be different by at least two, can be formalized as \[\left| v_{i,j} - v_{i',j'} \right| \ge 2\] for neighboring cells $(i,j)$ and $(i',j')$. The non-convexity of this constraint \[\begin{align} & v_{i,j} - v_{i',j'} \ge 2 \\ & \mathbf{or} \\&v_{i,'j'} - v_{i,j} \ge 2 \end{align} \] requires extra binary variables and big-M constraints: \[\begin{align} & v_{i,j} - v_{i',j'} \ge 2 - M\cdot \delta_{i,j,i',j'} \\ & v_{i,'j'} - v_{i,j} \ge 2 - M\cdot (1-\delta_{i,j,i',j'})\\ & \delta_{i,j,i',j'} \in \{0,1\}\end{align}\] So, how large should $M$ be? The maximum difference between two different values is 8. So we need $2-M\le -8$ or $M=10$.

We ignored that values 1 and 9 also have a difference of 1. As all other combinations are allowed, we can handle this by \[\left|v_{i,j}-v_{i',j'}\right| \le 7\] This is a constraint we already know how to handle: \[-7 \le v_{i,j}-v_{i',j'} \le 7\] Interestingly, this addition restriction also can help us in tightening our big-M value. We can now use $M = 9$. Note that with this smaller big-M we still need this constraint: the big-M constraint only covers one of the needed two sides of this constraint.

The complete model can look like:

Mixed-Integer Programming Model I
\[\begin{align}\min\>& 0 && && \text{Dummy objective}\\ & \sum_k \color{darkred}x_{i,j,k} = 1 &&\forall i,j && \text{One $k$ in each cell}\\ & \sum_{i,j\|\color{darkblue}u_{a,i,j}} \color{darkred}x_{i,j,k} = 1 && \forall a,k && \text{Unique values in each area}\\ &\color{darkred}v_{i,j} =\sum_k \color{darkblue}k\cdot\color{darkred}x_{i,j,k} && \forall i,j &&\text{Value of cell}\\ & \color{darkred}v_{i,j} - \color{darkred}v_{i',j'} \ge 2 - \color{darkblue}M\cdot \color{darkred}\delta_{i,j,i',j'}&&\forall i,j,i',j'\|\color{darkblue}{nb}_{i,j,i',j'} &&\text{Min difference of 2} \\ & \color{darkred}v_{i,'j'} - \color{darkred}v_{i,j} \ge 2 - \color{darkblue}M\cdot (1-\color{darkred}\delta_{i,j,i',j'}) &&\forall i,j,i',j'\|\color{darkblue}{nb}_{i,j,i',j'}\\ & -7 \le \color{darkred}v_{i,j} - \color{darkred}v_{i',j'} \le 7 &&\forall i,j,i',j'\|\color{darkblue}{nb}_{i,j,i',j'} && \text{Neighbors differ by max 7} \\ & \color{darkred}x_{i,j,k} = 1 && \text{where } k=\color{darkblue}{\mathit{Given}}_{i,j} &&\text{Fix given values}\\&\color{darkred}x_{i,j,k} \in \{0,1\} \\ &\color{darkred}v_{i,j} \in \{1,\dots,9\} \\ & \color{darkred}\delta_{i,j,i',j'} \in \{0,1\} \\ & \color{darkblue}M = 9\end{align}\]

Mixed-Integer Programming Model I

\[\begin{align}\min\>& 0 && && \text{Dummy objective}\\ & \sum_k \color{darkred}x_{i,j,k} = 1 &&\forall i,j && \text{One $k$ in each cell}\\ & \sum_{i,j|\color{darkblue}u_{a,i,j}} \color{darkred}x_{i,j,k} = 1 && \forall a,k && \text{Unique values in each area}\\ &\color{darkred}v_{i,j} =\sum_k \color{darkblue}k\cdot\color{darkred}x_{i,j,k} && \forall i,j &&\text{Value of cell}\\ & \color{darkred}v_{i,j} - \color{darkred}v_{i',j'} \ge 2 - \color{darkblue}M\cdot \color{darkred}\delta_{i,j,i',j'}&&\forall i,j,i',j'|\color{darkblue}{nb}_{i,j,i',j'} &&\text{Min difference of 2} \\ & \color{darkred}v_{i,'j'} - \color{darkred}v_{i,j} \ge 2 - \color{darkblue}M\cdot (1-\color{darkred}\delta_{i,j,i',j'}) &&\forall i,j,i',j'|\color{darkblue}{nb}_{i,j,i',j'}\\ & -7 \le \color{darkred}v_{i,j} - \color{darkred}v_{i',j'} \le 7 &&\forall i,j,i',j'|\color{darkblue}{nb}_{i,j,i',j'} && \text{Neighbors differ by max 7} \\ & \color{darkred}x_{i,j,k} = 1 && \text{where } k=\color{darkblue}{\mathit{Given}}_{i,j} &&\text{Fix given values}\\&\color{darkred}x_{i,j,k} \in \{0,1\} \\ &\color{darkred}v_{i,j} \in \{1,\dots,9\} \\ & \color{darkred}\delta_{i,j,i',j'} \in \{0,1\} \\ & \color{darkblue}M = 9\end{align}\]

Notes:

This model solves quickly: the presolve can remove all rows and columns.
The solution is unique (see [1]).
Using the above data set we actually can drop the "max 7" constraint. I think that is just a property of this data set. In general, we should include this constraint.

Approach II

A simpler approach is to consider the pairs of values that cannot be neighbors: \[(1,2),(2,3),(3,4),(4,5),(5,6),(6,7),(7,8),(8,9),(9,1)\] We also need to add their symmetrical counterparts \[(2,1),(3,2),\dots\] Detail: this is due to the way we organized the neighbors. The neighbor set did not have both: i.e. cells $\text{$(1,2)$-$(1,3)$}$ are neighbors, so this is stored. But, I did not also store the combination $\text{$(1,3)$-$(1,2)$}$.

After automatically generating the combinations $(k,k')$ that we need to prevent, the forbid data can look like:

----     55 SET forbid  combinations of values we want to exclude

            v1          v2          v3          v4          v5          v6          v7          v8          v9

v1                     YES                                                                                 YES
v2         YES                     YES
v3                     YES                     YES
v4                                 YES                     YES
v5                                             YES                     YES
v6                                                         YES                     YES
v7                                                                     YES                     YES
v8                                                                                 YES                     YES
v9         YES                                                                                 YES

This table contains elements $\mathit{forbid}_{k,k'}$ we want to forbid. Note that we can exclude the diagonal from this: neighboring cells already cannot have the same value. For this set of value-pairs, we want to add the constraint:\[x_{i,j,k} + x_{i',j',k'} \le 1 \] for all neighboring cells $(i,j)$ and $(i',j')$ and for all forbidden value combinations $(k,k')$. We have 18 of these combinations and 144 neighbors to check, yielding 2592 of these constraints.

The full model can look like:

Mixed-Integer Programming Model II
\[\begin{align}\min\>& 0 && && \text{Dummy objective}\\ & \sum_k \color{darkred}x_{i,j,k} = 1 &&\forall i,j && \text{One $k$ in each cell}\\ & \sum_{i,j\|\color{darkblue}u_{a,i,j}} \color{darkred}x_{i,j,k} = 1 && \forall a,k && \text{Unique values in each area}\\ & \color{darkred}x_{i,j,k} + \color{darkred}x_{i',j',k'} \le 1 && \begin{aligned}& \forall i,j,i',j'\|\color{darkblue}{nb}_{i,j,i',j'} \\ & \forall k,k'\|\color{darkblue}{\mathit{forbid}}_{k,k'} \end{aligned}&&\text{Forbid combinations} \\ & \color{darkred}x_{i,j,k} = 1 && \text{where } k=\color{darkblue}{\mathit{Given}}_{i,j} &&\text{Fix given values}\\&\color{darkred}x_{i,j,k} \in \{0,1\} \end{align}\]

Mixed-Integer Programming Model II

When we run this model, we will see the solution:

----     84 PARAMETER v  values (postprocessing)

            c1          c2          c3          c4          c5          c6          c7          c8          c9

r1           9           2           6           4           1           7           3           5           8
r2           3           5           8           2           6           9           7           1           4
r3           1           7           4           8           3           5           2           6           9
r4           7           4           1           6           9           2           5           8           3
r5           5           8           3           1           7           4           9           2           6
r6           2           6           9           3           5           8           4           7           1
r7           6           9           2           5           8           3           1           4           7
r8           8           3           5           7           4           1           6           9           2
r9           4           1           7           9           2           6           8           3           5

This solution is unique. The presolver completely solves the model, so no actual iterations are performed.

References

Special Sudoku, http://yetanothermathprogrammingconsultant.blogspot.com/2020/03/special-sudoku.html

Sunday, March 22, 2020

Special Sudoku

Here is a difficult Sudoku variant designed by Aad van de Wetering. Aad is a former colleague of mine.

Problem

The standard Sudoku rules apply:

Each cell has an integer value between 1 and 9
In each row, column and sub-block cells must be unique
Some cells have a predefined value

In addition, we have the following restriction:

The maximum difference between the value of two neighboring cells (orthogonal, that is horizontal or vertical) is 5

To solve this you can follow this video [1]:

Of course, we want to try to solve this as a Mixed-Integer Programming model.

Review: a linear integer programming model for standard Sudoku

When we want to solve Sudokus, the easiest approach is to define the following binary decision variables [2]: \[x_{i,j,k} = \begin{cases} 1 & \text{if cell $(i,j)$ has value $k$} \\ 0 & \text{otherwise}\end{cases}\] Here $k \in \{1,\dots,9\}$. We have 27 areas we need to check for unique values: rows, columns and sub-blocks. We can organize this as a set: \[u_{a,i,j}\>\text{exists if and only if area $a$ contains cell $(i,j)$}\] This is data. We also have a set of given cells, which we can fix. The resultimg MIP model can look like [2]:

Mixed-Integer Programming Model for standard Sudoku
\[\begin{align}\min\>& 0 && && \text{Dummy objective}\\ & \sum_k \color{darkred}x_{i,j,k} = 1 &&\forall i,j && \text{One $k$ in each cell}\\ & \sum_{i,j\|\color{darkblue}u_{a,i,j}} \color{darkred}x_{i,j,k} = 1 && \forall a,k && \text{Unique values in each area}\\ & \color{darkred}x_{i,j,k} = 1 && \text{where } k=\color{darkblue}{\mathit{Given}}_{i,j} &&\text{Fix given values}\\ &\color{darkred}x_{i,j,k} \in \{0,1\} \end{align}\]

Mixed-Integer Programming Model for standard Sudoku

During post-processing, we can calculate the completed grid using the optimal values of $x$: \[v_{i,j} = \sum_k k \cdot x^*_{i,j,k}\]

Linear integer programming model for Sudoku variant

We need to introduce the concept of "neighbors". We can define the set: \[nb_{i,j,i',j'} \> \text{exists iff cells $(i,j)$ and $(i',j')$ are neighbors}\] I excluded symmetric cases. To be precise: in GAMS, I populated this set with:

alias (i,ii),(j,jj);
set
nb(i,j,ii,jj) 'neighbors'
;

nb(i,j,i-1,j) = yes;
nb(i,j,i,j-1) = yes;

We can introduce the post-processing step directly into the MIP and then enforce our difference constraint on the variables $v_{i,j}$. This can look like \[\begin{align} & v_{i,j} = \sum_k k \cdot x_{i,j,k} && \text{cell values as constraint} \\ & -5 \le v_{i,j} - v_{i',j'} \le 5 && \text{for neigboring cells $(i,j)$ and $(i',j')$}\end{align}\]
With this, we can formulate:

Mixed-Integer Programming Model for Sudoku Variant
\[\begin{align}\min\>& 0 && && \text{Dummy objective}\\ & \sum_k \color{darkred}x_{i,j,k} = 1 &&\forall i,j && \text{One $k$ in each cell}\\ & \sum_{i,j\|\color{darkblue}u_{a,i,j}} \color{darkred}x_{i,j,k} = 1 && \forall a,k && \text{Unique values in each area}\\ &\color{darkred}v_{i,j} =\sum_k \color{darkblue}k\cdot\color{darkred}x_{i,j,k} && \forall i,j &&\text{Value of cell}\\ & -5 \le \color{darkred}v_{i,j} - \color{darkred}v_{i',j'} \le 5 &&\forall i,j,i',j'\|\color{darkblue}{nb}_{i,j,i',j'} && \text{Neighbors differ by max 5} \\ & \color{darkred}x_{i,j,k} = 1 && \text{where } k=\color{darkblue}{\mathit{Given}}_{i,j} &&\text{Fix given values}\\&\color{darkred}x_{i,j,k} \in \{0,1\} \\ &\color{darkred}v_{i,j} \in \{1,\dots,9\} \end{align}\]

Mixed-Integer Programming Model for Sudoku Variant

This model solves very easily. The presolver can solve this model completely, so we don't even have to start the branch-and-bound phase. As a result, we can make a much shorter movie:

The solution of this model looks like:

----     80 VARIABLE v.L  value of each cell

            c1          c2          c3          c4          c5          c6          c7          c8          c9

r1           2           4           9           6           7           5           8           3           1
r2           3           8           5           1           4           9           6           7           2
r3           7           6           1           2           3           8           9           5           4
r4           8           9           6           3           1           4           5           2           7
r5           5           7           3           8           6           2           1           4           9
r6           1           2           4           5           9           7           3           8           6
r7           6           1           2           4           8           3           7           9           5
r8           4           3           7           9           5           1           2           6           8
r9           9           5           8           7           2           6           4           1           3

Unique solution

A proper Sudoku puzzle has one unique solution. We can verify this by adding a cut to the problem after solving it. The cut should forbid the current solution (and allow all other solutions). In this case, the cut can look like \[\sum_{i,j,k} x^*_{i,j,k} x_{i,j,k}\le 9^2-1\] where $x^*$ is the previously found solution. Indeed, this puzzle has a single, unique solution: after adding the cut, the problem becomes (integer) infeasible.

References

A Sudoku With Just 11 Digits = Dutch Genius!, https://www.youtube.com/watch?v=ZU5fSDHJq8k
MIP Modeling: from Sudoku to KenKen via Logarithms, https://yetanothermathprogrammingconsultant.blogspot.com/2016/10/mip-modeling-from-sudoku-to-kenken.html

Thursday, March 12, 2020

CellBlock or Shikaku puzzle

This is a simple puzzle: given a grid with some cells containing a positive integer, form rectangles such that

Each rectangle contains exactly one cell with a number
The size of the rectangle is equal to this number
Every cell in the grid is covered by exactly one rectangle

Example:

Shikaku Problem and Solution

Approach

One possible approach is a two-step algorithm:

Generate all feasible rectangles,
Use a simple MIP model to select a subset of these rectangles such that each cell is covered exactly once.

For the problem above, we just have 31 feasible rectangles. Feasible means here: (a) rectangle contains exactly one cell with a number, (b) the size of the rectangle is equal to that number.

All feasible rectangles

Designing an algorithm that generates all feasible rectangles is easier said than done, especially if you want to do this efficiently for larger instances. This is mainly a programming exercise, rather than a modeling problem.

The second part is surprisingly simple. Assume all $k\in K$ rectangles are stored in a single large boolean data structure: \[\mathit{allRectangles}_{k,i,j} = \begin{cases} 1 & \text{if cell $(i,j)$ is part of rectangle $k$}\\ 0 & \text{otherwise}\end{cases}\] then the optimization model is a simple set covering model:

Set Covering Problem
\[\begin{align}\min\>& 0\\ &\sum_k \color{darkblue}{\mathit{allRectangles}}_{k,i,j} \color{darkred}x_{k} = 1&& \forall i,j\\ &\color{darkred}x_{k} \in \{0,1\} \end{align}\]

In this case, we don't have an objective. We simply state that every cell $(i,j)$ must be covered by exactly one rectangle.

This is a tiny model and solves very easily. Typically it can be solved completely in the presolve phase:

Tried aggregator 1 time.
MIP Presolve eliminated 37 rows and 32 columns.
MIP Presolve modified 3 coefficients.
All rows and columns eliminated.

The solution can look like:

----     89 VARIABLE x.L  selection of rectangles

k1  1.000,    k3  1.000,    k5  1.000,    k8  1.000,    k9  1.000,    k12 1.000,    k20 1.000,    k22 1.000
k26 1.000,    k29 1.000,    k31 1.000


----     97 PARAMETER result  solution: selected rectangles

            c1          c2          c3          c4          c5          c6

r1           1           3           5           5           8           9
r2           1           3           5           5           8           9
r3          12           3           5           5           8           9
r4          12           3          20          20          22          22
r5          12           3          26          26          26          26
r6          29          29          29          29          31          31

The model selects 11 rectangles. We could have predicted this: there are 11 numbered cells in the input data. The parameter result displays the selected rectangle numbers on the grid. You should be able to match this up with the picture of the feasible rectangles shown earlier.

GAMS Model

$ontext

   Cell Block or Shikaku puzzle
   Set Covering model for (small) square grids
   erwin@amsterdamoptimization.com

$offtext

sets
i 'rows' /r1*r6/
j 'columns' /c1*c6/
;

scalars
   m 'rows'
   n 'columns'
;
m = card(i);
n = card(j);

alias(i,p,h);
alias(j,q,w);

table grid(i,j)
     c1 c2 c3 c4 c5 c6
r1    2              3
r2             6 3
r3       5
r4    3     2     2
r5                4
r6          4        2
;

scalar nn 'sum of numbers';
nn = sum((i,j),grid(i,j));
display nn;

abort$(nn <> card(i)*card(j)) "error in grid specification";

*------------------------------------------
* generate all possible rectangles
* requirements:
* 1. block contains exactly one number from grid
* 2. size of block is equal to this number
*------------------------------------------

set
k 'rectangle number' /k1*k10000/
allRectangles(k,i,j)
r(i,j) 'current rectangle'
;
singleton set
kk(k) 'current index' /k1/
;
loop((p,q),
   loop((w,h)$(ord(p)+ord(h)-1<=m and ord(q)+ord(w)-1<=n),
* candidate rectangle
      r(i,j) = (ord(i)>=ord(p) and ord(i)<=ord(p)+ord(h)-1 and
                ord(j)>=ord(q) and ord(j)<=ord(q)+ord(w)-1);
* check: only one grid number and size = gridnumber
      if(sum(r$grid(r),1)=1 and card(r)=sum(r,grid(r)),
         allRectangles(kk,r) = yes;
         kk(k)=kk(k-1);
    );
   );
);

*------------------------------------------
* model: select rectangles such that each
* cell is covered exactly once
*------------------------------------------

binary variable x(k) 'selection of rectangles';
variable z 'dummy objective';

equations
   dummy         'dummy objective'
   covering(i,j) 'each cell is covered by exactly one rectangle'
;

covering(i,j).. sum(allRectangles(k,i,j),x(k)) =e= 1;
dummy.. z =e= 0;

model cover /all/;
* just need feasibility
solve cover minimizing z using mip;

display x.l;

* reporting
parameter result(i,j) 'solution: selected rectangles';
loop(k$(x.l(k)>0.5),
   result(i,j)$allRectangles(k,i,j) = ord(k);
);
option result:0;
display result;

Notes:

The data check makes sure the sum of the numbers is equal to the size (area) of the grid.
The enumeration of the feasible rectangles is using a simplistic and crude algorithm. It may need to be refined for larger instances.
allRectangles(k,i,j) is implemented as a set.
As we are looking for a feasible solution only, there is no need to specify a gap with the optcr option.

A larger problem

A larger problem is taken from [1].

17 x 15 problem

Some statistics:

This problem has 17 rows and 15 columns.
The number of feasible rectangles is 253.
The number of numbered cells is 66. This is also the number of rectangles we need to select.
If one wants to make a big impression, mention that the number of ways we can choose 66 out of 253 is: \[ \begin{align} {253 \choose 66} &= 66 281 207 211 442 081 043 167 193 447 008 690 678 844 358 496 354 619 442 275 550 \\ &\approx 6.6 \times 10^{61}\end{align}\]

The results look like:

----    105 VARIABLE x.L  selection of rectangles

k1   1.000,    k3   1.000,    k6   1.000,    k11  1.000,    k13  1.000,    k16  1.000,    k30  1.000,    k34  1.000
k36  1.000,    k37  1.000,    k41  1.000,    k44  1.000,    k45  1.000,    k51  1.000,    k54  1.000,    k63  1.000
k65  1.000,    k69  1.000,    k73  1.000,    k82  1.000,    k84  1.000,    k87  1.000,    k90  1.000,    k91  1.000
k94  1.000,    k103 1.000,    k107 1.000,    k110 1.000,    k112 1.000,    k125 1.000,    k128 1.000,    k130 1.000
k132 1.000,    k134 1.000,    k136 1.000,    k140 1.000,    k143 1.000,    k150 1.000,    k152 1.000,    k154 1.000
k159 1.000,    k162 1.000,    k171 1.000,    k174 1.000,    k190 1.000,    k192 1.000,    k195 1.000,    k199 1.000
k201 1.000,    k202 1.000,    k203 1.000,    k204 1.000,    k209 1.000,    k216 1.000,    k219 1.000,    k222 1.000
k226 1.000,    k228 1.000,    k235 1.000,    k238 1.000,    k244 1.000,    k245 1.000,    k247 1.000,    k249 1.000
k251 1.000,    k253 1.000


----    113 PARAMETER result  solution: selected rectangles

             c1          c2          c3          c4          c5          c6          c7          c8          c9

r1            1           3           3           6           6          11          11          11          11
r2            1           3           3           6           6          11          11          11          11
r3           30          30          30          30          34          34          36          37          37
r4           30          30          30          30          34          34          36          51          51
r5           30          30          30          30          34          34          63          65          65
r6           73          73          73          73          34          34          63          65          65
r7           87          87          90          90          91          91          94          65          65
r8          103         103         103         103         103         103          94         107         107
r9          103         103         103         103         103         103          94         125         125
r10         134         136         136         140         140         143         143         125         125
r11         134         154         154         159         159         159         159         159         159
r12         171         154         154         174         174         174         174         174         174
r13         171         190         190         192         195         195         195         199         199
r14         204         190         190         192         195         195         195         209         209
r15         204         190         190         216         216         219         219         222         222
r16         204         190         190         216         216         235         235         235         238
r17         204         244         244         245         245         245         247         247         249

  +         c10         c11         c12         c13         c14         c15

r1           11          11          13          13          16          16
r2           11          11          13          13          16          16
r3           41          41          13          13          44          45
r4           54          54          13          13          44          45
r5           54          54          69          69          44          45
r6           82          84          84          84          44          45
r7           82          84          84          84          44          45
r8          107         110         110         112          44          45
r9          128         128         130         112         132         132
r10         128         128         130         150         150         152
r11         162         162         130         150         150         152
r12         162         162         130         150         150         152
r13         199         201         201         202         203         203
r14         209         209         209         202         203         203
r15         222         222         222         226         226         228
r16         238         238         238         226         226         228
r17         249         249         251         251         253         253

Again the presolver could eliminate all rows and columns.

Questions

I still have some lingering questions about this problem.

How would a better algorithm that enumerates all feasible rectangles look like? I guess it would start with the cells containing numbers.
Can we use the solution pool to automate this step?
Is the solution unique?
Are there cases where the presolver is not solving the complete problem?
Nerdy: when disabling the presolver. it looks like we can solve the problem as an LP. I.e. the binary variables are integer-valued automatically. Is this true?

Q2: Solution Pool to generate feasible rectangles

In the comments, a MIP model is proposed that will deliver feasible rectangles. It is worthwhile to reproduce it here. A high-level version of this model can look like:

High-level Feasible Rectangle Model
\[\begin{align}\min\>& 0 && && \text{Dummy objective}\\ &\sum_{i,j\|\color{darkblue}{\mathit{Grid}}_{i,j}} \color{darkred}{\mathit{useCell}}_{i,j} = 1 && && \text{One numbered cell in rectangle}\\ & \sum_{i,j} \color{darkblue}{\mathit{Grid}}_{i,j} \color{darkred}{\mathit{useCell}}_{i,j} = \sum_{i,j} \color{darkred}{\mathit{useCell}}_{i,j} &&&&\text{Number equal to area of rectangle} \\ & \color{darkred}{\mathit{useCell}}_{i,j} = \color{darkred}{\mathit{useRow}}_i \cdot \color{darkred}{\mathit{useColumn}}_j &&\forall i,j && \text{Interior of rectangle} \\ & \color{darkred}{\mathit{useRow}}_i \ge \color{darkred}{\mathit{useRow}}_{i1} \cdot \color{darkred}{\mathit{useRow}}_{i2} && \forall i1 \lt i \lt i2 && \text{Contiguous rows} \\ & \color{darkred}{\mathit{useColumn}}_j \ge \color{darkred}{\mathit{useColumn}}_{j1} \cdot \color{darkred}{\mathit{useColumn}}_{j2} && \forall j1 \lt j \lt j2 && \text{Contiguous columns}\\ & \color{darkred}{\mathit{useCell}}_{i,j},\color{darkred}{\mathit{useRow}}_i,\color{darkred}{\mathit{useColumn}}_j \in \{0,1\} \end{align}\]

High-level Feasible Rectangle Model

\[\begin{align}\min\>& 0 && && \text{Dummy objective}\\ &\sum_{i,j|\color{darkblue}{\mathit{Grid}}_{i,j}} \color{darkred}{\mathit{useCell}}_{i,j} = 1 && && \text{One numbered cell in rectangle}\\ & \sum_{i,j} \color{darkblue}{\mathit{Grid}}_{i,j} \color{darkred}{\mathit{useCell}}_{i,j} = \sum_{i,j} \color{darkred}{\mathit{useCell}}_{i,j} &&&&\text{Number equal to area of rectangle} \\ & \color{darkred}{\mathit{useCell}}_{i,j} = \color{darkred}{\mathit{useRow}}_i \cdot \color{darkred}{\mathit{useColumn}}_j &&\forall i,j && \text{Interior of rectangle} \\ & \color{darkred}{\mathit{useRow}}_i \ge \color{darkred}{\mathit{useRow}}_{i1} \cdot \color{darkred}{\mathit{useRow}}_{i2} && \forall i1 \lt i \lt i2 && \text{Contiguous rows} \\ & \color{darkred}{\mathit{useColumn}}_j \ge \color{darkred}{\mathit{useColumn}}_{j1} \cdot \color{darkred}{\mathit{useColumn}}_{j2} && \forall j1 \lt j \lt j2 && \text{Contiguous columns}\\ & \color{darkred}{\mathit{useCell}}_{i,j},\color{darkred}{\mathit{useRow}}_i,\color{darkred}{\mathit{useColumn}}_j \in \{0,1\} \end{align}\]

This model is nonlinear: there are quadratic terms. We are lucky, however. The quadratic terms involve only binary variables. This means that we can use a standard linearization:\[z=x\cdot y \iff \begin{cases} z \le x \\ z \le y \\ z \ge x+y-1 \end{cases}\] We assume here that $x,y,x$ are binary variables. When we apply this, the linearized model looks like:

Linearized Feasible Rectangle Model
\[\begin{align}\min\>& 0 \\ &\sum_{i,j\|\color{darkblue}{\mathit{Grid}}_{i,j}} \color{darkred}{\mathit{useCell}}_{i,j} = 1 \\ & \sum_{i,j} \color{darkblue}{\mathit{Grid}}_{i,j} \color{darkred}{\mathit{useCell}}_{i,j} = \sum_{i,j} \color{darkred}{\mathit{useCell}}_{i,j} \\ & \color{darkred}{\mathit{useCell}}_{i,j} \le \color{darkred}{\mathit{useRow}}_i &&\forall i,j \\ & \color{darkred}{\mathit{useCell}}_{i,j} \le \color{darkred}{\mathit{useColumn}}_j &&\forall i,j \\ & \color{darkred}{\mathit{useCell}}_{i,j} \ge \color{darkred}{\mathit{useRow}}_i + \color{darkred}{\mathit{useColumn}}_j - 1 &&\forall i,j \\ & \color{darkred}{\mathit{useRow}}_i \ge \color{darkred}{\mathit{useRow}}_{i1} + \color{darkred}{\mathit{useRow}}_{i2} - 1 && \forall i1 \lt i \lt i2 \\ & \color{darkred}{\mathit{useColumn}}_j \ge \color{darkred}{\mathit{useColumn}}_{j1} + \color{darkred}{\mathit{useColumn}}_{j2} - 1 && \forall j1 \lt j \lt j2 \\ & \color{darkred}{\mathit{useCell}}_{i,j},\color{darkred}{\mathit{useRow}}_i,\color{darkred}{\mathit{useColumn}}_j \in \{0,1\} \end{align}\]

Linearized Feasible Rectangle Model

\[\begin{align}\min\>& 0 \\ &\sum_{i,j|\color{darkblue}{\mathit{Grid}}_{i,j}} \color{darkred}{\mathit{useCell}}_{i,j} = 1 \\ & \sum_{i,j} \color{darkblue}{\mathit{Grid}}_{i,j} \color{darkred}{\mathit{useCell}}_{i,j} = \sum_{i,j} \color{darkred}{\mathit{useCell}}_{i,j} \\ & \color{darkred}{\mathit{useCell}}_{i,j} \le \color{darkred}{\mathit{useRow}}_i &&\forall i,j \\ & \color{darkred}{\mathit{useCell}}_{i,j} \le \color{darkred}{\mathit{useColumn}}_j &&\forall i,j \\ & \color{darkred}{\mathit{useCell}}_{i,j} \ge \color{darkred}{\mathit{useRow}}_i + \color{darkred}{\mathit{useColumn}}_j - 1 &&\forall i,j \\ & \color{darkred}{\mathit{useRow}}_i \ge \color{darkred}{\mathit{useRow}}_{i1} + \color{darkred}{\mathit{useRow}}_{i2} - 1 && \forall i1 \lt i \lt i2 \\ & \color{darkred}{\mathit{useColumn}}_j \ge \color{darkred}{\mathit{useColumn}}_{j1} + \color{darkred}{\mathit{useColumn}}_{j2} - 1 && \forall j1 \lt j \lt j2 \\ & \color{darkred}{\mathit{useCell}}_{i,j},\color{darkred}{\mathit{useRow}}_i,\color{darkred}{\mathit{useColumn}}_j \in \{0,1\} \end{align}\]

After using a solution pool algorithm (this is part of solvers like Cplex and Gurobi) on this model, we get an enumeration of all feasible rectangles:

Feasible rectangles

For data set 1, we get 31 rectangles and for the second data set, we get 253 rectangles.

Q3. Uniqueness

To establish whether solutions are unique, we can use a standard approach:

Solve the optimization problem. Record the optimal values as $x_k^*$.
Add the cut: \[ \sum_k x_k^* x_k - \sum_k (1-x_k^*)x_k \le \sum_k x_k^* - 1\] to the problem. This cut forbids the solution $x_k^*$ but allows any other solution.
Resolve. If infeasible then the solution $x_k^*$ is unique.

The cut can be specialized. Let $M$ be the number of numbered cells. We know that \[\sum_k x_k = M\] With this our cut can look like: \[\sum_k x_k^* x_k \le M-1\]

An alternative, somewhat heavy-handed approach would be to use the solution pool to prove uniqueness.

When we try this, we see that both data sets yield unique solutions.

References

https://puzzlemadness.co.uk/

Monday, March 9, 2020

Tiling

This problem is original from [1]. I made it a bit more general.

Problem statement

Given a collection of square tiles, what is the maximum square space we can fill with them?

Example

Let's consider our inventory:

----     29 PARAMETER input  inventory data

            count       width        area

tile1           6           1           1
tile2           5           2           4
tile3           4           3           9
tile4           3           4          16
tile5           2           5          25
tile6           1           6          36
total          21                     196

So we have a total of 21 tiles of different sizes. Their total area is 196. This means that the best we can hope for is to fill square space of $14 \times 14 = 196$.

The optimization model shows we can fill this whole $14 \times 14$ area, using up all the tiles. No leftovers. As we will see later, this is somewhat of a special case. The optimal solution can look like:

----    110 PARAMETER tiles  solution

                  x           y       width        area

tile1-1                      12           1           1
tile1-2                      13           1           1
tile1-3          13                       1           1
tile1-4          13           1           1           1
tile1-5          13           2           1           1
tile1-6          13           3           1           1
tile2-1           3          10           2           4
tile2-2           7           6           2           4
tile2-3           1          12           2           4
tile2-4           3          12           2           4
tile2-5           7           8           2           4
tile3-1                                   3           9
tile3-2                       3           3           9
tile3-3                       6           3           9
tile3-4                       9           3           9
tile4-1           3           6           4          16
tile4-2           9                       4          16
tile4-3           5          10           4          16
tile5-1           9           4           5          25
tile5-2           9           9           5          25
tile6-1           3                       6          36
space                                    14         196

The solution presented in a numeric table is a bit difficult to grasp, so let's make a picture:

An optimal solution

It is noted that this solution is not unique.

A proper tiling has no overlapping tiles and no uncovered spaces, i.e., no holes. The picture confirms this is the case here.

Below I will discuss two different formulations. Warning: they are a bit tricky.

Formulation A: no-overlap constraints

One way to model this is to consider no-overlap constraints between two squares. The idea is as follows. Assume we have two squares $i,j$ with size $s_k$ and location $(x_k,y_k)$, then we can say: \[\begin{align} &x_j \ge x_i + s_i \>\mathbf{or} \\ & y_j \ge y_i+ s_i \>\mathbf{or} \\ & x_i \ge x_j + s_j \>\mathbf{or} \\ & y_i \ge y_j + s_j\end{align}\] A standard way to shoehorn this into a MIP model is to use big-M constraints: \[\begin{align} &x_j \ge x_i + s_i - M (1-\delta^{(1)}_{i,j}) \\ & y_j \ge y_i + s_i - M (1-\delta^{(2)}_{i,j}) \\ & x_i \ge x_j + s_j - M (1-\delta^{(3)}_{i,j}) \\ & y_i \ge y_j + s_j - M (1-\delta^{(4)}_{i,j})\\ & \sum_d \delta^{(d)}_{i,j} \ge 1\end{align}\] Alternatively, using the complement: \[\begin{align} &x_j \ge x_i + s_i - M \delta^{(1)}_{i,j} \\ & y_j \ge y_i + s_i - M\delta^{(2)}_{i,j} \\ & x_i \ge x_j + s_j - M \delta^{(3)}_{i,j} \\ & y_i \ge y_j + s_j - M \delta^{(4)}_{i,j}\\ & \sum_d \delta^{(d)}_{i,j} \le 3\end{align}\] The disadvantage of this formulation is that we need good, tight values for $M$.

Using this framework we can formulate the model:

Model A: No-overlap Constraints
Sets	\[\begin{align}&i,j&&\text{tiles}\\ &a&&\text{possible widths for space to fill: $1,2,\dots$} \\ & c =\{x,y\} && \text{coordinates}\\ & k = \{i\lt j,j\lt i\} && \text{no-overlap cases} \end{align}\]
Parameters	\[\begin{align} & \color{darkblue}{\mathit{size}}_{i} && \text{size of tile $i$}\\ &\color{darkblue} M &&\text{big-$M$ constants}\\ &\color{darkblue}{\mathit{sizeSpace}}_a = 1,2,\dots,\color{darkblue}{\mathit{maxSpace}} &&\text{possible sizes of total space we can fill}\end{align}\]
Variables	\[\begin{align} &\color{darkred}t_i \in \{0,1\}&&\text{select tile $i$}\\ &\color{darkred}{p}_{i,c} \ge 0 &&\text{position of tile $i$, $(x,y)$ coordinates} \\ &\color{darkred}W && \text{width of space we can fill} \\ &\color{darkred}A && \text{area of space to fill} \\ & \color{darkred}\delta_{i,j,k,c} \in \{0,1\} && \text{no-overlap cases} \\ & \color{darkred}\theta_a \in \{0,1\} && \text{select width of space to fill}\end{align} \]
Objective	\[\max \color{darkred}W \]	Maximize width of space we can fill
Constraints	\[ \color{darkred}p_{i,c} + \color{darkblue}{\mathit{size}}_{i}\le\color{darkred}{\mathit{W}}\>\>\forall i,c \]	Each tile should be located inside the space we want to fill. Really only important for selected tiles, but it is easier to apply this constraint to all tiles.
	\[\begin{align} & \color{darkred}p_{j,c} \ge \color{darkred}p_{i,c} + \color{darkblue}{\mathit{size}}_{i} - \color{darkblue}M (1-\color{darkred}\delta_{i,j,i\lt j,c}) - \color{darkblue}M(1-\color{darkred}t_i) - \color{darkblue}M(1-\color{darkred}t_j)&& \forall i\lt j,c \\ & \color{darkred}p_{j,c} + \color{darkblue}{\mathit{size}}_{j} \le \color{darkred}p_{i,c} + \color{darkblue}M (1-\color{darkred}\delta_{i,j,j\lt i,c}) + \color{darkblue}M(1-\color{darkred}t_i) + \color{darkblue}M(1-\color{darkred}t_j)&& \forall i\lt j,c \\ & \sum_{k,c} \color{darkred}\delta_{i,j,k,c} \ge 1 && \forall i\lt j\end{align} \]	No-overlap constraints. Only compare when both tiles $i$ and $j$ are selected.
	\[\begin{align} &\color{darkred}W = \sum_a \color{darkblue}{\mathit{sizeSpace}}_a \cdot\color{darkred}\theta_a\\ &\color{darkred}A = \sum_a \color{darkblue}{\mathit{sizeSpace}}_a^2 \cdot\color{darkred}\theta_a \\& \sum_a \color{darkred}\theta_a = 1 \end{align}\]	Select width of space to fill. Both size and area are calculated. We use binary variables to prevent the model to become quadratic.
	\[\sum_i \color{darkblue}{\mathit{size}}_{i}\cdot \color{darkred}t_i = \color{darkred}A \]	Area of the selected tiles should be equal to the area of the outer space. This will ensure the whole space is covered with tiles.

In the model, we need both the area and the length of each side (called size in the model). The easiest would be to add \[A = W^2\] However this would make the model non-linear. We use here that the outer area can only have integer-valued sizes.

A way to get good values for the big-$M$ constants is the following. From our inventory of tiles, we can calculate the total area we can cover. The maximum size of the outer box is the square root of this number, rounded down to the nearest integer: \[\mathit{maxSpace} = \left\lfloor \sqrt{\sum_i \mathit{size}_i^2} \right\rfloor\] We can simply use \[M = \mathit{maxSpace}\] We can refine this further by taking the size of the individual tiles into account (basically replacing $M$ by $M_{i,j}$.

I also added some ordering constraints: \[t_i \ge t_{i+1} \>\>\text{if $\mathit{size}_i = \mathit{size}_{i+1}$}\] This will select tiles according to their ordinal number. Furthermore, \[\sum_c p_{i,c} \ge \sum_c p_{i+1,c} \>\>\text{if $\mathit{size}_i = \mathit{size}_{i+1}$}\] This will order the locations, and hopefully breaks some symmetry in the model.

This model works fine for our example data. However, for larger instances, things become very slow.

Formulation B: grid approach

In this formulation, we work with binary variables: \[x_{i,p,q}=\begin{cases} 1 & \text{if tile $i$ is placed at location $(p,q)$} \\ 0 & \text{otherwise}\end{cases}\] Then for each location $(p,q)$ we check that the number tiles covering this location is equal to one within the area to fill and zero outside. To model this we develop a binary parameter \[\mathit{cover}_{p1,q1,i,p,q}=\begin{cases} 1 & \text{if $(p1,q1)$ is covered when tile $i$ is placed at $(p,q)$}\\ 0 & \text{otherwise}\end{cases}\] Furthermore, we maintain a boolean matrix $y_{p1,q1}$ with \[y_{p1,q1} = \begin{cases} 1 & \text{if $(p1,q1)$ is within our current area to fill} \\ 0 & \text{otherwise}\end{cases}\] Notice that this area is shrinking during the solution process, so this matrix is dynamic (i.e. a variable).

y matrix

Detail: $y$ has $P+1$ rows and columns. Initially, we just have one final row and column of zeroes. The size of $P$ is determined from the inventory: sum all areas of the tiles, take the square root and round down. The way we keep track of $y$ is to introduce a binary variable $\delta_p\in \{0,1\}$ with \[\begin{align} &\delta_p \le \delta_{p-1} \\ & \sum_p \delta_p = W\end{align}\] Now we just can do $y_{p,q} = \delta_p \cdot \delta_q$, for which a standard linearization is available.

With this, the covering constraint can look like: \[ \sum_{i,p,q} \mathit{cover}_{p1,q1,i,p,q} x_{i,p,q} = y_{p1,q1}\>\>\forall p1,q1\] This says: each cell inside the area to fill is covered by exactly one tile, and cells outside can not be covered. The complete model can look like:

Model B: Grid Approach
Sets	\[\begin{align}&i&&\text{tiles}\\ &p,q && \text{gridpoints, $p,q\in \{1,..,P\}$ }\\ & p1,q1 && \text{gridpoints plus extra row and column, $p1,q1\in\{1,\dots,P+1\}$} \end{align}\]
Parameters	\[\begin{align} & \color{darkblue}{\mathit{size}}_{i} && \text{size of tile $i$}\\ & \color{darkblue}{\mathit{cover}}_{p1,q1,i,p,q}\in \{0,1\} && \text{1 if $(p1,q1)$ is covered by tile $i$ at $(p,q)$} \end{align}\]
Variables	\[\begin{align} &\color{darkred}x_{i,p,q} \in \{0,1\}&&\text{place tile $i$ at location $(p,q)$}\\ &\color{darkred}{y}_{p1,q1} \in \{0,1\} &&\text{1 inside area to be filled, 0 outside} \\ & \color{darkred}\delta_{p1} \in \{0,1\} && \text{1 if $p1\le W$}\\ &\color{darkred}W && \text{width of space we can fill} \end{align} \]	The $y$ variables can be relaxed to ${y}_{p1,q1} \in [0,1]$.
Objective	\[\max \color{darkred}W \]	Maximize width of space we can fill
Constraints	\[ \sum_{p,q} \color{darkred}x_{i,p,q} \le 1 \>\>\forall i \]	Each tile can be placed once
	\[ \sum_{i,p,q} \color{darkblue}{\mathit{cover}}_{p1,q1,i,p,q} \cdot \color{darkred} x_{i,p,q} = \color{darkred}y_{p1,q1}\>\>\forall p1,q1\]	Cover constraint. Inside the fill-area, each cell must be covered exactly once. Outside: zero coverage.
	\[\begin{align} &\color{darkred} \delta_{p1} \le \color{darkred}\delta_{p1-1}\\ & \sum_{p1} \color{darkred}\delta_{p1}=\color{darkred} W\end{align}\]	Force a pattern like 111000. Number of ones is $W$. The last element is a 0. This is used to form the $y$ matrix.
	\[\begin{align} &\color{darkred}y_{p1,q1} \le \color{darkred}\delta_{p1}\\ & \color{darkred}y_{p1,q1} \le \color{darkred}\delta_{q1} \\ & \color{darkred}y_{p1,q1} \ge \color{darkred} \delta_{p1} +\color{darkred}\delta_{q1}-1\end{align}\]	Maintain the $y$ matrix. The first W rows and columns are ones. The other entries are zeros. One extra row and column of zeros for safeguarding against covering outside the area. These constraints are a linearization of a binary multiplication.

This model can handle larger problems. I tried the inventory:

----     56 PARAMETER input  inventory data

            count       width        area

tile1           9           1           1
tile2           8           2           4
tile3           7           3           9
tile4           6           4          16
tile5           5           5          25
tile6           4           6          36
tile7           3           7          49
tile8           2           8          64
tile9           1           9          81
total          45                     825

This is a larger data set with 45 tiles. The total area of the tiles is 825. The square root is \[\sqrt{825} = 28.723\] So the maximum square we can hope to fill is $28 \times 28$ with an area of 784. The generated MIP model is very large:

MODEL STATISTICS

BLOCKS OF EQUATIONS           8     SINGLE EQUATIONS        3,473
BLOCKS OF VARIABLES           5     SINGLE VARIABLES       36,196  1 projected
NON ZERO ELEMENTS       614,345     DISCRETE VARIABLES     35,353

Although it is large, it solves rather easily. We just needed 10 nodes. It has the following optimal solution:

----    161 PARAMETER tiles  solution

                  x           y       width        area

item1-1          27          23           1           1
item1-2                      10           1           1
item1-3          10          13           1           1
item1-4                       9           1           1
item1-5          14           5           1           1
item1-6          14          18           1           1
item1-7          20           6           1           1
item1-8          27          24           1           1
item2-1          25          23           2           4
item2-2           1           9           2           4
item2-3                      14           2           4
item2-4                      24           2           4
item2-5                      16           2           4
item2-6                      26           2           4
item3-1                                   3           9
item3-2          11          16           3           9
item3-3                       6           3           9
item3-4                      11           3           9
item3-5                       3           3           9
item3-6          25          25           3           9
item3-7          11          13           3           9
item4-1           2          14           4          16
item4-2          10           5           4          16
item4-3          10           9           4          16
item4-4           2          24           4          16
item5-1          10                       5          25
item5-2          15          18           5          25
item5-3          15          23           5          25
item5-4           6          14           5          25
item5-5          20          23           5          25
item6-1          15                       6          36
item6-2          14           6           6          36
item6-3                      18           6          36
item6-4          14          12           6          36
item7-1           3                       7          49
item7-2           3           7           7          49
item7-3          21                       7          49
item8-1          20          15           8          64
item8-2          20           7           8          64
item9-1           6          19           9          81
space                                    28         784

Indeed we find the $28 \times 28$ tiling.

An optimal solution

Summary of performance

These results are with Cplex, 8 threads, default settings.

	Data set 1	Data set 2
Data
Tiles	21	45
Area	196	824
Solution
Width	14	28
Area	196	784
Tiles used	21	40
Model A
Variables	919	4,125
Binary variables	875	4,033
Constraints	1,126	5,116
Solution time	39	NA
Nodes	73,975	NA
Model B
Variables	4,378	36,196
Binary variables	4,151	35,353
Constraints	950	3,473
Solution time	1.3	256
Nodes	0	10

Obviously, approach B is performing much better, even though it generates much larger MIP models.

An example with smaller covered area

Above we discussed two examples:

In the first example, we could use all the tiles of the inventory.
In the second example, the square area we could cover was equal to the maximum possible area calculated from the total area of the tiles.

Here is a small example where we are much below this maximum estimate. The data looks like:

----     44 PARAMETER input  inventory data

            count       width        area

tile1           1           1           1
tile2           1           2           4
tile3           1           3           9
tile4           1           4          16
tile5           1           5          25
tile6           1           6          36
tile7           1           7          49
tile8           1           8          64
tile9           1           9          81
total           9                     285

So we have one of each size from 1 to 9. Looking at the total area, the maximum size we can expect is:\[\left\lfloor \sqrt{285} \right\rfloor = 16\] When we solve this, we see:

----    125 PARAMETER tiles  solution

              width        area

tile9-1           9          81
space             9          81

I.e. just the $9 \times 9$ tile is used.

Conclusion

This problem is a bit more complex to model as a MIP. The first "no-overlap" formulation did not perform very well. The second "grid cover" formulation did much better, and I could solve some larger problems in a few minutes.

References

Max square side length of smaller squares, https://stackoverflow.com/questions/60526058/max-square-side-lenght-of-smaller-squares