Yet Another Math Programming Consultant: (Pre)solving Killer Sudokus

Killer Sudoku [1] is a variant of Sudoku where an additional constraint is that cells belonging to a cage (the colored regions in the picture) add up to a given value.

Killer Sudoku (from [1])

As in standard Sudokus, we need to fill each cell with a number $1,\dots,9$, and values in each row, column and $(3 \times 3)$ box must be unique. In addition, all values in a cage should be unique. In this puzzle we don't see that some cells already have given values: we need to fill-in all the cells. The solution for the above puzzle is:

Solution (from [1])

Using the standard decision variables for Sudoku puzzles: \[x_{i,j,k} = \begin{cases} 1 & \text{if cell $(i,j)$ has value $k$} \\ 0 & \text{otherwise}\end{cases}\] we can solve this easily with a MIP model. In the model below we use two data structures:

a mapping $\mathit{amap}(a,i,j)$ between area $a$ (an area is a row, column or box) and cells $(i,j)$: $\mathit{amap}(a,i,j)$ is $\mathit{true}$ if cell $(i,j)$ belongs to area $a$,
a similar mapping $\mathit{cmap}(c,i,j)$ between a cage $c$ and cell numbers $(i,j)$.

With this we can form a compact model:

Killer Sudoku model
\[\begin{align} & \sum_k \color{DarkRed} x_{i,j,k} = 1 && \forall i,j && \text{One value per cell}\\ & \sum_{i,j\| \color{DarkBlue}{\mathit{amap}}(a,i,j)} \color{DarkRed} x_{i,j,k} = 1 && \forall a,k && \text{Unique values in each row, column and box} \\ & \sum_{i,j\| \color{DarkBlue}{\mathit{cmap}}(c,i,j)} \color{DarkRed} x_{i,j,k} \le 1 && \forall c,k && \text{Unique values in each cage}\\ & \sum_{i,j,k\| \color{DarkBlue}{\mathit{cmap}}(c,i,j)} \color{DarkBlue} k \cdot \color{DarkRed} x_{i,j,k} = \color{DarkBlue} {\mathit{CageSum}}_c && \forall c && \text{Cell values add up to cage sum} \\ & \color{DarkRed} x_{i,j,k} \in \{0,1\} \end{align}\]

Killer Sudoku model

\[\begin{align}
& \sum_k \color{DarkRed} x_{i,j,k} = 1 && \forall i,j && \text{One value per cell}\\
& \sum_{i,j| \color{DarkBlue}{\mathit{amap}}(a,i,j)} \color{DarkRed} x_{i,j,k} = 1 && \forall a,k && \text{Unique values in each row, column and box} \\
& \sum_{i,j| \color{DarkBlue}{\mathit{cmap}}(c,i,j)} \color{DarkRed} x_{i,j,k} \le 1 && \forall c,k && \text{Unique values in each cage}\\
& \sum_{i,j,k| \color{DarkBlue}{\mathit{cmap}}(c,i,j)} \color{DarkBlue} k \cdot \color{DarkRed} x_{i,j,k} = \color{DarkBlue} {\mathit{CageSum}}_c && \forall c && \text{Cell values add up to cage sum} \\
& \color{DarkRed} x_{i,j,k} \in \{0,1\} \end{align}\]

There is no objective in this model: we are just looking for a feasible integer solution.

Implementation

The equivalent GAMS model can look like:

set
a     'area' /i1*i9,j1*j9,b1*b9/
i(a) 'rows' /i1*i9/
j(a) 'columns' /j1*j9/
b(a) 'boxes' /b1*b9/
amap(a,i,j) 'mapping area<->cells'
k     'values' /1*9/
;

*
* populate amap
* rows, columns, and boxes a.k.a nonets
*
amap(i,i,j) = yes;
amap(j,i,j) = yes;
amap(b,i,j) = ord(b) = 3*ceil(ord(i)/3) + ceil(ord(j)/3) - 3;
display amap;

*
* cages
*
sets
c 'cages' /cage1*cage29/
cmap(c,i,j) 'mapping cage<->cells'
;
table cageno(i,j) 'cage numbering'
      j1 j2 j3 j4 j5 j6 j7 j8 j9
i1   1 1 2 2 2 3 4 5 6
i2   7 7 8 8 3 3 4 5 6
i3   7 7 9 9 3 10 11 11 6
i4 12 13 13 9 14 10 11 15 6
i5 12 16 16 17 14 10 15 15 18
i6 19 16 20 17 14 21 22 22 18
i7 19 20 20 17 23 21 21 24 24
i8 19 25 26 23 23 27 27 24 24
i9 19 25 26 23 28 28 28 29 29
;
cmap(c,i,j) = ord(c)=cageno(i,j);
parameter cagesum(c) /
cage1   3, cage11 20, cage21 20
cage2 15, cage12   6, cage22   6
cage3 22, cage13 14, cage23 10
cage4   4, cage14 17, cage24 14
cage5 16, cage15 17, cage25   8
cage6 15, cage16 13, cage26 16
cage7 25, cage17 20, cage27 15
cage8 17, cage18 12, cage28 13
cage9   9, cage19 27, cage29 17
cage10 8, cage20   6
/;

binary variable x(i,j,k);
variable z 'obj';

equation
   cellvalue(i,j) 'one value per cell'
   unique(a,k)    'unique values in each area'
   cunique(c,k)   'unique values in each cage'
   cagesums(c)    'summation of cage cells'
   obj            'dummy objective'
;

cellvalue(i,j).. sum(k, x(i,j,k)) =e= 1;
unique(a,k)..    sum(amap(a,i,j),x(i,j,k)) =e= 1;
cunique(c,k)..   sum(cmap(c,i,j),x(i,j,k)) =l= 1;
cagesums(c)..    sum((cmap(c,i,j),k), k.val*x(i,j,k)) =e= cagesum(c);
obj..            z =e= 0;

model killer /all/;
solve killer using mip minimizing z;

parameter v(i,j) 'optimal values';
v(i,j) = sum(k, k.val*x.l(i,j,k));
option v:0;
display v;

The function \[f(i,j) = 3\left\lceil i/3 \right\rceil + \left\lceil j/3 \right\rceil – 3\] maps $(i,j)$ to a box number. We can check this by the fragment:

set
i /i1*i9/
j /j1*j9/
;
parameter f(i,j) 'box number';
f(i,j) = 3*ceil(ord(i)/3) + ceil(ord(j)/3) - 3;
display f;

The output of this fragment is:

----      7 PARAMETER f  box number

            j1          j2          j3          j4          j5          j6          j7          j8          j9

i1       1.000       1.000       1.000       2.000       2.000       2.000       3.000       3.000       3.000
i2       1.000       1.000       1.000       2.000       2.000       2.000       3.000       3.000       3.000
i3       1.000       1.000       1.000       2.000       2.000       2.000       3.000       3.000       3.000
i4       4.000       4.000       4.000       5.000       5.000       5.000       6.000       6.000       6.000
i5       4.000       4.000       4.000       5.000       5.000       5.000       6.000       6.000       6.000
i6       4.000       4.000       4.000       5.000       5.000       5.000       6.000       6.000       6.000
i7       7.000       7.000       7.000       8.000       8.000       8.000       9.000       9.000       9.000
i8       7.000       7.000       7.000       8.000       8.000       8.000       9.000       9.000       9.000
i9       7.000       7.000       7.000       8.000       8.000       8.000       9.000       9.000       9.000

For more information see [2].

The results of the complete MIP model are:

----     75 PARAMETER v  optimal values

            j1          j2          j3          j4          j5          j6          j7          j8          j9

i1           2           1           5           6           4           7           3           9           8
i2           3           6           8           9           5           2           1           7           4
i3           7           9           4           3           8           1           6           5           2
i4           5           8           6           2           7           4           9           3           1
i5           1           4           2           5           9           3           8           6           7
i6           9           7           3           8           1           6           4           2           5
i7           8           2           1           7           3           9           5           4           6
i8           6           5           9           4           2           8           7           1           3
i9           4           3           7           1           6           5           2           8           9

Sudokus are typically solved in the presolver: the MIP solver does not need to do any Simplex iterations or Branch-and-Bound nodes. We see the same thing here. The Cplex solver log shows:

MIP Presolve eliminated 608 rows and 724 columns.
MIP Presolve modified 882 coefficients.
Aggregator did 6 substitutions.
All rows and columns eliminated.
Presolve time = 0.11 sec. (5.58 ticks)

Proven optimal solution.

MIP Solution: 0.000000 (0 iterations, 0 nodes)

CBC is also able to solve this without much sweat:

Calling CBC main solution routine...

No integer variables out of 560 objects (560 integer) have costs

branch on satisfied N create fake objective Y random cost Y

Integer solution of 0 found by feasibility pump after 0 iterations and 0 nodes (0.94 seconds)

Search completed - best objective 0, took 0 iterations and 0 nodes (0.96 seconds)

It looks like CBC is not able to presolve the whole model away. It subsequently invokes the feasibility pump (a heuristic to find feasible solutions quickly).

Multiple solutions?

We can easily prove this solution is unique, by adding a cut that forbids this solution, and resolving. I added to the model:

equation cut;
cut.. sum((i,j,k),(2*x.l(i,j,k)-1)*x(i,j,k)) =l= sum((i,j,k),x.l(i,j,k))-1;

Indeed this solution is unique: adding the cut made the model infeasible.

Unique values in a cage

Firstly, it is noted that for this data set we could have removed the condition that within a cage we only want unique values. This constraint is non-binding: if we remove it we find the same solution. Even stronger: after removing the cunique constraint, we still get a single, unique solution.

An alternative formulation is to combine the constraints unique and cunique into one big equation For this we add the cages $c$ to our amap data structure:

*
* populate amap
* rows, columns, boxes and cages
*
amap(i,i,j) = yes;
amap(j,i,j) = yes;
amap(b,i,j) = ord(b) = 3*ceil(ord(i)/3) + ceil(ord(j)/3) - 3;
amap(c,i,j) = cmap(c,i,j);
display amap;

Now the simplified model can look like:

Killer Sudoku model 2
\[\begin{align} & \sum_k \color{DarkRed} x_{i,j,k} = 1 && \forall i,j && \text{One value per cell}\\ & \sum_{i,j\| \color{DarkBlue}{\mathit{amap}}(a,i,j)} \color{DarkRed} x_{i,j,k} \le 1 && \forall a,k && \text{Unique values in each row, column, box and cage} \\ & \sum_{i,j,k\| \color{DarkBlue}{\mathit{cmap}}(c,i,j)} \color{DarkBlue} k \cdot \color{DarkRed} x_{i,j,k} = \color{DarkBlue} {\mathit{CageSum}}_c && \forall c && \text{Cell values add up to cage sum} \\ & \color{DarkRed} x_{i,j,k} \in \{0,1\} \end{align}\]

Killer Sudoku model 2

\[\begin{align}
& \sum_k \color{DarkRed} x_{i,j,k} = 1 && \forall i,j && \text{One value per cell}\\
& \sum_{i,j| \color{DarkBlue}{\mathit{amap}}(a,i,j)} \color{DarkRed} x_{i,j,k} \le 1 && \forall a,k && \text{Unique values in each row, column, box and cage} \\ & \sum_{i,j,k| \color{DarkBlue}{\mathit{cmap}}(c,i,j)} \color{DarkBlue} k \cdot \color{DarkRed} x_{i,j,k} = \color{DarkBlue} {\mathit{CageSum}}_c && \forall c && \text{Cell values add up to cage sum} \\
& \color{DarkRed} x_{i,j,k} \in \{0,1\} \end{align}\]

It requires a bit of thought that we can use a $\le$ constraint for all uniqueness constraints.

When we solve this model with Cplex, it is interesting to see that the presolver is not as effective:

MIP Presolve eliminated 328 rows and 382 columns.

MIP Presolve modified 825 coefficients.

Aggregator did 7 substitutions.

Reduced MIP has 280 rows, 341 columns, and 1645 nonzeros.

Reduced MIP has 341 binaries, 0 generals, 0 SOSs, and 0 indicators.

Presolve time = 0.05 sec. (7.14 ticks)

Found incumbent of value 0.000000 after 0.14 sec. (17.27 ticks)

The message "All rows and columns eliminated" is notably missing.

References

Killer Sudoku, https://en.wikipedia.org/wiki/Killer_sudoku
MIP Modeling: from Sudoku to KenKen via Logarithms, https://yetanothermathprogrammingconsultant.blogspot.com/2016/10/mip-modeling-from-sudoku-to-kenken.html

Yet Another Math Programming Consultant

Sunday, January 6, 2019

(Pre)solving Killer Sudokus

Implementation

Multiple solutions?

Unique values in a cage

References

No comments:

Post a Comment