Wednesday, May 4, 2022

Evil Sudoku

On sudoku.com[1] there is now an "evil" difficulty level. I found these very difficult to solve by hand (I usually give up).



 

Of course, a good MIP presolver/preprocessor devours these in no time. It should solve models like this in less than 0.1 seconds using zero Simplex iterations and zero B&B nodes.


Version identifier: 20.1.0.1 | 2021-04-07 | 3a818710c
CPXPARAM_Advance                                 0
CPXPARAM_Threads                                 1
CPXPARAM_MIP_Display                             4
CPXPARAM_MIP_Pool_Capacity                       0
CPXPARAM_MIP_Tolerances_AbsMIPGap                0
Generic callback                                 0x50
Tried aggregator 2 times.
MIP Presolve eliminated 168 rows and 580 columns.
Aggregator did 28 substitutions.
Reduced MIP has 128 rows, 122 columns, and 501 nonzeros.
Reduced MIP has 122 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.06 sec. (0.93 ticks)
Found incumbent of value 0.000000 after 0.06 sec. (1.66 ticks)

Root node processing (before b&c):
  Real time             =    0.06 sec. (1.67 ticks)
Sequential b&c:
  Real time             =    0.00 sec. (0.00 ticks)
                          ------------
Total (root+branch&cut) =    0.06 sec. (1.67 ticks)

--- MIP status (101): integer optimal solution.
--- Cplex Time: 0.06sec (det. 1.67 ticks)

Proven optimal solution
MIP Solution:            0.000000    (0 iterations, 0 nodes)
Final Solve:             0.000000    (0 iterations)

Best possible:           0.000000
Absolute gap:            0.000000
Relative gap:            0.000000


CBC is a bit terser (not always the case):


COIN-OR CBC      38.2.1 96226ea8 Feb 19, 2022          WEI x86 64bit/MS Window

COIN-OR Branch and Cut (CBC Library 2.10)
written by J. Forrest

Calling CBC main solution routine...
No integer variables - nothing to do

Solved to optimality (within gap tolerances optca and optcr).
MIP solution:    0.000000e+00   (0 nodes, 0.019 seconds)

Best possible:   0.000000e+00
Absolute gap:    0.000000e+00   (absolute tolerance optca: 0)
Relative gap:    0.000000e+00   (relative tolerance optcr: 0.0001)



References


  1. https://sudoku.com/evil/

Appendix: GAMS model


$ontext

  
Evil Sudoku from sudoku.com

  
References:
  
https://yetanothermathprogrammingconsultant.blogspot.com/2022/05/evil-sudoku.html

$offtext

set
  a     
'all areas (rows,columns,squares)' /r1*r9,c1*c9,s1*s9/
  i(a)  
'rows' /r1*r9/
  j(a)  
'columns' /c1*c9/
  s(a)  
'squares' /s1*s9/
  u(a,i,j)
'areas with unique values a=area name, (i,j) are the cells'
  k     
'values' /1*9/;
;

*
* populate u(a,i,j)
*
u(i,i,j) =
yes;
u(j,i,j) =
yes;
u(s,i,j)$(
ord(s)=3*ceil(ord(i)/3)+ceil(ord(j)/3)-3) = yes;
display u;

*
* given values
*
table v0(i,j)
  
c1 c2 c3 c4 c5 c6 c7 c8 c9
r1                 9     5
r2  7                    1
r3        8  2        7     9
r4  6              4  9     8
r5     4     1
r6                          5
r7        2     3
r8  8              1  6     4
r9                       7
;

scalar nGiven 'number of given values';
nGiven =
card(v0);
display nGiven;

*
* MIP model
*
binary variable x(i,j,k);
x.fx(i,j,k)$(v0(i,j)=k.val) = 1;
variable z 'dummy objective';

equations
   dummy
'objective'
   unique(a,k)
'all-different'
   values(i,j)
'one value per cell'
;

dummy.. z =e= 0;
unique(a,k)..
sum(u(a,i,j), x(i,j,k)) =e= 1;
values(i,j)..
sum(k, x(i,j,k)) =e= 1;

model sudoku /all/;

*
* solve
*
solve sudoku minimizing z using mip;

*
* display solution
*
parameter v(i,j) 'solution';
v(i,j) =
sum(k, k.val*x.l(i,j,k));
option v:0;
display v;



Most models have separate constraints for rows, columns, and squares. In this formulation, I generalize this to "areas". The multidimensional set \(u(a,i,j)\) contains for each area \(a\) the corresponding cells \((i,j)\).  You can display this set to view it. Once we have this set \(u\) properly populated, the uniqueness constraints become exceedingly simple. The idea is that constraints are more difficult to debug than sets/parameters. So I am inclined to spend more effort in creating somewhat complex sets, expecting some pay-off in constraints becoming so simple that few things can go wrong.


u(a,i,j) for different values of a



As usual, we have \[\color{darkred}x_{i,j,k} = \begin{cases} 1 &  \text{if cell $(i,j)$ has value $k$} \\ 0 & \text{otherwise}\end{cases}\] To recover the Sudoku solution, we form \[ \color{darkblue}v_{i,j} = \sum_k \color{darkblue}k\cdot \color{darkred}x^*_{i,j,k}\] 

The final solution is:

----     77 PARAMETER v  solution

            c1          c2          c3          c4          c5          c6          c7          c8          c9

r1           3           2           4           7           1           9           8           5           6
r2           7           5           9           8           6           3           4           1           2
r3           1           6           8           2           4           5           7           3           9
r4           6           7           1           3           5           4           9           2           8
r5           9           4           5           1           8           2           3           6           7
r6           2           8           3           9           7           6           1           4           5
r7           4           9           2           6           3           7           5           8           1
r8           8           3           7           5           2           1           6           9           4
r9           5           1           6           4           9           8           2           7           3

5 comments:

  1. A question:
    How can you decide to specify some cells in a way that the difficulty is Easy, Medium, Hard or Evil?
    Can this be formulated as a MILP problem ?
    How does sudoku.com do it?

    ReplyDelete
    Replies
    1. I have heard about CP being used to generated Sudoku's for the French newspaper Le Monde. Another approach would be to generate random problems and checking the difficulty. There seems to some approach for quantifying the difficulty of a problem: https://www.technologyreview.com/2012/08/06/18948/mathematics-of-sudoku-leads-to-richter-scale-of-puzzle-hardness/

      Delete
    2. Thanks Erwin. I had a chance to read the mentioned paper. However, the materials are not explained clearly (the way you do it in an excellent way). I thought you might have some ideas on how to quantify it and put it as a contraint in a MILP formulation.
      Saying that
      Solve Problem
      St:
      Constraints +
      Difficulty level > threshold

      Delete
    3. In 2005 Helmut Simonis wrote a paper on modelling Sudoku with constraint programming (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.88.2964) where he classified a large set of Sudoku instances by what type of propagation would solve it without search. He found that with domain consistent propagation for the 27 all_different constraints (in his paper called hyper-arc consistency) and shaving (sometimes also called singleton arc consistency, removing values for cells by testing if that value for that cell would lead to a contradiction) all instances could be solved without search. Some minimal instances could not be solved with shaving and just value consistency (called forward checking in the paper, naive propagation just removing assigned values form other variables in the constraint).

      I added the above instance in a standard MiniZinc Sudoku formulation and tested with the Gecode constraint solver. With just normal propagation, search was needed (although stronger propagation meant fewer nodes). With bounds shaving (-O4) all the propagation levels could solve it without search. Note that this means that some instances in Simonis paper were harder according to his metric than the above instance.

      Full model and some basic test result are in this Gist: https://gist.github.com/zayenz/5f9a796445f954c0470b60eb681e6b56

      Delete