Yet Another Math Programming Consultant: Small MIP, proving optimality is difficult

This is a simple, smallish MIP model that is difficult to solve to proven optimality. The problem is stated as [1]:

I have grid of dimensions H and W of boolean variables. The only constraint is that if a variable is false then at least one of the adjacent variables (top, right, left, bottom, diagonals don't count) must be true. The goal is to minimize the number of true values in the grid.

A high-level mathematical representation of this can be:

High-level model
\[\begin{align}\min&\sum_{i,j}\color{darkred}x_{i,j}\\ & \color{darkred}x_{i,j}=0 \implies \color{darkred}x_{i-1,j} + \color{darkred}x_{i+1,j} + \color{darkred}x_{i,j-1} + \color{darkred}x_{i,j+1} \ge 1 \\ & \color{darkred}x_{i,j}\in \{0,1\} \end{align}\]

The implication can be written as a simple linear constraint: \[ \color{darkred}x_{i-1,j} + \color{darkred}x_{i+1,j} + \color{darkred}x_{i,j-1} + \color{darkred}x_{i,j+1} \ge 1 - \color{darkred}x_{i,j}\] or \[ \color{darkred}x_{i,j} + \color{darkred}x_{i-1,j} + \color{darkred}x_{i+1,j} + \color{darkred}x_{i,j-1} + \color{darkred}x_{i,j+1} \ge 1 \] We can also find the last constraint directly from the description. Just a different way of looking at the problem statement: at least one from $\color{darkred}x_{i,j}$ and its direct neighbors should be 1.

One issue that always warrants attention is: how to deal with the border? In this case we can assume $\color{darkred}x_{i-1,j}=0$ for $i=1$. In other words, all $\color{darkred}x_{i,j}$ outside the board are zero. This is exactly how GAMS behaves. GAMS will return a zero when addressing a variable or parameter with leads or lags outside its domain. So, we can simply use the following GAMS implementation:

set
i 'rows' /row1*row32/
j 'columns' /col1*col32/
;

variable z 'objective';
binary variable x(i,j) 'cells';

equations
obj 'objective'
eq(i,j) 'implication'
;

obj.. z =e= sum((i,j),x(i,j));
eq(i,j).. x(i,j) + x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) =g= 1;

model m /all/;
solve m minimizing z using mip;

option x:0:1:1;
display x.l;

A solution with objective 227 can be found rather quickly, but proving optimality is more difficult. The solution of 227 was found after 65 seconds, but I could not prove this was the optimum before hitting my time limit of 1 hour.

A model with just 1024 binary variables can be difficult even for high-end solvers. Of course $2^{1024}$ is not that small:

Improvement

One way to improve on the performance is to notice that the interior is rather regular and that the behavior near the border is more unpredictable. That makes sense as we assume all values outside the board are zero. That makes the constraint more difficult to obey on or near the border. My idea is to write the equations as:

\[\begin{align} & \color{darkred}x_{i,j} + \color{darkred}x_{i-1,j} + \color{darkred}x_{i+1,j} + \color{darkred}x_{i,j-1} + \color{darkred}x_{i,j+1} = 1 && \text{for $(i,j)$ in the interior part} \\ & \color{darkred}x_{i,j} + \color{darkred}x_{i-1,j} + \color{darkred}x_{i+1,j} + \color{darkred}x_{i,j-1} + \color{darkred}x_{i,j+1} \ge 1 && \text{for $(i,j)$ near the border}\end{align}\]

I define here the border region as the first and last two rows and columns. The depth of the border region needed to be two levels. When using a skinny border of just one cell, the model was infeasible. This version of the model is solved very fast (0.6 seconds; zero nodes were needed and only 141 simplex iterations, with proven optimality). It is quite remarkable how much impact this new formulation has on the performance. We are talking about a > 6000x performance increase. The picture becomes:

I have shaded the border area. The interior part now has a perfectly clean regular pattern, and all the dirt has been swept to the border.

We could even do the inner portion "by hand" (i.e. by some code), and let the solver worry about the border only. Obviously, inspiration for this approach came from looking at the pictures we produced. This example illustrates how important good, meaningful visualizations can be.

Complete GAMS Model

$onText

https://stackoverflow.com/questions/78987960/suggestion-on-which-constraint-to-adds-to-optimize-minizincs-model

Problem definition:

   I have grid of dimensions H and W of boolean variables. The only constraint is that if
   a variable is false then at least one of the adjacent variables (top, right, left, bottom,
   diagonals don't count) must be true. The goal is to minimize the number of true values in the grid.

I.e.
     x(i,j) = 0 => x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) >= 1
     
This can be written as:

     x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) >= 1 - x(i,j)
     
or

     x(i,j) + x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) >= 1
     

$offText

*------------------------------------------------
* problem size
*------------------------------------------------

set
   i 'rows' /row1*row32/
   j 'columns' /col1*col32/
;

*------------------------------------------------
* MIP model (difficult to prove optimality)
*------------------------------------------------

variable z 'objective';
binary variable x(i,j) 'cells';

equations
    obj 'objective'
    eq(i,j) 'implication'
;

obj.. z =e= sum((i,j),x(i,j));
eq(i,j).. x(i,j) + x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) =g= 1;

model m /all/;
option reslim = 100;
solve m minimizing z using mip;

option x:0:0:7;
display x.l;

*------------------------------------------------
* improved MIP model (solves very fast)
*------------------------------------------------

set
  interior(i,j)
  border(i,j)
;

interior(i,j) = ord(i) > 2 and ord(i) < card(i)-1 and ord(j) > 2 and ord(j) < card(j)-1;
border(i,j) = not interior(i,j);

equations
    eq1(i,j) 'interior constraints'
    eq2(i,j) 'border constraints'
;

eq1(interior(i,j)).. x(i,j) + x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) =e= 1;
eq2(border(i,j))..   x(i,j) + x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) =g= 1;

model m2 /obj,eq1,eq2/;
solve m2 minimizing z using mip;

display x.l;

*------------------------------------------------
* Visualization
*------------------------------------------------

$set html board.html
$set drawborder 1


file f /%html%/;
put f;

put '<h1>Board Problem</h1>'/;

put '<table>'
    '<tr><td>Board size<td><td><pre>',card(i):0:0,'x',card(j):0:0,'</pre></td></tr>'
    '<tr><td>Binary variables<td><td><pre>',card(x):0:0,'</pre></td></tr>'
    '<tr><td>Objective<td><td><pre>',z.l:0:0,'</pre></td></tr>'
    '</table><p>'/; 

put '<svg height="500" width="500" viewbox="0 0 35 35">'/;
loop((i,j),
     put '<rect x="',ord(j):0:0,'" y="',ord(i):0:0,'" width="1" height="1" ';
     put$(x.l(i,j)<0.5) 'fill="white" ';
     put$(x.l(i,j)>0.5) 'fill="lightblue" ';
     put 'stroke-width="0.01" stroke="black" '
     put$(%drawborder%=1 and border(i,j)) 'style="filter: brightness(90%);"';
     put '/>'/;
);
put '</svg>'/;

putclose;

executetool 'win32.ShellExecute "%html%"';

Known solutions

Rob Pratt pointed me to the series listed in https://oeis.org/A104519 with objective values. I initially misread this, but our 227 solution can be found as a(34).